saturn

/home/coolhand/datasets/us-inequality-atlas/economic/gini_by_county.csv 3,222 rows sample n=3,222 seed 42 2026-05-01T17:00:58+00:00

Overview

Source/home/coolhand/datasets/us-inequality-atlas/economic/gini_by_county.csv
Total rows3,222
Profiled sample3,222
Columns4
Generated2026-05-01T17:00:58+00:00

Insights opt-in

Model-generated narrative. These are opinions, not facts — the stats below are what saturn measured. Generated by: anthropic:claude-opus-4-7.

Dataset high anthropic:claude-opus-4-7

This dataset contains 3,222 US county-level records with four fields: county name, FIPS code, Gini index, and state. The Gini index is the most analytically interesting column, with a mean of 0.448 and a max of 0.721, plus 56 outliers worth investigating for unusually high local inequality. The state distribution is broad (52 unique values), led by Texas (254 counties) and Georgia (159), so any state-level comparison should account for that imbalance. County names show a 39% duplicate rate, reflecting common names like Washington, Jefferson, and Franklin County that recur across states.

fips high anthropic:claude-opus-4-7

This is the U.S. county FIPS code: a 5-digit numeric identifier where the first two digits encode state and the last three encode county. With 3222 unique values across 3222 rows, no nulls, and a range from 1001 to 72153 spanning the standard FIPS state prefixes, every row corresponds to a distinct county. Distribution stats (mean 31377, std 16299, near-zero skew) are artifacts of the prefix encoding and not meaningful as a numeric feature.

county_name high anthropic:claude-opus-4-7

This column holds US county-level place names: nearly every value ends in 'County' (2999 of 3222 rows), with smaller contingents of 'Parish' (64, Louisiana), 'Municipio' (78, Puerto Rico), and 'City' (47). Heavy duplication is expected and present — 39.2% duplicate rate with 1262 repeats — because common names like Washington, Jefferson, and Franklin County recur across states. Lengths are tight (10–46 chars, mean 14.2, ~2 words) and there are no nulls or empties.

state high anthropic:claude-opus-4-7

This is a US state code field with 52 distinct values across 3222 rows and no nulls, consistent with the 50 states plus DC and likely a territory. Distribution closely tracks county counts: TX leads at 254 (7.88%), followed by GA (159) and VA (133), and entropy is high at 5.31 (ratio 0.93), indicating broad spread rather than concentration. The 52-value cardinality is the only mild surprise—worth confirming whether the extras are DC, PR, or stray codes.

gini_index high anthropic:claude-opus-4-7

Numeric column holding Gini index values, all within the plausible 0.2744–0.721 range with no nulls or zeros across 3222 rows. The distribution is tight (IQR 0.049, std 0.038) and centred near 0.448, but a mild right skew (0.50) and 56 high-end outliers (1.7%) suggest a handful of unusually unequal observations.

Numeric correlation

fips numeric

rows3,222
null0 (0.0%)
unique3,222
min1,001
max72,153
mean31,378
median30,022
std16,300
q119,030
q346,104
iqr27,075
skew0.157
kurtosis-0.631
n_outliers0
outlier_rate0.000
zero_rate0.000

county_name text

95th-percentile length under 20 chars 39.2% duplicate strings
rows3,222
null0 (0.0%)
unique1,960
len_min10
len_max46
len_mean14.172
len_median14.000
len_p9518.000
word_mean2.083
word_median2.000
n_empty0
n_duplicates1,262
duplicate_rate0.392
vocab_size1,963
readability_flesch_mean33.359
emoji_rate0.000
url_rate0.000
one_word_rate0.000
allcaps_rate0.000
boilerplate_rate0.000
Sample values (first 10)
  1. Bibb County
  2. Cheatham County
  3. Piute County
  4. Lamb County
  5. Martin County
  6. Sheridan County
  7. Chickasaw County
  8. Rockingham County
  9. Liberty County
  10. Clark County

state categorical

rows3,222
null0 (0.0%)
unique52
top_valueTX
top_rate0.079
cardinality52
entropy5.314
entropy_ratio0.932
Top values (rank 1–20)
  1. TX — 254
  2. GA — 159
  3. VA — 133
  4. KY — 120
  5. MO — 115
  6. KS — 105
  7. IL — 102
  8. NC — 100
  9. IA — 99
  10. TN — 95
  11. NE — 93
  12. IN — 92
  13. OH — 88
  14. MN — 87
  15. MI — 83
  16. MS — 82
  17. PR — 78
  18. OK — 77
  19. AR — 75
  20. WI — 72

gini_index numeric

rows3,222
null0 (0.0%)
unique1,317
min0.274
max0.721
mean0.448
median0.446
std0.038
q10.422
q30.471
iqr0.049
skew0.500
kurtosis1.634
n_outliers56
outlier_rate0.017
zero_rate0.000