saturn

/home/coolhand/datasets/us-inequality-atlas/economic/poverty_depth_by_county.csv 3,222 rows sample n=3,222 seed 42 2026-05-01T17:06:32+00:00

Overview

Source/home/coolhand/datasets/us-inequality-atlas/economic/poverty_depth_by_county.csv
Total rows3,222
Profiled sample3,222
Columns7
Generated2026-05-01T17:06:32+00:00

Insights opt-in

Model-generated narrative. These are opinions, not facts — the stats below are what saturn measured. Generated by: anthropic:claude-opus-4-7.

Dataset high anthropic:claude-opus-4-7

This dataset contains 3,222 rows of US county-level poverty statistics, with each row identified by a FIPS code, county name, and state abbreviation, plus three poverty rate measures and a population total. The poverty measures are all right-skewed: pct_poverty ranges from 1.6% to 66.32% with a median of 13.55%, while pct_deep_poverty has a median of 5.82% but reaches as high as 34.7%. The total population column is extremely skewed (skew of 13.4, kurtosis ~297) with a median of 25,174 but a max near 9.8 million, so any aggregate analysis should account for this. Texas (254 counties), Georgia (159), and Virginia (133) dominate the state distribution, which matters for any state-level rollups.

fips high anthropic:claude-opus-4-7

This column is the US county FIPS code: every one of the 3222 rows is unique, null-free, and the value range (1001 to 72153) matches the standard 5-digit state+county encoding. Treating it as numeric is misleading despite the clean distribution (skew 0.16, no outliers) — the digits are categorical identifiers, not measurements.

county_name high anthropic:claude-opus-4-7

This column holds US county-level place names — virtually every value ends in 'County' (2999 occurrences), with smaller groups of Louisiana 'parish' (64) and Puerto Rican 'municipio' (78) entries. Despite 3222 rows, only 1960 are unique and 39.2% are duplicates, because common names like Washington County (30), Jefferson County (25) and Franklin County (24) recur across states. Values are short and uniform (mean 14.2 chars, ~2 words), so the name alone does not uniquely identify a county.

state high anthropic:claude-opus-4-7

This is a US state code field with 52 distinct values, consistent with the 50 states plus DC and likely one territory. Distribution is broad and near-uniform (entropy ratio 0.93), with TX leading at just 7.88% (254 of 3222 rows), followed by GA, VA, KY, and MO. No nulls, and the row count suggests multiple records per state rather than one-per-state.

total high anthropic:claude-opus-4-7

A heavily right-skewed numeric measure (skew 13.36, kurtosis 297.59) ranging from 47 to 9,782,602 with a median of 25,174 but a mean of 101,340 — the upper tail dwarfs the center. Roughly 13.9% of rows (449) flag as outliers, and the standard deviation (324,628) is over three times the mean, signalling a few very large values dominate. With 3,173 unique values across 3,222 rows and no nulls or zeros, this looks like a per-record aggregate total rather than a category or flag.

pct_deep_poverty high anthropic:claude-opus-4-7

This is a numeric feature representing the percent of population in deep poverty, likely at a county or similar geographic unit (n=3222 with 1131 unique values). The distribution is right-skewed (skew 2.67, kurtosis 10.4) with a median of 5.82 but a max of 34.7, and 176 outliers (5.46%) sit in the upper tail. Min is 0.0 but the zero rate is just 0.09%, so the floor is rarely hit.

pct_poverty high anthropic:claude-opus-4-7

Likely a county- or tract-level poverty rate expressed as a percentage, ranging from 1.6 to 66.32 with a median of 13.55. The distribution is heavily right-skewed (skew 2.10, kurtosis 6.89) with 137 high-end outliers (~4.3%) pulling the mean (15.10) above the median. No nulls or zeros across 3,222 rows.

pct_near_poverty high anthropic:claude-opus-4-7

This column reports a percentage of population near the poverty line, ranging from 0.58 to 49.14 with a mean of 9.81 and median of 9.38. The distribution is right-skewed (skew 1.19, kurtosis 5.73) with 82 outliers (2.55%) on the high tail, but no nulls or zeros. The IQR is tight at 4.43, so most observations cluster between 7.33 and 11.76 with a long upper tail.

Numeric correlation

fips numeric

rows3,222
null0 (0.0%)
unique3,222
min1,001
max72,153
mean31,378
median30,022
std16,300
q119,030
q346,104
iqr27,075
skew0.157
kurtosis-0.631
n_outliers0
outlier_rate0.000
zero_rate0.000

county_name text

95th-percentile length under 20 chars 39.2% duplicate strings
rows3,222
null0 (0.0%)
unique1,960
len_min10
len_max46
len_mean14.172
len_median14.000
len_p9518.000
word_mean2.083
word_median2.000
n_empty0
n_duplicates1,262
duplicate_rate0.392
vocab_size1,963
readability_flesch_mean33.359
emoji_rate0.000
url_rate0.000
one_word_rate0.000
allcaps_rate0.000
boilerplate_rate0.000
Sample values (first 10)
  1. Bibb County
  2. Cheatham County
  3. Piute County
  4. Lamb County
  5. Martin County
  6. Sheridan County
  7. Chickasaw County
  8. Rockingham County
  9. Liberty County
  10. Clark County

state categorical

rows3,222
null0 (0.0%)
unique52
top_valueTX
top_rate0.079
cardinality52
entropy5.314
entropy_ratio0.932
Top values (rank 1–20)
  1. TX — 254
  2. GA — 159
  3. VA — 133
  4. KY — 120
  5. MO — 115
  6. KS — 105
  7. IL — 102
  8. NC — 100
  9. IA — 99
  10. TN — 95
  11. NE — 93
  12. IN — 92
  13. OH — 88
  14. MN — 87
  15. MI — 83
  16. MS — 82
  17. PR — 78
  18. OK — 77
  19. AR — 75
  20. WI — 72

total numeric

skew=+13.36 13.9% rows beyond 1.5 IQR
rows3,222
null0 (0.0%)
unique3,173
min47.000
max9,782,602
mean101,340
median25,174
std324,628
q110,589
q365,006
iqr54,417
skew13.364
kurtosis297.593
n_outliers449
outlier_rate0.139
zero_rate0.000

pct_deep_poverty numeric

skew=+2.67 5.5% rows beyond 1.5 IQR
rows3,222
null0 (0.0%)
unique1,131
min0.000
max34.700
mean6.743
median5.820
std4.154
q14.270
q37.918
iqr3.648
skew2.665
kurtosis10.402
n_outliers176
outlier_rate0.055
zero_rate9.31e-04

pct_poverty numeric

skew=+2.10
rows3,222
null0 (0.0%)
unique1,719
min1.600
max66.320
mean15.100
median13.550
std7.706
q110.160
q317.910
iqr7.750
skew2.096
kurtosis6.891
n_outliers137
outlier_rate0.043
zero_rate0.000

pct_near_poverty numeric

rows3,222
null0 (0.0%)
unique1,237
min0.580
max49.140
mean9.813
median9.380
std3.644
q17.330
q311.760
iqr4.430
skew1.190
kurtosis5.729
n_outliers82
outlier_rate0.025
zero_rate0.000