saturn

/home/coolhand/datasets/us-inequality-atlas/food_deserts/food_desert_merged.csv 3,222 rows sample n=3,222 seed 42 2026-05-01T17:24:10+00:00

Overview

Source/home/coolhand/datasets/us-inequality-atlas/food_deserts/food_desert_merged.csv
Total rows3,222
Profiled sample3,222
Columns11
Generated2026-05-01T17:24:10+00:00

Insights opt-in

Model-generated narrative. These are opinions, not facts — the stats below are what saturn measured. Generated by: anthropic:claude-opus-4-7.

Dataset high anthropic:claude-opus-4-7

This dataset contains 3,222 rows and 11 columns of US county-level indicators on poverty, SNAP eligibility and participation, vehicle access, and total population, keyed by FIPS and county/state codes. The population and program-count columns (total_pop, poverty_pop, snap_eligible_est, snap_participants_est, no_vehicle_total) are extremely right-skewed, with skew values from 13 to 20 and around 11-14% of rows flagged as outliers — a handful of very large counties dominate the raw totals. Note that snap_eligible_est and poverty_pop have identical statistics, suggesting one is a direct copy of the other and worth verifying before analysis. The rate-based columns are more tractable: poverty_rate has a moderate skew of 2.1 with a median of 13.55%, and no_vehicle_pct has a median of 5.41% but a long tail reaching 85.94%. Start with the rate columns for cross-county comparison and reserve the totals for absolute-magnitude questions.

name high anthropic:claude-opus-4-7

This column holds full county names paired with state (e.g., "... County, Texas"), as evidenced by "county," appearing 2999 times out of 3222 rows alongside top state tokens like Texas (256), Virginia (189), and Georgia (159). Every value is unique (n_unique=3222, null_rate=0) and lengths are tightly clustered (mean 24.3, min 16, max 59, ~3 words), consistent with a canonical place-name label. The near_unique alert confirms it functions as a row identifier rather than a categorical feature.

total_pop high anthropic:claude-opus-4-7

Population counts per record, ranging from 47 to 9,782,602 with a median of 25,174 — consistent with US county-level totals. The distribution is extremely right-skewed (skew 13.36, kurtosis 297.59) and 13.9% of rows (449) flag as outliers, driven by a handful of mega-population entities pulling the mean (101,340) far above the median.

poverty_pop high anthropic:claude-opus-4-7

This is a count of population in poverty per record (likely a county or similar geographic unit), ranging from 3 to 1,343,978 with a median of 3,799.5. The distribution is extremely right-skewed (skew 14.73, kurtosis 342.21) and 362 values (11.2%) are flagged as outliers, consistent with a few very large jurisdictions dwarfing the rest. No nulls or zeros, and 2,839 of 3,222 values are unique.

state high anthropic:claude-opus-4-7

Numeric codes ranging from 1 to 72 with 52 unique values across 3222 rows and no nulls strongly suggest US state/territory FIPS codes rather than a true measurement. The near-uniform spread (mean 31.27, median 30, std 16.29, skew 0.16) and absence of outliers are consistent with a categorical identifier encoded as integers. Treating these as a continuous feature would be misleading.

county high anthropic:claude-opus-4-7

Despite the name 'county', this column is stored as numeric with 330 unique integer values from 1 to 840 across 3,222 rows — consistent with a county FIPS or lookup code rather than a measured quantity. The distribution is heavily right-skewed (skew 2.87, kurtosis 11.6) with 178 outliers (5.5%), which is expected behavior for an ID-like code, not a meaningful statistical signal. No nulls or zeros are present.

fips high anthropic:claude-opus-4-7

This is the U.S. county FIPS code: every one of the 3222 rows is unique, with values spanning 1001 to 72153, consistent with state-prefixed county identifiers. The distribution is near-symmetric (skew 0.16, kurtosis -0.63) and has no outliers or nulls, as expected for a structured code rather than a measurement. Despite being numeric, the values are categorical labels and arithmetic on them is meaningless.

poverty_rate high anthropic:claude-opus-4-7

This column appears to be a county- or area-level poverty rate expressed as a percentage, with 3222 rows, 1719 unique values, and no nulls. The distribution is right-skewed (skew 2.10, kurtosis 6.89) with a median of 13.55 and mean 15.10, but a long tail stretching to a max of 66.32 versus a min of 1.6. About 4.25% of rows (137) are flagged as outliers, consistent with a small set of severely impoverished areas.

snap_eligible_est high anthropic:claude-opus-4-7

A numeric estimate of SNAP-eligible counts per record, with 3222 non-null rows and 2839 unique values. The distribution is severely right-skewed (skew 14.73, kurtosis 342.21): the median is 3799.5 but the max reaches 1,343,978, and 11.2% of rows flag as outliers. No nulls or zeros are present, so the spread is real, not missingness artefact.

snap_participants_est high anthropic:claude-opus-4-7

Estimated SNAP participant counts per record, ranging from 2 to 900,465 with a median of 2,546 and mean of 8,711. The distribution is severely right-skewed (skew 14.73, kurtosis 342.21) with 362 outliers (11.2%) and a standard deviation (28,987) more than three times the mean, suggesting a few very large jurisdictions dominate. No nulls or zeros are present across 3,222 rows.

no_vehicle_total high anthropic:claude-opus-4-7

This column appears to be an aggregate vehicle count (likely total number of vehicles per record/area). The distribution is extremely heavy-tailed: median is 580 but the mean is 3304 and the maximum reaches 601,621, with skew of 20.26 and kurtosis of 501.27. About 12.6% of rows (407) flag as outliers, while only 0.37% are zeros and there are no nulls.

no_vehicle_pct high anthropic:claude-opus-4-7

Likely a per-area percentage of households without a vehicle, given values bounded between 0.0 and 85.94 with a median of 5.41 and Q1-Q3 of 3.98-7.36. The distribution is severely right-skewed (skew 6.98, kurtosis 86.23) with 140 outliers (4.35%) stretching far above the typical range, while only 0.37% of rows are exactly zero. No nulls across 3,222 rows.

Numeric correlation

name text

100.0% of rows are unique strings
rows3,222
null0 (0.0%)
unique3,222
len_min16
len_max59
len_mean24.324
len_median24.000
len_p9531.000
word_mean3.248
word_median3.000
n_empty0
n_duplicates0
duplicate_rate0.000
vocab_size1,990
readability_flesch_mean10.284
emoji_rate0.000
url_rate0.000
one_word_rate0.000
allcaps_rate0.000
boilerplate_rate0.000
Sample values (first 10)
  1. Bibb County, Alabama
  2. Cheatham County, Tennessee
  3. Piute County, Utah
  4. Lamb County, Texas
  5. Martin County, Minnesota
  6. Sheridan County, Wyoming
  7. Chickasaw County, Mississippi
  8. Rockingham County, Virginia
  9. Liberty County, Texas
  10. Clark County, Arkansas

total_pop numeric

skew=+13.36 13.9% rows beyond 1.5 IQR
rows3,222
null0 (0.0%)
unique3,173
min47.000
max9,782,602
mean101,340
median25,174
std324,628
q110,589
q365,006
iqr54,417
skew13.364
kurtosis297.593
n_outliers449
outlier_rate0.139
zero_rate0.000

poverty_pop numeric

skew=+14.73 11.2% rows beyond 1.5 IQR
rows3,222
null0 (0.0%)
unique2,839
min3.000
max1,343,978
mean13,001
median3,800
std43,264
q11,526
q39,768
iqr8,242
skew14.731
kurtosis342.209
n_outliers362
outlier_rate0.112
zero_rate0.000

state numeric

rows3,222
null0 (0.0%)
unique52
min1.000
max72.000
mean31.275
median30.000
std16.285
q119.000
q346.000
iqr27.000
skew0.157
kurtosis-0.627
n_outliers0
outlier_rate0.000
zero_rate0.000

county numeric

skew=+2.87 5.5% rows beyond 1.5 IQR
rows3,222
null0 (0.0%)
unique330
min1.000
max840.000
mean103.216
median79.000
std106.561
q135.000
q3133.000
iqr98.000
skew2.866
kurtosis11.640
n_outliers178
outlier_rate0.055
zero_rate0.000

fips numeric

rows3,222
null0 (0.0%)
unique3,222
min1,001
max72,153
mean31,378
median30,022
std16,300
q119,030
q346,104
iqr27,075
skew0.157
kurtosis-0.631
n_outliers0
outlier_rate0.000
zero_rate0.000

poverty_rate numeric

skew=+2.10
rows3,222
null0 (0.0%)
unique1,719
min1.600
max66.320
mean15.100
median13.550
std7.706
q110.160
q317.910
iqr7.750
skew2.096
kurtosis6.891
n_outliers137
outlier_rate0.043
zero_rate0.000

snap_eligible_est numeric

skew=+14.73 11.2% rows beyond 1.5 IQR
rows3,222
null0 (0.0%)
unique2,839
min3.000
max1,343,978
mean13,001
median3,800
std43,264
q11,526
q39,768
iqr8,242
skew14.731
kurtosis342.209
n_outliers362
outlier_rate0.112
zero_rate0.000

snap_participants_est numeric

skew=+14.73 11.2% rows beyond 1.5 IQR
rows3,222
null0 (0.0%)
unique2,636
min2.000
max900,465
mean8,711
median2,546
std28,987
q11,022
q36,544
iqr5,522
skew14.731
kurtosis342.209
n_outliers362
outlier_rate0.112
zero_rate0.000

no_vehicle_total numeric

skew=+20.26 12.6% rows beyond 1.5 IQR
rows3,222
null0 (0.0%)
unique1,823
min0.000
max601,621
mean3,304
median580.000
std20,050
q1223.000
q31,555
iqr1,332
skew20.257
kurtosis501.273
n_outliers407
outlier_rate0.126
zero_rate3.72e-03

no_vehicle_pct numeric

skew=+6.98
rows3,222
null0 (0.0%)
unique1,065
min0.000
max85.940
mean6.197
median5.410
std4.538
q13.980
q37.360
iqr3.380
skew6.976
kurtosis86.230
n_outliers140
outlier_rate0.043
zero_rate3.72e-03