saturn

/home/coolhand/datasets/us-inequality-atlas/food_deserts/food_desert_merged.csv 3,222 rows sample n=3,222 seed 42 2026-05-01T17:24:10+00:00

Overview

Source	/home/coolhand/datasets/us-inequality-atlas/food_deserts/food_desert_merged.csv
Total rows	3,222
Profiled sample	3,222
Columns	11
Generated	2026-05-01T17:24:10+00:00

Insights opt-in

Model-generated narrative. These are opinions, not facts — the stats below are what saturn measured. Generated by: anthropic:claude-opus-4-7.

Dataset high anthropic:claude-opus-4-7

This dataset contains 3,222 rows and 11 columns of US county-level indicators on poverty, SNAP eligibility and participation, vehicle access, and total population, keyed by FIPS and county/state codes. The population and program-count columns (total_pop, poverty_pop, snap_eligible_est, snap_participants_est, no_vehicle_total) are extremely right-skewed, with skew values from 13 to 20 and around 11-14% of rows flagged as outliers — a handful of very large counties dominate the raw totals. Note that snap_eligible_est and poverty_pop have identical statistics, suggesting one is a direct copy of the other and worth verifying before analysis. The rate-based columns are more tractable: poverty_rate has a moderate skew of 2.1 with a median of 13.55%, and no_vehicle_pct has a median of 5.41% but a long tail reaching 85.94%. Start with the rate columns for cross-county comparison and reserve the totals for absolute-magnitude questions.

name high anthropic:claude-opus-4-7

This column holds full county names paired with state (e.g., "... County, Texas"), as evidenced by "county," appearing 2999 times out of 3222 rows alongside top state tokens like Texas (256), Virginia (189), and Georgia (159). Every value is unique (n_unique=3222, null_rate=0) and lengths are tightly clustered (mean 24.3, min 16, max 59, ~3 words), consistent with a canonical place-name label. The near_unique alert confirms it functions as a row identifier rather than a categorical feature.

total_pop high anthropic:claude-opus-4-7

Population counts per record, ranging from 47 to 9,782,602 with a median of 25,174 — consistent with US county-level totals. The distribution is extremely right-skewed (skew 13.36, kurtosis 297.59) and 13.9% of rows (449) flag as outliers, driven by a handful of mega-population entities pulling the mean (101,340) far above the median.

poverty_pop high anthropic:claude-opus-4-7

This is a count of population in poverty per record (likely a county or similar geographic unit), ranging from 3 to 1,343,978 with a median of 3,799.5. The distribution is extremely right-skewed (skew 14.73, kurtosis 342.21) and 362 values (11.2%) are flagged as outliers, consistent with a few very large jurisdictions dwarfing the rest. No nulls or zeros, and 2,839 of 3,222 values are unique.

state high anthropic:claude-opus-4-7

Numeric codes ranging from 1 to 72 with 52 unique values across 3222 rows and no nulls strongly suggest US state/territory FIPS codes rather than a true measurement. The near-uniform spread (mean 31.27, median 30, std 16.29, skew 0.16) and absence of outliers are consistent with a categorical identifier encoded as integers. Treating these as a continuous feature would be misleading.

county high anthropic:claude-opus-4-7

Despite the name 'county', this column is stored as numeric with 330 unique integer values from 1 to 840 across 3,222 rows — consistent with a county FIPS or lookup code rather than a measured quantity. The distribution is heavily right-skewed (skew 2.87, kurtosis 11.6) with 178 outliers (5.5%), which is expected behavior for an ID-like code, not a meaningful statistical signal. No nulls or zeros are present.

fips high anthropic:claude-opus-4-7

This is the U.S. county FIPS code: every one of the 3222 rows is unique, with values spanning 1001 to 72153, consistent with state-prefixed county identifiers. The distribution is near-symmetric (skew 0.16, kurtosis -0.63) and has no outliers or nulls, as expected for a structured code rather than a measurement. Despite being numeric, the values are categorical labels and arithmetic on them is meaningless.

poverty_rate high anthropic:claude-opus-4-7

This column appears to be a county- or area-level poverty rate expressed as a percentage, with 3222 rows, 1719 unique values, and no nulls. The distribution is right-skewed (skew 2.10, kurtosis 6.89) with a median of 13.55 and mean 15.10, but a long tail stretching to a max of 66.32 versus a min of 1.6. About 4.25% of rows (137) are flagged as outliers, consistent with a small set of severely impoverished areas.

snap_eligible_est high anthropic:claude-opus-4-7

A numeric estimate of SNAP-eligible counts per record, with 3222 non-null rows and 2839 unique values. The distribution is severely right-skewed (skew 14.73, kurtosis 342.21): the median is 3799.5 but the max reaches 1,343,978, and 11.2% of rows flag as outliers. No nulls or zeros are present, so the spread is real, not missingness artefact.

snap_participants_est high anthropic:claude-opus-4-7

Estimated SNAP participant counts per record, ranging from 2 to 900,465 with a median of 2,546 and mean of 8,711. The distribution is severely right-skewed (skew 14.73, kurtosis 342.21) with 362 outliers (11.2%) and a standard deviation (28,987) more than three times the mean, suggesting a few very large jurisdictions dominate. No nulls or zeros are present across 3,222 rows.

no_vehicle_total high anthropic:claude-opus-4-7

This column appears to be an aggregate vehicle count (likely total number of vehicles per record/area). The distribution is extremely heavy-tailed: median is 580 but the mean is 3304 and the maximum reaches 601,621, with skew of 20.26 and kurtosis of 501.27. About 12.6% of rows (407) flag as outliers, while only 0.37% are zeros and there are no nulls.

no_vehicle_pct high anthropic:claude-opus-4-7

Likely a per-area percentage of households without a vehicle, given values bounded between 0.0 and 85.94 with a median of 5.41 and Q1-Q3 of 3.98-7.36. The distribution is severely right-skewed (skew 6.98, kurtosis 86.23) with 140 outliers (4.35%) stretching far above the typical range, while only 0.37% of rows are exactly zero. No nulls across 3,222 rows.

Numeric correlation

name text

100.0% of rows are unique strings

rows3,222

null0 (0.0%)

unique3,222

len_min16

len_max59

len_mean24.324

len_median24.000

len_p9531.000

word_mean3.248

word_median3.000

n_empty0

n_duplicates0

duplicate_rate0.000

vocab_size1,990

readability_flesch_mean10.284

emoji_rate0.000

url_rate0.000

one_word_rate0.000

allcaps_rate0.000

boilerplate_rate0.000

Sample values (first 10)

Bibb County, Alabama
Cheatham County, Tennessee
Piute County, Utah
Lamb County, Texas
Martin County, Minnesota
Sheridan County, Wyoming
Chickasaw County, Mississippi
Rockingham County, Virginia
Liberty County, Texas
Clark County, Arkansas

total_pop numeric

skew=+13.36 13.9% rows beyond 1.5 IQR

rows3,222

null0 (0.0%)

unique3,173

min47.000

max9,782,602

mean101,340

median25,174

std324,628

q110,589

q365,006

iqr54,417

skew13.364

kurtosis297.593

n_outliers449

outlier_rate0.139

zero_rate0.000

poverty_pop numeric

skew=+14.73 11.2% rows beyond 1.5 IQR

rows3,222

null0 (0.0%)

unique2,839

min3.000

max1,343,978

mean13,001

median3,800

std43,264

q11,526

q39,768

iqr8,242

skew14.731

kurtosis342.209

n_outliers362

outlier_rate0.112

zero_rate0.000

state numeric

rows3,222

null0 (0.0%)

unique52

min1.000

max72.000

mean31.275

median30.000

std16.285

q119.000

q346.000

iqr27.000

skew0.157

kurtosis-0.627

n_outliers0

outlier_rate0.000

zero_rate0.000

county numeric

skew=+2.87 5.5% rows beyond 1.5 IQR

rows3,222

null0 (0.0%)

unique330

min1.000

max840.000

mean103.216

median79.000

std106.561

q135.000

q3133.000

iqr98.000

skew2.866

kurtosis11.640

n_outliers178

outlier_rate0.055

zero_rate0.000

fips numeric

rows3,222

null0 (0.0%)

unique3,222

min1,001

max72,153

mean31,378

median30,022

std16,300

q119,030

q346,104

iqr27,075

skew0.157

kurtosis-0.631

n_outliers0

outlier_rate0.000

zero_rate0.000

poverty_rate numeric

skew=+2.10

rows3,222

null0 (0.0%)

unique1,719

min1.600

max66.320

mean15.100

median13.550

std7.706

q110.160

q317.910

iqr7.750

skew2.096

kurtosis6.891

n_outliers137

outlier_rate0.043

zero_rate0.000

snap_eligible_est numeric

skew=+14.73 11.2% rows beyond 1.5 IQR

rows3,222

null0 (0.0%)

unique2,839

min3.000

max1,343,978

mean13,001

median3,800

std43,264

q11,526

q39,768

iqr8,242

skew14.731

kurtosis342.209

n_outliers362

outlier_rate0.112

zero_rate0.000

snap_participants_est numeric

skew=+14.73 11.2% rows beyond 1.5 IQR

rows3,222

null0 (0.0%)

unique2,636

min2.000

max900,465

mean8,711

median2,546

std28,987

q11,022

q36,544

iqr5,522

skew14.731

kurtosis342.209

n_outliers362

outlier_rate0.112

zero_rate0.000

no_vehicle_total numeric

skew=+20.26 12.6% rows beyond 1.5 IQR

rows3,222

null0 (0.0%)

unique1,823

min0.000

max601,621

mean3,304

median580.000

std20,050

q1223.000

q31,555

iqr1,332

skew20.257

kurtosis501.273

n_outliers407

outlier_rate0.126

zero_rate3.72e-03

no_vehicle_pct numeric

skew=+6.98

rows3,222

null0 (0.0%)

unique1,065

min0.000

max85.940

mean6.197

median5.410

std4.538

q13.980

q37.360

iqr3.380

skew6.976

kurtosis86.230

n_outliers140

outlier_rate0.043

zero_rate3.72e-03