food deserts snap participation

source /home/coolhand/html/datavis/data_trove/data/urban/food_deserts/snap_participation.csv 3,222 rows 9 columns profiled 2026-05-01 raw JSON static .html .ipynb Report Notebook

Reading

dataset summary · high confidence anthropic:claude-opus-4-7

This dataset covers 3,222 U.S. counties with population, poverty, and SNAP participation estimates alongside FIPS and state identifiers. Population and SNAP-related counts are extremely right-skewed — total_pop has a skew of 13.4 and a max of 9.78M against a median of just 25,174, with similar long tails in poverty_pop and snap_participants_est. The poverty_rate column is more behaved (median 13.55%, max 66.32%) and is probably the most useful field for cross-county comparison without log-scaling. Note that snap_eligible_est appears to be an exact duplicate of poverty_pop (identical stats), which is worth verifying before using either as an independent variable. State coverage spans 52 distinct values, so DC and territories are included.

citing: row_count · column_count · columns.total_pop.stats · columns.poverty_rate.stats · columns.poverty_pop.stats · columns.snap_eligible_est.stats · columns.snap_participants_est.stats · columns.state.n_unique · columns.name.top_words

Charts the summary said to look at first

poverty_rate · Most counties cluster between 10–18% poverty, with a tail extending past 60%.

Show data table

Histogram bins for poverty_rate (median: 13.55).
bin	count
1.6 – 3.218	7
3.218 – 4.836	34
4.836 – 6.454	106
6.454 – 8.072	246
8.072 – 9.69	320
9.69 – 11.31	354
11.31 – 12.93	393
12.93 – 14.54	364
14.54 – 16.16	306
16.16 – 17.78	262
17.78 – 19.4	192
19.4 – 21.02	149
21.02 – 22.63	123
22.63 – 24.25	91
24.25 – 25.87	52
25.87 – 27.49	44
27.49 – 29.11	34
29.11 – 30.72	23
30.72 – 32.34	18
32.34 – 33.96	14
33.96 – 35.58	6
35.58 – 37.2	8
37.2 – 38.81	3
38.81 – 40.43	8
40.43 – 42.05	5
42.05 – 43.67	9
43.67 – 45.29	4
45.29 – 46.9	11
46.9 – 48.52	7
48.52 – 50.14	8
50.14 – 51.76	2
51.76 – 53.38	6
53.38 – 54.99	5
54.99 – 56.61	5
56.61 – 58.23	1
58.23 – 59.85	0
59.85 – 61.47	0
61.47 – 63.08	0
63.08 – 64.7	1
64.7 – 66.32	1

total_pop · Heavy right skew — a handful of large-population counties dwarf the rest; consider a log scale.

Show data table

Histogram bins for total_pop (median: 25174.0).
bin	count
47 – 2.446e+05	2942
2.446e+05 – 4.892e+05	137
4.892e+05 – 7.337e+05	57
7.337e+05 – 9.783e+05	39
9.783e+05 – 1.223e+06	12
1.223e+06 – 1.467e+06	9
1.467e+06 – 1.712e+06	7
1.712e+06 – 1.957e+06	3
1.957e+06 – 2.201e+06	3
2.201e+06 – 2.446e+06	4
2.446e+06 – 2.69e+06	3
2.69e+06 – 2.935e+06	0
2.935e+06 – 3.179e+06	1
3.179e+06 – 3.424e+06	1
3.424e+06 – 3.669e+06	0
3.669e+06 – 3.913e+06	0
3.913e+06 – 4.158e+06	0
4.158e+06 – 4.402e+06	1
4.402e+06 – 4.647e+06	0
4.647e+06 – 4.891e+06	1
4.891e+06 – 5.136e+06	0
5.136e+06 – 5.38e+06	1
5.38e+06 – 5.625e+06	0
5.625e+06 – 5.87e+06	0
5.87e+06 – 6.114e+06	0
6.114e+06 – 6.359e+06	0
6.359e+06 – 6.603e+06	0
6.603e+06 – 6.848e+06	0
6.848e+06 – 7.092e+06	0
7.092e+06 – 7.337e+06	0
7.337e+06 – 7.582e+06	0
7.582e+06 – 7.826e+06	0
7.826e+06 – 8.071e+06	0
8.071e+06 – 8.315e+06	0
8.315e+06 – 8.56e+06	0
8.56e+06 – 8.804e+06	0
8.804e+06 – 9.049e+06	0
9.049e+06 – 9.293e+06	0
9.293e+06 – 9.538e+06	0
9.538e+06 – 9.783e+06	1

snap_participants_est · Distribution of SNAP participant estimates is similarly long-tailed and tracks population size.

Show data table

Histogram bins for snap_participants_est (median: 2546.0).
bin	count
2 – 2.251e+04	2963
2.251e+04 – 4.503e+04	152
4.503e+04 – 6.754e+04	49
6.754e+04 – 9.005e+04	19
9.005e+04 – 1.126e+05	10
1.126e+05 – 1.351e+05	6
1.351e+05 – 1.576e+05	3
1.576e+05 – 1.801e+05	3
1.801e+05 – 2.026e+05	5
2.026e+05 – 2.251e+05	1
2.251e+05 – 2.476e+05	4
2.476e+05 – 2.701e+05	1
2.701e+05 – 2.927e+05	1
2.927e+05 – 3.152e+05	0
3.152e+05 – 3.377e+05	2
3.377e+05 – 3.602e+05	0
3.602e+05 – 3.827e+05	0
3.827e+05 – 4.052e+05	0
4.052e+05 – 4.277e+05	0
4.277e+05 – 4.502e+05	0
4.502e+05 – 4.727e+05	1
4.727e+05 – 4.953e+05	1
4.953e+05 – 5.178e+05	0
5.178e+05 – 5.403e+05	0
5.403e+05 – 5.628e+05	0
5.628e+05 – 5.853e+05	0
5.853e+05 – 6.078e+05	0
6.078e+05 – 6.303e+05	0
6.303e+05 – 6.528e+05	0
6.528e+05 – 6.753e+05	0
6.753e+05 – 6.979e+05	0
6.979e+05 – 7.204e+05	0
7.204e+05 – 7.429e+05	0
7.429e+05 – 7.654e+05	0
7.654e+05 – 7.879e+05	0
7.879e+05 – 8.104e+05	0
8.104e+05 – 8.329e+05	0
8.329e+05 – 8.554e+05	0
8.554e+05 – 8.78e+05	0
8.78e+05 – 9.005e+05	1

state · Counts of counties per state show which states contribute most rows (Texas, Virginia, Georgia lead).

Show data table

Histogram bins for state (median: 30.0).
bin	count
1 – 2.775	97
2.775 – 4.55	15
4.55 – 6.325	133
6.325 – 8.1	64
8.1 – 9.875	9
9.875 – 11.65	4
11.65 – 13.42	226
13.42 – 15.2	5
15.2 – 16.98	44
16.98 – 18.75	194
18.75 – 20.52	204
20.52 – 22.3	184
22.3 – 24.07	40
24.07 – 25.85	14
25.85 – 27.62	170
27.62 – 29.4	197
29.4 – 31.17	149
31.17 – 32.95	17
32.95 – 34.73	31
34.73 – 36.5	95
36.5 – 38.27	153
38.27 – 40.05	165
40.05 – 41.82	36
41.82 – 43.6	67
43.6 – 45.38	51
45.38 – 47.15	161
47.15 – 48.92	254
48.92 – 50.7	43
50.7 – 52.47	133
52.47 – 54.25	94
54.25 – 56.02	95
56.02 – 57.8	0
57.8 – 59.57	0
59.57 – 61.35	0
61.35 – 63.12	0
63.12 – 64.9	0
64.9 – 66.67	0
66.67 – 68.45	0
68.45 – 70.22	0
70.22 – 72	78

poverty_pop · Compare against snap_eligible_est — the two columns appear identical and may be redundant.

Show data table

Histogram bins for poverty_pop (median: 3799.5).
bin	count
3 – 3.36e+04	2963
3.36e+04 – 6.72e+04	152
6.72e+04 – 1.008e+05	49
1.008e+05 – 1.344e+05	19
1.344e+05 – 1.68e+05	10
1.68e+05 – 2.016e+05	6
2.016e+05 – 2.352e+05	3
2.352e+05 – 2.688e+05	3
2.688e+05 – 3.024e+05	5
3.024e+05 – 3.36e+05	1
3.36e+05 – 3.696e+05	4
3.696e+05 – 4.032e+05	1
4.032e+05 – 4.368e+05	1
4.368e+05 – 4.704e+05	0
4.704e+05 – 5.04e+05	2
5.04e+05 – 5.376e+05	0
5.376e+05 – 5.712e+05	0
5.712e+05 – 6.048e+05	0
6.048e+05 – 6.384e+05	0
6.384e+05 – 6.72e+05	0
6.72e+05 – 7.056e+05	1
7.056e+05 – 7.392e+05	1
7.392e+05 – 7.728e+05	0
7.728e+05 – 8.064e+05	0
8.064e+05 – 8.4e+05	0
8.4e+05 – 8.736e+05	0
8.736e+05 – 9.072e+05	0
9.072e+05 – 9.408e+05	0
9.408e+05 – 9.744e+05	0
9.744e+05 – 1.008e+06	0
1.008e+06 – 1.042e+06	0
1.042e+06 – 1.075e+06	0
1.075e+06 – 1.109e+06	0
1.109e+06 – 1.142e+06	0
1.142e+06 – 1.176e+06	0
1.176e+06 – 1.21e+06	0
1.21e+06 – 1.243e+06	0
1.243e+06 – 1.277e+06	0
1.277e+06 – 1.31e+06	0
1.31e+06 – 1.344e+06	1

Schema

9 columns

Per-column summary. Click column name to jump to its detail.
				Alerts
name	text	0.0%	3,222	near_unique
total_pop	numeric	0.0%	3,173	high_skew outliers
poverty_pop	numeric	0.0%	2,839	high_skew outliers
state	numeric	0.0%	52
county	numeric	0.0%	330	high_skew outliers
fips	numeric	0.0%	3,222
poverty_rate	numeric	0.0%	1,719	high_skew
snap_eligible_est	numeric	0.0%	2,839	high_skew outliers
snap_participants_est	numeric	0.0%	2,636	high_skew outliers

name

text identifier near_unique

This column holds U.S. county names with state qualifiers — 2999 of 3222 values contain the token 'county,' followed by state names like Texas (256), Virginia (189), and Georgia (159). Every one of the 3222 rows is unique with zero nulls or duplicates, and lengths cluster tightly between 16 and 31 characters (mean 24.3, 3.2 words). The near_unique alert plus the 'County, State' pattern strongly suggests this is a geographic identifier rather than free text. Treatment: Parse into separate county and state fields, then use as a join key to geographic reference data. high · anthropic:claude-opus-4-7

n: 3,222
nulls: 0 (0.0%)
unique: 3,222
len_min: 16
len_max: 59
len_mean: 24.32
len_median: 24
len_p95: 31
word_mean: 3.248
word_median: 3
n_empty: 0
n_duplicates: 0
duplicate_rate: 0
vocab_size: 1,990
readability_flesch_mean: 10.28
emoji_rate: 0
url_rate: 0
one_word_rate: 0
allcaps_rate: 0
boilerplate_rate: 0

total_pop

numeric feature high_skew outliers

This is a population count by geographic unit, with 3,222 rows, no nulls, and 3,173 unique values, suggesting one row per area (likely U.S. counties given the count). The distribution is extremely heavy-tailed: median is 25,174 but the mean is 101,340 and the max reaches 9,782,602, producing a skew of 13.36 and kurtosis of 297.59. About 13.9% of rows (449) flag as outliers, reflecting a small number of very large jurisdictions dominating the tail. Treatment: log-transform before regression or any distance-based modelling. high · anthropic:claude-opus-4-7

n: 3,222
nulls: 0 (0.0%)
unique: 3,173
min: 47
max: 9.783e+06
mean: 1.013e+05
median: 25,174
std: 3.246e+05
q1: 1.059e+04
q3: 6.501e+04
iqr: 5.442e+04
skew: 13.36
kurtosis: 297.6
n_outliers: 449
outlier_rate: 0.1394
zero_rate: 0

poverty_pop

numeric feature high_skew outliers

This is a count of the population in poverty per record (likely county or similar geography), ranging from 3 to 1,343,978 with a median of 3,799.5. The distribution is extremely right-skewed (skew 14.73, kurtosis 342.21) with 362 outliers (11.2%) reflecting a few very populous areas dwarfing the rest. No nulls or zeros, and 2,839 unique values across 3,222 rows suggest near-record-level granularity. Treatment: Log-transform or convert to a poverty rate before modelling. high · anthropic:claude-opus-4-7

n: 3,222
nulls: 0 (0.0%)
unique: 2,839
min: 3
max: 1.344e+06
mean: 1.3e+04
median: 3800
std: 4.326e+04
q1: 1526
q3: 9768
iqr: 8242
skew: 14.73
kurtosis: 342.2
n_outliers: 362
outlier_rate: 0.1124
zero_rate: 0

state

numeric identifier

Stored as numeric but the values look like FIPS-style state codes: 52 unique integers spanning 1 to 72 across 3222 rows with no nulls and no zeros. The roughly uniform spread (mean 31.27, median 30, std 16.29, skew 0.16, kurtosis -0.63) and the count of 52 distinct codes are consistent with US states/territories rather than a continuous measurement. Treatment: Treat as a categorical state code and map to labels before modelling, not as a numeric feature. high · anthropic:claude-opus-4-7

n: 3,222
nulls: 0 (0.0%)
unique: 52
min: 1
max: 72
mean: 31.27
median: 30
std: 16.29
q1: 19
q3: 46
iqr: 27
skew: 0.1574
kurtosis: -0.6267
n_outliers: 0
outlier_rate: 0
zero_rate: 0

county

numeric identifier high_skew outliers

Despite being labeled 'county', this column holds small integers ranging from 1 to 840 with 330 unique values across 3,222 rows, suggesting it's a numeric county code (likely a FIPS-style identifier) rather than a true measurement. The distribution is heavily right-skewed (skew 2.87, kurtosis 11.6) with 178 outliers (5.5%), which is expected for ID codes but meaningless as a numeric signal. The median (79) sits well below the mean (103), and there are no nulls or zeros. Treatment: Treat as a categorical code; do not use as a numeric feature — encode or join to a county lookup. high · anthropic:claude-opus-4-7

n: 3,222
nulls: 0 (0.0%)
unique: 330
min: 1
max: 840
mean: 103.2
median: 79
std: 106.6
q1: 35
q3: 133
iqr: 98
skew: 2.866
kurtosis: 11.64
n_outliers: 178
outlier_rate: 0.05525
zero_rate: 0

fips

numeric identifier

This is the FIPS county/state code identifier — values span 1001 to 72153 and every one of the 3222 rows is unique with no nulls. The distribution is near-uniform across the code range (skew 0.16, kurtosis -0.63), consistent with a geographic key rather than a measured quantity. Treatment: left-join on this id; do not use as a numeric feature. high · anthropic:claude-opus-4-7

n: 3,222
nulls: 0 (0.0%)
unique: 3,222
min: 1,001
max: 72,153
mean: 3.138e+04
median: 30,022
std: 1.63e+04
q1: 1.903e+04
q3: 4.61e+04
iqr: 27,075
skew: 0.1574
kurtosis: -0.6314
n_outliers: 0
outlier_rate: 0
zero_rate: 0

poverty_rate

numeric feature high_skew

This is a numeric poverty rate, almost certainly a percentage given the range from 1.6 to 66.32 and a median of 13.55, plausibly at a county or tract level given n=3222. The distribution is right-skewed (skew 2.10, kurtosis 6.89) with 137 outliers (4.25%) sitting well above the Q3 of 17.91, indicating a long tail of high-poverty areas. No nulls and no zeros, and 1719 unique values suggest some rounding to one decimal. Treatment: Apply a log or Yeo-Johnson transform before regression to tame the right skew. high · anthropic:claude-opus-4-7

n: 3,222
nulls: 0 (0.0%)
unique: 1,719
min: 1.6
max: 66.32
mean: 15.1
median: 13.55
std: 7.706
q1: 10.16
q3: 17.91
iqr: 7.75
skew: 2.096
kurtosis: 6.891
n_outliers: 137
outlier_rate: 0.04252
zero_rate: 0

snap_eligible_est

numeric feature high_skew outliers

Numeric estimate of SNAP-eligible population per record, with all 3222 rows populated and 2839 unique values. The distribution is extremely right-skewed (skew 14.73, kurtosis 342.21): median is 3799.5 but the mean is 13001.22 and the max reaches 1,343,978, roughly 31x the standard deviation above the mean. About 11.2% of rows (362) flag as outliers, so a small number of very large geographies dominate the tail. Treatment: Log-transform (no zeros, min=3) before any modelling or distance-based comparison. high · anthropic:claude-opus-4-7

n: 3,222
nulls: 0 (0.0%)
unique: 2,839
min: 3
max: 1.344e+06
mean: 1.3e+04
median: 3800
std: 4.326e+04
q1: 1526
q3: 9768
iqr: 8242
skew: 14.73
kurtosis: 342.2
n_outliers: 362
outlier_rate: 0.1124
zero_rate: 0

snap_participants_est

numeric feature high_skew outliers

Estimated SNAP participant counts per row, ranging from 2 to 900,465 with a median of 2,546 and mean of 8,710.83. The distribution is severely right-skewed (skew 14.73, kurtosis 342.21) with std 28,987.13 dwarfing the IQR of 5,522, and 362 rows (11.24%) flagged as outliers — likely a few very large geographies pulling the tail. No nulls or zeros, and 2,636 of 3,222 values are unique. Treatment: log-transform before regression to tame the heavy right tail. high · anthropic:claude-opus-4-7

n: 3,222
nulls: 0 (0.0%)
unique: 2,636
min: 2
max: 900,465
mean: 8711
median: 2,546
std: 2.899e+04
q1: 1022
q3: 6544
iqr: 5,522
skew: 14.73
kurtosis: 342.2
n_outliers: 362
outlier_rate: 0.1124
zero_rate: 0