saturn

/home/coolhand/html/datavis/data_trove/data/quirky/food_desert_states.json 51 rows sample n=51 seed 42 2026-06-21T23:46:41+00:00

Overview

Source	/home/coolhand/html/datavis/data_trove/data/quirky/food_desert_states.json
Total rows	51
Profiled sample	51
Columns	11
Generated	2026-06-21T23:46:41+00:00

Show data table

Per-column null rate across the corpus.
column	kind	null %
name	categorical	0.0%
abbr	categorical	0.0%
pop	numeric	0.0%
desertPop	numeric	0.0%
povertyPop	numeric	0.0%
noVehicle	numeric	0.0%
povertyRate	numeric	0.0%
noVehiclePct	numeric	0.0%
counties	numeric	0.0%
lat	numeric	0.0%
lon	numeric	0.0%

Insights opt-in

Model-generated narrative. These are opinions, not facts — the stats below are what saturn measured. Generated by: anthropic:default.

Dataset high anthropic:default

This dataset contains one row per U.S. state (plus D.C., 51 rows total) with figures on food desert populations, vehicle access, and poverty. The most striking feature is the extreme right-skew in desert-exposed population counts: the median desertPop is just 21,000 but the max reaches 449,000, with 6 outlier states driving the distribution far above the norm — a pattern mirrored almost identically in noVehicle counts. Poverty rate, by contrast, is far more normally distributed (mean 12.4%, std 2.6%), suggesting that food desert exposure is more strongly shaped by state size and car dependency than by poverty alone — worth cross-examining. The noVehiclePct column (max 17.37% vs. median 2.45%) flags a small handful of states with dramatically higher car-free household rates that likely align with the desertPop outliers.

desertPop medium anthropic:default

This column likely represents a population count associated with desert regions (e.g., population living in desert areas, possibly by U.S. state or similar unit given n=51). The distribution is severely right-skewed (skew=4.73, kurtosis=25.51): the median is just 21.0 while the mean is 38.27 and the max reaches 449.0, indicating a small number of entities dominate desert population totals. With 6 outliers (≈11.8% of rows) and a standard deviation of 67.39 against a median of 21.0, those extreme values will heavily distort any linear model trained on raw values.

noVehicle medium anthropic:default

This column likely represents a count of households or individuals without access to a vehicle, aggregated at some geographic unit (e.g., census tract or neighbourhood) across 51 observations. The distribution is severely right-skewed (skew = 4.41, kurtosis = 22.57), with a median of 115 but a mean pulled to 204.9 by a long upper tail reaching 2202. Six outliers (≈11.8% of rows) are driving this extreme shape, suggesting a small number of densely populated or car-deprived areas dominate the upper end while most units cluster between 40 and 203 (IQR = 163).

noVehiclePct high anthropic:default

This column represents the percentage of households without a vehicle, likely a census or survey-derived socioeconomic indicator across 51 geographic units (e.g., states or counties). The distribution is heavily right-skewed (skew=4.53, kurtosis=21.72) with the bulk of values tightly clustered between Q1=2.15% and Q3=3.05%, yet 3 outliers pull the max to 17.37% — more than 5× the median of 2.45%. That extreme upper tail almost certainly reflects a high-density urban area (e.g., New York City) where car-free households are far more common than in typical units.

pop medium anthropic:default

This column likely represents population counts for 51 distinct geographic or administrative units (e.g., U.S. states or territories), given exactly 51 fully unique, non-null integer values. The distribution is heavily right-skewed (skew = 2.58, kurtosis = 7.61), with a median of 4,372 far below the mean of 6,338 and a maximum of 38,643 — suggesting a small number of very large-population entities pulling the tail; 4 outliers (~7.8% of rows) drive this effect. The std of 7,243 exceeds the mean, confirming high dispersion relative to the central tendency.

povertyPop high anthropic:default

This column likely represents a count of people living in poverty, measured per U.S. state (n=51, matching the 50 states plus DC). The distribution is heavily right-skewed (skew=2.53, kurtosis=6.80), with a median of 548 but a mean of 794 and a maximum of 4685, indicating a small number of high-population states pull the mean well above the typical value. Four outliers (~7.8% of rows) are flagged, likely corresponding to the most populous states with the largest absolute poverty populations. The near-uniqueness (49 of 51 distinct values) suggests this is a genuine count variable, not a derived category.

povertyRate high anthropic:default

This column represents poverty rate (likely percentage of population below the poverty line) across 51 observations — almost certainly U.S. states plus DC. Values range from 7.33 to 19.2 with a mean of 12.35 and median of 11.91, indicating a modest right skew (skew=0.75) consistent with a handful of higher-poverty states pulling the tail. Three outliers (~5.9% of rows) at the upper end are flagged, likely representing the highest-poverty states; the near-zero kurtosis (0.20) suggests the distribution is otherwise fairly normal.

abbr high anthropic:default

This column contains two-letter US state abbreviations, with exactly 51 unique values across 51 rows — covering all 50 states plus one additional entry (likely Washington D.C. or a territory). Every value appears exactly once (top_rate = 0.0196), yielding a perfect entropy ratio of 1.0, meaning this is a fully uniform identifier with zero redundancy. The 'long_tail' alert is a statistical artifact of perfect uniformity, not a genuine concern here.

name high anthropic:default

This column contains US state names, with all 51 entries being unique (cardinality = 51, n = 51), consistent with a full list of US states plus Washington D.C. or a territory. Entropy ratio is exactly 1.0, meaning perfect uniformity — every value appears exactly once (top_rate = 0.0196, or 1/51). The 'long_tail' alert is technically correct but misleading here: the distribution is not skewed, it is perfectly flat.

counties high anthropic:default

This column most likely represents the number of counties per U.S. state (plus D.C.), matching the dataset's 51 rows exactly. The mean of ~62 and median of 62 are consistent with typical state county counts, while the maximum of 254 is almost certainly Texas (which has 254 counties). The distribution is right-skewed (skew 1.44) with high kurtosis (3.91), driven by that single outlier — Texas — which sits far above the rest of the distribution.

lat high anthropic:default

This column contains latitude coordinates, almost certainly representing the 50 US states plus Washington D.C. (n=51, all unique). The mean of 39.57 and median of 39.55 are tightly aligned, indicating near-symmetric distribution centered on the mid-continental US, though a kurtosis of 3.94 flags heavier tails than normal — driven by the 2 outliers likely corresponding to Alaska (max 64.2) and Hawaii (min 19.9).

lon high anthropic:default

This column contains longitude coordinates, almost certainly representing geographic locations of 51 entities (e.g., US states or cities), all with negative values indicating the Western Hemisphere. The range spans -155.58 to -69.45, consistent with continental US plus Hawaii (≈-155°), and the left skew (skew = -1.27) reflects Hawaii and Alaska pulling the distribution westward. Three duplicate longitude values exist (51 records, 48 unique), and 2 outliers (~3.9%) likely correspond to Hawaii and/or Alaska.

Numeric correlation

Show data table

Pearson correlation across 9 numeric columns (values clipped to 2 decimals).
	pop	desertPop	povertyPop	noVehicle	povertyRate	noVehiclePct	counties	lat	lon
pop	+1.00	+0.68	+0.99	+0.70	+0.06	+0.05	+0.45	-0.25	+0.03
desertPop	+0.68	+1.00	+0.69	+1.00	+0.12	+0.44	+0.20	-0.07	+0.16
povertyPop	+0.99	+0.69	+1.00	+0.70	+0.17	+0.06	+0.50	-0.30	+0.04
noVehicle	+0.70	+1.00	+0.70	+1.00	+0.06	+0.44	+0.19	-0.05	+0.17
povertyRate	+0.06	+0.12	+0.17	+0.06	+1.00	+0.14	+0.28	-0.43	+0.10
noVehiclePct	+0.05	+0.44	+0.06	+0.44	+0.14	+1.00	-0.22	+0.07	+0.25
counties	+0.45	+0.20	+0.50	+0.19	+0.28	-0.22	+1.00	-0.22	+0.07
lat	-0.25	-0.07	-0.30	-0.05	-0.43	+0.07	-0.22	+1.00	-0.09
lon	+0.03	+0.16	+0.04	+0.17	+0.10	+0.25	+0.07	-0.09	+1.00

name categorical

51 singleton categories

rows51

null0 (0.0%)

unique51

top_valueNew York

top_rate0.020

cardinality51

entropy5.672

entropy_ratio1.000

Show data table

Top values for name (20 unique shown, of 51 total).
value	count	share
New York	1	2.0%
California	1	2.0%
Texas	1	2.0%
Florida	1	2.0%
Pennsylvania	1	2.0%
Illinois	1	2.0%
Ohio	1	2.0%
Michigan	1	2.0%
New Jersey	1	2.0%
Massachusetts	1	2.0%
Georgia	1	2.0%
North Carolina	1	2.0%
Louisiana	1	2.0%
Missouri	1	2.0%
Indiana	1	2.0%
Tennessee	1	2.0%
Washington	1	2.0%
Arizona	1	2.0%
Virginia	1	2.0%
Kentucky	1	2.0%

Top values (rank 1–20)

New York — 1
California — 1
Texas — 1
Florida — 1
Pennsylvania — 1
Illinois — 1
Ohio — 1
Michigan — 1
New Jersey — 1
Massachusetts — 1
Georgia — 1
North Carolina — 1
Louisiana — 1
Missouri — 1
Indiana — 1
Tennessee — 1
Washington — 1
Arizona — 1
Virginia — 1
Kentucky — 1

abbr categorical

51 singleton categories

rows51

null0 (0.0%)

unique51

top_valueNY

top_rate0.020

cardinality51

entropy5.672

entropy_ratio1.000

Show data table

Top values for abbr (20 unique shown, of 51 total).
value	count	share
NY	1	2.0%
CA	1	2.0%
TX	1	2.0%
FL	1	2.0%
PA	1	2.0%
IL	1	2.0%
OH	1	2.0%
MI	1	2.0%
NJ	1	2.0%
MA	1	2.0%
GA	1	2.0%
NC	1	2.0%
LA	1	2.0%
MO	1	2.0%
IN	1	2.0%
TN	1	2.0%
WA	1	2.0%
AZ	1	2.0%
VA	1	2.0%
KY	1	2.0%

Top values (rank 1–20)

NY — 1
CA — 1
TX — 1
FL — 1
PA — 1
IL — 1
OH — 1
MI — 1
NJ — 1
MA — 1
GA — 1
NC — 1
LA — 1
MO — 1
IN — 1
TN — 1
WA — 1
AZ — 1
VA — 1
KY — 1

pop numeric

skew=+2.58 7.8% rows beyond 1.5 IQR

rows51

null0 (0.0%)

unique51

min564.000

max38,643

mean6,338

median4,372

std7,243

q11,770

q37,285

iqr5,514

skew2.583

kurtosis7.608

n_outliers4

outlier_rate0.078

zero_rate0.000

Show data table

Histogram bins for pop (median: 4372.0).
bin	count
564 – 6004	33
6004 – 1.144e+04	11
1.144e+04 – 1.688e+04	3
1.688e+04 – 2.232e+04	2
2.232e+04 – 2.776e+04	0
2.776e+04 – 3.32e+04	1
3.32e+04 – 3.864e+04	1

desertPop numeric

skew=+4.73 11.8% rows beyond 1.5 IQR

rows51

null0 (0.0%)

unique34

min1.000

max449.000

mean38.275

median21.000

std67.393

q16.000

q335.500

iqr29.500

skew4.734

kurtosis25.506

n_outliers6

outlier_rate0.118

zero_rate0.000

Show data table

Histogram bins for desertPop (median: 21.0).
bin	count
1 – 65	44
65 – 129	5
129 – 193	1
193 – 257	0
257 – 321	0
321 – 385	0
385 – 449	1

povertyPop numeric

skew=+2.53 7.8% rows beyond 1.5 IQR

rows51

null0 (0.0%)

unique49

min60.000

max4,685

mean794.020

median548.000

std932.900

q1198.000

q3860.500

iqr662.500

skew2.526

kurtosis6.800

n_outliers4

outlier_rate0.078

zero_rate0.000

Show data table

Histogram bins for povertyPop (median: 548.0).
bin	count
60 – 720.7	32
720.7 – 1381	11
1381 – 2042	4
2042 – 2703	1
2703 – 3364	1
3364 – 4024	1
4024 – 4685	1

noVehicle numeric

skew=+4.41 11.8% rows beyond 1.5 IQR

rows51

null0 (0.0%)

unique45

min8.000

max2,202

mean204.922

median115.000

std337.387

q140.000

q3203.000

iqr163.000

skew4.408

kurtosis22.566

n_outliers6

outlier_rate0.118

zero_rate0.000

Show data table

Histogram bins for noVehicle (median: 115.0).
bin	count
8 – 321.4	42
321.4 – 634.9	7
634.9 – 948.3	1
948.3 – 1262	0
1262 – 1575	0
1575 – 1889	0
1889 – 2202	1

povertyRate numeric

5.9% rows beyond 1.5 IQR

rows51

null0 (0.0%)

unique50

min7.330

max19.200

mean12.354

median11.910

std2.632

q110.460

q313.570

iqr3.110

skew0.753

kurtosis0.195

n_outliers3

outlier_rate0.059

zero_rate0.000

Show data table

Histogram bins for povertyRate (median: 11.91).
bin	count
7.33 – 9.026	2
9.026 – 10.72	14
10.72 – 12.42	14
12.42 – 14.11	11
14.11 – 15.81	4
15.81 – 17.5	3
17.5 – 19.2	3

noVehiclePct numeric

skew=+4.53 5.9% rows beyond 1.5 IQR

rows51

null0 (0.0%)

unique45

min1.290

max17.370

mean3.092

median2.450

std2.484

q12.150

q33.050

iqr0.900

skew4.533

kurtosis21.718

n_outliers3

outlier_rate0.059

zero_rate0.000

Show data table

Histogram bins for noVehiclePct (median: 2.45).
bin	count
1.29 – 3.587	44
3.587 – 5.884	5
5.884 – 8.181	0
8.181 – 10.48	0
10.48 – 12.78	1
12.78 – 15.07	0
15.07 – 17.37	1

counties numeric

rows51

null0 (0.0%)

unique46

min1.000

max254.000

mean61.647

median62.000

std46.726

q123.500

q387.500

iqr64.000

skew1.442

kurtosis3.907

n_outliers1

outlier_rate0.020

zero_rate0.000

Show data table

Histogram bins for counties (median: 62.0).
bin	count
1 – 37.14	18
37.14 – 73.29	15
73.29 – 109.4	13
109.4 – 145.6	3
145.6 – 181.7	1
181.7 – 217.9	0
217.9 – 254	1

lat numeric

rows51

null0 (0.0%)

unique51

min19.900

max64.200

mean39.574

median39.550

std6.418

q135.640

q343.135

iqr7.495

skew0.407

kurtosis3.940

n_outliers2

outlier_rate0.039

zero_rate0.000

Show data table

Histogram bins for lat (median: 39.55).
bin	count
19.9 – 26.23	1
26.23 – 32.56	6
32.56 – 38.89	13
38.89 – 45.21	25
45.21 – 51.54	5
51.54 – 57.87	0
57.87 – 64.2	1

lon numeric

rows51

null0 (0.0%)

unique48

min-155.580

max-69.450

mean-93.363

median-89.400

std19.125

q1-103.390

q3-78.840

iqr24.550

skew-1.274

kurtosis1.845

n_outliers2

outlier_rate0.039

zero_rate0.000

Show data table

Histogram bins for lon (median: -89.4).
bin	count
-155.6 – -143.3	2
-143.3 – -131	0
-131 – -118.7	3
-118.7 – -106.4	6
-106.4 – -94.06	9
-94.06 – -81.75	14
-81.75 – -69.45	17