saturn

/home/coolhand/html/datavis/data_trove/data/quirky/food_desert_states.json 51 rows sample n=51 seed 42 2026-06-21T23:46:41+00:00

Overview

Source/home/coolhand/html/datavis/data_trove/data/quirky/food_desert_states.json
Total rows51
Profiled sample51
Columns11
Generated2026-06-21T23:46:41+00:00
Show data table
Per-column null rate across the corpus.
columnkindnull %
namecategorical0.0%
abbrcategorical0.0%
popnumeric0.0%
desertPopnumeric0.0%
povertyPopnumeric0.0%
noVehiclenumeric0.0%
povertyRatenumeric0.0%
noVehiclePctnumeric0.0%
countiesnumeric0.0%
latnumeric0.0%
lonnumeric0.0%

Insights opt-in

Model-generated narrative. These are opinions, not facts — the stats below are what saturn measured. Generated by: anthropic:default.

Dataset high anthropic:default

This dataset contains one row per U.S. state (plus D.C., 51 rows total) with figures on food desert populations, vehicle access, and poverty. The most striking feature is the extreme right-skew in desert-exposed population counts: the median desertPop is just 21,000 but the max reaches 449,000, with 6 outlier states driving the distribution far above the norm — a pattern mirrored almost identically in noVehicle counts. Poverty rate, by contrast, is far more normally distributed (mean 12.4%, std 2.6%), suggesting that food desert exposure is more strongly shaped by state size and car dependency than by poverty alone — worth cross-examining. The noVehiclePct column (max 17.37% vs. median 2.45%) flags a small handful of states with dramatically higher car-free household rates that likely align with the desertPop outliers.

desertPop medium anthropic:default

This column likely represents a population count associated with desert regions (e.g., population living in desert areas, possibly by U.S. state or similar unit given n=51). The distribution is severely right-skewed (skew=4.73, kurtosis=25.51): the median is just 21.0 while the mean is 38.27 and the max reaches 449.0, indicating a small number of entities dominate desert population totals. With 6 outliers (≈11.8% of rows) and a standard deviation of 67.39 against a median of 21.0, those extreme values will heavily distort any linear model trained on raw values.

noVehicle medium anthropic:default

This column likely represents a count of households or individuals without access to a vehicle, aggregated at some geographic unit (e.g., census tract or neighbourhood) across 51 observations. The distribution is severely right-skewed (skew = 4.41, kurtosis = 22.57), with a median of 115 but a mean pulled to 204.9 by a long upper tail reaching 2202. Six outliers (≈11.8% of rows) are driving this extreme shape, suggesting a small number of densely populated or car-deprived areas dominate the upper end while most units cluster between 40 and 203 (IQR = 163).

noVehiclePct high anthropic:default

This column represents the percentage of households without a vehicle, likely a census or survey-derived socioeconomic indicator across 51 geographic units (e.g., states or counties). The distribution is heavily right-skewed (skew=4.53, kurtosis=21.72) with the bulk of values tightly clustered between Q1=2.15% and Q3=3.05%, yet 3 outliers pull the max to 17.37% — more than 5× the median of 2.45%. That extreme upper tail almost certainly reflects a high-density urban area (e.g., New York City) where car-free households are far more common than in typical units.

pop medium anthropic:default

This column likely represents population counts for 51 distinct geographic or administrative units (e.g., U.S. states or territories), given exactly 51 fully unique, non-null integer values. The distribution is heavily right-skewed (skew = 2.58, kurtosis = 7.61), with a median of 4,372 far below the mean of 6,338 and a maximum of 38,643 — suggesting a small number of very large-population entities pulling the tail; 4 outliers (~7.8% of rows) drive this effect. The std of 7,243 exceeds the mean, confirming high dispersion relative to the central tendency.

povertyPop high anthropic:default

This column likely represents a count of people living in poverty, measured per U.S. state (n=51, matching the 50 states plus DC). The distribution is heavily right-skewed (skew=2.53, kurtosis=6.80), with a median of 548 but a mean of 794 and a maximum of 4685, indicating a small number of high-population states pull the mean well above the typical value. Four outliers (~7.8% of rows) are flagged, likely corresponding to the most populous states with the largest absolute poverty populations. The near-uniqueness (49 of 51 distinct values) suggests this is a genuine count variable, not a derived category.

povertyRate high anthropic:default

This column represents poverty rate (likely percentage of population below the poverty line) across 51 observations — almost certainly U.S. states plus DC. Values range from 7.33 to 19.2 with a mean of 12.35 and median of 11.91, indicating a modest right skew (skew=0.75) consistent with a handful of higher-poverty states pulling the tail. Three outliers (~5.9% of rows) at the upper end are flagged, likely representing the highest-poverty states; the near-zero kurtosis (0.20) suggests the distribution is otherwise fairly normal.

abbr high anthropic:default

This column contains two-letter US state abbreviations, with exactly 51 unique values across 51 rows — covering all 50 states plus one additional entry (likely Washington D.C. or a territory). Every value appears exactly once (top_rate = 0.0196), yielding a perfect entropy ratio of 1.0, meaning this is a fully uniform identifier with zero redundancy. The 'long_tail' alert is a statistical artifact of perfect uniformity, not a genuine concern here.

name high anthropic:default

This column contains US state names, with all 51 entries being unique (cardinality = 51, n = 51), consistent with a full list of US states plus Washington D.C. or a territory. Entropy ratio is exactly 1.0, meaning perfect uniformity — every value appears exactly once (top_rate = 0.0196, or 1/51). The 'long_tail' alert is technically correct but misleading here: the distribution is not skewed, it is perfectly flat.

counties high anthropic:default

This column most likely represents the number of counties per U.S. state (plus D.C.), matching the dataset's 51 rows exactly. The mean of ~62 and median of 62 are consistent with typical state county counts, while the maximum of 254 is almost certainly Texas (which has 254 counties). The distribution is right-skewed (skew 1.44) with high kurtosis (3.91), driven by that single outlier — Texas — which sits far above the rest of the distribution.

lat high anthropic:default

This column contains latitude coordinates, almost certainly representing the 50 US states plus Washington D.C. (n=51, all unique). The mean of 39.57 and median of 39.55 are tightly aligned, indicating near-symmetric distribution centered on the mid-continental US, though a kurtosis of 3.94 flags heavier tails than normal — driven by the 2 outliers likely corresponding to Alaska (max 64.2) and Hawaii (min 19.9).

lon high anthropic:default

This column contains longitude coordinates, almost certainly representing geographic locations of 51 entities (e.g., US states or cities), all with negative values indicating the Western Hemisphere. The range spans -155.58 to -69.45, consistent with continental US plus Hawaii (≈-155°), and the left skew (skew = -1.27) reflects Hawaii and Alaska pulling the distribution westward. Three duplicate longitude values exist (51 records, 48 unique), and 2 outliers (~3.9%) likely correspond to Hawaii and/or Alaska.

Numeric correlation

Show data table
Pearson correlation across 9 numeric columns (values clipped to 2 decimals).
popdesertPoppovertyPopnoVehiclepovertyRatenoVehiclePctcountieslatlon
pop+1.00+0.68+0.99+0.70+0.06+0.05+0.45-0.25+0.03
desertPop+0.68+1.00+0.69+1.00+0.12+0.44+0.20-0.07+0.16
povertyPop+0.99+0.69+1.00+0.70+0.17+0.06+0.50-0.30+0.04
noVehicle+0.70+1.00+0.70+1.00+0.06+0.44+0.19-0.05+0.17
povertyRate+0.06+0.12+0.17+0.06+1.00+0.14+0.28-0.43+0.10
noVehiclePct+0.05+0.44+0.06+0.44+0.14+1.00-0.22+0.07+0.25
counties+0.45+0.20+0.50+0.19+0.28-0.22+1.00-0.22+0.07
lat-0.25-0.07-0.30-0.05-0.43+0.07-0.22+1.00-0.09
lon+0.03+0.16+0.04+0.17+0.10+0.25+0.07-0.09+1.00

name categorical

51 singleton categories
rows51
null0 (0.0%)
unique51
top_valueNew York
top_rate0.020
cardinality51
entropy5.672
entropy_ratio1.000
Show data table
Top values for name (20 unique shown, of 51 total).
valuecountshare
New York12.0%
California12.0%
Texas12.0%
Florida12.0%
Pennsylvania12.0%
Illinois12.0%
Ohio12.0%
Michigan12.0%
New Jersey12.0%
Massachusetts12.0%
Georgia12.0%
North Carolina12.0%
Louisiana12.0%
Missouri12.0%
Indiana12.0%
Tennessee12.0%
Washington12.0%
Arizona12.0%
Virginia12.0%
Kentucky12.0%
Top values (rank 1–20)
  1. New York — 1
  2. California — 1
  3. Texas — 1
  4. Florida — 1
  5. Pennsylvania — 1
  6. Illinois — 1
  7. Ohio — 1
  8. Michigan — 1
  9. New Jersey — 1
  10. Massachusetts — 1
  11. Georgia — 1
  12. North Carolina — 1
  13. Louisiana — 1
  14. Missouri — 1
  15. Indiana — 1
  16. Tennessee — 1
  17. Washington — 1
  18. Arizona — 1
  19. Virginia — 1
  20. Kentucky — 1

abbr categorical

51 singleton categories
rows51
null0 (0.0%)
unique51
top_valueNY
top_rate0.020
cardinality51
entropy5.672
entropy_ratio1.000
Show data table
Top values for abbr (20 unique shown, of 51 total).
valuecountshare
NY12.0%
CA12.0%
TX12.0%
FL12.0%
PA12.0%
IL12.0%
OH12.0%
MI12.0%
NJ12.0%
MA12.0%
GA12.0%
NC12.0%
LA12.0%
MO12.0%
IN12.0%
TN12.0%
WA12.0%
AZ12.0%
VA12.0%
KY12.0%
Top values (rank 1–20)
  1. NY — 1
  2. CA — 1
  3. TX — 1
  4. FL — 1
  5. PA — 1
  6. IL — 1
  7. OH — 1
  8. MI — 1
  9. NJ — 1
  10. MA — 1
  11. GA — 1
  12. NC — 1
  13. LA — 1
  14. MO — 1
  15. IN — 1
  16. TN — 1
  17. WA — 1
  18. AZ — 1
  19. VA — 1
  20. KY — 1

pop numeric

skew=+2.58 7.8% rows beyond 1.5 IQR
rows51
null0 (0.0%)
unique51
min564.000
max38,643
mean6,338
median4,372
std7,243
q11,770
q37,285
iqr5,514
skew2.583
kurtosis7.608
n_outliers4
outlier_rate0.078
zero_rate0.000
Show data table
Histogram bins for pop (median: 4372.0).
bincount
564 – 600433
6004 – 1.144e+0411
1.144e+04 – 1.688e+043
1.688e+04 – 2.232e+042
2.232e+04 – 2.776e+040
2.776e+04 – 3.32e+041
3.32e+04 – 3.864e+041

desertPop numeric

skew=+4.73 11.8% rows beyond 1.5 IQR
rows51
null0 (0.0%)
unique34
min1.000
max449.000
mean38.275
median21.000
std67.393
q16.000
q335.500
iqr29.500
skew4.734
kurtosis25.506
n_outliers6
outlier_rate0.118
zero_rate0.000
Show data table
Histogram bins for desertPop (median: 21.0).
bincount
1 – 6544
65 – 1295
129 – 1931
193 – 2570
257 – 3210
321 – 3850
385 – 4491

povertyPop numeric

skew=+2.53 7.8% rows beyond 1.5 IQR
rows51
null0 (0.0%)
unique49
min60.000
max4,685
mean794.020
median548.000
std932.900
q1198.000
q3860.500
iqr662.500
skew2.526
kurtosis6.800
n_outliers4
outlier_rate0.078
zero_rate0.000
Show data table
Histogram bins for povertyPop (median: 548.0).
bincount
60 – 720.732
720.7 – 138111
1381 – 20424
2042 – 27031
2703 – 33641
3364 – 40241
4024 – 46851

noVehicle numeric

skew=+4.41 11.8% rows beyond 1.5 IQR
rows51
null0 (0.0%)
unique45
min8.000
max2,202
mean204.922
median115.000
std337.387
q140.000
q3203.000
iqr163.000
skew4.408
kurtosis22.566
n_outliers6
outlier_rate0.118
zero_rate0.000
Show data table
Histogram bins for noVehicle (median: 115.0).
bincount
8 – 321.442
321.4 – 634.97
634.9 – 948.31
948.3 – 12620
1262 – 15750
1575 – 18890
1889 – 22021

povertyRate numeric

5.9% rows beyond 1.5 IQR
rows51
null0 (0.0%)
unique50
min7.330
max19.200
mean12.354
median11.910
std2.632
q110.460
q313.570
iqr3.110
skew0.753
kurtosis0.195
n_outliers3
outlier_rate0.059
zero_rate0.000
Show data table
Histogram bins for povertyRate (median: 11.91).
bincount
7.33 – 9.0262
9.026 – 10.7214
10.72 – 12.4214
12.42 – 14.1111
14.11 – 15.814
15.81 – 17.53
17.5 – 19.23

noVehiclePct numeric

skew=+4.53 5.9% rows beyond 1.5 IQR
rows51
null0 (0.0%)
unique45
min1.290
max17.370
mean3.092
median2.450
std2.484
q12.150
q33.050
iqr0.900
skew4.533
kurtosis21.718
n_outliers3
outlier_rate0.059
zero_rate0.000
Show data table
Histogram bins for noVehiclePct (median: 2.45).
bincount
1.29 – 3.58744
3.587 – 5.8845
5.884 – 8.1810
8.181 – 10.480
10.48 – 12.781
12.78 – 15.070
15.07 – 17.371

counties numeric

rows51
null0 (0.0%)
unique46
min1.000
max254.000
mean61.647
median62.000
std46.726
q123.500
q387.500
iqr64.000
skew1.442
kurtosis3.907
n_outliers1
outlier_rate0.020
zero_rate0.000
Show data table
Histogram bins for counties (median: 62.0).
bincount
1 – 37.1418
37.14 – 73.2915
73.29 – 109.413
109.4 – 145.63
145.6 – 181.71
181.7 – 217.90
217.9 – 2541

lat numeric

rows51
null0 (0.0%)
unique51
min19.900
max64.200
mean39.574
median39.550
std6.418
q135.640
q343.135
iqr7.495
skew0.407
kurtosis3.940
n_outliers2
outlier_rate0.039
zero_rate0.000
Show data table
Histogram bins for lat (median: 39.55).
bincount
19.9 – 26.231
26.23 – 32.566
32.56 – 38.8913
38.89 – 45.2125
45.21 – 51.545
51.54 – 57.870
57.87 – 64.21

lon numeric

rows51
null0 (0.0%)
unique48
min-155.580
max-69.450
mean-93.363
median-89.400
std19.125
q1-103.390
q3-78.840
iqr24.550
skew-1.274
kurtosis1.845
n_outliers2
outlier_rate0.039
zero_rate0.000
Show data table
Histogram bins for lon (median: -89.4).
bincount
-155.6 – -143.32
-143.3 – -1310
-131 – -118.73
-118.7 – -106.46
-106.4 – -94.069
-94.06 – -81.7514
-81.75 – -69.4517