saturn·

merged inequality master

saturn notebook · generated 2026-05-01 Report Notebook

Overview

Source: /home/coolhand/datasets/us-inequality-atlas/merged/inequality_master.csv

Saturn profiled 3,222 rows across 28 columns. The stats below are deterministic and machine-readable; the prose is a language-model interpretation of those stats (opt-in, added after the fact, never sees raw rows).

[2]:
!pip install saturn-dissect
import subprocess
subprocess.run([
    "saturn", "analyze", "/home/coolhand/datasets/us-inequality-atlas/merged/inequality_master.csv",
    "--findings", "merged-inequality_master.json",
    "--llm", "anthropic:claude-opus-4-7",
])

Summary confidence: high

This dataset profiles 3,222 U.S. counties across 28 columns of socioeconomic indicators, including poverty, rent burden, education, healthcare, and a composite inequality index. Two things stand out for closer inspection: the rent_to_income_ratio shows extreme skew (53.98) with a max of 1200 against a median of 17.06, suggesting either data-entry anomalies or a handful of severe outliers worth investigating. Total population is also highly skewed (skew 13.36, max ~9.78M vs median 25,174), so any per-county aggregation should be population-weighted. The composite_index and the *_score columns are well-behaved and centered near 50, making them good candidates for cross-county comparison. Texas (254 counties), Georgia, and Virginia dominate the state distribution.

citing: rent_to_income_ratio · total_pop · composite_index · pct_poverty · state · pct_rent_burdened_30 · uninsured_rate

Out[4]:

saturn.schema() · 28 columns

column kind n null% unique alerts
fips numeric 3,222 0.0% 3,222
county_name text 3,222 0.0% 3,222 near_unique
state categorical 3,222 0.0% 52
total_pop numeric 3,222 0.0% 3,173 high_skew outliers
composite_index numeric 3,222 0.0% 650
economic_score numeric 3,222 0.0% 908
education_score numeric 3,222 0.0% 1,001
healthcare_score numeric 3,222 0.0% 808
housing_score numeric 3,222 0.0% 937
food_score numeric 3,222 0.0% 941
disability_score numeric 3,222 0.0% 1,001
poverty_rate numeric 3,222 0.0% 1,719 high_skew
no_vehicle_pct numeric 3,222 0.0% 1,065 high_skew
uninsured_rate numeric 3,222 0.0% 152 high_skew outliers
hospital_closure_risk numeric 3,222 0.0% 3
pct_rent_burdened_30 numeric 3,222 0.0% 2,146
pct_rent_burdened_50 numeric 3,222 0.0% 1,769
median_gross_rent numeric 3,222 0.3% 983 outliers
rent_to_income_ratio numeric 3,222 0.3% 1,269 high_skew
gini_index numeric 3,222 0.0% 1,317
unemployment_rate numeric 3,222 0.0% 950 high_skew
labor_force_participation numeric 3,222 0.0% 1,944
pct_deep_poverty numeric 3,222 0.0% 1,131 high_skew outliers
pct_poverty numeric 3,222 0.0% 1,719 high_skew
pct_near_poverty numeric 3,222 0.0% 1,237
pct_hs_or_higher numeric 3,222 0.0% 1,612
pct_bachelors_or_higher numeric 3,222 0.0% 1,982
disability_rate numeric 3,222 0.0% 305 high_skew
Fig 1.
composite_index · Roughly symmetric spread from ~10 to ~90 around a median of 49.5 — a usable summary score for ranking counties.
Show data table
Histogram bins for composite_index (median: 49.5).
bincount
10.1 – 12.11
12.1 – 14.12
14.1 – 16.16
16.1 – 18.115
18.1 – 20.113
20.1 – 22.127
22.1 – 24.150
24.1 – 26.146
26.1 – 28.178
28.1 – 30.181
30.1 – 32.1101
32.1 – 34.1107
34.1 – 36.1118
36.1 – 38.1139
38.1 – 40.1142
40.1 – 42.1128
42.1 – 44.1155
44.1 – 46.1160
46.1 – 48.1151
48.1 – 50.1142
50.1 – 52.1162
52.1 – 54.1165
54.1 – 56.1115
56.1 – 58.1122
58.1 – 60.1113
60.1 – 62.1116
62.1 – 64.1108
64.1 – 66.1109
66.1 – 68.181
68.1 – 70.1103
70.1 – 72.165
72.1 – 74.181
74.1 – 76.170
76.1 – 78.147
78.1 – 80.138
80.1 – 82.124
82.1 – 84.118
84.1 – 86.113
86.1 – 88.13
88.1 – 90.17
Fig 2.
pct_poverty · Right-skewed poverty rates (median 13.6%, max 66.3%) highlight a long tail of high-poverty counties.
Show data table
Histogram bins for pct_poverty (median: 13.55).
bincount
1.6 – 3.2187
3.218 – 4.83634
4.836 – 6.454106
6.454 – 8.072246
8.072 – 9.69320
9.69 – 11.31354
11.31 – 12.93393
12.93 – 14.54364
14.54 – 16.16306
16.16 – 17.78262
17.78 – 19.4192
19.4 – 21.02149
21.02 – 22.63123
22.63 – 24.2591
24.25 – 25.8752
25.87 – 27.4944
27.49 – 29.1134
29.11 – 30.7223
30.72 – 32.3418
32.34 – 33.9614
33.96 – 35.586
35.58 – 37.28
37.2 – 38.813
38.81 – 40.438
40.43 – 42.055
42.05 – 43.679
43.67 – 45.294
45.29 – 46.911
46.9 – 48.527
48.52 – 50.148
50.14 – 51.762
51.76 – 53.386
53.38 – 54.995
54.99 – 56.615
56.61 – 58.231
58.23 – 59.850
59.85 – 61.470
61.47 – 63.080
63.08 – 64.71
64.7 – 66.321
Fig 3.
rent_to_income_ratio · Watch for extreme outliers — max of 1200 vs median 17 suggests data quality issues worth filtering before analysis.
Show data table
Histogram bins for rent_to_income_ratio (median: 17.06).
bincount
6.1 – 35.953207
35.95 – 65.85
65.8 – 95.640
95.64 – 125.50
125.5 – 155.30
155.3 – 185.20
185.2 – 2150
215 – 244.90
244.9 – 274.70
274.7 – 304.60
304.6 – 334.40
334.4 – 364.30
364.3 – 394.10
394.1 – 4240
424 – 453.80
453.8 – 483.70
483.7 – 513.50
513.5 – 543.40
543.4 – 573.20
573.2 – 603.10
603.1 – 632.90
632.9 – 662.70
662.7 – 692.60
692.6 – 722.40
722.4 – 752.30
752.3 – 782.10
782.1 – 8120
812 – 841.80
841.8 – 871.70
871.7 – 901.50
901.5 – 931.40
931.4 – 961.20
961.2 – 991.10
991.1 – 10210
1021 – 10510
1051 – 10810
1081 – 11100
1110 – 11400
1140 – 11700
1170 – 12001
Fig 4.
state · Coverage is national but Texas (254), Georgia, and Virginia contribute the most counties; weight comparisons accordingly.
Show data table
Top values for state (20 unique shown, of 52 total).
valuecountshare
TX2547.9%
GA1594.9%
VA1334.1%
KY1203.7%
MO1153.6%
KS1053.3%
IL1023.2%
NC1003.1%
IA993.1%
TN952.9%
NE932.9%
IN922.9%
OH882.7%
MN872.7%
MI832.6%
MS822.5%
PR782.4%
OK772.4%
AR752.3%
WI722.2%
Fig 5.
pct_rent_burdened_30 · Near-symmetric distribution centered around 37% shows rent burden is widespread, not just a tail phenomenon.
Show data table
Histogram bins for pct_rent_burdened_30 (median: 37.36).
bincount
0 – 1.6249
1.624 – 3.2485
3.248 – 4.8723
4.872 – 6.4965
6.496 – 8.129
8.12 – 9.74413
9.744 – 11.3711
11.37 – 12.9916
12.99 – 14.6226
14.62 – 16.2419
16.24 – 17.8635
17.86 – 19.4943
19.49 – 21.1152
21.11 – 22.7452
22.74 – 24.3673
24.36 – 25.9899
25.98 – 27.61109
27.61 – 29.23116
29.23 – 30.86132
30.86 – 32.48159
32.48 – 34.1189
34.1 – 35.73209
35.73 – 37.35227
37.35 – 38.98239
38.98 – 40.6205
40.6 – 42.22209
42.22 – 43.85210
43.85 – 45.47190
45.47 – 47.1131
47.1 – 48.72114
48.72 – 50.34118
50.34 – 51.9769
51.97 – 53.5951
53.59 – 55.2234
55.22 – 56.8424
56.84 – 58.466
58.46 – 60.093
60.09 – 61.712
61.71 – 63.343
63.34 – 64.963
Fig 6.
Per-column null rate across the corpus. Columns are ordered by input position.
Show data table
Per-column null rate across the corpus.
columnkindnull %
fipsnumeric0.0%
county_nametext0.0%
statecategorical0.0%
total_popnumeric0.0%
composite_indexnumeric0.0%
economic_scorenumeric0.0%
education_scorenumeric0.0%
healthcare_scorenumeric0.0%
housing_scorenumeric0.0%
food_scorenumeric0.0%
disability_scorenumeric0.0%
poverty_ratenumeric0.0%
no_vehicle_pctnumeric0.0%
uninsured_ratenumeric0.0%
hospital_closure_risknumeric0.0%
pct_rent_burdened_30numeric0.0%
pct_rent_burdened_50numeric0.0%
median_gross_rentnumeric0.3%
rent_to_income_rationumeric0.3%
gini_indexnumeric0.0%
unemployment_ratenumeric0.0%
labor_force_participationnumeric0.0%
pct_deep_povertynumeric0.0%
pct_povertynumeric0.0%
pct_near_povertynumeric0.0%
pct_hs_or_highernumeric0.0%
pct_bachelors_or_highernumeric0.0%
disability_ratenumeric0.0%
Fig 7.
Pearson correlation across numeric columns (sampled, bounded).
Show data table
Pearson correlation across 12 numeric columns (values clipped to 2 decimals).
fipstotal_popcomposite_indexeconomic_scoreeducation_scorehealthcare_scorehousing_scorefood_scoredisability_scorepoverty_rateno_vehicle_pctuninsured_rate
fips+1.00-0.07+0.03+0.01-0.06+0.30-0.11+0.02-0.04+0.16+0.04+0.01
total_pop-0.07+1.00-0.06+0.07-0.29-0.22+0.29-0.01-0.09-0.11+0.09-0.04
composite_index+0.03-0.06+1.00+0.84+0.60+0.44+0.40+0.85+0.43+0.76+0.47+0.13
economic_score+0.01+0.07+0.84+1.00+0.28+0.16+0.48+0.75+0.17+0.72+0.41-0.05
education_score-0.06-0.29+0.60+0.28+1.00+0.38-0.19+0.43+0.27+0.43+0.19+0.13
healthcare_score+0.30-0.22+0.44+0.16+0.38+1.00-0.25+0.22+0.11+0.30+0.11+0.57
housing_score-0.11+0.29+0.40+0.48-0.19-0.25+1.00+0.33+0.03+0.24+0.19-0.18
food_score+0.02-0.01+0.85+0.75+0.43+0.22+0.33+1.00+0.23+0.80+0.64+0.00
disability_score-0.04-0.09+0.43+0.17+0.27+0.11+0.03+0.23+1.00+0.15+0.12+0.06
poverty_rate+0.16-0.11+0.76+0.72+0.43+0.30+0.24+0.80+0.15+1.00+0.45-0.04
no_vehicle_pct+0.04+0.09+0.47+0.41+0.19+0.11+0.19+0.64+0.12+0.45+1.00+0.15
uninsured_rate+0.01-0.04+0.13-0.05+0.13+0.57-0.18+0.00+0.06-0.04+0.15+1.00

fips numeric identifier

This is the FIPS code identifying U.S. counties (or equivalents), with values spanning 1001 to 72153 and exactly one row per code (3222 unique out of 3222). The distribution is roughly symmetric (skew 0.16, kurtosis -0.63) with no nulls or outliers, consistent with a structured geographic key rather than a measured quantity. Treat the numeric stats as incidental—the magnitude has no analytic meaning.

Treatment: Cast to zero-padded string and use as a join key to county-level reference data.

anthropic:claude-opus-4-7 · confidence high
Out[13]:

saturn.columns["fips"].stats

statvalue
n3,222
nulls0 (0.0%)
unique3,222
min 1,001
max 72,153
mean 3.138e+04
median 30,022
std 1.63e+04
q1 1.903e+04
q3 4.61e+04
iqr 27,075
skew 0.1574
kurtosis -0.6314
n_outliers 0
outlier_rate 0
zero_rate 0
Fig 8.
Distribution of fips. Vertical dash marks the median.
Show data table
Histogram bins for fips (median: 30022.0).
bincount
1001 – 278097
2780 – 455915
4559 – 6337133
6337 – 811659
8116 – 989514
9895 – 1.167e+044
1.167e+04 – 1.345e+04226
1.345e+04 – 1.523e+045
1.523e+04 – 1.701e+0449
1.701e+04 – 1.879e+04189
1.879e+04 – 2.057e+04204
2.057e+04 – 2.235e+04184
2.235e+04 – 2.413e+0439
2.413e+04 – 2.59e+0415
2.59e+04 – 2.768e+04170
2.768e+04 – 2.946e+04196
2.946e+04 – 3.124e+04150
3.124e+04 – 3.302e+0427
3.302e+04 – 3.48e+0421
3.48e+04 – 3.658e+0495
3.658e+04 – 3.836e+04153
3.836e+04 – 4.013e+04155
4.013e+04 – 4.191e+0446
4.191e+04 – 4.369e+0467
4.369e+04 – 4.547e+0451
4.547e+04 – 4.725e+04161
4.725e+04 – 4.903e+04268
4.903e+04 – 5.081e+0429
5.081e+04 – 5.259e+04133
5.259e+04 – 5.436e+0494
5.436e+04 – 5.614e+0495
5.614e+04 – 5.792e+040
5.792e+04 – 5.97e+040
5.97e+04 – 6.148e+040
6.148e+04 – 6.326e+040
6.326e+04 – 6.504e+040
6.504e+04 – 6.682e+040
6.682e+04 – 6.86e+040
6.86e+04 – 7.037e+040
7.037e+04 – 7.215e+0478

county_name text identifier

This column appears to be a fully-qualified US county name (e.g., 'X County, State'), with all 3222 values unique and zero nulls. The token 'county,' appears in 2999 of 3222 rows, suggesting ~223 entries use a different administrative suffix (parish, borough, census area). State-name frequencies (Texas 256, Virginia 189, Georgia 159) line up with known county counts, and length is tightly bounded between 16 and 59 characters.

Treatment: Use as a join key to county-level reference tables; do not feed as a feature.

anthropic:claude-opus-4-7 · confidence high
Out[16]:

saturn.columns["county_name"].stats

statvalue
n3,222
nulls0 (0.0%)
unique3,222
len_min 16
len_max 59
len_mean 24.32
len_median 24
len_p95 31
word_mean 3.248
word_median 3
n_empty 0
n_duplicates 0
duplicate_rate 0
vocab_size 1,990
readability_flesch_mean 10.28
emoji_rate 0
url_rate 0
one_word_rate 0
allcaps_rate 0
boilerplate_rate 0
alert: near_unique100.0% of rows are unique strings
Fig 9.
Character-length distribution for county_name.
Show data table
Character-length distribution for county_name (mean: 24.324022346368714).
charscount
16 – 1726
17 – 1872
18 – 19121
19 – 20190
20 – 21264
21 – 22407
22 – 24420
24 – 25363
25 – 26320
26 – 27240
27 – 28231
28 – 29152
29 – 30139
30 – 31165
31 – 3241
32 – 3328
33 – 3416
34 – 3510
35 – 365
36 – 380
38 – 391
39 – 401
40 – 410
41 – 421
42 – 431
43 – 440
44 – 452
45 – 460
46 – 471
47 – 481
48 – 490
49 – 500
50 – 510
51 – 530
53 – 542
54 – 551
55 – 560
56 – 570
57 – 580
58 – 591

state categorical feature

This is a US state code column with 52 distinct values, consistent with the 50 states plus DC and likely one territory. Distribution is broad and near-uniform on entropy (entropy_ratio 0.932), with TX leading at just 254 of 3222 rows (7.88%) followed by GA, VA, KY, and MO — suggesting one row per US county or similar geographic unit rather than a population-weighted sample. No nulls.

Treatment: Use as a categorical grouping key; one-hot or target-encode for modelling.

anthropic:claude-opus-4-7 · confidence high
Out[19]:

saturn.columns["state"].stats

statvalue
n3,222
nulls0 (0.0%)
unique52
top_value TX
top_rate 0.07883
cardinality 52
entropy 5.314
entropy_ratio 0.9322
Fig 10.
Top values for state.
Show data table
Top values for state (20 unique shown, of 52 total).
valuecountshare
TX2547.9%
GA1594.9%
VA1334.1%
KY1203.7%
MO1153.6%
KS1053.3%
IL1023.2%
NC1003.1%
IA993.1%
TN952.9%
NE932.9%
IN922.9%
OH882.7%
MN872.7%
MI832.6%
MS822.5%
PR782.4%
OK772.4%
AR752.3%
WI722.2%

total_pop numeric feature

This is a population count column with 3222 records and 3173 unique values, no nulls or zeros, ranging from 47 to 9,782,602. The distribution is extremely right-skewed (skew 13.36, kurtosis 297.59) with the mean (101,340) nearly four times the median (25,174), and 449 outliers (13.9%) sit beyond the IQR fence. The shape is consistent with US county- or municipality-level populations where a few large metros dominate.

Treatment: log-transform before modelling to tame the heavy right tail.

anthropic:claude-opus-4-7 · confidence high
Out[22]:

saturn.columns["total_pop"].stats

statvalue
n3,222
nulls0 (0.0%)
unique3,173
min 47
max 9.783e+06
mean 1.013e+05
median 25,174
std 3.246e+05
q1 1.059e+04
q3 6.501e+04
iqr 5.442e+04
skew 13.36
kurtosis 297.6
n_outliers 449
outlier_rate 0.1394
zero_rate 0
alert: high_skewskew=+13.36
alert: outliers13.9% rows beyond 1.5 IQR
Fig 11.
Distribution of total_pop. Vertical dash marks the median.
Show data table
Histogram bins for total_pop (median: 25174.0).
bincount
47 – 2.446e+052942
2.446e+05 – 4.892e+05137
4.892e+05 – 7.337e+0557
7.337e+05 – 9.783e+0539
9.783e+05 – 1.223e+0612
1.223e+06 – 1.467e+069
1.467e+06 – 1.712e+067
1.712e+06 – 1.957e+063
1.957e+06 – 2.201e+063
2.201e+06 – 2.446e+064
2.446e+06 – 2.69e+063
2.69e+06 – 2.935e+060
2.935e+06 – 3.179e+061
3.179e+06 – 3.424e+061
3.424e+06 – 3.669e+060
3.669e+06 – 3.913e+060
3.913e+06 – 4.158e+060
4.158e+06 – 4.402e+061
4.402e+06 – 4.647e+060
4.647e+06 – 4.891e+061
4.891e+06 – 5.136e+060
5.136e+06 – 5.38e+061
5.38e+06 – 5.625e+060
5.625e+06 – 5.87e+060
5.87e+06 – 6.114e+060
6.114e+06 – 6.359e+060
6.359e+06 – 6.603e+060
6.603e+06 – 6.848e+060
6.848e+06 – 7.092e+060
7.092e+06 – 7.337e+060
7.337e+06 – 7.582e+060
7.582e+06 – 7.826e+060
7.826e+06 – 8.071e+060
8.071e+06 – 8.315e+060
8.315e+06 – 8.56e+060
8.56e+06 – 8.804e+060
8.804e+06 – 9.049e+060
9.049e+06 – 9.293e+060
9.293e+06 – 9.538e+060
9.538e+06 – 9.783e+061

composite_index numeric feature

A numeric composite_index spanning 10.1 to 90.1 with mean 49.99 and median 49.5, suggesting a deliberately scaled or normalized index centered near 50. The distribution is nearly symmetric (skew 0.13) and slightly platykurtic (kurtosis -0.67), with no nulls, no zeros, and no outliers flagged. Only 650 unique values across 3222 rows points to rounding to one decimal rather than continuous measurement.

Treatment: Use as-is for modelling; already well-scaled and clean, no transform needed.

anthropic:claude-opus-4-7 · confidence high
Out[25]:

saturn.columns["composite_index"].stats

statvalue
n3,222
nulls0 (0.0%)
unique650
min 10.1
max 90.1
mean 49.99
median 49.5
std 15.29
q1 38.4
q3 61.5
iqr 23.1
skew 0.1295
kurtosis -0.6661
n_outliers 0
outlier_rate 0
zero_rate 0
Fig 12.
Distribution of composite_index. Vertical dash marks the median.
Show data table
Histogram bins for composite_index (median: 49.5).
bincount
10.1 – 12.11
12.1 – 14.12
14.1 – 16.16
16.1 – 18.115
18.1 – 20.113
20.1 – 22.127
22.1 – 24.150
24.1 – 26.146
26.1 – 28.178
28.1 – 30.181
30.1 – 32.1101
32.1 – 34.1107
34.1 – 36.1118
36.1 – 38.1139
38.1 – 40.1142
40.1 – 42.1128
42.1 – 44.1155
44.1 – 46.1160
46.1 – 48.1151
48.1 – 50.1142
50.1 – 52.1162
52.1 – 54.1165
54.1 – 56.1115
56.1 – 58.1122
58.1 – 60.1113
60.1 – 62.1116
62.1 – 64.1108
64.1 – 66.1109
66.1 – 68.181
68.1 – 70.1103
70.1 – 72.165
72.1 – 74.181
74.1 – 76.170
76.1 – 78.147
78.1 – 80.138
80.1 – 82.124
82.1 – 84.118
84.1 – 86.113
86.1 – 88.13
88.1 – 90.17

economic_score numeric feature

A bounded numeric feature ranging from 0.3 to 99.9 with mean 50.00 and median 49.6, consistent with a 0-100 economic index or score. The distribution is nearly symmetric (skew 0.084) and platykurtic (kurtosis -0.826), with no nulls, no zeros, and no outliers flagged across 3222 rows. With 908 unique values and an IQR of 35.47, the spread is wide and uniform-leaning rather than concentrated.

Treatment: Use as-is or min-max scale to [0,1]; no transformation needed given symmetric bounded distribution.

anthropic:claude-opus-4-7 · confidence high
Out[28]:

saturn.columns["economic_score"].stats

statvalue
n3,222
nulls0 (0.0%)
unique908
min 0.3
max 99.9
mean 50
median 49.6
std 23.15
q1 32.2
q3 67.67
iqr 35.47
skew 0.084
kurtosis -0.8261
n_outliers 0
outlier_rate 0
zero_rate 0
Fig 13.
Distribution of economic_score. Vertical dash marks the median.
Show data table
Histogram bins for economic_score (median: 49.6).
bincount
0.3 – 2.7911
2.79 – 5.2819
5.28 – 7.7737
7.77 – 10.2641
10.26 – 12.7541
12.75 – 15.2459
15.24 – 17.7360
17.73 – 20.2295
20.22 – 22.7178
22.71 – 25.299
25.2 – 27.6991
27.69 – 30.1897
30.18 – 32.6794
32.67 – 35.16119
35.16 – 37.65100
37.65 – 40.14140
40.14 – 42.63110
42.63 – 45.12127
45.12 – 47.61106
47.61 – 50.1114
50.1 – 52.59133
52.59 – 55.08110
55.08 – 57.57124
57.57 – 60.06108
60.06 – 62.55114
62.55 – 65.0490
65.04 – 67.5393
67.53 – 70.02101
70.02 – 72.5199
72.51 – 7574
75 – 77.4981
77.49 – 79.9868
79.98 – 82.4762
82.47 – 84.9671
84.96 – 87.4557
87.45 – 89.9457
89.94 – 92.4336
92.43 – 94.9255
94.92 – 97.4127
97.41 – 99.924

education_score numeric feature

This column is a numeric education score bounded between 0 and 100 with a perfectly symmetric distribution (mean and median both 50.0, skew effectively zero). The negative kurtosis of -1.20 and IQR spanning exactly 25 to 75 suggest a near-uniform spread rather than a bell curve, which is unusual for a real-world score and hints at synthetic or rank-transformed data. With 1001 unique values across 3222 rows, no nulls, and no outliers, the column is clean but suspiciously well-behaved.

Treatment: Use as-is or scale to [0,1]; verify it isn't a synthetic/rank feature before modelling.

anthropic:claude-opus-4-7 · confidence high
Out[31]:

saturn.columns["education_score"].stats

statvalue
n3,222
nulls0 (0.0%)
unique1,001
min 0
max 100
mean 50
median 50
std 28.88
q1 25
q3 75
iqr 50
skew 1.2e-17
kurtosis -1.2
n_outliers 0
outlier_rate 0
zero_rate 0.0006207
Fig 14.
Distribution of education_score. Vertical dash marks the median.
Show data table
Histogram bins for education_score (median: 50.0).
bincount
0 – 2.579
2.5 – 581
5 – 7.580
7.5 – 1081
10 – 12.581
12.5 – 1580
15 – 17.581
17.5 – 2080
20 – 22.581
22.5 – 2580
25 – 27.581
27.5 – 3080
30 – 32.581
32.5 – 3580
35 – 37.581
37.5 – 4080
40 – 42.581
42.5 – 4580
45 – 47.581
47.5 – 5080
50 – 52.581
52.5 – 5580
55 – 57.581
57.5 – 6080
60 – 62.581
62.5 – 6581
65 – 67.580
67.5 – 7081
70 – 72.580
72.5 – 7581
75 – 77.580
77.5 – 8081
80 – 82.580
82.5 – 8581
85 – 87.580
87.5 – 9081
90 – 92.580
92.5 – 9581
95 – 97.580
97.5 – 10083

healthcare_score numeric feature

A continuous healthcare quality or performance score for 3222 rows, ranging from 4.3 to 98.2 with mean 50.0 and median 48.6. The distribution is mildly right-skewed (0.24) with negative kurtosis (-0.75), suggesting a broad, near-uniform spread rather than a tight bell, and no outliers were flagged. With 808 unique values, no nulls, and no zeros, the column looks clean and ready to use.

Treatment: Use as-is as a numeric feature; standardize if combining with other scaled features.

anthropic:claude-opus-4-7 · confidence high
Out[34]:

saturn.columns["healthcare_score"].stats

statvalue
n3,222
nulls0 (0.0%)
unique808
min 4.3
max 98.2
mean 50
median 48.6
std 20.19
q1 33.9
q3 64.57
iqr 30.67
skew 0.2381
kurtosis -0.7521
n_outliers 0
outlier_rate 0
zero_rate 0
Fig 15.
Distribution of healthcare_score. Vertical dash marks the median.
Show data table
Histogram bins for healthcare_score (median: 48.6).
bincount
4.3 – 6.6473
6.647 – 8.9952
8.995 – 11.345
11.34 – 13.6912
13.69 – 16.0443
16.04 – 18.3972
18.39 – 20.7376
20.73 – 23.0883
23.08 – 25.43100
25.43 – 27.7890
27.78 – 30.12121
30.12 – 32.47120
32.47 – 34.82135
34.82 – 37.16115
37.16 – 39.51103
39.51 – 41.86109
41.86 – 44.21146
44.21 – 46.55150
46.55 – 48.9150
48.9 – 51.25125
51.25 – 53.6128
53.6 – 55.95128
55.95 – 58.29148
58.29 – 60.64102
60.64 – 62.9991
62.99 – 65.3496
65.34 – 67.6870
67.68 – 70.0393
70.03 – 72.3873
72.38 – 74.7379
74.73 – 77.0771
77.07 – 79.4266
79.42 – 81.7770
81.77 – 84.1172
84.11 – 86.4640
86.46 – 88.8140
88.81 – 91.1633
91.16 – 93.5123
93.51 – 95.8520
95.85 – 98.219

housing_score numeric feature

A continuous housing_score ranging from 0.0 to 99.9 with mean 49.93 and median 49.85, suggesting a 0-100 index. The distribution is nearly symmetric (skew 0.01) and platykurtic (kurtosis -0.88), with a wide IQR of 37.98 and no detected outliers, consistent with a near-uniform spread rather than a peaked score. No nulls and only one zero across 3222 rows.

Treatment: Use as-is or min-max scale to [0,1]; no transform needed given symmetry and absence of outliers.

anthropic:claude-opus-4-7 · confidence high
Out[37]:

saturn.columns["housing_score"].stats

statvalue
n3,222
nulls0 (0.0%)
unique937
min 0
max 99.9
mean 49.93
median 49.85
std 24.47
q1 30.73
q3 68.7
iqr 37.98
skew 0.01353
kurtosis -0.8807
n_outliers 0
outlier_rate 0
zero_rate 0.0003104
Fig 16.
Distribution of housing_score. Vertical dash marks the median.
Show data table
Histogram bins for housing_score (median: 49.849999999999994).
bincount
0 – 2.49834
2.498 – 4.99539
4.995 – 7.49252
7.492 – 9.9948
9.99 – 12.4945
12.49 – 14.9842
14.98 – 17.4878
17.48 – 19.9878
19.98 – 22.4873
22.48 – 24.9899
24.98 – 27.4794
27.47 – 29.9799
29.97 – 32.4790
32.47 – 34.9787
34.97 – 37.4696
37.46 – 39.96117
39.96 – 42.46107
42.46 – 44.95116
44.95 – 47.45102
47.45 – 49.95121
49.95 – 52.45142
52.45 – 54.95107
54.95 – 57.44103
57.44 – 59.9499
59.94 – 62.44105
62.44 – 64.9485
64.94 – 67.43102
67.43 – 69.93101
69.93 – 72.4379
72.43 – 74.9281
74.92 – 77.4294
77.42 – 79.9280
79.92 – 82.4270
82.42 – 84.9259
84.92 – 87.4161
87.41 – 89.9160
89.91 – 92.4161
92.41 – 94.9153
94.91 – 97.447
97.4 – 99.916

food_score numeric feature

A numeric feature called food_score that ranges from 0.1 to 99.5 with mean 49.9997 and median 50.0, suggesting a percentile-style or normalised rating bounded near [0,100]. The distribution is essentially symmetric (skew 0.029) and platykurtic (kurtosis -0.96), with no nulls, no zeros, and no outliers across 3222 rows — consistent with a synthetic or uniformly distributed score rather than an organic measurement.

Treatment: Use as-is; already on a bounded 0–100 scale with no transformation needed.

anthropic:claude-opus-4-7 · confidence high
Out[40]:

saturn.columns["food_score"].stats

statvalue
n3,222
nulls0 (0.0%)
unique941
min 0.1
max 99.5
mean 50
median 50
std 25.48
q1 29.6
q3 69.8
iqr 40.2
skew 0.02926
kurtosis -0.9648
n_outliers 0
outlier_rate 0
zero_rate 0
Fig 17.
Distribution of food_score. Vertical dash marks the median.
Show data table
Histogram bins for food_score (median: 50.0).
bincount
0.1 – 2.58523
2.585 – 5.0743
5.07 – 7.55561
7.555 – 10.0462
10.04 – 12.5363
12.53 – 15.0174
15.01 – 17.564
17.5 – 19.9882
19.98 – 22.4775
22.47 – 24.9586
24.95 – 27.4485
27.44 – 29.92106
29.92 – 32.4192
32.41 – 34.8999
34.89 – 37.3896
37.38 – 39.8697
39.86 – 42.35102
42.35 – 44.8390
44.83 – 47.3284
47.32 – 49.8121
49.8 – 52.2994
52.29 – 54.77117
54.77 – 57.26102
57.26 – 59.7493
59.74 – 62.23101
62.23 – 64.7192
64.71 – 67.297
67.2 – 69.68107
69.68 – 72.1789
72.17 – 74.6590
74.65 – 77.1483
77.14 – 79.6276
79.62 – 82.1178
82.11 – 84.5954
84.59 – 87.0871
87.08 – 89.5656
89.56 – 92.0543
92.05 – 94.5347
94.53 – 97.0269
97.02 – 99.558

disability_score numeric feature

A numeric disability score bounded between 0 and 100 with mean and median both exactly 50.0 and zero skew, indicating a perfectly symmetric distribution. The negative kurtosis (-1.20) and IQR spanning 25 to 75 suggest a near-uniform spread rather than a bell curve, which is unusual for a real-world severity metric and hints at synthetic or rank-based generation. No nulls and no outliers across 3222 rows with 1001 distinct values.

Treatment: use as-is or bin into quartiles; no transformation needed given symmetric bounded range.

anthropic:claude-opus-4-7 · confidence high
Out[43]:

saturn.columns["disability_score"].stats

statvalue
n3,222
nulls0 (0.0%)
unique1,001
min 0
max 100
mean 50
median 50
std 28.88
q1 25
q3 75
iqr 50
skew 1.2e-17
kurtosis -1.2
n_outliers 0
outlier_rate 0
zero_rate 0.0006207
Fig 18.
Distribution of disability_score. Vertical dash marks the median.
Show data table
Histogram bins for disability_score (median: 50.0).
bincount
0 – 2.579
2.5 – 581
5 – 7.580
7.5 – 1081
10 – 12.581
12.5 – 1580
15 – 17.581
17.5 – 2080
20 – 22.581
22.5 – 2580
25 – 27.581
27.5 – 3080
30 – 32.581
32.5 – 3580
35 – 37.581
37.5 – 4080
40 – 42.581
42.5 – 4580
45 – 47.581
47.5 – 5080
50 – 52.581
52.5 – 5580
55 – 57.581
57.5 – 6080
60 – 62.581
62.5 – 6581
65 – 67.580
67.5 – 7081
70 – 72.580
72.5 – 7581
75 – 77.580
77.5 – 8081
80 – 82.580
82.5 – 8581
85 – 87.580
87.5 – 9081
90 – 92.580
92.5 – 9581
95 – 97.580
97.5 – 10083

poverty_rate numeric feature

Numeric poverty rate (likely percent of population below the poverty line) across 3,222 rows with no nulls and 1,719 distinct values. The distribution is right-skewed (skew 2.10, kurtosis 6.89): median is 13.55 and Q3 is 17.91, but the max reaches 66.32, producing 137 outliers (4.25%). Minimum is 1.6 and there are no zeros, consistent with a county- or area-level rate rather than individual records.

Treatment: Consider a log or winsorizing transform before regression to tame the right tail.

anthropic:claude-opus-4-7 · confidence high
Out[46]:

saturn.columns["poverty_rate"].stats

statvalue
n3,222
nulls0 (0.0%)
unique1,719
min 1.6
max 66.32
mean 15.1
median 13.55
std 7.706
q1 10.16
q3 17.91
iqr 7.75
skew 2.096
kurtosis 6.891
n_outliers 137
outlier_rate 0.04252
zero_rate 0
alert: high_skewskew=+2.10
Fig 19.
Distribution of poverty_rate. Vertical dash marks the median.
Show data table
Histogram bins for poverty_rate (median: 13.55).
bincount
1.6 – 3.2187
3.218 – 4.83634
4.836 – 6.454106
6.454 – 8.072246
8.072 – 9.69320
9.69 – 11.31354
11.31 – 12.93393
12.93 – 14.54364
14.54 – 16.16306
16.16 – 17.78262
17.78 – 19.4192
19.4 – 21.02149
21.02 – 22.63123
22.63 – 24.2591
24.25 – 25.8752
25.87 – 27.4944
27.49 – 29.1134
29.11 – 30.7223
30.72 – 32.3418
32.34 – 33.9614
33.96 – 35.586
35.58 – 37.28
37.2 – 38.813
38.81 – 40.438
40.43 – 42.055
42.05 – 43.679
43.67 – 45.294
45.29 – 46.911
46.9 – 48.527
48.52 – 50.148
50.14 – 51.762
51.76 – 53.386
53.38 – 54.995
54.99 – 56.615
56.61 – 58.231
58.23 – 59.850
59.85 – 61.470
61.47 – 63.080
63.08 – 64.71
64.7 – 66.321

no_vehicle_pct numeric feature

Percentage of households with no vehicle, reported per row (likely a geographic unit like county or tract). The distribution is tightly clustered with a median of 5.41 and IQR of 3.38, but a long right tail pushes the max to 85.94, yielding skew of 6.98 and kurtosis of 86.23. About 4.3% of rows are flagged as outliers, and 0.37% are exact zeros; no nulls.

Treatment: Log1p- or winsorize before modelling to tame the heavy right tail.

anthropic:claude-opus-4-7 · confidence high
Out[49]:

saturn.columns["no_vehicle_pct"].stats

statvalue
n3,222
nulls0 (0.0%)
unique1,065
min 0
max 85.94
mean 6.197
median 5.41
std 4.538
q1 3.98
q3 7.36
iqr 3.38
skew 6.976
kurtosis 86.23
n_outliers 140
outlier_rate 0.04345
zero_rate 0.003724
alert: high_skewskew=+6.98
Fig 20.
Distribution of no_vehicle_pct. Vertical dash marks the median.
Show data table
Histogram bins for no_vehicle_pct (median: 5.41).
bincount
0 – 2.148161
2.148 – 4.297823
4.297 – 6.4451091
6.445 – 8.594630
8.594 – 10.74283
10.74 – 12.89111
12.89 – 15.0461
15.04 – 17.1923
17.19 – 19.348
19.34 – 21.483
21.48 – 23.634
23.63 – 25.782
25.78 – 27.933
27.93 – 30.082
30.08 – 32.232
32.23 – 34.382
34.38 – 36.522
36.52 – 38.672
38.67 – 40.820
40.82 – 42.971
42.97 – 45.120
45.12 – 47.271
47.27 – 49.420
49.42 – 51.560
51.56 – 53.710
53.71 – 55.861
55.86 – 58.011
58.01 – 60.160
60.16 – 62.312
62.31 – 64.450
64.45 – 66.61
66.6 – 68.750
68.75 – 70.90
70.9 – 73.050
73.05 – 75.20
75.2 – 77.350
77.35 – 79.491
79.49 – 81.640
81.64 – 83.790
83.79 – 85.941

uninsured_rate numeric feature

Likely a per-record uninsured rate (probably proportion of population without insurance), ranging 0.0 to 3.7 with a median of 0.12 and IQR of 0.21. The distribution is heavily right-skewed (skew 4.10, kurtosis 27.7) with 230 outliers (7.1%) and 17.5% exact zeros; the max of 3.7 is implausible for a true rate and suggests mixed units or data-entry errors.

Treatment: Investigate values >1 for unit errors, then winsorize or log1p-transform before modelling.

anthropic:claude-opus-4-7 · confidence high
Out[52]:

saturn.columns["uninsured_rate"].stats

statvalue
n3,222
nulls0 (0.0%)
unique152
min 0
max 3.7
mean 0.2002
median 0.12
std 0.2829
q1 0.04
q3 0.25
iqr 0.21
skew 4.095
kurtosis 27.7
n_outliers 230
outlier_rate 0.07138
zero_rate 0.1754
alert: high_skewskew=+4.10
alert: outliers7.1% rows beyond 1.5 IQR
Fig 21.
Distribution of uninsured_rate. Vertical dash marks the median.
Show data table
Histogram bins for uninsured_rate (median: 0.12).
bincount
0 – 0.09251403
0.0925 – 0.185704
0.185 – 0.2775403
0.2775 – 0.37213
0.37 – 0.4625158
0.4625 – 0.555101
0.555 – 0.647565
0.6475 – 0.7443
0.74 – 0.832527
0.8325 – 0.92523
0.925 – 1.0189
1.018 – 1.1115
1.11 – 1.20214
1.202 – 1.2955
1.295 – 1.3877
1.387 – 1.487
1.48 – 1.5735
1.573 – 1.6652
1.665 – 1.7584
1.758 – 1.851
1.85 – 1.9421
1.942 – 2.0351
2.035 – 2.1272
2.127 – 2.222
2.22 – 2.3121
2.312 – 2.4050
2.405 – 2.4980
2.498 – 2.591
2.59 – 2.6830
2.683 – 2.7751
2.775 – 2.8680
2.868 – 2.961
2.96 – 3.0521
3.052 – 3.1450
3.145 – 3.2371
3.237 – 3.330
3.33 – 3.4220
3.422 – 3.5150
3.515 – 3.6070
3.607 – 3.71

hospital_closure_risk numeric feature

A coarse risk score for hospital closure taking only 3 distinct values across 3222 rows, bounded between 0.0 and 50.0 with a median of 25.0. Despite being stored as numeric, the column behaves categorically: 28.8% of rows are zero and quartiles collapse to 0.0 and 25.0, suggesting the three buckets are roughly {0, 25, 50}. No outliers and no nulls.

Treatment: Treat as an ordinal category with three levels rather than a continuous variable.

anthropic:claude-opus-4-7 · confidence high
Out[55]:

saturn.columns["hospital_closure_risk"].stats

statvalue
n3,222
nulls0 (0.0%)
unique3
min 0
max 50
mean 21.69
median 25
std 16.34
q1 0
q3 25
iqr 25
skew 0.1414
kurtosis -0.6949
n_outliers 0
outlier_rate 0
zero_rate 0.2883
Fig 22.
Distribution of hospital_closure_risk. Vertical dash marks the median.
Show data table
Histogram bins for hospital_closure_risk (median: 25.0).
bincount
0 – 1.25929
1.25 – 2.50
2.5 – 3.750
3.75 – 50
5 – 6.250
6.25 – 7.50
7.5 – 8.750
8.75 – 100
10 – 11.250
11.25 – 12.50
12.5 – 13.750
13.75 – 150
15 – 16.250
16.25 – 17.50
17.5 – 18.750
18.75 – 200
20 – 21.250
21.25 – 22.50
22.5 – 23.750
23.75 – 250
25 – 26.251790
26.25 – 27.50
27.5 – 28.750
28.75 – 300
30 – 31.250
31.25 – 32.50
32.5 – 33.750
33.75 – 350
35 – 36.250
36.25 – 37.50
37.5 – 38.750
38.75 – 400
40 – 41.250
41.25 – 42.50
42.5 – 43.750
43.75 – 450
45 – 46.250
46.25 – 47.50
47.5 – 48.750
48.75 – 50503

pct_rent_burdened_30 numeric feature

This appears to be the percentage of renter households spending at least 30% of income on rent, reported per row (likely a county or tract). Values span 0 to 64.96 with a median of 37.36 and IQR 30.67–43.48, indicating most areas cluster in the 30–45% range with a mild left skew (-0.57). About 0.25% of rows are exact zeros and 58 outliers (1.8%) sit outside the whiskers, worth checking for small-population geographies.

Treatment: Use as-is for modelling; optionally winsorize the 58 outliers and verify zero-valued rows.

anthropic:claude-opus-4-7 · confidence high
Out[58]:

saturn.columns["pct_rent_burdened_30"].stats

statvalue
n3,222
nulls0 (0.0%)
unique2,146
min 0
max 64.96
mean 36.44
median 37.36
std 10.01
q1 30.67
q3 43.48
iqr 12.81
skew -0.5673
kurtosis 0.5032
n_outliers 58
outlier_rate 0.018
zero_rate 0.002483
Fig 23.
Distribution of pct_rent_burdened_30. Vertical dash marks the median.
Show data table
Histogram bins for pct_rent_burdened_30 (median: 37.36).
bincount
0 – 1.6249
1.624 – 3.2485
3.248 – 4.8723
4.872 – 6.4965
6.496 – 8.129
8.12 – 9.74413
9.744 – 11.3711
11.37 – 12.9916
12.99 – 14.6226
14.62 – 16.2419
16.24 – 17.8635
17.86 – 19.4943
19.49 – 21.1152
21.11 – 22.7452
22.74 – 24.3673
24.36 – 25.9899
25.98 – 27.61109
27.61 – 29.23116
29.23 – 30.86132
30.86 – 32.48159
32.48 – 34.1189
34.1 – 35.73209
35.73 – 37.35227
37.35 – 38.98239
38.98 – 40.6205
40.6 – 42.22209
42.22 – 43.85210
43.85 – 45.47190
45.47 – 47.1131
47.1 – 48.72114
48.72 – 50.34118
50.34 – 51.9769
51.97 – 53.5951
53.59 – 55.2234
55.22 – 56.8424
56.84 – 58.466
58.46 – 60.093
60.09 – 61.712
61.71 – 63.343
63.34 – 64.963

pct_rent_burdened_50 numeric feature

This column reports the percentage of households that are severely rent-burdened (spending 50%+ of income on rent), with values ranging from 0.0 to 64.96 and a mean of 17.35 closely matching the median of 17.62. The distribution is remarkably symmetric (skew 0.054) and near-normal in shape, with only 47 outliers (1.46%) and a small zero rate of 0.93%. The tight IQR of 8.56 around a median near 17.6 suggests most geographies cluster in a narrow band of severe rent burden.

Treatment: Use directly as a numeric feature; no transform needed given near-symmetric distribution.

anthropic:claude-opus-4-7 · confidence high
Out[61]:

saturn.columns["pct_rent_burdened_50"].stats

statvalue
n3,222
nulls0 (0.0%)
unique1,769
min 0
max 64.96
mean 17.35
median 17.62
std 6.577
q1 13.07
q3 21.63
iqr 8.557
skew 0.05436
kurtosis 0.9823
n_outliers 47
outlier_rate 0.01459
zero_rate 0.009311
Fig 24.
Distribution of pct_rent_burdened_50. Vertical dash marks the median.
Show data table
Histogram bins for pct_rent_burdened_50 (median: 17.62).
bincount
0 – 1.62442
1.624 – 3.24827
3.248 – 4.87234
4.872 – 6.49663
6.496 – 8.12102
8.12 – 9.744148
9.744 – 11.37163
11.37 – 12.99214
12.99 – 14.62242
14.62 – 16.24310
16.24 – 17.86315
17.86 – 19.49332
19.49 – 21.11335
21.11 – 22.74264
22.74 – 24.36219
24.36 – 25.98150
25.98 – 27.6199
27.61 – 29.2364
29.23 – 30.8639
30.86 – 32.4820
32.48 – 34.121
34.1 – 35.739
35.73 – 37.352
37.35 – 38.983
38.98 – 40.61
40.6 – 42.221
42.22 – 43.851
43.85 – 45.470
45.47 – 47.11
47.1 – 48.720
48.72 – 50.340
50.34 – 51.970
51.97 – 53.590
53.59 – 55.220
55.22 – 56.840
56.84 – 58.460
58.46 – 60.090
60.09 – 61.710
61.71 – 63.340
63.34 – 64.961

median_gross_rent numeric feature

Numeric column capturing median gross rent in dollars, with 3,222 rows, 983 unique values, and a trivial 0.31% null rate. The distribution is right-skewed (skew 1.76, kurtosis 4.55), running from 297 to 2,805 around a median of 818 and mean of 891, and 225 values (7.0%) flag as outliers on the high end. No zeros are present, so missingness isn't being encoded as 0.

Treatment: Log-transform before regression to tame the right skew and high-rent outliers.

anthropic:claude-opus-4-7 · confidence high
Out[64]:

saturn.columns["median_gross_rent"].stats

statvalue
n3,222
nulls10 (0.3%)
unique983
min 297
max 2,805
mean 890.9
median 818
std 283.4
q1 718
q3 978
iqr 260
skew 1.763
kurtosis 4.55
n_outliers 225
outlier_rate 0.07005
zero_rate 0
alert: outliers7.0% rows beyond 1.5 IQR
Fig 25.
Distribution of median_gross_rent. Vertical dash marks the median.
Show data table
Histogram bins for median_gross_rent (median: 818.0).
bincount
297 – 359.75
359.7 – 422.414
422.4 – 485.132
485.1 – 547.869
547.8 – 610.5128
610.5 – 673.2242
673.2 – 735.9457
735.9 – 798.6515
798.6 – 861.3423
861.3 – 924306
924 – 986.7251
986.7 – 1049140
1049 – 1112105
1112 – 117598
1175 – 123879
1238 – 130071
1300 – 136352
1363 – 142648
1426 – 148826
1488 – 155122
1551 – 161433
1614 – 167613
1676 – 173919
1739 – 180210
1802 – 186413
1864 – 19278
1927 – 199011
1990 – 20536
2053 – 21154
2115 – 21783
2178 – 22414
2241 – 23031
2303 – 23661
2366 – 24290
2429 – 24921
2492 – 25540
2554 – 26170
2617 – 26800
2680 – 27421
2742 – 28051

rent_to_income_ratio numeric feature

This column reports a rent-to-income ratio, with a typical tenant sitting near 17.06 and an interquartile range of just 4.29 between 15.1 and 19.39. However, the maximum of 1200.0 against a median of 17.06 produces extreme skew (53.98) and kurtosis (3007.07), and 107 values (3.33%) are flagged as outliers. The tight IQR alongside a 21.2 standard deviation indicates a small number of records are orders of magnitude beyond the bulk of the distribution.

Treatment: Cap or winsorize extreme values and log-transform before modelling.

anthropic:claude-opus-4-7 · confidence high
Out[67]:

saturn.columns["rent_to_income_ratio"].stats

statvalue
n3,222
nulls9 (0.3%)
unique1,269
min 6.1
max 1,200
mean 17.89
median 17.06
std 21.2
q1 15.1
q3 19.39
iqr 4.29
skew 53.98
kurtosis 3007
n_outliers 107
outlier_rate 0.0333
zero_rate 0
alert: high_skewskew=+53.98
Fig 26.
Distribution of rent_to_income_ratio. Vertical dash marks the median.
Show data table
Histogram bins for rent_to_income_ratio (median: 17.06).
bincount
6.1 – 35.953207
35.95 – 65.85
65.8 – 95.640
95.64 – 125.50
125.5 – 155.30
155.3 – 185.20
185.2 – 2150
215 – 244.90
244.9 – 274.70
274.7 – 304.60
304.6 – 334.40
334.4 – 364.30
364.3 – 394.10
394.1 – 4240
424 – 453.80
453.8 – 483.70
483.7 – 513.50
513.5 – 543.40
543.4 – 573.20
573.2 – 603.10
603.1 – 632.90
632.9 – 662.70
662.7 – 692.60
692.6 – 722.40
722.4 – 752.30
752.3 – 782.10
782.1 – 8120
812 – 841.80
841.8 – 871.70
871.7 – 901.50
901.5 – 931.40
931.4 – 961.20
961.2 – 991.10
991.1 – 10210
1021 – 10510
1051 – 10810
1081 – 11100
1110 – 11400
1140 – 11700
1170 – 12001

gini_index numeric feature

Numeric column holding Gini index values for 3,222 records, all populated and bounded between 0.2744 and 0.721 with a mean of 0.4481 and median 0.4457. Distribution is tight (IQR 0.049375, std 0.0384) with mild right skew (0.4999) and 56 high-side outliers (1.74%) stretching toward 0.721. Values fall in the expected 0–1 range for an inequality coefficient, suggesting a clean, ready-to-use feature.

Treatment: Use as-is as a numeric feature; optionally winsorize the 56 upper outliers.

anthropic:claude-opus-4-7 · confidence high
Out[70]:

saturn.columns["gini_index"].stats

statvalue
n3,222
nulls0 (0.0%)
unique1,317
min 0.2744
max 0.721
mean 0.4481
median 0.4457
std 0.03841
q1 0.422
q3 0.4714
iqr 0.04938
skew 0.4999
kurtosis 1.634
n_outliers 56
outlier_rate 0.01738
zero_rate 0
Fig 27.
Distribution of gini_index. Vertical dash marks the median.
Show data table
Histogram bins for gini_index (median: 0.4457).
bincount
0.2744 – 0.28561
0.2856 – 0.29671
0.2967 – 0.30791
0.3079 – 0.31910
0.3191 – 0.33021
0.3302 – 0.34143
0.3414 – 0.35266
0.3526 – 0.363710
0.3637 – 0.374929
0.3749 – 0.386166
0.3861 – 0.3972123
0.3972 – 0.4084202
0.4084 – 0.4195277
0.4195 – 0.4307365
0.4307 – 0.4419375
0.4419 – 0.453402
0.453 – 0.4642370
0.4642 – 0.4754299
0.4754 – 0.4865227
0.4865 – 0.4977162
0.4977 – 0.5089104
0.5089 – 0.5280
0.52 – 0.531237
0.5312 – 0.542426
0.5424 – 0.553522
0.5535 – 0.564714
0.5647 – 0.57596
0.5759 – 0.5875
0.587 – 0.59822
0.5982 – 0.60933
0.6093 – 0.62052
0.6205 – 0.63170
0.6317 – 0.64280
0.6428 – 0.6540
0.654 – 0.66520
0.6652 – 0.67630
0.6763 – 0.68750
0.6875 – 0.69870
0.6987 – 0.70980
0.7098 – 0.7211

unemployment_rate numeric feature

Likely a county/region-level unemployment rate in percent, with values ranging from 0.0 to 31.99 and a median of 4.69. The distribution is heavily right-skewed (skew 2.55, kurtosis 12.81) with 154 outliers (4.78%) pulling the mean (5.13) above the median. A small zero_rate (0.56%) suggests a handful of suspiciously perfect-zero readings worth verifying.

Treatment: Log or Yeo-Johnson transform before regression to tame the right-skew, and inspect the zero values.

anthropic:claude-opus-4-7 · confidence high
Out[73]:

saturn.columns["unemployment_rate"].stats

statvalue
n3,222
nulls0 (0.0%)
unique950
min 0
max 31.99
mean 5.127
median 4.69
std 2.926
q1 3.42
q3 6.08
iqr 2.66
skew 2.545
kurtosis 12.81
n_outliers 154
outlier_rate 0.0478
zero_rate 0.005587
alert: high_skewskew=+2.55
Fig 28.
Distribution of unemployment_rate. Vertical dash marks the median.
Show data table
Histogram bins for unemployment_rate (median: 4.69).
bincount
0 – 0.799760
0.7997 – 1.59990
1.599 – 2.399197
2.399 – 3.199333
3.199 – 3.999464
3.999 – 4.798539
4.798 – 5.598492
5.598 – 6.398361
6.398 – 7.198217
7.198 – 7.997142
7.997 – 8.79785
8.797 – 9.59761
9.597 – 10.436
10.4 – 11.236
11.2 – 1213
12 – 12.818
12.8 – 13.613
13.6 – 14.412
14.4 – 15.211
15.2 – 15.998
15.99 – 16.795
16.79 – 17.594
17.59 – 18.392
18.39 – 19.191
19.19 – 19.993
19.99 – 20.793
20.79 – 21.594
21.59 – 22.393
22.39 – 23.192
23.19 – 23.993
23.99 – 24.791
24.79 – 25.590
25.59 – 26.390
26.39 – 27.190
27.19 – 27.990
27.99 – 28.790
28.79 – 29.590
29.59 – 30.390
30.39 – 31.191
31.19 – 31.992

labor_force_participation numeric feature

Numeric labor force participation rate, almost certainly expressed as a percentage given the range of 18.63 to 84.04 and mean of 57.89. Distribution is moderately left-skewed (-0.58) with a tight interquartile band of 52.97 to 63.67, and only 38 outliers (1.18%) sit outside the whiskers. No nulls or zeros across 3,222 rows, and 1,944 unique values suggest fine-grained measurements rather than rounded buckets.

Treatment: Use as-is in modelling; mild left skew does not require transformation.

anthropic:claude-opus-4-7 · confidence high
Out[76]:

saturn.columns["labor_force_participation"].stats

statvalue
n3,222
nulls0 (0.0%)
unique1,944
min 18.63
max 84.04
mean 57.89
median 58.72
std 8.041
q1 52.97
q3 63.66
iqr 10.7
skew -0.5766
kurtosis 0.4502
n_outliers 38
outlier_rate 0.01179
zero_rate 0
Fig 29.
Distribution of labor_force_participation. Vertical dash marks the median.
Show data table
Histogram bins for labor_force_participation (median: 58.724999999999994).
bincount
18.63 – 20.271
20.27 – 21.90
21.9 – 23.540
23.54 – 25.171
25.17 – 26.811
26.81 – 28.441
28.44 – 30.080
30.08 – 31.714
31.71 – 33.356
33.35 – 34.988
34.98 – 36.6210
36.62 – 38.2524
38.25 – 39.8930
39.89 – 41.5237
41.52 – 43.1651
43.16 – 44.7961
44.79 – 46.4360
46.43 – 48.0676
48.06 – 49.7109
49.7 – 51.34141
51.34 – 52.97186
52.97 – 54.61174
54.61 – 56.24235
56.24 – 57.88245
57.88 – 59.51272
59.51 – 61.15270
61.15 – 62.78277
62.78 – 64.42252
64.42 – 66.05221
66.05 – 67.69187
67.69 – 69.32118
69.32 – 70.9682
70.96 – 72.5941
72.59 – 74.2319
74.23 – 75.8610
75.86 – 77.54
77.5 – 79.136
79.13 – 80.771
80.77 – 82.40
82.4 – 84.041

pct_deep_poverty numeric feature

Percentage of population in deep poverty across 3,222 rows, with no nulls and values bounded between 0.0 and 34.7. The distribution is right-skewed (skew 2.67, kurtosis 10.40) with median 5.82 trailing the mean 6.74, and 176 rows (5.5%) flagged as upper-tail outliers. Only 0.09% of rows are zero, so floor effects are minimal despite the long tail.

Treatment: Log or Winsorize before linear modelling to dampen the heavy right tail.

anthropic:claude-opus-4-7 · confidence high
Out[79]:

saturn.columns["pct_deep_poverty"].stats

statvalue
n3,222
nulls0 (0.0%)
unique1,131
min 0
max 34.7
mean 6.743
median 5.82
std 4.154
q1 4.27
q3 7.918
iqr 3.648
skew 2.665
kurtosis 10.4
n_outliers 176
outlier_rate 0.05462
zero_rate 0.0009311
alert: high_skewskew=+2.67
alert: outliers5.5% rows beyond 1.5 IQR
Fig 30.
Distribution of pct_deep_poverty. Vertical dash marks the median.
Show data table
Histogram bins for pct_deep_poverty (median: 5.82).
bincount
0 – 0.867515
0.8675 – 1.73528
1.735 – 2.603128
2.603 – 3.47241
3.47 – 4.338429
4.338 – 5.205446
5.205 – 6.073436
6.073 – 6.94403
6.94 – 7.808261
7.808 – 8.675211
8.675 – 9.543157
9.543 – 10.41113
10.41 – 11.2857
11.28 – 12.1558
12.15 – 13.0150
13.01 – 13.8828
13.88 – 14.7518
14.75 – 15.6222
15.62 – 16.4818
16.48 – 17.358
17.35 – 18.2211
18.22 – 19.099
19.09 – 19.957
19.95 – 20.824
20.82 – 21.697
21.69 – 22.558
22.55 – 23.425
23.42 – 24.292
24.29 – 25.168
25.16 – 26.034
26.03 – 26.896
26.89 – 27.762
27.76 – 28.634
28.63 – 29.57
29.5 – 30.363
30.36 – 31.230
31.23 – 32.12
32.1 – 32.971
32.97 – 33.831
33.83 – 34.74

pct_poverty numeric feature

Likely a county- or area-level poverty rate expressed as a percentage, ranging from 1.6 to 66.32 with a median of 13.55 and mean of 15.10. The distribution is right-skewed (skew 2.10, kurtosis 6.89) with 137 outliers (4.25%) in the heavy upper tail, consistent with a small set of high-poverty areas pulling the mean above the median. No nulls or zeros, and 1719 unique values across 3222 rows suggest fine-grained but repeated measurements.

Treatment: Consider a log or sqrt transform before linear modelling to tame the right skew.

anthropic:claude-opus-4-7 · confidence high
Out[82]:

saturn.columns["pct_poverty"].stats

statvalue
n3,222
nulls0 (0.0%)
unique1,719
min 1.6
max 66.32
mean 15.1
median 13.55
std 7.706
q1 10.16
q3 17.91
iqr 7.75
skew 2.096
kurtosis 6.891
n_outliers 137
outlier_rate 0.04252
zero_rate 0
alert: high_skewskew=+2.10
Fig 31.
Distribution of pct_poverty. Vertical dash marks the median.
Show data table
Histogram bins for pct_poverty (median: 13.55).
bincount
1.6 – 3.2187
3.218 – 4.83634
4.836 – 6.454106
6.454 – 8.072246
8.072 – 9.69320
9.69 – 11.31354
11.31 – 12.93393
12.93 – 14.54364
14.54 – 16.16306
16.16 – 17.78262
17.78 – 19.4192
19.4 – 21.02149
21.02 – 22.63123
22.63 – 24.2591
24.25 – 25.8752
25.87 – 27.4944
27.49 – 29.1134
29.11 – 30.7223
30.72 – 32.3418
32.34 – 33.9614
33.96 – 35.586
35.58 – 37.28
37.2 – 38.813
38.81 – 40.438
40.43 – 42.055
42.05 – 43.679
43.67 – 45.294
45.29 – 46.911
46.9 – 48.527
48.52 – 50.148
50.14 – 51.762
51.76 – 53.386
53.38 – 54.995
54.99 – 56.615
56.61 – 58.231
58.23 – 59.850
59.85 – 61.470
61.47 – 63.080
63.08 – 64.71
64.7 – 66.321

pct_near_poverty numeric feature

Percentage of population near the poverty line (likely between 100-200% of the federal poverty threshold), reported per record across 3222 rows with no nulls. The distribution centers around a median of 9.38 with an IQR of 4.43, but a right tail pushes the max to 49.14, yielding skew of 1.19 and kurtosis of 5.73. About 2.5% of values (82 rows) fall outside the outlier fence, suggesting a handful of high-poverty areas worth inspecting separately.

Treatment: Consider a log or sqrt transform before regression to tame the right skew.

anthropic:claude-opus-4-7 · confidence high
Out[85]:

saturn.columns["pct_near_poverty"].stats

statvalue
n3,222
nulls0 (0.0%)
unique1,237
min 0.58
max 49.14
mean 9.813
median 9.38
std 3.644
q1 7.33
q3 11.76
iqr 4.43
skew 1.19
kurtosis 5.729
n_outliers 82
outlier_rate 0.02545
zero_rate 0
Fig 32.
Distribution of pct_near_poverty. Vertical dash marks the median.
Show data table
Histogram bins for pct_near_poverty (median: 9.38).
bincount
0.58 – 1.7947
1.794 – 3.00818
3.008 – 4.22282
4.222 – 5.436161
5.436 – 6.65302
6.65 – 7.864419
7.864 – 9.078480
9.078 – 10.29487
10.29 – 11.51392
11.51 – 12.72280
12.72 – 13.93210
13.93 – 15.15138
15.15 – 16.3687
16.36 – 17.5853
17.58 – 18.7937
18.79 – 2035
20 – 21.2215
21.22 – 22.437
22.43 – 23.652
23.65 – 24.865
24.86 – 26.071
26.07 – 27.292
27.29 – 28.50
28.5 – 29.720
29.72 – 30.930
30.93 – 32.140
32.14 – 33.360
33.36 – 34.570
34.57 – 35.790
35.79 – 371
37 – 38.210
38.21 – 39.430
39.43 – 40.640
40.64 – 41.860
41.86 – 43.070
43.07 – 44.280
44.28 – 45.50
45.5 – 46.710
46.71 – 47.930
47.93 – 49.141

pct_hs_or_higher numeric feature

Percentage of population (likely adults 25+) with a high school diploma or higher, reported per row across 3,222 records. Values are tightly clustered high (mean 88.08, median 89.39, IQR 84.9–92.47) with a left tail reaching down to 33.33, producing skew of -1.33 and 86 low-end outliers (2.67%). No nulls or zeros, and 1,612 unique values suggest a county- or tract-level rate.

Treatment: Use as-is for modelling, but consider a reflected log or winsorisation given the left skew and low-end outliers.

anthropic:claude-opus-4-7 · confidence high
Out[88]:

saturn.columns["pct_hs_or_higher"].stats

statvalue
n3,222
nulls0 (0.0%)
unique1,612
min 33.33
max 99.69
mean 88.08
median 89.39
std 5.97
q1 84.9
q3 92.47
iqr 7.567
skew -1.328
kurtosis 3.742
n_outliers 86
outlier_rate 0.02669
zero_rate 0
Fig 33.
Distribution of pct_hs_or_higher. Vertical dash marks the median.
Show data table
Histogram bins for pct_hs_or_higher (median: 89.39).
bincount
33.33 – 34.991
34.99 – 36.650
36.65 – 38.310
38.31 – 39.970
39.97 – 41.620
41.62 – 43.280
43.28 – 44.940
44.94 – 46.60
46.6 – 48.260
48.26 – 49.920
49.92 – 51.580
51.58 – 53.240
53.24 – 54.90
54.9 – 56.561
56.56 – 58.221
58.22 – 59.871
59.87 – 61.533
61.53 – 63.193
63.19 – 64.853
64.85 – 66.512
66.51 – 68.176
68.17 – 69.837
69.83 – 71.4915
71.49 – 73.1530
73.15 – 74.8130
74.81 – 76.4646
76.46 – 78.1260
78.12 – 79.7888
79.78 – 81.44131
81.44 – 83.1174
83.1 – 84.76189
84.76 – 86.42256
86.42 – 88.08289
88.08 – 89.74360
89.74 – 91.39429
91.39 – 93.05460
93.05 – 94.71389
94.71 – 96.37192
96.37 – 98.0347
98.03 – 99.699

pct_bachelors_or_higher numeric feature

Percent of adults with a bachelor's degree or higher, almost certainly at the county or similar geographic level given n=3222 with no nulls. Values range from 0.0 to 78.87 with median 21.07 and mean 23.50, and the distribution is right-skewed (skew 1.36, kurtosis 2.31) with 141 outliers (4.4%) on the high end—consistent with a long tail of highly educated metros above the typical county.

Treatment: Consider a log or sqrt transform before linear modelling to tame the right skew.

anthropic:claude-opus-4-7 · confidence high
Out[91]:

saturn.columns["pct_bachelors_or_higher"].stats

statvalue
n3,222
nulls0 (0.0%)
unique1,982
min 0
max 78.87
mean 23.5
median 21.07
std 9.983
q1 16.59
q3 27.85
iqr 11.26
skew 1.357
kurtosis 2.306
n_outliers 141
outlier_rate 0.04376
zero_rate 0.0003104
Fig 34.
Distribution of pct_bachelors_or_higher. Vertical dash marks the median.
Show data table
Histogram bins for pct_bachelors_or_higher (median: 21.07).
bincount
0 – 1.9721
1.972 – 3.9440
3.944 – 5.9154
5.915 – 7.8879
7.887 – 9.85932
9.859 – 11.83135
11.83 – 13.8169
13.8 – 15.77317
15.77 – 17.75328
17.75 – 19.72376
19.72 – 21.69345
21.69 – 23.66262
23.66 – 25.63232
25.63 – 27.6189
27.6 – 29.58123
29.58 – 31.55116
31.55 – 33.52118
33.52 – 35.4996
35.49 – 37.4660
37.46 – 39.4468
39.44 – 41.4140
41.41 – 43.3834
43.38 – 45.3534
45.35 – 47.3224
47.32 – 49.2921
49.29 – 51.2719
51.27 – 53.2415
53.24 – 55.2110
55.21 – 57.1811
57.18 – 59.1510
59.15 – 61.129
61.12 – 63.16
63.1 – 65.075
65.07 – 67.041
67.04 – 69.010
69.01 – 70.981
70.98 – 72.950
72.95 – 74.930
74.93 – 76.91
76.9 – 78.871

disability_rate numeric feature

This is a numeric disability rate per record, ranging from 0.0 to 9.17 with a median of 1.07 and IQR of 0.65. The distribution is heavily right-skewed (skew 2.17, kurtosis 15.24) with 117 outliers (3.6%) and a small but non-trivial 1.7% zeros. Only 305 unique values across 3,222 rows suggests the rate is reported at coarse precision or aggregated to a small set of geographies.

Treatment: Log- or winsorize-transform before regression to tame the right tail.

anthropic:claude-opus-4-7 · confidence high
Out[94]:

saturn.columns["disability_rate"].stats

statvalue
n3,222
nulls0 (0.0%)
unique305
min 0
max 9.17
mean 1.145
median 1.07
std 0.6215
q1 0.77
q3 1.42
iqr 0.65
skew 2.167
kurtosis 15.24
n_outliers 117
outlier_rate 0.03631
zero_rate 0.01676
alert: high_skewskew=+2.17
Fig 35.
Distribution of disability_rate. Vertical dash marks the median.
Show data table
Histogram bins for disability_rate (median: 1.07).
bincount
0 – 0.2293114
0.2293 – 0.4585143
0.4585 – 0.6878352
0.6878 – 0.917590
0.917 – 1.146634
1.146 – 1.376496
1.376 – 1.605362
1.605 – 1.834200
1.834 – 2.063118
2.063 – 2.29277
2.292 – 2.52245
2.522 – 2.75131
2.751 – 2.9822
2.98 – 3.2110
3.21 – 3.4397
3.439 – 3.6684
3.668 – 3.8973
3.897 – 4.1273
4.127 – 4.3563
4.356 – 4.5852
4.585 – 4.8140
4.814 – 5.0432
5.043 – 5.2732
5.273 – 5.5020
5.502 – 5.7310
5.731 – 5.9610
5.961 – 6.190
6.19 – 6.4190
6.419 – 6.6480
6.648 – 6.8780
6.878 – 7.1070
7.107 – 7.3360
7.336 – 7.5651
7.565 – 7.7950
7.795 – 8.0240
8.024 – 8.2530
8.253 – 8.4820
8.482 – 8.7120
8.712 – 8.9410
8.941 – 9.171

How to cite

click to copy

BibTeX
@misc{saturn-merged-inequality-master-2026,
  author       = {Steuber, Luke},
  title        = {Saturn reading: merged inequality master},
  year         ={2026},
  howpublished = {\url{https://dr.eamer.dev/saturn/view/merged-inequality_master}},
  note         = {Profiled with saturn-dissect v0.2.0, prompt saturn-insight-v2, model anthropic:claude-opus-4-7},
}
APA
Steuber, L. (2026). Saturn reading: merged inequality master. Source: /home/coolhand/datasets/us-inequality-atlas/merged/inequality_master.csv. Profiled with saturn-dissect v0.2.0 (saturn-insight-v2, anthropic:claude-opus-4-7). Retrieved from https://dr.eamer.dev/saturn/view/merged-inequality_master