merged-inequality_master · saturn notebook

Overview

Source: /home/coolhand/datasets/us-inequality-atlas/merged/inequality_master.csv

Saturn profiled 3,222 rows across 28 columns. The stats below are deterministic and machine-readable; the prose is a language-model interpretation of those stats (opt-in, added after the fact, never sees raw rows).

[2]:

!pip install saturn-dissect
import subprocess
subprocess.run([
    "saturn", "analyze", "/home/coolhand/datasets/us-inequality-atlas/merged/inequality_master.csv",
    "--findings", "merged-inequality_master.json",
    "--llm", "anthropic:claude-opus-4-7",
])

Summary confidence: high

This dataset profiles 3,222 U.S. counties across 28 columns of socioeconomic indicators, including poverty, rent burden, education, healthcare, and a composite inequality index. Two things stand out for closer inspection: the rent_to_income_ratio shows extreme skew (53.98) with a max of 1200 against a median of 17.06, suggesting either data-entry anomalies or a handful of severe outliers worth investigating. Total population is also highly skewed (skew 13.36, max ~9.78M vs median 25,174), so any per-county aggregation should be population-weighted. The composite_index and the *_score columns are well-behaved and centered near 50, making them good candidates for cross-county comparison. Texas (254 counties), Georgia, and Virginia dominate the state distribution.

citing: rent_to_income_ratio · total_pop · composite_index · pct_poverty · state · pct_rent_burdened_30 · uninsured_rate

Out[4]:

saturn.schema() · 28 columns

column	kind	n	null%	unique	alerts
fips	numeric	3,222	0.0%	3,222
county_name	text	3,222	0.0%	3,222	near_unique
state	categorical	3,222	0.0%	52
total_pop	numeric	3,222	0.0%	3,173	high_skew outliers
composite_index	numeric	3,222	0.0%	650
economic_score	numeric	3,222	0.0%	908
education_score	numeric	3,222	0.0%	1,001
healthcare_score	numeric	3,222	0.0%	808
housing_score	numeric	3,222	0.0%	937
food_score	numeric	3,222	0.0%	941
disability_score	numeric	3,222	0.0%	1,001
poverty_rate	numeric	3,222	0.0%	1,719	high_skew
no_vehicle_pct	numeric	3,222	0.0%	1,065	high_skew
uninsured_rate	numeric	3,222	0.0%	152	high_skew outliers
hospital_closure_risk	numeric	3,222	0.0%	3
pct_rent_burdened_30	numeric	3,222	0.0%	2,146
pct_rent_burdened_50	numeric	3,222	0.0%	1,769
median_gross_rent	numeric	3,222	0.3%	983	outliers
rent_to_income_ratio	numeric	3,222	0.3%	1,269	high_skew
gini_index	numeric	3,222	0.0%	1,317
unemployment_rate	numeric	3,222	0.0%	950	high_skew
labor_force_participation	numeric	3,222	0.0%	1,944
pct_deep_poverty	numeric	3,222	0.0%	1,131	high_skew outliers
pct_poverty	numeric	3,222	0.0%	1,719	high_skew
pct_near_poverty	numeric	3,222	0.0%	1,237
pct_hs_or_higher	numeric	3,222	0.0%	1,612
pct_bachelors_or_higher	numeric	3,222	0.0%	1,982
disability_rate	numeric	3,222	0.0%	305	high_skew

Fig 1.

composite_index · Roughly symmetric spread from ~10 to ~90 around a median of 49.5 — a usable summary score for ranking counties.

Show data table

Histogram bins for composite_index (median: 49.5).
bin	count
10.1 – 12.1	1
12.1 – 14.1	2
14.1 – 16.1	6
16.1 – 18.1	15
18.1 – 20.1	13
20.1 – 22.1	27
22.1 – 24.1	50
24.1 – 26.1	46
26.1 – 28.1	78
28.1 – 30.1	81
30.1 – 32.1	101
32.1 – 34.1	107
34.1 – 36.1	118
36.1 – 38.1	139
38.1 – 40.1	142
40.1 – 42.1	128
42.1 – 44.1	155
44.1 – 46.1	160
46.1 – 48.1	151
48.1 – 50.1	142
50.1 – 52.1	162
52.1 – 54.1	165
54.1 – 56.1	115
56.1 – 58.1	122
58.1 – 60.1	113
60.1 – 62.1	116
62.1 – 64.1	108
64.1 – 66.1	109
66.1 – 68.1	81
68.1 – 70.1	103
70.1 – 72.1	65
72.1 – 74.1	81
74.1 – 76.1	70
76.1 – 78.1	47
78.1 – 80.1	38
80.1 – 82.1	24
82.1 – 84.1	18
84.1 – 86.1	13
86.1 – 88.1	3
88.1 – 90.1	7

Fig 2.

pct_poverty · Right-skewed poverty rates (median 13.6%, max 66.3%) highlight a long tail of high-poverty counties.

Show data table

Histogram bins for pct_poverty (median: 13.55).
bin	count
1.6 – 3.218	7
3.218 – 4.836	34
4.836 – 6.454	106
6.454 – 8.072	246
8.072 – 9.69	320
9.69 – 11.31	354
11.31 – 12.93	393
12.93 – 14.54	364
14.54 – 16.16	306
16.16 – 17.78	262
17.78 – 19.4	192
19.4 – 21.02	149
21.02 – 22.63	123
22.63 – 24.25	91
24.25 – 25.87	52
25.87 – 27.49	44
27.49 – 29.11	34
29.11 – 30.72	23
30.72 – 32.34	18
32.34 – 33.96	14
33.96 – 35.58	6
35.58 – 37.2	8
37.2 – 38.81	3
38.81 – 40.43	8
40.43 – 42.05	5
42.05 – 43.67	9
43.67 – 45.29	4
45.29 – 46.9	11
46.9 – 48.52	7
48.52 – 50.14	8
50.14 – 51.76	2
51.76 – 53.38	6
53.38 – 54.99	5
54.99 – 56.61	5
56.61 – 58.23	1
58.23 – 59.85	0
59.85 – 61.47	0
61.47 – 63.08	0
63.08 – 64.7	1
64.7 – 66.32	1

Fig 3.

rent_to_income_ratio · Watch for extreme outliers — max of 1200 vs median 17 suggests data quality issues worth filtering before analysis.

Show data table

Histogram bins for rent_to_income_ratio (median: 17.06).
bin	count
6.1 – 35.95	3207
35.95 – 65.8	5
65.8 – 95.64	0
95.64 – 125.5	0
125.5 – 155.3	0
155.3 – 185.2	0
185.2 – 215	0
215 – 244.9	0
244.9 – 274.7	0
274.7 – 304.6	0
304.6 – 334.4	0
334.4 – 364.3	0
364.3 – 394.1	0
394.1 – 424	0
424 – 453.8	0
453.8 – 483.7	0
483.7 – 513.5	0
513.5 – 543.4	0
543.4 – 573.2	0
573.2 – 603.1	0
603.1 – 632.9	0
632.9 – 662.7	0
662.7 – 692.6	0
692.6 – 722.4	0
722.4 – 752.3	0
752.3 – 782.1	0
782.1 – 812	0
812 – 841.8	0
841.8 – 871.7	0
871.7 – 901.5	0
901.5 – 931.4	0
931.4 – 961.2	0
961.2 – 991.1	0
991.1 – 1021	0
1021 – 1051	0
1051 – 1081	0
1081 – 1110	0
1110 – 1140	0
1140 – 1170	0
1170 – 1200	1

Fig 4.

state · Coverage is national but Texas (254), Georgia, and Virginia contribute the most counties; weight comparisons accordingly.

Show data table

Top values for state (20 unique shown, of 52 total).
value	count	share
TX	254	7.9%
GA	159	4.9%
VA	133	4.1%
KY	120	3.7%
MO	115	3.6%
KS	105	3.3%
IL	102	3.2%
NC	100	3.1%
IA	99	3.1%
TN	95	2.9%
NE	93	2.9%
IN	92	2.9%
OH	88	2.7%
MN	87	2.7%
MI	83	2.6%
MS	82	2.5%
PR	78	2.4%
OK	77	2.4%
AR	75	2.3%
WI	72	2.2%

Fig 5.

pct_rent_burdened_30 · Near-symmetric distribution centered around 37% shows rent burden is widespread, not just a tail phenomenon.

Show data table

Histogram bins for pct_rent_burdened_30 (median: 37.36).
bin	count
0 – 1.624	9
1.624 – 3.248	5
3.248 – 4.872	3
4.872 – 6.496	5
6.496 – 8.12	9
8.12 – 9.744	13
9.744 – 11.37	11
11.37 – 12.99	16
12.99 – 14.62	26
14.62 – 16.24	19
16.24 – 17.86	35
17.86 – 19.49	43
19.49 – 21.11	52
21.11 – 22.74	52
22.74 – 24.36	73
24.36 – 25.98	99
25.98 – 27.61	109
27.61 – 29.23	116
29.23 – 30.86	132
30.86 – 32.48	159
32.48 – 34.1	189
34.1 – 35.73	209
35.73 – 37.35	227
37.35 – 38.98	239
38.98 – 40.6	205
40.6 – 42.22	209
42.22 – 43.85	210
43.85 – 45.47	190
45.47 – 47.1	131
47.1 – 48.72	114
48.72 – 50.34	118
50.34 – 51.97	69
51.97 – 53.59	51
53.59 – 55.22	34
55.22 – 56.84	24
56.84 – 58.46	6
58.46 – 60.09	3
60.09 – 61.71	2
61.71 – 63.34	3
63.34 – 64.96	3

Fig 6.

Per-column null rate across the corpus. Columns are ordered by input position.

Show data table

Per-column null rate across the corpus.
column	kind	null %
fips	numeric	0.0%
county_name	text	0.0%
state	categorical	0.0%
total_pop	numeric	0.0%
composite_index	numeric	0.0%
economic_score	numeric	0.0%
education_score	numeric	0.0%
healthcare_score	numeric	0.0%
housing_score	numeric	0.0%
food_score	numeric	0.0%
disability_score	numeric	0.0%
poverty_rate	numeric	0.0%
no_vehicle_pct	numeric	0.0%
uninsured_rate	numeric	0.0%
hospital_closure_risk	numeric	0.0%
pct_rent_burdened_30	numeric	0.0%
pct_rent_burdened_50	numeric	0.0%
median_gross_rent	numeric	0.3%
rent_to_income_ratio	numeric	0.3%
gini_index	numeric	0.0%
unemployment_rate	numeric	0.0%
labor_force_participation	numeric	0.0%
pct_deep_poverty	numeric	0.0%
pct_poverty	numeric	0.0%
pct_near_poverty	numeric	0.0%
pct_hs_or_higher	numeric	0.0%
pct_bachelors_or_higher	numeric	0.0%
disability_rate	numeric	0.0%

Fig 7.

Pearson correlation across numeric columns (sampled, bounded).

Show data table

Pearson correlation across 12 numeric columns (values clipped to 2 decimals).
	fips	total_pop	composite_index	economic_score	education_score	healthcare_score	housing_score	food_score	disability_score	poverty_rate	no_vehicle_pct	uninsured_rate
fips	+1.00	-0.07	+0.03	+0.01	-0.06	+0.30	-0.11	+0.02	-0.04	+0.16	+0.04	+0.01
total_pop	-0.07	+1.00	-0.06	+0.07	-0.29	-0.22	+0.29	-0.01	-0.09	-0.11	+0.09	-0.04
composite_index	+0.03	-0.06	+1.00	+0.84	+0.60	+0.44	+0.40	+0.85	+0.43	+0.76	+0.47	+0.13
economic_score	+0.01	+0.07	+0.84	+1.00	+0.28	+0.16	+0.48	+0.75	+0.17	+0.72	+0.41	-0.05
education_score	-0.06	-0.29	+0.60	+0.28	+1.00	+0.38	-0.19	+0.43	+0.27	+0.43	+0.19	+0.13
healthcare_score	+0.30	-0.22	+0.44	+0.16	+0.38	+1.00	-0.25	+0.22	+0.11	+0.30	+0.11	+0.57
housing_score	-0.11	+0.29	+0.40	+0.48	-0.19	-0.25	+1.00	+0.33	+0.03	+0.24	+0.19	-0.18
food_score	+0.02	-0.01	+0.85	+0.75	+0.43	+0.22	+0.33	+1.00	+0.23	+0.80	+0.64	+0.00
disability_score	-0.04	-0.09	+0.43	+0.17	+0.27	+0.11	+0.03	+0.23	+1.00	+0.15	+0.12	+0.06
poverty_rate	+0.16	-0.11	+0.76	+0.72	+0.43	+0.30	+0.24	+0.80	+0.15	+1.00	+0.45	-0.04
no_vehicle_pct	+0.04	+0.09	+0.47	+0.41	+0.19	+0.11	+0.19	+0.64	+0.12	+0.45	+1.00	+0.15
uninsured_rate	+0.01	-0.04	+0.13	-0.05	+0.13	+0.57	-0.18	+0.00	+0.06	-0.04	+0.15	+1.00

fips numeric identifier

This is the FIPS code identifying U.S. counties (or equivalents), with values spanning 1001 to 72153 and exactly one row per code (3222 unique out of 3222). The distribution is roughly symmetric (skew 0.16, kurtosis -0.63) with no nulls or outliers, consistent with a structured geographic key rather than a measured quantity. Treat the numeric stats as incidental—the magnitude has no analytic meaning.

Treatment: Cast to zero-padded string and use as a join key to county-level reference data.

anthropic:claude-opus-4-7 · confidence high

Out[13]:

saturn.columns["fips"].stats

stat	value
n	3,222
nulls	0 (0.0%)
unique	3,222
min	1,001
max	72,153
mean	3.138e+04
median	30,022
std	1.63e+04
q1	1.903e+04
q3	4.61e+04
iqr	27,075
skew	0.1574
kurtosis	-0.6314
n_outliers	0
outlier_rate	0
zero_rate	0

Fig 8.

Distribution of fips. Vertical dash marks the median.

Show data table

Histogram bins for fips (median: 30022.0).
bin	count
1001 – 2780	97
2780 – 4559	15
4559 – 6337	133
6337 – 8116	59
8116 – 9895	14
9895 – 1.167e+04	4
1.167e+04 – 1.345e+04	226
1.345e+04 – 1.523e+04	5
1.523e+04 – 1.701e+04	49
1.701e+04 – 1.879e+04	189
1.879e+04 – 2.057e+04	204
2.057e+04 – 2.235e+04	184
2.235e+04 – 2.413e+04	39
2.413e+04 – 2.59e+04	15
2.59e+04 – 2.768e+04	170
2.768e+04 – 2.946e+04	196
2.946e+04 – 3.124e+04	150
3.124e+04 – 3.302e+04	27
3.302e+04 – 3.48e+04	21
3.48e+04 – 3.658e+04	95
3.658e+04 – 3.836e+04	153
3.836e+04 – 4.013e+04	155
4.013e+04 – 4.191e+04	46
4.191e+04 – 4.369e+04	67
4.369e+04 – 4.547e+04	51
4.547e+04 – 4.725e+04	161
4.725e+04 – 4.903e+04	268
4.903e+04 – 5.081e+04	29
5.081e+04 – 5.259e+04	133
5.259e+04 – 5.436e+04	94
5.436e+04 – 5.614e+04	95
5.614e+04 – 5.792e+04	0
5.792e+04 – 5.97e+04	0
5.97e+04 – 6.148e+04	0
6.148e+04 – 6.326e+04	0
6.326e+04 – 6.504e+04	0
6.504e+04 – 6.682e+04	0
6.682e+04 – 6.86e+04	0
6.86e+04 – 7.037e+04	0
7.037e+04 – 7.215e+04	78

county_name text identifier

This column appears to be a fully-qualified US county name (e.g., 'X County, State'), with all 3222 values unique and zero nulls. The token 'county,' appears in 2999 of 3222 rows, suggesting ~223 entries use a different administrative suffix (parish, borough, census area). State-name frequencies (Texas 256, Virginia 189, Georgia 159) line up with known county counts, and length is tightly bounded between 16 and 59 characters.

Treatment: Use as a join key to county-level reference tables; do not feed as a feature.

anthropic:claude-opus-4-7 · confidence high

Out[16]:

saturn.columns["county_name"].stats

stat	value
n	3,222
nulls	0 (0.0%)
unique	3,222
len_min	16
len_max	59
len_mean	24.32
len_median	24
len_p95	31
word_mean	3.248
word_median	3
n_empty	0
n_duplicates	0
duplicate_rate	0
vocab_size	1,990
readability_flesch_mean	10.28
emoji_rate	0
url_rate	0
one_word_rate	0
allcaps_rate	0
boilerplate_rate	0
alert: near_unique	100.0% of rows are unique strings

Fig 9.

Character-length distribution for county_name.

Show data table

Character-length distribution for county_name (mean: 24.324022346368714).
chars	count
16 – 17	26
17 – 18	72
18 – 19	121
19 – 20	190
20 – 21	264
21 – 22	407
22 – 24	420
24 – 25	363
25 – 26	320
26 – 27	240
27 – 28	231
28 – 29	152
29 – 30	139
30 – 31	165
31 – 32	41
32 – 33	28
33 – 34	16
34 – 35	10
35 – 36	5
36 – 38	0
38 – 39	1
39 – 40	1
40 – 41	0
41 – 42	1
42 – 43	1
43 – 44	0
44 – 45	2
45 – 46	0
46 – 47	1
47 – 48	1
48 – 49	0
49 – 50	0
50 – 51	0
51 – 53	0
53 – 54	2
54 – 55	1
55 – 56	0
56 – 57	0
57 – 58	0
58 – 59	1

state categorical feature

This is a US state code column with 52 distinct values, consistent with the 50 states plus DC and likely one territory. Distribution is broad and near-uniform on entropy (entropy_ratio 0.932), with TX leading at just 254 of 3222 rows (7.88%) followed by GA, VA, KY, and MO — suggesting one row per US county or similar geographic unit rather than a population-weighted sample. No nulls.

Treatment: Use as a categorical grouping key; one-hot or target-encode for modelling.

anthropic:claude-opus-4-7 · confidence high

Out[19]:

saturn.columns["state"].stats

stat	value
n	3,222
nulls	0 (0.0%)
unique	52
top_value	TX
top_rate	0.07883
cardinality	52
entropy	5.314
entropy_ratio	0.9322

Fig 10.

Top values for state.

Show data table

Top values for state (20 unique shown, of 52 total).
value	count	share
TX	254	7.9%
GA	159	4.9%
VA	133	4.1%
KY	120	3.7%
MO	115	3.6%
KS	105	3.3%
IL	102	3.2%
NC	100	3.1%
IA	99	3.1%
TN	95	2.9%
NE	93	2.9%
IN	92	2.9%
OH	88	2.7%
MN	87	2.7%
MI	83	2.6%
MS	82	2.5%
PR	78	2.4%
OK	77	2.4%
AR	75	2.3%
WI	72	2.2%

total_pop numeric feature

This is a population count column with 3222 records and 3173 unique values, no nulls or zeros, ranging from 47 to 9,782,602. The distribution is extremely right-skewed (skew 13.36, kurtosis 297.59) with the mean (101,340) nearly four times the median (25,174), and 449 outliers (13.9%) sit beyond the IQR fence. The shape is consistent with US county- or municipality-level populations where a few large metros dominate.

Treatment: log-transform before modelling to tame the heavy right tail.

anthropic:claude-opus-4-7 · confidence high

Out[22]:

saturn.columns["total_pop"].stats

stat	value
n	3,222
nulls	0 (0.0%)
unique	3,173
min	47
max	9.783e+06
mean	1.013e+05
median	25,174
std	3.246e+05
q1	1.059e+04
q3	6.501e+04
iqr	5.442e+04
skew	13.36
kurtosis	297.6
n_outliers	449
outlier_rate	0.1394
zero_rate	0
alert: high_skew	skew=+13.36
alert: outliers	13.9% rows beyond 1.5 IQR

Fig 11.

Distribution of total_pop. Vertical dash marks the median.

Show data table

Histogram bins for total_pop (median: 25174.0).
bin	count
47 – 2.446e+05	2942
2.446e+05 – 4.892e+05	137
4.892e+05 – 7.337e+05	57
7.337e+05 – 9.783e+05	39
9.783e+05 – 1.223e+06	12
1.223e+06 – 1.467e+06	9
1.467e+06 – 1.712e+06	7
1.712e+06 – 1.957e+06	3
1.957e+06 – 2.201e+06	3
2.201e+06 – 2.446e+06	4
2.446e+06 – 2.69e+06	3
2.69e+06 – 2.935e+06	0
2.935e+06 – 3.179e+06	1
3.179e+06 – 3.424e+06	1
3.424e+06 – 3.669e+06	0
3.669e+06 – 3.913e+06	0
3.913e+06 – 4.158e+06	0
4.158e+06 – 4.402e+06	1
4.402e+06 – 4.647e+06	0
4.647e+06 – 4.891e+06	1
4.891e+06 – 5.136e+06	0
5.136e+06 – 5.38e+06	1
5.38e+06 – 5.625e+06	0
5.625e+06 – 5.87e+06	0
5.87e+06 – 6.114e+06	0
6.114e+06 – 6.359e+06	0
6.359e+06 – 6.603e+06	0
6.603e+06 – 6.848e+06	0
6.848e+06 – 7.092e+06	0
7.092e+06 – 7.337e+06	0
7.337e+06 – 7.582e+06	0
7.582e+06 – 7.826e+06	0
7.826e+06 – 8.071e+06	0
8.071e+06 – 8.315e+06	0
8.315e+06 – 8.56e+06	0
8.56e+06 – 8.804e+06	0
8.804e+06 – 9.049e+06	0
9.049e+06 – 9.293e+06	0
9.293e+06 – 9.538e+06	0
9.538e+06 – 9.783e+06	1

composite_index numeric feature

A numeric composite_index spanning 10.1 to 90.1 with mean 49.99 and median 49.5, suggesting a deliberately scaled or normalized index centered near 50. The distribution is nearly symmetric (skew 0.13) and slightly platykurtic (kurtosis -0.67), with no nulls, no zeros, and no outliers flagged. Only 650 unique values across 3222 rows points to rounding to one decimal rather than continuous measurement.

Treatment: Use as-is for modelling; already well-scaled and clean, no transform needed.

anthropic:claude-opus-4-7 · confidence high

Out[25]:

saturn.columns["composite_index"].stats

stat	value
n	3,222
nulls	0 (0.0%)
unique	650
min	10.1
max	90.1
mean	49.99
median	49.5
std	15.29
q1	38.4
q3	61.5
iqr	23.1
skew	0.1295
kurtosis	-0.6661
n_outliers	0
outlier_rate	0
zero_rate	0

Fig 12.

Distribution of composite_index. Vertical dash marks the median.

Show data table

Histogram bins for composite_index (median: 49.5).
bin	count
10.1 – 12.1	1
12.1 – 14.1	2
14.1 – 16.1	6
16.1 – 18.1	15
18.1 – 20.1	13
20.1 – 22.1	27
22.1 – 24.1	50
24.1 – 26.1	46
26.1 – 28.1	78
28.1 – 30.1	81
30.1 – 32.1	101
32.1 – 34.1	107
34.1 – 36.1	118
36.1 – 38.1	139
38.1 – 40.1	142
40.1 – 42.1	128
42.1 – 44.1	155
44.1 – 46.1	160
46.1 – 48.1	151
48.1 – 50.1	142
50.1 – 52.1	162
52.1 – 54.1	165
54.1 – 56.1	115
56.1 – 58.1	122
58.1 – 60.1	113
60.1 – 62.1	116
62.1 – 64.1	108
64.1 – 66.1	109
66.1 – 68.1	81
68.1 – 70.1	103
70.1 – 72.1	65
72.1 – 74.1	81
74.1 – 76.1	70
76.1 – 78.1	47
78.1 – 80.1	38
80.1 – 82.1	24
82.1 – 84.1	18
84.1 – 86.1	13
86.1 – 88.1	3
88.1 – 90.1	7

economic_score numeric feature

A bounded numeric feature ranging from 0.3 to 99.9 with mean 50.00 and median 49.6, consistent with a 0-100 economic index or score. The distribution is nearly symmetric (skew 0.084) and platykurtic (kurtosis -0.826), with no nulls, no zeros, and no outliers flagged across 3222 rows. With 908 unique values and an IQR of 35.47, the spread is wide and uniform-leaning rather than concentrated.

Treatment: Use as-is or min-max scale to [0,1]; no transformation needed given symmetric bounded distribution.

anthropic:claude-opus-4-7 · confidence high

Out[28]:

saturn.columns["economic_score"].stats

stat	value
n	3,222
nulls	0 (0.0%)
unique	908
min	0.3
max	99.9
mean	50
median	49.6
std	23.15
q1	32.2
q3	67.67
iqr	35.47
skew	0.084
kurtosis	-0.8261
n_outliers	0
outlier_rate	0
zero_rate	0

Fig 13.

Distribution of economic_score. Vertical dash marks the median.

Show data table

Histogram bins for economic_score (median: 49.6).
bin	count
0.3 – 2.79	11
2.79 – 5.28	19
5.28 – 7.77	37
7.77 – 10.26	41
10.26 – 12.75	41
12.75 – 15.24	59
15.24 – 17.73	60
17.73 – 20.22	95
20.22 – 22.71	78
22.71 – 25.2	99
25.2 – 27.69	91
27.69 – 30.18	97
30.18 – 32.67	94
32.67 – 35.16	119
35.16 – 37.65	100
37.65 – 40.14	140
40.14 – 42.63	110
42.63 – 45.12	127
45.12 – 47.61	106
47.61 – 50.1	114
50.1 – 52.59	133
52.59 – 55.08	110
55.08 – 57.57	124
57.57 – 60.06	108
60.06 – 62.55	114
62.55 – 65.04	90
65.04 – 67.53	93
67.53 – 70.02	101
70.02 – 72.51	99
72.51 – 75	74
75 – 77.49	81
77.49 – 79.98	68
79.98 – 82.47	62
82.47 – 84.96	71
84.96 – 87.45	57
87.45 – 89.94	57
89.94 – 92.43	36
92.43 – 94.92	55
94.92 – 97.41	27
97.41 – 99.9	24

education_score numeric feature

This column is a numeric education score bounded between 0 and 100 with a perfectly symmetric distribution (mean and median both 50.0, skew effectively zero). The negative kurtosis of -1.20 and IQR spanning exactly 25 to 75 suggest a near-uniform spread rather than a bell curve, which is unusual for a real-world score and hints at synthetic or rank-transformed data. With 1001 unique values across 3222 rows, no nulls, and no outliers, the column is clean but suspiciously well-behaved.

Treatment: Use as-is or scale to [0,1]; verify it isn't a synthetic/rank feature before modelling.

anthropic:claude-opus-4-7 · confidence high

Out[31]:

saturn.columns["education_score"].stats

stat	value
n	3,222
nulls	0 (0.0%)
unique	1,001
min	0
max	100
mean	50
median	50
std	28.88
q1	25
q3	75
iqr	50
skew	1.2e-17
kurtosis	-1.2
n_outliers	0
outlier_rate	0
zero_rate	0.0006207

Fig 14.

Distribution of education_score. Vertical dash marks the median.

Show data table

Histogram bins for education_score (median: 50.0).
bin	count
0 – 2.5	79
2.5 – 5	81
5 – 7.5	80
7.5 – 10	81
10 – 12.5	81
12.5 – 15	80
15 – 17.5	81
17.5 – 20	80
20 – 22.5	81
22.5 – 25	80
25 – 27.5	81
27.5 – 30	80
30 – 32.5	81
32.5 – 35	80
35 – 37.5	81
37.5 – 40	80
40 – 42.5	81
42.5 – 45	80
45 – 47.5	81
47.5 – 50	80
50 – 52.5	81
52.5 – 55	80
55 – 57.5	81
57.5 – 60	80
60 – 62.5	81
62.5 – 65	81
65 – 67.5	80
67.5 – 70	81
70 – 72.5	80
72.5 – 75	81
75 – 77.5	80
77.5 – 80	81
80 – 82.5	80
82.5 – 85	81
85 – 87.5	80
87.5 – 90	81
90 – 92.5	80
92.5 – 95	81
95 – 97.5	80
97.5 – 100	83

healthcare_score numeric feature

A continuous healthcare quality or performance score for 3222 rows, ranging from 4.3 to 98.2 with mean 50.0 and median 48.6. The distribution is mildly right-skewed (0.24) with negative kurtosis (-0.75), suggesting a broad, near-uniform spread rather than a tight bell, and no outliers were flagged. With 808 unique values, no nulls, and no zeros, the column looks clean and ready to use.

Treatment: Use as-is as a numeric feature; standardize if combining with other scaled features.

anthropic:claude-opus-4-7 · confidence high

Out[34]:

saturn.columns["healthcare_score"].stats

stat	value
n	3,222
nulls	0 (0.0%)
unique	808
min	4.3
max	98.2
mean	50
median	48.6
std	20.19
q1	33.9
q3	64.57
iqr	30.67
skew	0.2381
kurtosis	-0.7521
n_outliers	0
outlier_rate	0
zero_rate	0

Fig 15.

Distribution of healthcare_score. Vertical dash marks the median.

Show data table

Histogram bins for healthcare_score (median: 48.6).
bin	count
4.3 – 6.647	3
6.647 – 8.995	2
8.995 – 11.34	5
11.34 – 13.69	12
13.69 – 16.04	43
16.04 – 18.39	72
18.39 – 20.73	76
20.73 – 23.08	83
23.08 – 25.43	100
25.43 – 27.78	90
27.78 – 30.12	121
30.12 – 32.47	120
32.47 – 34.82	135
34.82 – 37.16	115
37.16 – 39.51	103
39.51 – 41.86	109
41.86 – 44.21	146
44.21 – 46.55	150
46.55 – 48.9	150
48.9 – 51.25	125
51.25 – 53.6	128
53.6 – 55.95	128
55.95 – 58.29	148
58.29 – 60.64	102
60.64 – 62.99	91
62.99 – 65.34	96
65.34 – 67.68	70
67.68 – 70.03	93
70.03 – 72.38	73
72.38 – 74.73	79
74.73 – 77.07	71
77.07 – 79.42	66
79.42 – 81.77	70
81.77 – 84.11	72
84.11 – 86.46	40
86.46 – 88.81	40
88.81 – 91.16	33
91.16 – 93.51	23
93.51 – 95.85	20
95.85 – 98.2	19

housing_score numeric feature

A continuous housing_score ranging from 0.0 to 99.9 with mean 49.93 and median 49.85, suggesting a 0-100 index. The distribution is nearly symmetric (skew 0.01) and platykurtic (kurtosis -0.88), with a wide IQR of 37.98 and no detected outliers, consistent with a near-uniform spread rather than a peaked score. No nulls and only one zero across 3222 rows.

Treatment: Use as-is or min-max scale to [0,1]; no transform needed given symmetry and absence of outliers.

anthropic:claude-opus-4-7 · confidence high

Out[37]:

saturn.columns["housing_score"].stats

stat	value
n	3,222
nulls	0 (0.0%)
unique	937
min	0
max	99.9
mean	49.93
median	49.85
std	24.47
q1	30.73
q3	68.7
iqr	37.98
skew	0.01353
kurtosis	-0.8807
n_outliers	0
outlier_rate	0
zero_rate	0.0003104

Fig 16.

Distribution of housing_score. Vertical dash marks the median.

Show data table

Histogram bins for housing_score (median: 49.849999999999994).
bin	count
0 – 2.498	34
2.498 – 4.995	39
4.995 – 7.492	52
7.492 – 9.99	48
9.99 – 12.49	45
12.49 – 14.98	42
14.98 – 17.48	78
17.48 – 19.98	78
19.98 – 22.48	73
22.48 – 24.98	99
24.98 – 27.47	94
27.47 – 29.97	99
29.97 – 32.47	90
32.47 – 34.97	87
34.97 – 37.46	96
37.46 – 39.96	117
39.96 – 42.46	107
42.46 – 44.95	116
44.95 – 47.45	102
47.45 – 49.95	121
49.95 – 52.45	142
52.45 – 54.95	107
54.95 – 57.44	103
57.44 – 59.94	99
59.94 – 62.44	105
62.44 – 64.94	85
64.94 – 67.43	102
67.43 – 69.93	101
69.93 – 72.43	79
72.43 – 74.92	81
74.92 – 77.42	94
77.42 – 79.92	80
79.92 – 82.42	70
82.42 – 84.92	59
84.92 – 87.41	61
87.41 – 89.91	60
89.91 – 92.41	61
92.41 – 94.91	53
94.91 – 97.4	47
97.4 – 99.9	16

food_score numeric feature

A numeric feature called food_score that ranges from 0.1 to 99.5 with mean 49.9997 and median 50.0, suggesting a percentile-style or normalised rating bounded near [0,100]. The distribution is essentially symmetric (skew 0.029) and platykurtic (kurtosis -0.96), with no nulls, no zeros, and no outliers across 3222 rows — consistent with a synthetic or uniformly distributed score rather than an organic measurement.

Treatment: Use as-is; already on a bounded 0–100 scale with no transformation needed.

anthropic:claude-opus-4-7 · confidence high

Out[40]:

saturn.columns["food_score"].stats

stat	value
n	3,222
nulls	0 (0.0%)
unique	941
min	0.1
max	99.5
mean	50
median	50
std	25.48
q1	29.6
q3	69.8
iqr	40.2
skew	0.02926
kurtosis	-0.9648
n_outliers	0
outlier_rate	0
zero_rate	0

Fig 17.

Distribution of food_score. Vertical dash marks the median.

Show data table

Histogram bins for food_score (median: 50.0).
bin	count
0.1 – 2.585	23
2.585 – 5.07	43
5.07 – 7.555	61
7.555 – 10.04	62
10.04 – 12.53	63
12.53 – 15.01	74
15.01 – 17.5	64
17.5 – 19.98	82
19.98 – 22.47	75
22.47 – 24.95	86
24.95 – 27.44	85
27.44 – 29.92	106
29.92 – 32.41	92
32.41 – 34.89	99
34.89 – 37.38	96
37.38 – 39.86	97
39.86 – 42.35	102
42.35 – 44.83	90
44.83 – 47.32	84
47.32 – 49.8	121
49.8 – 52.29	94
52.29 – 54.77	117
54.77 – 57.26	102
57.26 – 59.74	93
59.74 – 62.23	101
62.23 – 64.71	92
64.71 – 67.2	97
67.2 – 69.68	107
69.68 – 72.17	89
72.17 – 74.65	90
74.65 – 77.14	83
77.14 – 79.62	76
79.62 – 82.11	78
82.11 – 84.59	54
84.59 – 87.08	71
87.08 – 89.56	56
89.56 – 92.05	43
92.05 – 94.53	47
94.53 – 97.02	69
97.02 – 99.5	58

disability_score numeric feature

A numeric disability score bounded between 0 and 100 with mean and median both exactly 50.0 and zero skew, indicating a perfectly symmetric distribution. The negative kurtosis (-1.20) and IQR spanning 25 to 75 suggest a near-uniform spread rather than a bell curve, which is unusual for a real-world severity metric and hints at synthetic or rank-based generation. No nulls and no outliers across 3222 rows with 1001 distinct values.

Treatment: use as-is or bin into quartiles; no transformation needed given symmetric bounded range.

anthropic:claude-opus-4-7 · confidence high

Out[43]:

saturn.columns["disability_score"].stats

stat	value
n	3,222
nulls	0 (0.0%)
unique	1,001
min	0
max	100
mean	50
median	50
std	28.88
q1	25
q3	75
iqr	50
skew	1.2e-17
kurtosis	-1.2
n_outliers	0
outlier_rate	0
zero_rate	0.0006207

Fig 18.

Distribution of disability_score. Vertical dash marks the median.

Show data table

Histogram bins for disability_score (median: 50.0).
bin	count
0 – 2.5	79
2.5 – 5	81
5 – 7.5	80
7.5 – 10	81
10 – 12.5	81
12.5 – 15	80
15 – 17.5	81
17.5 – 20	80
20 – 22.5	81
22.5 – 25	80
25 – 27.5	81
27.5 – 30	80
30 – 32.5	81
32.5 – 35	80
35 – 37.5	81
37.5 – 40	80
40 – 42.5	81
42.5 – 45	80
45 – 47.5	81
47.5 – 50	80
50 – 52.5	81
52.5 – 55	80
55 – 57.5	81
57.5 – 60	80
60 – 62.5	81
62.5 – 65	81
65 – 67.5	80
67.5 – 70	81
70 – 72.5	80
72.5 – 75	81
75 – 77.5	80
77.5 – 80	81
80 – 82.5	80
82.5 – 85	81
85 – 87.5	80
87.5 – 90	81
90 – 92.5	80
92.5 – 95	81
95 – 97.5	80
97.5 – 100	83

poverty_rate numeric feature

Numeric poverty rate (likely percent of population below the poverty line) across 3,222 rows with no nulls and 1,719 distinct values. The distribution is right-skewed (skew 2.10, kurtosis 6.89): median is 13.55 and Q3 is 17.91, but the max reaches 66.32, producing 137 outliers (4.25%). Minimum is 1.6 and there are no zeros, consistent with a county- or area-level rate rather than individual records.

Treatment: Consider a log or winsorizing transform before regression to tame the right tail.

anthropic:claude-opus-4-7 · confidence high

Out[46]:

saturn.columns["poverty_rate"].stats

stat	value
n	3,222
nulls	0 (0.0%)
unique	1,719
min	1.6
max	66.32
mean	15.1
median	13.55
std	7.706
q1	10.16
q3	17.91
iqr	7.75
skew	2.096
kurtosis	6.891
n_outliers	137
outlier_rate	0.04252
zero_rate	0
alert: high_skew	skew=+2.10

Fig 19.

Distribution of poverty_rate. Vertical dash marks the median.

Show data table

Histogram bins for poverty_rate (median: 13.55).
bin	count
1.6 – 3.218	7
3.218 – 4.836	34
4.836 – 6.454	106
6.454 – 8.072	246
8.072 – 9.69	320
9.69 – 11.31	354
11.31 – 12.93	393
12.93 – 14.54	364
14.54 – 16.16	306
16.16 – 17.78	262
17.78 – 19.4	192
19.4 – 21.02	149
21.02 – 22.63	123
22.63 – 24.25	91
24.25 – 25.87	52
25.87 – 27.49	44
27.49 – 29.11	34
29.11 – 30.72	23
30.72 – 32.34	18
32.34 – 33.96	14
33.96 – 35.58	6
35.58 – 37.2	8
37.2 – 38.81	3
38.81 – 40.43	8
40.43 – 42.05	5
42.05 – 43.67	9
43.67 – 45.29	4
45.29 – 46.9	11
46.9 – 48.52	7
48.52 – 50.14	8
50.14 – 51.76	2
51.76 – 53.38	6
53.38 – 54.99	5
54.99 – 56.61	5
56.61 – 58.23	1
58.23 – 59.85	0
59.85 – 61.47	0
61.47 – 63.08	0
63.08 – 64.7	1
64.7 – 66.32	1

no_vehicle_pct numeric feature

Percentage of households with no vehicle, reported per row (likely a geographic unit like county or tract). The distribution is tightly clustered with a median of 5.41 and IQR of 3.38, but a long right tail pushes the max to 85.94, yielding skew of 6.98 and kurtosis of 86.23. About 4.3% of rows are flagged as outliers, and 0.37% are exact zeros; no nulls.

Treatment: Log1p- or winsorize before modelling to tame the heavy right tail.

anthropic:claude-opus-4-7 · confidence high

Out[49]:

saturn.columns["no_vehicle_pct"].stats

stat	value
n	3,222
nulls	0 (0.0%)
unique	1,065
min	0
max	85.94
mean	6.197
median	5.41
std	4.538
q1	3.98
q3	7.36
iqr	3.38
skew	6.976
kurtosis	86.23
n_outliers	140
outlier_rate	0.04345
zero_rate	0.003724
alert: high_skew	skew=+6.98

Fig 20.

Distribution of no_vehicle_pct. Vertical dash marks the median.

Show data table

Histogram bins for no_vehicle_pct (median: 5.41).
bin	count
0 – 2.148	161
2.148 – 4.297	823
4.297 – 6.445	1091
6.445 – 8.594	630
8.594 – 10.74	283
10.74 – 12.89	111
12.89 – 15.04	61
15.04 – 17.19	23
17.19 – 19.34	8
19.34 – 21.48	3
21.48 – 23.63	4
23.63 – 25.78	2
25.78 – 27.93	3
27.93 – 30.08	2
30.08 – 32.23	2
32.23 – 34.38	2
34.38 – 36.52	2
36.52 – 38.67	2
38.67 – 40.82	0
40.82 – 42.97	1
42.97 – 45.12	0
45.12 – 47.27	1
47.27 – 49.42	0
49.42 – 51.56	0
51.56 – 53.71	0
53.71 – 55.86	1
55.86 – 58.01	1
58.01 – 60.16	0
60.16 – 62.31	2
62.31 – 64.45	0
64.45 – 66.6	1
66.6 – 68.75	0
68.75 – 70.9	0
70.9 – 73.05	0
73.05 – 75.2	0
75.2 – 77.35	0
77.35 – 79.49	1
79.49 – 81.64	0
81.64 – 83.79	0
83.79 – 85.94	1

uninsured_rate numeric feature

Likely a per-record uninsured rate (probably proportion of population without insurance), ranging 0.0 to 3.7 with a median of 0.12 and IQR of 0.21. The distribution is heavily right-skewed (skew 4.10, kurtosis 27.7) with 230 outliers (7.1%) and 17.5% exact zeros; the max of 3.7 is implausible for a true rate and suggests mixed units or data-entry errors.

Treatment: Investigate values >1 for unit errors, then winsorize or log1p-transform before modelling.

anthropic:claude-opus-4-7 · confidence high

Out[52]:

saturn.columns["uninsured_rate"].stats

stat	value
n	3,222
nulls	0 (0.0%)
unique	152
min	0
max	3.7
mean	0.2002
median	0.12
std	0.2829
q1	0.04
q3	0.25
iqr	0.21
skew	4.095
kurtosis	27.7
n_outliers	230
outlier_rate	0.07138
zero_rate	0.1754
alert: high_skew	skew=+4.10
alert: outliers	7.1% rows beyond 1.5 IQR

Fig 21.

Distribution of uninsured_rate. Vertical dash marks the median.

Show data table

Histogram bins for uninsured_rate (median: 0.12).
bin	count
0 – 0.0925	1403
0.0925 – 0.185	704
0.185 – 0.2775	403
0.2775 – 0.37	213
0.37 – 0.4625	158
0.4625 – 0.555	101
0.555 – 0.6475	65
0.6475 – 0.74	43
0.74 – 0.8325	27
0.8325 – 0.925	23
0.925 – 1.018	9
1.018 – 1.11	15
1.11 – 1.202	14
1.202 – 1.295	5
1.295 – 1.387	7
1.387 – 1.48	7
1.48 – 1.573	5
1.573 – 1.665	2
1.665 – 1.758	4
1.758 – 1.85	1
1.85 – 1.942	1
1.942 – 2.035	1
2.035 – 2.127	2
2.127 – 2.22	2
2.22 – 2.312	1
2.312 – 2.405	0
2.405 – 2.498	0
2.498 – 2.59	1
2.59 – 2.683	0
2.683 – 2.775	1
2.775 – 2.868	0
2.868 – 2.96	1
2.96 – 3.052	1
3.052 – 3.145	0
3.145 – 3.237	1
3.237 – 3.33	0
3.33 – 3.422	0
3.422 – 3.515	0
3.515 – 3.607	0
3.607 – 3.7	1

hospital_closure_risk numeric feature

A coarse risk score for hospital closure taking only 3 distinct values across 3222 rows, bounded between 0.0 and 50.0 with a median of 25.0. Despite being stored as numeric, the column behaves categorically: 28.8% of rows are zero and quartiles collapse to 0.0 and 25.0, suggesting the three buckets are roughly {0, 25, 50}. No outliers and no nulls.

Treatment: Treat as an ordinal category with three levels rather than a continuous variable.

anthropic:claude-opus-4-7 · confidence high

Out[55]:

saturn.columns["hospital_closure_risk"].stats

stat	value
n	3,222
nulls	0 (0.0%)
unique	3
min	0
max	50
mean	21.69
median	25
std	16.34
q1	0
q3	25
iqr	25
skew	0.1414
kurtosis	-0.6949
n_outliers	0
outlier_rate	0
zero_rate	0.2883

Fig 22.

Distribution of hospital_closure_risk. Vertical dash marks the median.

Show data table

Histogram bins for hospital_closure_risk (median: 25.0).
bin	count
0 – 1.25	929
1.25 – 2.5	0
2.5 – 3.75	0
3.75 – 5	0
5 – 6.25	0
6.25 – 7.5	0
7.5 – 8.75	0
8.75 – 10	0
10 – 11.25	0
11.25 – 12.5	0
12.5 – 13.75	0
13.75 – 15	0
15 – 16.25	0
16.25 – 17.5	0
17.5 – 18.75	0
18.75 – 20	0
20 – 21.25	0
21.25 – 22.5	0
22.5 – 23.75	0
23.75 – 25	0
25 – 26.25	1790
26.25 – 27.5	0
27.5 – 28.75	0
28.75 – 30	0
30 – 31.25	0
31.25 – 32.5	0
32.5 – 33.75	0
33.75 – 35	0
35 – 36.25	0
36.25 – 37.5	0
37.5 – 38.75	0
38.75 – 40	0
40 – 41.25	0
41.25 – 42.5	0
42.5 – 43.75	0
43.75 – 45	0
45 – 46.25	0
46.25 – 47.5	0
47.5 – 48.75	0
48.75 – 50	503

pct_rent_burdened_30 numeric feature

This appears to be the percentage of renter households spending at least 30% of income on rent, reported per row (likely a county or tract). Values span 0 to 64.96 with a median of 37.36 and IQR 30.67–43.48, indicating most areas cluster in the 30–45% range with a mild left skew (-0.57). About 0.25% of rows are exact zeros and 58 outliers (1.8%) sit outside the whiskers, worth checking for small-population geographies.

Treatment: Use as-is for modelling; optionally winsorize the 58 outliers and verify zero-valued rows.

anthropic:claude-opus-4-7 · confidence high

Out[58]:

saturn.columns["pct_rent_burdened_30"].stats

stat	value
n	3,222
nulls	0 (0.0%)
unique	2,146
min	0
max	64.96
mean	36.44
median	37.36
std	10.01
q1	30.67
q3	43.48
iqr	12.81
skew	-0.5673
kurtosis	0.5032
n_outliers	58
outlier_rate	0.018
zero_rate	0.002483

Fig 23.

Distribution of pct_rent_burdened_30. Vertical dash marks the median.

Show data table

Histogram bins for pct_rent_burdened_30 (median: 37.36).
bin	count
0 – 1.624	9
1.624 – 3.248	5
3.248 – 4.872	3
4.872 – 6.496	5
6.496 – 8.12	9
8.12 – 9.744	13
9.744 – 11.37	11
11.37 – 12.99	16
12.99 – 14.62	26
14.62 – 16.24	19
16.24 – 17.86	35
17.86 – 19.49	43
19.49 – 21.11	52
21.11 – 22.74	52
22.74 – 24.36	73
24.36 – 25.98	99
25.98 – 27.61	109
27.61 – 29.23	116
29.23 – 30.86	132
30.86 – 32.48	159
32.48 – 34.1	189
34.1 – 35.73	209
35.73 – 37.35	227
37.35 – 38.98	239
38.98 – 40.6	205
40.6 – 42.22	209
42.22 – 43.85	210
43.85 – 45.47	190
45.47 – 47.1	131
47.1 – 48.72	114
48.72 – 50.34	118
50.34 – 51.97	69
51.97 – 53.59	51
53.59 – 55.22	34
55.22 – 56.84	24
56.84 – 58.46	6
58.46 – 60.09	3
60.09 – 61.71	2
61.71 – 63.34	3
63.34 – 64.96	3

pct_rent_burdened_50 numeric feature

This column reports the percentage of households that are severely rent-burdened (spending 50%+ of income on rent), with values ranging from 0.0 to 64.96 and a mean of 17.35 closely matching the median of 17.62. The distribution is remarkably symmetric (skew 0.054) and near-normal in shape, with only 47 outliers (1.46%) and a small zero rate of 0.93%. The tight IQR of 8.56 around a median near 17.6 suggests most geographies cluster in a narrow band of severe rent burden.

Treatment: Use directly as a numeric feature; no transform needed given near-symmetric distribution.

anthropic:claude-opus-4-7 · confidence high

Out[61]:

saturn.columns["pct_rent_burdened_50"].stats

stat	value
n	3,222
nulls	0 (0.0%)
unique	1,769
min	0
max	64.96
mean	17.35
median	17.62
std	6.577
q1	13.07
q3	21.63
iqr	8.557
skew	0.05436
kurtosis	0.9823
n_outliers	47
outlier_rate	0.01459
zero_rate	0.009311

Fig 24.

Distribution of pct_rent_burdened_50. Vertical dash marks the median.

Show data table

Histogram bins for pct_rent_burdened_50 (median: 17.62).
bin	count
0 – 1.624	42
1.624 – 3.248	27
3.248 – 4.872	34
4.872 – 6.496	63
6.496 – 8.12	102
8.12 – 9.744	148
9.744 – 11.37	163
11.37 – 12.99	214
12.99 – 14.62	242
14.62 – 16.24	310
16.24 – 17.86	315
17.86 – 19.49	332
19.49 – 21.11	335
21.11 – 22.74	264
22.74 – 24.36	219
24.36 – 25.98	150
25.98 – 27.61	99
27.61 – 29.23	64
29.23 – 30.86	39
30.86 – 32.48	20
32.48 – 34.1	21
34.1 – 35.73	9
35.73 – 37.35	2
37.35 – 38.98	3
38.98 – 40.6	1
40.6 – 42.22	1
42.22 – 43.85	1
43.85 – 45.47	0
45.47 – 47.1	1
47.1 – 48.72	0
48.72 – 50.34	0
50.34 – 51.97	0
51.97 – 53.59	0
53.59 – 55.22	0
55.22 – 56.84	0
56.84 – 58.46	0
58.46 – 60.09	0
60.09 – 61.71	0
61.71 – 63.34	0
63.34 – 64.96	1

median_gross_rent numeric feature

Numeric column capturing median gross rent in dollars, with 3,222 rows, 983 unique values, and a trivial 0.31% null rate. The distribution is right-skewed (skew 1.76, kurtosis 4.55), running from 297 to 2,805 around a median of 818 and mean of 891, and 225 values (7.0%) flag as outliers on the high end. No zeros are present, so missingness isn't being encoded as 0.

Treatment: Log-transform before regression to tame the right skew and high-rent outliers.

anthropic:claude-opus-4-7 · confidence high

Out[64]:

saturn.columns["median_gross_rent"].stats

stat	value
n	3,222
nulls	10 (0.3%)
unique	983
min	297
max	2,805
mean	890.9
median	818
std	283.4
q1	718
q3	978
iqr	260
skew	1.763
kurtosis	4.55
n_outliers	225
outlier_rate	0.07005
zero_rate	0
alert: outliers	7.0% rows beyond 1.5 IQR

Fig 25.

Distribution of median_gross_rent. Vertical dash marks the median.

Show data table

Histogram bins for median_gross_rent (median: 818.0).
bin	count
297 – 359.7	5
359.7 – 422.4	14
422.4 – 485.1	32
485.1 – 547.8	69
547.8 – 610.5	128
610.5 – 673.2	242
673.2 – 735.9	457
735.9 – 798.6	515
798.6 – 861.3	423
861.3 – 924	306
924 – 986.7	251
986.7 – 1049	140
1049 – 1112	105
1112 – 1175	98
1175 – 1238	79
1238 – 1300	71
1300 – 1363	52
1363 – 1426	48
1426 – 1488	26
1488 – 1551	22
1551 – 1614	33
1614 – 1676	13
1676 – 1739	19
1739 – 1802	10
1802 – 1864	13
1864 – 1927	8
1927 – 1990	11
1990 – 2053	6
2053 – 2115	4
2115 – 2178	3
2178 – 2241	4
2241 – 2303	1
2303 – 2366	1
2366 – 2429	0
2429 – 2492	1
2492 – 2554	0
2554 – 2617	0
2617 – 2680	0
2680 – 2742	1
2742 – 2805	1

rent_to_income_ratio numeric feature

This column reports a rent-to-income ratio, with a typical tenant sitting near 17.06 and an interquartile range of just 4.29 between 15.1 and 19.39. However, the maximum of 1200.0 against a median of 17.06 produces extreme skew (53.98) and kurtosis (3007.07), and 107 values (3.33%) are flagged as outliers. The tight IQR alongside a 21.2 standard deviation indicates a small number of records are orders of magnitude beyond the bulk of the distribution.

Treatment: Cap or winsorize extreme values and log-transform before modelling.

anthropic:claude-opus-4-7 · confidence high

Out[67]:

saturn.columns["rent_to_income_ratio"].stats

stat	value
n	3,222
nulls	9 (0.3%)
unique	1,269
min	6.1
max	1,200
mean	17.89
median	17.06
std	21.2
q1	15.1
q3	19.39
iqr	4.29
skew	53.98
kurtosis	3007
n_outliers	107
outlier_rate	0.0333
zero_rate	0
alert: high_skew	skew=+53.98

Fig 26.

Distribution of rent_to_income_ratio. Vertical dash marks the median.

Show data table

Histogram bins for rent_to_income_ratio (median: 17.06).
bin	count
6.1 – 35.95	3207
35.95 – 65.8	5
65.8 – 95.64	0
95.64 – 125.5	0
125.5 – 155.3	0
155.3 – 185.2	0
185.2 – 215	0
215 – 244.9	0
244.9 – 274.7	0
274.7 – 304.6	0
304.6 – 334.4	0
334.4 – 364.3	0
364.3 – 394.1	0
394.1 – 424	0
424 – 453.8	0
453.8 – 483.7	0
483.7 – 513.5	0
513.5 – 543.4	0
543.4 – 573.2	0
573.2 – 603.1	0
603.1 – 632.9	0
632.9 – 662.7	0
662.7 – 692.6	0
692.6 – 722.4	0
722.4 – 752.3	0
752.3 – 782.1	0
782.1 – 812	0
812 – 841.8	0
841.8 – 871.7	0
871.7 – 901.5	0
901.5 – 931.4	0
931.4 – 961.2	0
961.2 – 991.1	0
991.1 – 1021	0
1021 – 1051	0
1051 – 1081	0
1081 – 1110	0
1110 – 1140	0
1140 – 1170	0
1170 – 1200	1

gini_index numeric feature

Numeric column holding Gini index values for 3,222 records, all populated and bounded between 0.2744 and 0.721 with a mean of 0.4481 and median 0.4457. Distribution is tight (IQR 0.049375, std 0.0384) with mild right skew (0.4999) and 56 high-side outliers (1.74%) stretching toward 0.721. Values fall in the expected 0–1 range for an inequality coefficient, suggesting a clean, ready-to-use feature.

Treatment: Use as-is as a numeric feature; optionally winsorize the 56 upper outliers.

anthropic:claude-opus-4-7 · confidence high

Out[70]:

saturn.columns["gini_index"].stats

stat	value
n	3,222
nulls	0 (0.0%)
unique	1,317
min	0.2744
max	0.721
mean	0.4481
median	0.4457
std	0.03841
q1	0.422
q3	0.4714
iqr	0.04938
skew	0.4999
kurtosis	1.634
n_outliers	56
outlier_rate	0.01738
zero_rate	0

Fig 27.

Distribution of gini_index. Vertical dash marks the median.

Show data table

Histogram bins for gini_index (median: 0.4457).
bin	count
0.2744 – 0.2856	1
0.2856 – 0.2967	1
0.2967 – 0.3079	1
0.3079 – 0.3191	0
0.3191 – 0.3302	1
0.3302 – 0.3414	3
0.3414 – 0.3526	6
0.3526 – 0.3637	10
0.3637 – 0.3749	29
0.3749 – 0.3861	66
0.3861 – 0.3972	123
0.3972 – 0.4084	202
0.4084 – 0.4195	277
0.4195 – 0.4307	365
0.4307 – 0.4419	375
0.4419 – 0.453	402
0.453 – 0.4642	370
0.4642 – 0.4754	299
0.4754 – 0.4865	227
0.4865 – 0.4977	162
0.4977 – 0.5089	104
0.5089 – 0.52	80
0.52 – 0.5312	37
0.5312 – 0.5424	26
0.5424 – 0.5535	22
0.5535 – 0.5647	14
0.5647 – 0.5759	6
0.5759 – 0.587	5
0.587 – 0.5982	2
0.5982 – 0.6093	3
0.6093 – 0.6205	2
0.6205 – 0.6317	0
0.6317 – 0.6428	0
0.6428 – 0.654	0
0.654 – 0.6652	0
0.6652 – 0.6763	0
0.6763 – 0.6875	0
0.6875 – 0.6987	0
0.6987 – 0.7098	0
0.7098 – 0.721	1

unemployment_rate numeric feature

Likely a county/region-level unemployment rate in percent, with values ranging from 0.0 to 31.99 and a median of 4.69. The distribution is heavily right-skewed (skew 2.55, kurtosis 12.81) with 154 outliers (4.78%) pulling the mean (5.13) above the median. A small zero_rate (0.56%) suggests a handful of suspiciously perfect-zero readings worth verifying.

Treatment: Log or Yeo-Johnson transform before regression to tame the right-skew, and inspect the zero values.

anthropic:claude-opus-4-7 · confidence high

Out[73]:

saturn.columns["unemployment_rate"].stats

stat	value
n	3,222
nulls	0 (0.0%)
unique	950
min	0
max	31.99
mean	5.127
median	4.69
std	2.926
q1	3.42
q3	6.08
iqr	2.66
skew	2.545
kurtosis	12.81
n_outliers	154
outlier_rate	0.0478
zero_rate	0.005587
alert: high_skew	skew=+2.55

Fig 28.

Distribution of unemployment_rate. Vertical dash marks the median.

Show data table

Histogram bins for unemployment_rate (median: 4.69).
bin	count
0 – 0.7997	60
0.7997 – 1.599	90
1.599 – 2.399	197
2.399 – 3.199	333
3.199 – 3.999	464
3.999 – 4.798	539
4.798 – 5.598	492
5.598 – 6.398	361
6.398 – 7.198	217
7.198 – 7.997	142
7.997 – 8.797	85
8.797 – 9.597	61
9.597 – 10.4	36
10.4 – 11.2	36
11.2 – 12	13
12 – 12.8	18
12.8 – 13.6	13
13.6 – 14.4	12
14.4 – 15.2	11
15.2 – 15.99	8
15.99 – 16.79	5
16.79 – 17.59	4
17.59 – 18.39	2
18.39 – 19.19	1
19.19 – 19.99	3
19.99 – 20.79	3
20.79 – 21.59	4
21.59 – 22.39	3
22.39 – 23.19	2
23.19 – 23.99	3
23.99 – 24.79	1
24.79 – 25.59	0
25.59 – 26.39	0
26.39 – 27.19	0
27.19 – 27.99	0
27.99 – 28.79	0
28.79 – 29.59	0
29.59 – 30.39	0
30.39 – 31.19	1
31.19 – 31.99	2

labor_force_participation numeric feature

Numeric labor force participation rate, almost certainly expressed as a percentage given the range of 18.63 to 84.04 and mean of 57.89. Distribution is moderately left-skewed (-0.58) with a tight interquartile band of 52.97 to 63.67, and only 38 outliers (1.18%) sit outside the whiskers. No nulls or zeros across 3,222 rows, and 1,944 unique values suggest fine-grained measurements rather than rounded buckets.

Treatment: Use as-is in modelling; mild left skew does not require transformation.

anthropic:claude-opus-4-7 · confidence high

Out[76]:

saturn.columns["labor_force_participation"].stats

stat	value
n	3,222
nulls	0 (0.0%)
unique	1,944
min	18.63
max	84.04
mean	57.89
median	58.72
std	8.041
q1	52.97
q3	63.66
iqr	10.7
skew	-0.5766
kurtosis	0.4502
n_outliers	38
outlier_rate	0.01179
zero_rate	0

Fig 29.

Distribution of labor_force_participation. Vertical dash marks the median.

Show data table

Histogram bins for labor_force_participation (median: 58.724999999999994).
bin	count
18.63 – 20.27	1
20.27 – 21.9	0
21.9 – 23.54	0
23.54 – 25.17	1
25.17 – 26.81	1
26.81 – 28.44	1
28.44 – 30.08	0
30.08 – 31.71	4
31.71 – 33.35	6
33.35 – 34.98	8
34.98 – 36.62	10
36.62 – 38.25	24
38.25 – 39.89	30
39.89 – 41.52	37
41.52 – 43.16	51
43.16 – 44.79	61
44.79 – 46.43	60
46.43 – 48.06	76
48.06 – 49.7	109
49.7 – 51.34	141
51.34 – 52.97	186
52.97 – 54.61	174
54.61 – 56.24	235
56.24 – 57.88	245
57.88 – 59.51	272
59.51 – 61.15	270
61.15 – 62.78	277
62.78 – 64.42	252
64.42 – 66.05	221
66.05 – 67.69	187
67.69 – 69.32	118
69.32 – 70.96	82
70.96 – 72.59	41
72.59 – 74.23	19
74.23 – 75.86	10
75.86 – 77.5	4
77.5 – 79.13	6
79.13 – 80.77	1
80.77 – 82.4	0
82.4 – 84.04	1

pct_deep_poverty numeric feature

Percentage of population in deep poverty across 3,222 rows, with no nulls and values bounded between 0.0 and 34.7. The distribution is right-skewed (skew 2.67, kurtosis 10.40) with median 5.82 trailing the mean 6.74, and 176 rows (5.5%) flagged as upper-tail outliers. Only 0.09% of rows are zero, so floor effects are minimal despite the long tail.

Treatment: Log or Winsorize before linear modelling to dampen the heavy right tail.

anthropic:claude-opus-4-7 · confidence high

Out[79]:

saturn.columns["pct_deep_poverty"].stats

stat	value
n	3,222
nulls	0 (0.0%)
unique	1,131
min	0
max	34.7
mean	6.743
median	5.82
std	4.154
q1	4.27
q3	7.918
iqr	3.648
skew	2.665
kurtosis	10.4
n_outliers	176
outlier_rate	0.05462
zero_rate	0.0009311
alert: high_skew	skew=+2.67
alert: outliers	5.5% rows beyond 1.5 IQR

Fig 30.

Distribution of pct_deep_poverty. Vertical dash marks the median.

Show data table

Histogram bins for pct_deep_poverty (median: 5.82).
bin	count
0 – 0.8675	15
0.8675 – 1.735	28
1.735 – 2.603	128
2.603 – 3.47	241
3.47 – 4.338	429
4.338 – 5.205	446
5.205 – 6.073	436
6.073 – 6.94	403
6.94 – 7.808	261
7.808 – 8.675	211
8.675 – 9.543	157
9.543 – 10.41	113
10.41 – 11.28	57
11.28 – 12.15	58
12.15 – 13.01	50
13.01 – 13.88	28
13.88 – 14.75	18
14.75 – 15.62	22
15.62 – 16.48	18
16.48 – 17.35	8
17.35 – 18.22	11
18.22 – 19.09	9
19.09 – 19.95	7
19.95 – 20.82	4
20.82 – 21.69	7
21.69 – 22.55	8
22.55 – 23.42	5
23.42 – 24.29	2
24.29 – 25.16	8
25.16 – 26.03	4
26.03 – 26.89	6
26.89 – 27.76	2
27.76 – 28.63	4
28.63 – 29.5	7
29.5 – 30.36	3
30.36 – 31.23	0
31.23 – 32.1	2
32.1 – 32.97	1
32.97 – 33.83	1
33.83 – 34.7	4

pct_poverty numeric feature

Likely a county- or area-level poverty rate expressed as a percentage, ranging from 1.6 to 66.32 with a median of 13.55 and mean of 15.10. The distribution is right-skewed (skew 2.10, kurtosis 6.89) with 137 outliers (4.25%) in the heavy upper tail, consistent with a small set of high-poverty areas pulling the mean above the median. No nulls or zeros, and 1719 unique values across 3222 rows suggest fine-grained but repeated measurements.

Treatment: Consider a log or sqrt transform before linear modelling to tame the right skew.

anthropic:claude-opus-4-7 · confidence high

Out[82]:

saturn.columns["pct_poverty"].stats

stat	value
n	3,222
nulls	0 (0.0%)
unique	1,719
min	1.6
max	66.32
mean	15.1
median	13.55
std	7.706
q1	10.16
q3	17.91
iqr	7.75
skew	2.096
kurtosis	6.891
n_outliers	137
outlier_rate	0.04252
zero_rate	0
alert: high_skew	skew=+2.10

Fig 31.

Distribution of pct_poverty. Vertical dash marks the median.

Show data table

Histogram bins for pct_poverty (median: 13.55).
bin	count
1.6 – 3.218	7
3.218 – 4.836	34
4.836 – 6.454	106
6.454 – 8.072	246
8.072 – 9.69	320
9.69 – 11.31	354
11.31 – 12.93	393
12.93 – 14.54	364
14.54 – 16.16	306
16.16 – 17.78	262
17.78 – 19.4	192
19.4 – 21.02	149
21.02 – 22.63	123
22.63 – 24.25	91
24.25 – 25.87	52
25.87 – 27.49	44
27.49 – 29.11	34
29.11 – 30.72	23
30.72 – 32.34	18
32.34 – 33.96	14
33.96 – 35.58	6
35.58 – 37.2	8
37.2 – 38.81	3
38.81 – 40.43	8
40.43 – 42.05	5
42.05 – 43.67	9
43.67 – 45.29	4
45.29 – 46.9	11
46.9 – 48.52	7
48.52 – 50.14	8
50.14 – 51.76	2
51.76 – 53.38	6
53.38 – 54.99	5
54.99 – 56.61	5
56.61 – 58.23	1
58.23 – 59.85	0
59.85 – 61.47	0
61.47 – 63.08	0
63.08 – 64.7	1
64.7 – 66.32	1

pct_near_poverty numeric feature

Percentage of population near the poverty line (likely between 100-200% of the federal poverty threshold), reported per record across 3222 rows with no nulls. The distribution centers around a median of 9.38 with an IQR of 4.43, but a right tail pushes the max to 49.14, yielding skew of 1.19 and kurtosis of 5.73. About 2.5% of values (82 rows) fall outside the outlier fence, suggesting a handful of high-poverty areas worth inspecting separately.

Treatment: Consider a log or sqrt transform before regression to tame the right skew.

anthropic:claude-opus-4-7 · confidence high

Out[85]:

saturn.columns["pct_near_poverty"].stats

stat	value
n	3,222
nulls	0 (0.0%)
unique	1,237
min	0.58
max	49.14
mean	9.813
median	9.38
std	3.644
q1	7.33
q3	11.76
iqr	4.43
skew	1.19
kurtosis	5.729
n_outliers	82
outlier_rate	0.02545
zero_rate	0

Fig 32.

Distribution of pct_near_poverty. Vertical dash marks the median.

Show data table

Histogram bins for pct_near_poverty (median: 9.38).
bin	count
0.58 – 1.794	7
1.794 – 3.008	18
3.008 – 4.222	82
4.222 – 5.436	161
5.436 – 6.65	302
6.65 – 7.864	419
7.864 – 9.078	480
9.078 – 10.29	487
10.29 – 11.51	392
11.51 – 12.72	280
12.72 – 13.93	210
13.93 – 15.15	138
15.15 – 16.36	87
16.36 – 17.58	53
17.58 – 18.79	37
18.79 – 20	35
20 – 21.22	15
21.22 – 22.43	7
22.43 – 23.65	2
23.65 – 24.86	5
24.86 – 26.07	1
26.07 – 27.29	2
27.29 – 28.5	0
28.5 – 29.72	0
29.72 – 30.93	0
30.93 – 32.14	0
32.14 – 33.36	0
33.36 – 34.57	0
34.57 – 35.79	0
35.79 – 37	1
37 – 38.21	0
38.21 – 39.43	0
39.43 – 40.64	0
40.64 – 41.86	0
41.86 – 43.07	0
43.07 – 44.28	0
44.28 – 45.5	0
45.5 – 46.71	0
46.71 – 47.93	0
47.93 – 49.14	1

pct_hs_or_higher numeric feature

Percentage of population (likely adults 25+) with a high school diploma or higher, reported per row across 3,222 records. Values are tightly clustered high (mean 88.08, median 89.39, IQR 84.9–92.47) with a left tail reaching down to 33.33, producing skew of -1.33 and 86 low-end outliers (2.67%). No nulls or zeros, and 1,612 unique values suggest a county- or tract-level rate.

Treatment: Use as-is for modelling, but consider a reflected log or winsorisation given the left skew and low-end outliers.

anthropic:claude-opus-4-7 · confidence high

Out[88]:

saturn.columns["pct_hs_or_higher"].stats

stat	value
n	3,222
nulls	0 (0.0%)
unique	1,612
min	33.33
max	99.69
mean	88.08
median	89.39
std	5.97
q1	84.9
q3	92.47
iqr	7.567
skew	-1.328
kurtosis	3.742
n_outliers	86
outlier_rate	0.02669
zero_rate	0

Fig 33.

Distribution of pct_hs_or_higher. Vertical dash marks the median.

Show data table

Histogram bins for pct_hs_or_higher (median: 89.39).
bin	count
33.33 – 34.99	1
34.99 – 36.65	0
36.65 – 38.31	0
38.31 – 39.97	0
39.97 – 41.62	0
41.62 – 43.28	0
43.28 – 44.94	0
44.94 – 46.6	0
46.6 – 48.26	0
48.26 – 49.92	0
49.92 – 51.58	0
51.58 – 53.24	0
53.24 – 54.9	0
54.9 – 56.56	1
56.56 – 58.22	1
58.22 – 59.87	1
59.87 – 61.53	3
61.53 – 63.19	3
63.19 – 64.85	3
64.85 – 66.51	2
66.51 – 68.17	6
68.17 – 69.83	7
69.83 – 71.49	15
71.49 – 73.15	30
73.15 – 74.81	30
74.81 – 76.46	46
76.46 – 78.12	60
78.12 – 79.78	88
79.78 – 81.44	131
81.44 – 83.1	174
83.1 – 84.76	189
84.76 – 86.42	256
86.42 – 88.08	289
88.08 – 89.74	360
89.74 – 91.39	429
91.39 – 93.05	460
93.05 – 94.71	389
94.71 – 96.37	192
96.37 – 98.03	47
98.03 – 99.69	9

pct_bachelors_or_higher numeric feature

Percent of adults with a bachelor's degree or higher, almost certainly at the county or similar geographic level given n=3222 with no nulls. Values range from 0.0 to 78.87 with median 21.07 and mean 23.50, and the distribution is right-skewed (skew 1.36, kurtosis 2.31) with 141 outliers (4.4%) on the high end—consistent with a long tail of highly educated metros above the typical county.

Treatment: Consider a log or sqrt transform before linear modelling to tame the right skew.

anthropic:claude-opus-4-7 · confidence high

Out[91]:

saturn.columns["pct_bachelors_or_higher"].stats

stat	value
n	3,222
nulls	0 (0.0%)
unique	1,982
min	0
max	78.87
mean	23.5
median	21.07
std	9.983
q1	16.59
q3	27.85
iqr	11.26
skew	1.357
kurtosis	2.306
n_outliers	141
outlier_rate	0.04376
zero_rate	0.0003104

Fig 34.

Distribution of pct_bachelors_or_higher. Vertical dash marks the median.

Show data table

Histogram bins for pct_bachelors_or_higher (median: 21.07).
bin	count
0 – 1.972	1
1.972 – 3.944	0
3.944 – 5.915	4
5.915 – 7.887	9
7.887 – 9.859	32
9.859 – 11.83	135
11.83 – 13.8	169
13.8 – 15.77	317
15.77 – 17.75	328
17.75 – 19.72	376
19.72 – 21.69	345
21.69 – 23.66	262
23.66 – 25.63	232
25.63 – 27.6	189
27.6 – 29.58	123
29.58 – 31.55	116
31.55 – 33.52	118
33.52 – 35.49	96
35.49 – 37.46	60
37.46 – 39.44	68
39.44 – 41.41	40
41.41 – 43.38	34
43.38 – 45.35	34
45.35 – 47.32	24
47.32 – 49.29	21
49.29 – 51.27	19
51.27 – 53.24	15
53.24 – 55.21	10
55.21 – 57.18	11
57.18 – 59.15	10
59.15 – 61.12	9
61.12 – 63.1	6
63.1 – 65.07	5
65.07 – 67.04	1
67.04 – 69.01	0
69.01 – 70.98	1
70.98 – 72.95	0
72.95 – 74.93	0
74.93 – 76.9	1
76.9 – 78.87	1

disability_rate numeric feature

This is a numeric disability rate per record, ranging from 0.0 to 9.17 with a median of 1.07 and IQR of 0.65. The distribution is heavily right-skewed (skew 2.17, kurtosis 15.24) with 117 outliers (3.6%) and a small but non-trivial 1.7% zeros. Only 305 unique values across 3,222 rows suggests the rate is reported at coarse precision or aggregated to a small set of geographies.

Treatment: Log- or winsorize-transform before regression to tame the right tail.

anthropic:claude-opus-4-7 · confidence high

Out[94]:

saturn.columns["disability_rate"].stats

stat	value
n	3,222
nulls	0 (0.0%)
unique	305
min	0
max	9.17
mean	1.145
median	1.07
std	0.6215
q1	0.77
q3	1.42
iqr	0.65
skew	2.167
kurtosis	15.24
n_outliers	117
outlier_rate	0.03631
zero_rate	0.01676
alert: high_skew	skew=+2.17

Fig 35.

Distribution of disability_rate. Vertical dash marks the median.

Show data table

Histogram bins for disability_rate (median: 1.07).
bin	count
0 – 0.2293	114
0.2293 – 0.4585	143
0.4585 – 0.6878	352
0.6878 – 0.917	590
0.917 – 1.146	634
1.146 – 1.376	496
1.376 – 1.605	362
1.605 – 1.834	200
1.834 – 2.063	118
2.063 – 2.292	77
2.292 – 2.522	45
2.522 – 2.751	31
2.751 – 2.98	22
2.98 – 3.21	10
3.21 – 3.439	7
3.439 – 3.668	4
3.668 – 3.897	3
3.897 – 4.127	3
4.127 – 4.356	3
4.356 – 4.585	2
4.585 – 4.814	0
4.814 – 5.043	2
5.043 – 5.273	2
5.273 – 5.502	0
5.502 – 5.731	0
5.731 – 5.961	0
5.961 – 6.19	0
6.19 – 6.419	0
6.419 – 6.648	0
6.648 – 6.878	0
6.878 – 7.107	0
7.107 – 7.336	0
7.336 – 7.565	1
7.565 – 7.795	0
7.795 – 8.024	0
8.024 – 8.253	0
8.253 – 8.482	0
8.482 – 8.712	0
8.712 – 8.941	0
8.941 – 9.17	1

merged inequality master

Overview

Summary confidence: high

fips numeric identifier

county_name text identifier

state categorical feature

total_pop numeric feature

composite_index numeric feature

economic_score numeric feature

education_score numeric feature

healthcare_score numeric feature

housing_score numeric feature

food_score numeric feature

disability_score numeric feature

poverty_rate numeric feature

no_vehicle_pct numeric feature

uninsured_rate numeric feature

hospital_closure_risk numeric feature

pct_rent_burdened_30 numeric feature

pct_rent_burdened_50 numeric feature

median_gross_rent numeric feature

rent_to_income_ratio numeric feature

gini_index numeric feature

unemployment_rate numeric feature

labor_force_participation numeric feature

pct_deep_poverty numeric feature

pct_poverty numeric feature

pct_near_poverty numeric feature

pct_hs_or_higher numeric feature

pct_bachelors_or_higher numeric feature

disability_rate numeric feature

How to cite