housing-housing_crisis_merged

Overview

Source: /home/coolhand/datasets/us-inequality-atlas/housing/housing_crisis_merged.csv

Saturn profiled 3,222 rows across 16 columns. The stats below are deterministic and machine-readable; the prose is a language-model interpretation of those stats (opt-in, added after the fact, never sees raw rows).

[2]:

!pip install saturn-dissect
import subprocess
subprocess.run([
    "saturn", "analyze", "/home/coolhand/datasets/us-inequality-atlas/housing/housing_crisis_merged.csv",
    "--findings", "housing-housing_crisis_merged.json",
    "--llm", "anthropic:claude-opus-4-7",
])

Summary confidence: high

This dataset covers 3,222 U.S. counties (one row per county, identified by FIPS code) with 16 columns spanning housing stock, rent burden, income, and affordability metrics. The headline finding is that the affordability_category field is overwhelmingly imbalanced — 'Affordable' covers 3,192 of 3,222 counties (top_rate 0.99), with only 29 'Moderately Burdened' and 1 'Extremely Burdened', so this label likely needs reworking before it's useful. The rent-burden percentages tell a richer story: pct_rent_burdened_30plus has a mean of 36.4% and pct_rent_burdened_50plus a mean of 17.4%, suggesting real stress that the categorical label hides. Housing-count columns (owner_occupied, renter_occupied, total_housing_units) are extremely right-skewed (skew 9.5–15.8) with hundreds of outliers, reflecting a few very large urban counties — log scales recommended. Also note rent_to_income_ratio has an extreme max of 1200 with skew ~54, hinting at data-quality issues worth checking.

citing: row_count · column_count · affordability_category.top_rate · affordability_category.top_values · pct_rent_burdened_30plus.mean · pct_rent_burdened_50plus.mean · owner_occupied.skew · renter_occupied.skew · total_housing_units.skew · rent_to_income_ratio.max · rent_to_income_ratio.skew · median_household_income.mean

Out[4]:

saturn.schema() · 16 columns

column	kind	n	null%	unique	alerts
fips	numeric	3,222	0.0%	3,222
county_name	text	3,222	0.0%	3,222	near_unique
total_renters	numeric	3,222	0.0%	2,709	high_skew outliers
pct_rent_burdened_30plus	numeric	3,222	0.0%	2,146
pct_rent_burdened_50plus	numeric	3,222	0.0%	1,769
median_gross_rent	numeric	3,222	0.3%	983	outliers
median_household_income	numeric	3,222	0.0%	3,098	outliers
total_housing_units	numeric	3,222	0.0%	3,074	high_skew outliers
owner_occupied	numeric	3,222	0.0%	3,001	high_skew outliers
renter_occupied	numeric	3,222	0.0%	2,709	high_skew outliers
pct_renter	numeric	3,222	0.0%	1,925
annual_rent	numeric	3,222	0.3%	983	outliers
rent_to_income_ratio	numeric	3,222	0.3%	1,269	high_skew
affordability_category	categorical	3,222	0.0%	3	imbalance
hours_at_min_wage_for_rent	numeric	3,222	0.3%	229	outliers
weeks_at_min_wage_for_rent	numeric	3,222	0.3%	71	outliers

Fig 1.

affordability_category · Shows how nearly every county falls into 'Affordable', exposing how unbalanced this label is.

Show data table

Top values for affordability_category (3 unique shown, of 3 total).
value	count	share
Affordable	3192	99.1%
Moderately Burdened	29	0.9%
Extremely Burdened	1	0.0%

Fig 2.

pct_rent_burdened_30plus · Distribution of the share of renters paying 30%+ of income on rent — a more honest view of burden than the categorical field.

Show data table

Histogram bins for pct_rent_burdened_30plus (median: 37.36).
bin	count
0 – 1.624	9
1.624 – 3.248	5
3.248 – 4.872	3
4.872 – 6.496	5
6.496 – 8.12	9
8.12 – 9.744	13
9.744 – 11.37	11
11.37 – 12.99	16
12.99 – 14.62	26
14.62 – 16.24	19
16.24 – 17.86	35
17.86 – 19.49	43
19.49 – 21.11	52
21.11 – 22.74	52
22.74 – 24.36	73
24.36 – 25.98	99
25.98 – 27.61	109
27.61 – 29.23	116
29.23 – 30.86	132
30.86 – 32.48	159
32.48 – 34.1	189
34.1 – 35.73	209
35.73 – 37.35	227
37.35 – 38.98	239
38.98 – 40.6	205
40.6 – 42.22	209
42.22 – 43.85	210
43.85 – 45.47	190
45.47 – 47.1	131
47.1 – 48.72	114
48.72 – 50.34	118
50.34 – 51.97	69
51.97 – 53.59	51
53.59 – 55.22	34
55.22 – 56.84	24
56.84 – 58.46	6
58.46 – 60.09	3
60.09 – 61.71	2
61.71 – 63.34	3
63.34 – 64.96	3

Fig 3.

pct_rent_burdened_50plus · Severe rent burden at the county level; watch the right tail for the most stressed places.

Show data table

Histogram bins for pct_rent_burdened_50plus (median: 17.62).
bin	count
0 – 1.624	42
1.624 – 3.248	27
3.248 – 4.872	34
4.872 – 6.496	63
6.496 – 8.12	102
8.12 – 9.744	148
9.744 – 11.37	163
11.37 – 12.99	214
12.99 – 14.62	242
14.62 – 16.24	310
16.24 – 17.86	315
17.86 – 19.49	332
19.49 – 21.11	335
21.11 – 22.74	264
22.74 – 24.36	219
24.36 – 25.98	150
25.98 – 27.61	99
27.61 – 29.23	64
29.23 – 30.86	39
30.86 – 32.48	20
32.48 – 34.1	21
34.1 – 35.73	9
35.73 – 37.35	2
37.35 – 38.98	3
38.98 – 40.6	1
40.6 – 42.22	1
42.22 – 43.85	1
43.85 – 45.47	0
45.47 – 47.1	1
47.1 – 48.72	0
48.72 – 50.34	0
50.34 – 51.97	0
51.97 – 53.59	0
53.59 – 55.22	0
55.22 – 56.84	0
56.84 – 58.46	0
58.46 – 60.09	0
60.09 – 61.71	0
61.71 – 63.34	0
63.34 – 64.96	1

Fig 4.

median_household_income · County income distribution to contextualise rent-burden numbers; mildly right-skewed with a long upper tail.

Show data table

Histogram bins for median_household_income (median: 60461.0).
bin	count
1.452e+04 – 1.842e+04	14
1.842e+04 – 2.232e+04	30
2.232e+04 – 2.622e+04	26
2.622e+04 – 3.012e+04	10
3.012e+04 – 3.402e+04	28
3.402e+04 – 3.792e+04	52
3.792e+04 – 4.181e+04	102
4.181e+04 – 4.571e+04	154
4.571e+04 – 4.961e+04	237
4.961e+04 – 5.351e+04	286
5.351e+04 – 5.741e+04	352
5.741e+04 – 6.131e+04	400
6.131e+04 – 6.52e+04	342
6.52e+04 – 6.91e+04	294
6.91e+04 – 7.3e+04	231
7.3e+04 – 7.69e+04	150
7.69e+04 – 8.08e+04	136
8.08e+04 – 8.47e+04	99
8.47e+04 – 8.86e+04	52
8.86e+04 – 9.249e+04	43
9.249e+04 – 9.639e+04	44
9.639e+04 – 1.003e+05	26
1.003e+05 – 1.042e+05	23
1.042e+05 – 1.081e+05	23
1.081e+05 – 1.12e+05	11
1.12e+05 – 1.159e+05	9
1.159e+05 – 1.198e+05	12
1.198e+05 – 1.237e+05	10
1.237e+05 – 1.276e+05	5
1.276e+05 – 1.315e+05	4
1.315e+05 – 1.354e+05	3
1.354e+05 – 1.393e+05	6
1.393e+05 – 1.432e+05	2
1.432e+05 – 1.471e+05	1
1.471e+05 – 1.51e+05	1
1.51e+05 – 1.549e+05	1
1.549e+05 – 1.588e+05	0
1.588e+05 – 1.627e+05	0
1.627e+05 – 1.666e+05	1
1.666e+05 – 1.705e+05	1

Fig 5.

pct_renter · Share of renter households per county — most counties cluster near 27% but a few reach 100%, worth inspecting.

Show data table

Histogram bins for pct_renter (median: 26.07).
bin	count
3.01 – 5.435	1
5.435 – 7.859	3
7.859 – 10.28	9
10.28 – 12.71	26
12.71 – 15.13	63
15.13 – 17.56	156
17.56 – 19.98	316
19.98 – 22.41	371
22.41 – 24.83	450
24.83 – 27.26	419
27.26 – 29.68	357
29.68 – 32.11	301
32.11 – 34.53	203
34.53 – 36.96	169
36.96 – 39.38	115
39.38 – 41.81	75
41.81 – 44.23	56
44.23 – 46.66	45
46.66 – 49.08	25
49.08 – 51.5	15
51.5 – 53.93	11
53.93 – 56.35	10
56.35 – 58.78	8
58.78 – 61.2	4
61.2 – 63.63	4
63.63 – 66.05	1
66.05 – 68.48	1
68.48 – 70.9	3
70.9 – 73.33	1
73.33 – 75.75	1
75.75 – 78.18	0
78.18 – 80.6	1
80.6 – 83.03	0
83.03 – 85.45	1
85.45 – 87.88	0
87.88 – 90.3	0
90.3 – 92.73	0
92.73 – 95.15	0
95.15 – 97.58	0
97.58 – 100	1

Fig 6.

Per-column null rate across the corpus. Columns are ordered by input position.

Show data table

Per-column null rate across the corpus.
column	kind	null %
fips	numeric	0.0%
county_name	text	0.0%
total_renters	numeric	0.0%
pct_rent_burdened_30plus	numeric	0.0%
pct_rent_burdened_50plus	numeric	0.0%
median_gross_rent	numeric	0.3%
median_household_income	numeric	0.0%
total_housing_units	numeric	0.0%
owner_occupied	numeric	0.0%
renter_occupied	numeric	0.0%
pct_renter	numeric	0.0%
annual_rent	numeric	0.3%
rent_to_income_ratio	numeric	0.3%
affordability_category	categorical	0.0%
hours_at_min_wage_for_rent	numeric	0.3%
weeks_at_min_wage_for_rent	numeric	0.3%

Fig 7.

Pearson correlation across numeric columns (sampled, bounded).

Show data table

Pearson correlation across 12 numeric columns (values clipped to 2 decimals).
	fips	total_renters	pct_rent_burdened_30plus	pct_rent_burdened_50plus	median_gross_rent	median_household_income	total_housing_units	owner_occupied	renter_occupied	pct_renter	annual_rent	rent_to_income_ratio
fips	+1.00	-0.06	-0.16	-0.10	-0.12	-0.11	-0.06	-0.06	-0.06	-0.10	-0.12	+0.05
total_renters	-0.06	+1.00	+0.23	+0.20	+0.17	+0.12	+0.99	+0.96	+1.00	+0.22	+0.17	+0.06
pct_rent_burdened_30plus	-0.16	+0.23	+1.00	+0.82	+0.18	+0.07	+0.26	+0.28	+0.23	+0.19	+0.18	+0.08
pct_rent_burdened_50plus	-0.10	+0.20	+0.82	+1.00	+0.13	+0.00	+0.23	+0.25	+0.20	+0.22	+0.13	+0.07
median_gross_rent	-0.12	+0.17	+0.18	+0.13	+1.00	+0.35	+0.16	+0.15	+0.17	+0.12	+1.00	+0.18
median_household_income	-0.11	+0.12	+0.07	+0.00	+0.35	+1.00	+0.13	+0.14	+0.12	+0.02	+0.35	-0.21
total_housing_units	-0.06	+0.99	+0.26	+0.23	+0.16	+0.13	+1.00	+0.99	+0.99	+0.19	+0.16	+0.05
owner_occupied	-0.06	+0.96	+0.28	+0.25	+0.15	+0.14	+0.99	+1.00	+0.96	+0.16	+0.15	+0.05
renter_occupied	-0.06	+1.00	+0.23	+0.20	+0.17	+0.12	+0.99	+0.96	+1.00	+0.22	+0.17	+0.06
pct_renter	-0.10	+0.22	+0.19	+0.22	+0.12	+0.02	+0.19	+0.16	+0.22	+1.00	+0.12	+0.09
annual_rent	-0.12	+0.17	+0.18	+0.13	+1.00	+0.35	+0.16	+0.15	+0.17	+0.12	+1.00	+0.18
rent_to_income_ratio	+0.05	+0.06	+0.08	+0.07	+0.18	-0.21	+0.05	+0.05	+0.06	+0.09	+0.18	+1.00

fips numeric identifier

This is the US FIPS county code: every one of the 3222 rows is unique, there are no nulls, and the value range (1001 to 72153) matches the standard 2-digit state + 3-digit county encoding. Distribution stats like mean 31377.89 and skew 0.157 are not meaningful here since the integers are categorical identifiers, not quantities.

Treatment: Treat as a categorical key; left-join on this code to bring in county/state attributes rather than using it as a numeric feature.

anthropic:claude-opus-4-7 · confidence high

Out[13]:

saturn.columns["fips"].stats

stat	value
n	3,222
nulls	0 (0.0%)
unique	3,222
min	1,001
max	72,153
mean	3.138e+04
median	30,022
std	1.63e+04
q1	1.903e+04
q3	4.61e+04
iqr	27,075
skew	0.1574
kurtosis	-0.6314
n_outliers	0
outlier_rate	0
zero_rate	0

Fig 8.

Distribution of fips. Vertical dash marks the median.

Show data table

Histogram bins for fips (median: 30022.0).
bin	count
1001 – 2780	97
2780 – 4559	15
4559 – 6337	133
6337 – 8116	59
8116 – 9895	14
9895 – 1.167e+04	4
1.167e+04 – 1.345e+04	226
1.345e+04 – 1.523e+04	5
1.523e+04 – 1.701e+04	49
1.701e+04 – 1.879e+04	189
1.879e+04 – 2.057e+04	204
2.057e+04 – 2.235e+04	184
2.235e+04 – 2.413e+04	39
2.413e+04 – 2.59e+04	15
2.59e+04 – 2.768e+04	170
2.768e+04 – 2.946e+04	196
2.946e+04 – 3.124e+04	150
3.124e+04 – 3.302e+04	27
3.302e+04 – 3.48e+04	21
3.48e+04 – 3.658e+04	95
3.658e+04 – 3.836e+04	153
3.836e+04 – 4.013e+04	155
4.013e+04 – 4.191e+04	46
4.191e+04 – 4.369e+04	67
4.369e+04 – 4.547e+04	51
4.547e+04 – 4.725e+04	161
4.725e+04 – 4.903e+04	268
4.903e+04 – 5.081e+04	29
5.081e+04 – 5.259e+04	133
5.259e+04 – 5.436e+04	94
5.436e+04 – 5.614e+04	95
5.614e+04 – 5.792e+04	0
5.792e+04 – 5.97e+04	0
5.97e+04 – 6.148e+04	0
6.148e+04 – 6.326e+04	0
6.326e+04 – 6.504e+04	0
6.504e+04 – 6.682e+04	0
6.682e+04 – 6.86e+04	0
6.86e+04 – 7.037e+04	0
7.037e+04 – 7.215e+04	78

county_name text identifier

This column holds fully-qualified US county names (e.g., 'X County, State'), with the token 'county,' appearing in 2999 of 3222 rows and state names like Texas (256), Virginia (189), and Georgia (159) topping the word frequencies. Every one of the 3222 values is unique with zero nulls or duplicates, and lengths cluster tightly between 16 and 31 characters (mean 24.3). The 223 rows missing the 'county,' token likely correspond to parishes (Louisiana), boroughs/census areas (Alaska), or independent cities, which an analyst should not treat as data quality issues.

Treatment: Split into county and state fields and left-join on a county FIPS lookup.

anthropic:claude-opus-4-7 · confidence high

Out[16]:

saturn.columns["county_name"].stats

stat	value
n	3,222
nulls	0 (0.0%)
unique	3,222
len_min	16
len_max	59
len_mean	24.32
len_median	24
len_p95	31
word_mean	3.248
word_median	3
n_empty	0
n_duplicates	0
duplicate_rate	0
vocab_size	1,990
readability_flesch_mean	10.28
emoji_rate	0
url_rate	0
one_word_rate	0
allcaps_rate	0
boilerplate_rate	0
alert: near_unique	100.0% of rows are unique strings

Fig 9.

Character-length distribution for county_name.

Show data table

Character-length distribution for county_name (mean: 24.324022346368714).
chars	count
16 – 17	26
17 – 18	72
18 – 19	121
19 – 20	190
20 – 21	264
21 – 22	407
22 – 24	420
24 – 25	363
25 – 26	320
26 – 27	240
27 – 28	231
28 – 29	152
29 – 30	139
30 – 31	165
31 – 32	41
32 – 33	28
33 – 34	16
34 – 35	10
35 – 36	5
36 – 38	0
38 – 39	1
39 – 40	1
40 – 41	0
41 – 42	1
42 – 43	1
43 – 44	0
44 – 45	2
45 – 46	0
46 – 47	1
47 – 48	1
48 – 49	0
49 – 50	0
50 – 51	0
51 – 53	0
53 – 54	2
54 – 55	1
55 – 56	0
56 – 57	0
57 – 58	0
58 – 59	1

total_renters numeric feature

This column reports a count of renters per record, ranging from 28 to 1,810,929 with a median of 2,579.5 and a mean of 13,851.1 — consistent with geographic or administrative aggregates rather than individual-level data. The distribution is severely right-skewed (skew 15.82, kurtosis 398.15) and 449 of 3,222 rows (14.0%) flag as outliers, with the std (55,351.6) dwarfing the IQR (6,392). No nulls or zeros are present, and 2,709 of 3,222 values are unique.

Treatment: Log-transform before modelling to tame the heavy right tail.

anthropic:claude-opus-4-7 · confidence high

Out[19]:

saturn.columns["total_renters"].stats

stat	value
n	3,222
nulls	0 (0.0%)
unique	2,709
min	28
max	1.811e+06
mean	1.385e+04
median	2580
std	5.535e+04
q1	1004
q3	7396
iqr	6,392
skew	15.82
kurtosis	398.2
n_outliers	449
outlier_rate	0.1394
zero_rate	0
alert: high_skew	skew=+15.82
alert: outliers	13.9% rows beyond 1.5 IQR

Fig 10.

Distribution of total_renters. Vertical dash marks the median.

Show data table

Histogram bins for total_renters (median: 2579.5).
bin	count
28 – 4.53e+04	3019
4.53e+04 – 9.057e+04	109
9.057e+04 – 1.358e+05	38
1.358e+05 – 1.811e+05	17
1.811e+05 – 2.264e+05	11
2.264e+05 – 2.717e+05	9
2.717e+05 – 3.169e+05	5
3.169e+05 – 3.622e+05	0
3.622e+05 – 4.075e+05	2
4.075e+05 – 4.528e+05	2
4.528e+05 – 4.98e+05	3
4.98e+05 – 5.433e+05	1
5.433e+05 – 5.886e+05	1
5.886e+05 – 6.338e+05	1
6.338e+05 – 6.791e+05	0
6.791e+05 – 7.244e+05	1
7.244e+05 – 7.697e+05	1
7.697e+05 – 8.149e+05	0
8.149e+05 – 8.602e+05	0
8.602e+05 – 9.055e+05	1
9.055e+05 – 9.508e+05	0
9.508e+05 – 9.96e+05	0
9.96e+05 – 1.041e+06	0
1.041e+06 – 1.087e+06	0
1.087e+06 – 1.132e+06	0
1.132e+06 – 1.177e+06	0
1.177e+06 – 1.222e+06	0
1.222e+06 – 1.268e+06	0
1.268e+06 – 1.313e+06	0
1.313e+06 – 1.358e+06	0
1.358e+06 – 1.403e+06	0
1.403e+06 – 1.449e+06	0
1.449e+06 – 1.494e+06	0
1.494e+06 – 1.539e+06	0
1.539e+06 – 1.585e+06	0
1.585e+06 – 1.63e+06	0
1.63e+06 – 1.675e+06	0
1.675e+06 – 1.72e+06	0
1.72e+06 – 1.766e+06	0
1.766e+06 – 1.811e+06	1

pct_rent_burdened_30plus numeric feature

Percentage of renter households spending 30%+ of income on rent, reported per record (n=3222). Distribution is roughly centered with median 37.36 and IQR 30.67–43.48, mildly left-skewed (-0.57) and ranging 0 to 64.96, with 58 outliers (1.8%) and a small zero_rate of 0.25%. With 2146 unique values out of 3222, granularity is high but not near-unique.

Treatment: Use as-is as a numeric feature; no transform needed given near-symmetric, bounded percentage scale.

anthropic:claude-opus-4-7 · confidence high

Out[22]:

saturn.columns["pct_rent_burdened_30plus"].stats

stat	value
n	3,222
nulls	0 (0.0%)
unique	2,146
min	0
max	64.96
mean	36.44
median	37.36
std	10.01
q1	30.67
q3	43.48
iqr	12.81
skew	-0.5673
kurtosis	0.5032
n_outliers	58
outlier_rate	0.018
zero_rate	0.002483

Fig 11.

Distribution of pct_rent_burdened_30plus. Vertical dash marks the median.

Show data table

Histogram bins for pct_rent_burdened_30plus (median: 37.36).
bin	count
0 – 1.624	9
1.624 – 3.248	5
3.248 – 4.872	3
4.872 – 6.496	5
6.496 – 8.12	9
8.12 – 9.744	13
9.744 – 11.37	11
11.37 – 12.99	16
12.99 – 14.62	26
14.62 – 16.24	19
16.24 – 17.86	35
17.86 – 19.49	43
19.49 – 21.11	52
21.11 – 22.74	52
22.74 – 24.36	73
24.36 – 25.98	99
25.98 – 27.61	109
27.61 – 29.23	116
29.23 – 30.86	132
30.86 – 32.48	159
32.48 – 34.1	189
34.1 – 35.73	209
35.73 – 37.35	227
37.35 – 38.98	239
38.98 – 40.6	205
40.6 – 42.22	209
42.22 – 43.85	210
43.85 – 45.47	190
45.47 – 47.1	131
47.1 – 48.72	114
48.72 – 50.34	118
50.34 – 51.97	69
51.97 – 53.59	51
53.59 – 55.22	34
55.22 – 56.84	24
56.84 – 58.46	6
58.46 – 60.09	3
60.09 – 61.71	2
61.71 – 63.34	3
63.34 – 64.96	3

pct_rent_burdened_50plus numeric feature

Likely a county- or tract-level percentage of renter households spending 50%+ of income on rent (severely rent-burdened). Values span 0 to 64.96 with mean 17.35 and median 17.62, and the distribution is nearly symmetric (skew 0.05, kurtosis 0.98) with only 1.5% outliers. About 0.9% of rows are exactly zero and there are no nulls across 3,222 records.

Treatment: Use as-is in modelling; no transform needed given near-symmetric distribution.

anthropic:claude-opus-4-7 · confidence high

Out[25]:

saturn.columns["pct_rent_burdened_50plus"].stats

stat	value
n	3,222
nulls	0 (0.0%)
unique	1,769
min	0
max	64.96
mean	17.35
median	17.62
std	6.577
q1	13.07
q3	21.63
iqr	8.557
skew	0.05436
kurtosis	0.9823
n_outliers	47
outlier_rate	0.01459
zero_rate	0.009311

Fig 12.

Distribution of pct_rent_burdened_50plus. Vertical dash marks the median.

Show data table

Histogram bins for pct_rent_burdened_50plus (median: 17.62).
bin	count
0 – 1.624	42
1.624 – 3.248	27
3.248 – 4.872	34
4.872 – 6.496	63
6.496 – 8.12	102
8.12 – 9.744	148
9.744 – 11.37	163
11.37 – 12.99	214
12.99 – 14.62	242
14.62 – 16.24	310
16.24 – 17.86	315
17.86 – 19.49	332
19.49 – 21.11	335
21.11 – 22.74	264
22.74 – 24.36	219
24.36 – 25.98	150
25.98 – 27.61	99
27.61 – 29.23	64
29.23 – 30.86	39
30.86 – 32.48	20
32.48 – 34.1	21
34.1 – 35.73	9
35.73 – 37.35	2
37.35 – 38.98	3
38.98 – 40.6	1
40.6 – 42.22	1
42.22 – 43.85	1
43.85 – 45.47	0
45.47 – 47.1	1
47.1 – 48.72	0
48.72 – 50.34	0
50.34 – 51.97	0
51.97 – 53.59	0
53.59 – 55.22	0
55.22 – 56.84	0
56.84 – 58.46	0
58.46 – 60.09	0
60.09 – 61.71	0
61.71 – 63.34	0
63.34 – 64.96	1

median_gross_rent numeric feature

Numeric column capturing the median gross rent (presumably USD per month) across 3,222 rows with only 0.31% missing and no zeros. The distribution is right-skewed (skew 1.76, kurtosis 4.55) with median 818 and mean 890.9, and 225 values (7.0%) flagged as outliers stretching up to 2,805 against a Q3 of 978.

Treatment: Log-transform or winsorize before regression to tame the right-skew and high-rent outliers.

anthropic:claude-opus-4-7 · confidence high

Out[28]:

saturn.columns["median_gross_rent"].stats

stat	value
n	3,222
nulls	10 (0.3%)
unique	983
min	297
max	2,805
mean	890.9
median	818
std	283.4
q1	718
q3	978
iqr	260
skew	1.763
kurtosis	4.55
n_outliers	225
outlier_rate	0.07005
zero_rate	0
alert: outliers	7.0% rows beyond 1.5 IQR

Fig 13.

Distribution of median_gross_rent. Vertical dash marks the median.

Show data table

Histogram bins for median_gross_rent (median: 818.0).
bin	count
297 – 359.7	5
359.7 – 422.4	14
422.4 – 485.1	32
485.1 – 547.8	69
547.8 – 610.5	128
610.5 – 673.2	242
673.2 – 735.9	457
735.9 – 798.6	515
798.6 – 861.3	423
861.3 – 924	306
924 – 986.7	251
986.7 – 1049	140
1049 – 1112	105
1112 – 1175	98
1175 – 1238	79
1238 – 1300	71
1300 – 1363	52
1363 – 1426	48
1426 – 1488	26
1488 – 1551	22
1551 – 1614	33
1614 – 1676	13
1676 – 1739	19
1739 – 1802	10
1802 – 1864	13
1864 – 1927	8
1927 – 1990	11
1990 – 2053	6
2053 – 2115	4
2115 – 2178	3
2178 – 2241	4
2241 – 2303	1
2303 – 2366	1
2366 – 2429	0
2429 – 2492	1
2492 – 2554	0
2554 – 2617	0
2617 – 2680	0
2680 – 2742	1
2742 – 2805	1

median_household_income numeric feature

Median household income in dollars, almost certainly at a US county or similar geography given n=3222 and the typical 14525-170463 range. Distribution is right-skewed (skew 0.95, kurtosis 2.96) with 187 high-side outliers (5.8%) pulling the mean (62327) above the median (60461). Near-complete coverage with only a 0.03% null rate and no zeros.

Treatment: Log-transform before regression to tame the right skew and high-income outliers.

anthropic:claude-opus-4-7 · confidence high

Out[31]:

saturn.columns["median_household_income"].stats

stat	value
n	3,222
nulls	1 (0.0%)
unique	3,098
min	14,525
max	170,463
mean	6.233e+04
median	60,461
std	1.777e+04
q1	51,823
q3	70,379
iqr	18,556
skew	0.9478
kurtosis	2.962
n_outliers	187
outlier_rate	0.05806
zero_rate	0
alert: outliers	5.8% rows beyond 1.5 IQR

Fig 14.

Distribution of median_household_income. Vertical dash marks the median.

Show data table

Histogram bins for median_household_income (median: 60461.0).
bin	count
1.452e+04 – 1.842e+04	14
1.842e+04 – 2.232e+04	30
2.232e+04 – 2.622e+04	26
2.622e+04 – 3.012e+04	10
3.012e+04 – 3.402e+04	28
3.402e+04 – 3.792e+04	52
3.792e+04 – 4.181e+04	102
4.181e+04 – 4.571e+04	154
4.571e+04 – 4.961e+04	237
4.961e+04 – 5.351e+04	286
5.351e+04 – 5.741e+04	352
5.741e+04 – 6.131e+04	400
6.131e+04 – 6.52e+04	342
6.52e+04 – 6.91e+04	294
6.91e+04 – 7.3e+04	231
7.3e+04 – 7.69e+04	150
7.69e+04 – 8.08e+04	136
8.08e+04 – 8.47e+04	99
8.47e+04 – 8.86e+04	52
8.86e+04 – 9.249e+04	43
9.249e+04 – 9.639e+04	44
9.639e+04 – 1.003e+05	26
1.003e+05 – 1.042e+05	23
1.042e+05 – 1.081e+05	23
1.081e+05 – 1.12e+05	11
1.12e+05 – 1.159e+05	9
1.159e+05 – 1.198e+05	12
1.198e+05 – 1.237e+05	10
1.237e+05 – 1.276e+05	5
1.276e+05 – 1.315e+05	4
1.315e+05 – 1.354e+05	3
1.354e+05 – 1.393e+05	6
1.393e+05 – 1.432e+05	2
1.432e+05 – 1.471e+05	1
1.471e+05 – 1.51e+05	1
1.51e+05 – 1.549e+05	1
1.549e+05 – 1.588e+05	0
1.588e+05 – 1.627e+05	0
1.627e+05 – 1.666e+05	1
1.666e+05 – 1.705e+05	1

total_housing_units numeric feature

Counts of total housing units per record, almost certainly aggregated to a geography (likely US counties given n=3222). The distribution is severely right-skewed (skew 12.05, kurtosis 240.5) with a median of 10,021 but a max of 3,363,093, and 443 rows (13.7%) flag as outliers — consistent with a few massive metros dwarfing thousands of small areas. No nulls or zeros, and 3,074 of 3,222 values are unique.

Treatment: log-transform before modelling to tame the heavy right tail.

anthropic:claude-opus-4-7 · confidence high

Out[34]:

saturn.columns["total_housing_units"].stats

stat	value
n	3,222
nulls	0 (0.0%)
unique	3,074
min	32
max	3.363e+06
mean	3.94e+04
median	10,021
std	1.201e+05
q1	4211
q3	25,939
iqr	2.173e+04
skew	12.05
kurtosis	240.5
n_outliers	443
outlier_rate	0.1375
zero_rate	0
alert: high_skew	skew=+12.05
alert: outliers	13.7% rows beyond 1.5 IQR

Fig 15.

Distribution of total_housing_units. Vertical dash marks the median.

Show data table

Histogram bins for total_housing_units (median: 10021.0).
bin	count
32 – 8.411e+04	2907
8.411e+04 – 1.682e+05	153
1.682e+05 – 2.523e+05	62
2.523e+05 – 3.363e+05	38
3.363e+05 – 4.204e+05	22
4.204e+05 – 5.045e+05	6
5.045e+05 – 5.886e+05	11
5.886e+05 – 6.726e+05	5
6.726e+05 – 7.567e+05	5
7.567e+05 – 8.408e+05	3
8.408e+05 – 9.249e+05	1
9.249e+05 – 1.009e+06	3
1.009e+06 – 1.093e+06	1
1.093e+06 – 1.177e+06	1
1.177e+06 – 1.261e+06	0
1.261e+06 – 1.345e+06	0
1.345e+06 – 1.429e+06	0
1.429e+06 – 1.513e+06	0
1.513e+06 – 1.597e+06	0
1.597e+06 – 1.682e+06	1
1.682e+06 – 1.766e+06	1
1.766e+06 – 1.85e+06	0
1.85e+06 – 1.934e+06	0
1.934e+06 – 2.018e+06	0
2.018e+06 – 2.102e+06	1
2.102e+06 – 2.186e+06	0
2.186e+06 – 2.27e+06	0
2.27e+06 – 2.354e+06	0
2.354e+06 – 2.438e+06	0
2.438e+06 – 2.522e+06	0
2.522e+06 – 2.606e+06	0
2.606e+06 – 2.69e+06	0
2.69e+06 – 2.775e+06	0
2.775e+06 – 2.859e+06	0
2.859e+06 – 2.943e+06	0
2.943e+06 – 3.027e+06	0
3.027e+06 – 3.111e+06	0
3.111e+06 – 3.195e+06	0
3.195e+06 – 3.279e+06	0
3.279e+06 – 3.363e+06	1

owner_occupied numeric feature

Likely a count of owner-occupied housing units per geographic area, with 3001 unique values across 3222 rows and effectively no zeros (zero_rate 0.0003) or nulls. The distribution is severely right-skewed (skew 9.52, kurtosis 146.9): median is 7325.5 but the mean is 25551.7 and the max reaches 1,552,164, producing 429 outliers (13.3%).

Treatment: Log-transform before modelling to tame the heavy right tail.

anthropic:claude-opus-4-7 · confidence high

Out[37]:

saturn.columns["owner_occupied"].stats

stat	value
n	3,222
nulls	0 (0.0%)
unique	3,001
min	0
max	1.552e+06
mean	2.555e+04
median	7326
std	6.755e+04
q1	3148
q3	1.886e+04
iqr	1.572e+04
skew	9.516
kurtosis	146.9
n_outliers	429
outlier_rate	0.1331
zero_rate	0.0003104
alert: high_skew	skew=+9.52
alert: outliers	13.3% rows beyond 1.5 IQR

Fig 16.

Distribution of owner_occupied. Vertical dash marks the median.

Show data table

Histogram bins for owner_occupied (median: 7325.5).
bin	count
0 – 3.88e+04	2761
3.88e+04 – 7.761e+04	225
7.761e+04 – 1.164e+05	78
1.164e+05 – 1.552e+05	52
1.552e+05 – 1.94e+05	36
1.94e+05 – 2.328e+05	20
2.328e+05 – 2.716e+05	10
2.716e+05 – 3.104e+05	10
3.104e+05 – 3.492e+05	6
3.492e+05 – 3.88e+05	6
3.88e+05 – 4.268e+05	3
4.268e+05 – 4.656e+05	3
4.656e+05 – 5.045e+05	4
5.045e+05 – 5.433e+05	2
5.433e+05 – 5.821e+05	0
5.821e+05 – 6.209e+05	1
6.209e+05 – 6.597e+05	1
6.597e+05 – 6.985e+05	0
6.985e+05 – 7.373e+05	0
7.373e+05 – 7.761e+05	0
7.761e+05 – 8.149e+05	0
8.149e+05 – 8.537e+05	0
8.537e+05 – 8.925e+05	0
8.925e+05 – 9.313e+05	1
9.313e+05 – 9.701e+05	0
9.701e+05 – 1.009e+06	0
1.009e+06 – 1.048e+06	0
1.048e+06 – 1.087e+06	1
1.087e+06 – 1.125e+06	0
1.125e+06 – 1.164e+06	0
1.164e+06 – 1.203e+06	1
1.203e+06 – 1.242e+06	0
1.242e+06 – 1.281e+06	0
1.281e+06 – 1.319e+06	0
1.319e+06 – 1.358e+06	0
1.358e+06 – 1.397e+06	0
1.397e+06 – 1.436e+06	0
1.436e+06 – 1.475e+06	0
1.475e+06 – 1.513e+06	0
1.513e+06 – 1.552e+06	1

renter_occupied numeric feature

Counts of renter-occupied housing units per record, ranging from 28 to 1,810,929 with a median of just 2,579.5. The distribution is severely right-skewed (skew 15.82, kurtosis 398.15) with 449 outliers (14% of rows), consistent with a few very large geographies dominating an otherwise small-county distribution. No nulls or zeros, and 2,709 unique values across 3,222 rows suggest county- or tract-level granularity.

Treatment: log-transform before regression to tame the extreme right skew.

anthropic:claude-opus-4-7 · confidence high

Out[40]:

saturn.columns["renter_occupied"].stats

stat	value
n	3,222
nulls	0 (0.0%)
unique	2,709
min	28
max	1.811e+06
mean	1.385e+04
median	2580
std	5.535e+04
q1	1004
q3	7396
iqr	6,392
skew	15.82
kurtosis	398.2
n_outliers	449
outlier_rate	0.1394
zero_rate	0
alert: high_skew	skew=+15.82
alert: outliers	13.9% rows beyond 1.5 IQR

Fig 17.

Distribution of renter_occupied. Vertical dash marks the median.

Show data table

Histogram bins for renter_occupied (median: 2579.5).
bin	count
28 – 4.53e+04	3019
4.53e+04 – 9.057e+04	109
9.057e+04 – 1.358e+05	38
1.358e+05 – 1.811e+05	17
1.811e+05 – 2.264e+05	11
2.264e+05 – 2.717e+05	9
2.717e+05 – 3.169e+05	5
3.169e+05 – 3.622e+05	0
3.622e+05 – 4.075e+05	2
4.075e+05 – 4.528e+05	2
4.528e+05 – 4.98e+05	3
4.98e+05 – 5.433e+05	1
5.433e+05 – 5.886e+05	1
5.886e+05 – 6.338e+05	1
6.338e+05 – 6.791e+05	0
6.791e+05 – 7.244e+05	1
7.244e+05 – 7.697e+05	1
7.697e+05 – 8.149e+05	0
8.149e+05 – 8.602e+05	0
8.602e+05 – 9.055e+05	1
9.055e+05 – 9.508e+05	0
9.508e+05 – 9.96e+05	0
9.96e+05 – 1.041e+06	0
1.041e+06 – 1.087e+06	0
1.087e+06 – 1.132e+06	0
1.132e+06 – 1.177e+06	0
1.177e+06 – 1.222e+06	0
1.222e+06 – 1.268e+06	0
1.268e+06 – 1.313e+06	0
1.313e+06 – 1.358e+06	0
1.358e+06 – 1.403e+06	0
1.403e+06 – 1.449e+06	0
1.449e+06 – 1.494e+06	0
1.494e+06 – 1.539e+06	0
1.539e+06 – 1.585e+06	0
1.585e+06 – 1.63e+06	0
1.63e+06 – 1.675e+06	0
1.675e+06 – 1.72e+06	0
1.72e+06 – 1.766e+06	0
1.766e+06 – 1.811e+06	1

pct_renter numeric feature

Percentage of renter-occupied housing units across 3,222 records, ranging from 3.01 to 100.0 with a mean of 27.35 and median of 26.07. The distribution is right-skewed (skew 1.32, kurtosis 4.41) with 88 high-side outliers (2.7%); the 100.0 maximum stands out against a Q3 of 31.66 and suggests a few all-renter localities.

Treatment: Use as-is or apply a mild transform (e.g., logit on the 0-100 scale) before linear models given the right skew.

anthropic:claude-opus-4-7 · confidence high

Out[43]:

saturn.columns["pct_renter"].stats

stat	value
n	3,222
nulls	0 (0.0%)
unique	1,925
min	3.01
max	100
mean	27.35
median	26.07
std	8.564
q1	21.64
q3	31.66
iqr	10.02
skew	1.317
kurtosis	4.412
n_outliers	88
outlier_rate	0.02731
zero_rate	0

Fig 18.

Distribution of pct_renter. Vertical dash marks the median.

Show data table

Histogram bins for pct_renter (median: 26.07).
bin	count
3.01 – 5.435	1
5.435 – 7.859	3
7.859 – 10.28	9
10.28 – 12.71	26
12.71 – 15.13	63
15.13 – 17.56	156
17.56 – 19.98	316
19.98 – 22.41	371
22.41 – 24.83	450
24.83 – 27.26	419
27.26 – 29.68	357
29.68 – 32.11	301
32.11 – 34.53	203
34.53 – 36.96	169
36.96 – 39.38	115
39.38 – 41.81	75
41.81 – 44.23	56
44.23 – 46.66	45
46.66 – 49.08	25
49.08 – 51.5	15
51.5 – 53.93	11
53.93 – 56.35	10
56.35 – 58.78	8
58.78 – 61.2	4
61.2 – 63.63	4
63.63 – 66.05	1
66.05 – 68.48	1
68.48 – 70.9	3
70.9 – 73.33	1
73.33 – 75.75	1
75.75 – 78.18	0
78.18 – 80.6	1
80.6 – 83.03	0
83.03 – 85.45	1
85.45 – 87.88	0
87.88 – 90.3	0
90.3 – 92.73	0
92.73 – 95.15	0
95.15 – 97.58	0
97.58 – 100	1

annual_rent numeric feature

Likely an annual rent figure in currency units, with 3222 records and 983 distinct values ranging from 3564 to 33660 and a median of 9816. The distribution is right-skewed (skew 1.76, kurtosis 4.55) and 225 rows (7.0%) sit beyond the outlier fences, suggesting a long tail of high-rent cases above the Q3 of 11736. Nulls are negligible (0.31%) and there are no zero values.

Treatment: Log-transform before regression to dampen the right-skew and high-rent outliers.

anthropic:claude-opus-4-7 · confidence high

Out[46]:

saturn.columns["annual_rent"].stats

stat	value
n	3,222
nulls	10 (0.3%)
unique	983
min	3,564
max	33,660
mean	1.069e+04
median	9,816
std	3400
q1	8,616
q3	11,736
iqr	3,120
skew	1.763
kurtosis	4.55
n_outliers	225
outlier_rate	0.07005
zero_rate	0
alert: outliers	7.0% rows beyond 1.5 IQR

Fig 19.

Distribution of annual_rent. Vertical dash marks the median.

Show data table

Histogram bins for annual_rent (median: 9816.0).
bin	count
3564 – 4316	5
4316 – 5069	14
5069 – 5821	32
5821 – 6574	69
6574 – 7326	128
7326 – 8078	242
8078 – 8831	457
8831 – 9583	515
9583 – 1.034e+04	423
1.034e+04 – 1.109e+04	306
1.109e+04 – 1.184e+04	251
1.184e+04 – 1.259e+04	140
1.259e+04 – 1.335e+04	105
1.335e+04 – 1.41e+04	98
1.41e+04 – 1.485e+04	79
1.485e+04 – 1.56e+04	71
1.56e+04 – 1.635e+04	52
1.635e+04 – 1.711e+04	48
1.711e+04 – 1.786e+04	26
1.786e+04 – 1.861e+04	22
1.861e+04 – 1.936e+04	33
1.936e+04 – 2.012e+04	13
2.012e+04 – 2.087e+04	19
2.087e+04 – 2.162e+04	10
2.162e+04 – 2.237e+04	13
2.237e+04 – 2.313e+04	8
2.313e+04 – 2.388e+04	11
2.388e+04 – 2.463e+04	6
2.463e+04 – 2.538e+04	4
2.538e+04 – 2.614e+04	3
2.614e+04 – 2.689e+04	4
2.689e+04 – 2.764e+04	1
2.764e+04 – 2.839e+04	1
2.839e+04 – 2.915e+04	0
2.915e+04 – 2.99e+04	1
2.99e+04 – 3.065e+04	0
3.065e+04 – 3.14e+04	0
3.14e+04 – 3.216e+04	0
3.216e+04 – 3.291e+04	1
3.291e+04 – 3.366e+04	1

rent_to_income_ratio numeric feature

Likely a rent-to-income ratio expressed as a percentage, with a tight interquartile band between 15.1 and 19.39 and median 17.06. The distribution is severely contaminated: skew of 53.98 and kurtosis of 3007 are driven by a max of 1200.0 against a mean of 17.89, and 107 outliers (3.33%) sit far outside the IQR of 4.29. Nulls are negligible at 0.28% and there are no zeros, but the extreme tail suggests data-entry errors or unit inconsistencies.

Treatment: Winsorize or cap extreme values and log-transform before modelling.

anthropic:claude-opus-4-7 · confidence high

Out[49]:

saturn.columns["rent_to_income_ratio"].stats

stat	value
n	3,222
nulls	9 (0.3%)
unique	1,269
min	6.1
max	1,200
mean	17.89
median	17.06
std	21.2
q1	15.1
q3	19.39
iqr	4.29
skew	53.98
kurtosis	3007
n_outliers	107
outlier_rate	0.0333
zero_rate	0
alert: high_skew	skew=+53.98

Fig 20.

Distribution of rent_to_income_ratio. Vertical dash marks the median.

Show data table

Histogram bins for rent_to_income_ratio (median: 17.06).
bin	count
6.1 – 35.95	3207
35.95 – 65.8	5
65.8 – 95.64	0
95.64 – 125.5	0
125.5 – 155.3	0
155.3 – 185.2	0
185.2 – 215	0
215 – 244.9	0
244.9 – 274.7	0
274.7 – 304.6	0
304.6 – 334.4	0
334.4 – 364.3	0
364.3 – 394.1	0
394.1 – 424	0
424 – 453.8	0
453.8 – 483.7	0
483.7 – 513.5	0
513.5 – 543.4	0
543.4 – 573.2	0
573.2 – 603.1	0
603.1 – 632.9	0
632.9 – 662.7	0
662.7 – 692.6	0
692.6 – 722.4	0
722.4 – 752.3	0
752.3 – 782.1	0
782.1 – 812	0
812 – 841.8	0
841.8 – 871.7	0
871.7 – 901.5	0
901.5 – 931.4	0
931.4 – 961.2	0
961.2 – 991.1	0
991.1 – 1021	0
1021 – 1051	0
1051 – 1081	0
1081 – 1110	0
1110 – 1140	0
1140 – 1170	0
1170 – 1200	1

affordability_category categorical label

A 3-level categorical flag bucketing rows into housing affordability tiers. The distribution is extremely degenerate: 'Affordable' covers 3192 of 3222 rows (top_rate 0.9907), 'Moderately Burdened' has 29, and 'Extremely Burdened' has just 1, yielding an entropy_ratio of 0.049. As a predictor it carries almost no information, and the single 'Extremely Burdened' row will not survive any train/test split.

Treatment: Collapse to a binary Affordable vs. Burdened flag or drop; near-constant as-is.

anthropic:claude-opus-4-7 · confidence high

Out[52]:

saturn.columns["affordability_category"].stats

stat	value
n	3,222
nulls	0 (0.0%)
unique	3
top_value	Affordable
top_rate	0.9907
cardinality	3
entropy	0.07815
entropy_ratio	0.04931
alert: imbalance	top value is 99.1% of rows

Fig 21.

Top values for affordability_category.

Show data table

Top values for affordability_category (3 unique shown, of 3 total).
value	count	share
Affordable	3192	99.1%
Moderately Burdened	29	0.9%
Extremely Burdened	1	0.0%

hours_at_min_wage_for_rent numeric feature

This column reports the number of minimum-wage work hours required to afford rent, with values ranging from 41 to 387 (median 113, mean 122.9). The distribution is right-skewed (skew 1.76, kurtosis 4.55) and 222 rows (6.9%) flag as outliers in the upper tail, suggesting a subset of high-cost areas where rent demands far more hours than typical. Nulls are negligible (0.31%) and there are no zeros, so coverage is essentially complete.

Treatment: Log-transform or winsorize before regression to dampen the right-tail outliers.

anthropic:claude-opus-4-7 · confidence high

Out[55]:

saturn.columns["hours_at_min_wage_for_rent"].stats

stat	value
n	3,222
nulls	10 (0.3%)
unique	229
min	41
max	387
mean	122.9
median	113
std	39.09
q1	99
q3	135
iqr	36
skew	1.763
kurtosis	4.546
n_outliers	222
outlier_rate	0.06912
zero_rate	0
alert: outliers	6.9% rows beyond 1.5 IQR

Fig 22.

Distribution of hours_at_min_wage_for_rent. Vertical dash marks the median.

Show data table

Histogram bins for hours_at_min_wage_for_rent (median: 113.0).
bin	count
41 – 49.65	5
49.65 – 58.3	14
58.3 – 66.95	29
66.95 – 75.6	72
75.6 – 84.25	132
84.25 – 92.9	232
92.9 – 101.6	463
101.6 – 110.2	536
110.2 – 118.9	392
118.9 – 127.5	319
127.5 – 136.2	251
136.2 – 144.8	131
144.8 – 153.4	111
153.4 – 162.1	103
162.1 – 170.8	74
170.8 – 179.4	73
179.4 – 188.1	53
188.1 – 196.7	44
196.7 – 205.3	27
205.3 – 214	20
214 – 222.7	35
222.7 – 231.3	14
231.3 – 240	17
240 – 248.6	11
248.6 – 257.2	13
257.2 – 265.9	8
265.9 – 274.6	11
274.6 – 283.2	6
283.2 – 291.9	4
291.9 – 300.5	3
300.5 – 309.2	4
309.2 – 317.8	1
317.8 – 326.4	1
326.4 – 335.1	0
335.1 – 343.8	1
343.8 – 352.4	0
352.4 – 361.1	0
361.1 – 369.7	0
369.7 – 378.4	1
378.4 – 387	1

weeks_at_min_wage_for_rent numeric feature

This column reports the number of weeks of minimum-wage work needed to cover rent, ranging from 1.0 to 9.7 with a median of 2.8 and IQR of 0.9. The distribution is right-skewed (skew 1.76, kurtosis 4.57) and 222 rows (6.9%) flag as outliers on the high end, pointing to localities where rent dramatically outpaces minimum wage. Nulls are negligible (0.31%) and only 71 unique values appear across 3222 rows, suggesting rounded or coarsely binned figures.

Treatment: Log-transform or winsorize before regression to dampen the right tail.

anthropic:claude-opus-4-7 · confidence high

Out[58]:

saturn.columns["weeks_at_min_wage_for_rent"].stats

stat	value
n	3,222
nulls	10 (0.3%)
unique	71
min	1
max	9.7
mean	3.072
median	2.8
std	0.9775
q1	2.5
q3	3.4
iqr	0.9
skew	1.763
kurtosis	4.567
n_outliers	222
outlier_rate	0.06912
zero_rate	0
alert: outliers	6.9% rows beyond 1.5 IQR

Fig 23.

Distribution of weeks_at_min_wage_for_rent. Vertical dash marks the median.

Show data table

Histogram bins for weeks_at_min_wage_for_rent (median: 2.8).
bin	count
1 – 1.218	5
1.218 – 1.435	14
1.435 – 1.652	29
1.652 – 1.87	61
1.87 – 2.087	107
2.087 – 2.305	294
2.305 – 2.522	437
2.522 – 2.74	483
2.74 – 2.957	408
2.957 – 3.175	298
3.175 – 3.392	230
3.392 – 3.61	240
3.61 – 3.827	95
3.827 – 4.045	89
4.045 – 4.262	74
4.262 – 4.48	68
4.48 – 4.697	48
4.697 – 4.915	56
4.915 – 5.132	25
5.132 – 5.35	20
5.35 – 5.567	34
5.567 – 5.785	11
5.785 – 6.002	23
6.002 – 6.22	13
6.22 – 6.437	12
6.437 – 6.655	5
6.655 – 6.872	11
6.872 – 7.09	5
7.09 – 7.307	5
7.307 – 7.525	3
7.525 – 7.742	4
7.742 – 7.96	1
7.96 – 8.177	1
8.177 – 8.395	0
8.395 – 8.612	1
8.612 – 8.83	0
8.83 – 9.047	0
9.047 – 9.265	0
9.265 – 9.482	1
9.482 – 9.7	1

housing housing crisis merged

Overview

Summary confidence: high

fips numeric identifier

county_name text identifier

total_renters numeric feature

pct_rent_burdened_30plus numeric feature

pct_rent_burdened_50plus numeric feature

median_gross_rent numeric feature

median_household_income numeric feature

total_housing_units numeric feature

owner_occupied numeric feature

renter_occupied numeric feature

pct_renter numeric feature

annual_rent numeric feature

rent_to_income_ratio numeric feature

affordability_category categorical label

hours_at_min_wage_for_rent numeric feature

weeks_at_min_wage_for_rent numeric feature

How to cite