saturn·

economic unemployment by county

saturn notebook · generated 2026-05-01 Report Notebook

Overview

Source: /home/coolhand/datasets/us-inequality-atlas/economic/unemployment_by_county.csv

Saturn profiled 3,222 rows across 8 columns. The stats below are deterministic and machine-readable; the prose is a language-model interpretation of those stats (opt-in, added after the fact, never sees raw rows).

[2]:
!pip install saturn-dissect
import subprocess
subprocess.run([
    "saturn", "analyze", "/home/coolhand/datasets/us-inequality-atlas/economic/unemployment_by_county.csv",
    "--findings", "economic-unemployment_by_county.json",
    "--llm", "anthropic:claude-opus-4-7",
])

Summary confidence: high

This dataset contains 3,222 US county-level labor market records with 8 columns covering county identifiers (FIPS, name, state) and workforce statistics (labor force, total 16+, unemployed, unemployment rate, participation rate). The unemployment rate averages 5.13% with a median of 4.69%, ranging up to 31.99%, so the right tail is worth inspecting for distressed counties. Population-based counts (labor_force, total_16_plus, unemployed) are extremely right-skewed (skew >13) with hundreds of outliers — expected when a few large metros sit alongside small rural counties, but it means you should likely log-transform before modeling. Texas (254), Georgia (159), and Virginia (133) contribute the most counties, reflecting state geography rather than any sampling bias. County names show a 39% duplicate rate driven by repeated names like Washington, Jefferson, and Franklin Counties across states — join on FIPS, not name.

citing: row_count · column_count · columns.unemployment_rate.stats · columns.labor_force.stats · columns.total_16_plus.stats · columns.unemployed.stats · columns.labor_force_participation_rate.stats · columns.state.top_values · columns.county_name.stats · columns.county_name.top_values

Fig 1.
unemployment_rate · Check the right tail beyond ~10% to find counties with persistent labor distress.
Show data table
Histogram bins for unemployment_rate (median: 4.69).
bincount
0 – 0.799760
0.7997 – 1.59990
1.599 – 2.399197
2.399 – 3.199333
3.199 – 3.999464
3.999 – 4.798539
4.798 – 5.598492
5.598 – 6.398361
6.398 – 7.198217
7.198 – 7.997142
7.997 – 8.79785
8.797 – 9.59761
9.597 – 10.436
10.4 – 11.236
11.2 – 1213
12 – 12.818
12.8 – 13.613
13.6 – 14.412
14.4 – 15.211
15.2 – 15.998
15.99 – 16.795
16.79 – 17.594
17.59 – 18.392
18.39 – 19.191
19.19 – 19.993
19.99 – 20.793
20.79 – 21.594
21.59 – 22.393
22.39 – 23.192
23.19 – 23.993
23.99 – 24.791
24.79 – 25.590
25.59 – 26.390
26.39 – 27.190
27.19 – 27.990
27.99 – 28.790
28.79 – 29.590
29.59 – 30.390
30.39 – 31.191
31.19 – 31.992
Fig 2.
labor_force_participation_rate · Roughly symmetric around 58% — outliers below 30% flag counties with weak workforce engagement.
Show data table
Histogram bins for labor_force_participation_rate (median: 58.724999999999994).
bincount
18.63 – 20.271
20.27 – 21.90
21.9 – 23.540
23.54 – 25.171
25.17 – 26.811
26.81 – 28.441
28.44 – 30.080
30.08 – 31.714
31.71 – 33.356
33.35 – 34.988
34.98 – 36.6210
36.62 – 38.2524
38.25 – 39.8930
39.89 – 41.5237
41.52 – 43.1651
43.16 – 44.7961
44.79 – 46.4360
46.43 – 48.0676
48.06 – 49.7109
49.7 – 51.34141
51.34 – 52.97186
52.97 – 54.61174
54.61 – 56.24235
56.24 – 57.88245
57.88 – 59.51272
59.51 – 61.15270
61.15 – 62.78277
62.78 – 64.42252
64.42 – 66.05221
66.05 – 67.69187
67.69 – 69.32118
69.32 – 70.9682
70.96 – 72.5941
72.59 – 74.2319
74.23 – 75.8610
75.86 – 77.54
77.5 – 79.136
79.13 – 80.771
80.77 – 82.40
82.4 – 84.041
Fig 3.
state · Counts per state show TX, GA, and VA dominate the row count simply due to having more counties.
Show data table
Top values for state (20 unique shown, of 52 total).
valuecountshare
TX2547.9%
GA1594.9%
VA1334.1%
KY1203.7%
MO1153.6%
KS1053.3%
IL1023.2%
NC1003.1%
IA993.1%
TN952.9%
NE932.9%
IN922.9%
OH882.7%
MN872.7%
MI832.6%
MS822.5%
PR782.4%
OK772.4%
AR752.3%
WI722.2%
Fig 4.
labor_force · Severe right skew (max 5.2M vs median 11.6K) — consider a log scale before any modeling.
Show data table
Histogram bins for labor_force (median: 11608.5).
bincount
36 – 1.311e+052944
1.311e+05 – 2.621e+05136
2.621e+05 – 3.931e+0552
3.931e+05 – 5.241e+0541
5.241e+05 – 6.551e+0514
6.551e+05 – 7.862e+0510
7.862e+05 – 9.172e+055
9.172e+05 – 1.048e+065
1.048e+06 – 1.179e+064
1.179e+06 – 1.31e+062
1.31e+06 – 1.441e+063
1.441e+06 – 1.572e+060
1.572e+06 – 1.703e+061
1.703e+06 – 1.834e+061
1.834e+06 – 1.965e+060
1.965e+06 – 2.096e+060
2.096e+06 – 2.227e+060
2.227e+06 – 2.358e+061
2.358e+06 – 2.489e+061
2.489e+06 – 2.62e+060
2.62e+06 – 2.751e+060
2.751e+06 – 2.882e+061
2.882e+06 – 3.013e+060
3.013e+06 – 3.145e+060
3.145e+06 – 3.276e+060
3.276e+06 – 3.407e+060
3.407e+06 – 3.538e+060
3.538e+06 – 3.669e+060
3.669e+06 – 3.8e+060
3.8e+06 – 3.931e+060
3.931e+06 – 4.062e+060
4.062e+06 – 4.193e+060
4.193e+06 – 4.324e+060
4.324e+06 – 4.455e+060
4.455e+06 – 4.586e+060
4.586e+06 – 4.717e+060
4.717e+06 – 4.848e+060
4.848e+06 – 4.979e+060
4.979e+06 – 5.11e+060
5.11e+06 – 5.241e+061
Fig 5.
unemployed · Heavy-tailed with 417 outliers; a handful of large counties account for most absolute unemployment.
Show data table
Histogram bins for unemployed (median: 589.0).
bincount
0 – 91393018
9139 – 1.828e+0496
1.828e+04 – 2.742e+0453
2.742e+04 – 3.655e+0423
3.655e+04 – 4.569e+048
4.569e+04 – 5.483e+044
5.483e+04 – 6.397e+043
6.397e+04 – 7.311e+044
7.311e+04 – 8.225e+044
8.225e+04 – 9.139e+043
9.139e+04 – 1.005e+051
1.005e+05 – 1.097e+052
1.097e+05 – 1.188e+050
1.188e+05 – 1.279e+050
1.279e+05 – 1.371e+050
1.371e+05 – 1.462e+050
1.462e+05 – 1.554e+050
1.554e+05 – 1.645e+051
1.645e+05 – 1.736e+050
1.736e+05 – 1.828e+050
1.828e+05 – 1.919e+050
1.919e+05 – 2.01e+051
2.01e+05 – 2.102e+050
2.102e+05 – 2.193e+050
2.193e+05 – 2.285e+050
2.285e+05 – 2.376e+050
2.376e+05 – 2.467e+050
2.467e+05 – 2.559e+050
2.559e+05 – 2.65e+050
2.65e+05 – 2.742e+050
2.742e+05 – 2.833e+050
2.833e+05 – 2.924e+050
2.924e+05 – 3.016e+050
3.016e+05 – 3.107e+050
3.107e+05 – 3.199e+050
3.199e+05 – 3.29e+050
3.29e+05 – 3.381e+050
3.381e+05 – 3.473e+050
3.473e+05 – 3.564e+050
3.564e+05 – 3.655e+051
Fig 6.
Per-column null rate across the corpus. Columns are ordered by input position.
Show data table
Per-column null rate across the corpus.
columnkindnull %
fipsnumeric0.0%
county_nametext0.0%
statecategorical0.0%
total_16_plusnumeric0.0%
labor_forcenumeric0.0%
unemployednumeric0.0%
labor_force_participation_ratenumeric0.0%
unemployment_ratenumeric0.0%
Fig 7.
Pearson correlation across numeric columns (sampled, bounded).
Show data table
Pearson correlation across 6 numeric columns (values clipped to 2 decimals).
fipstotal_16_pluslabor_forceunemployedlabor_force_participation_rateunemployment_rate
fips+1.00-0.07-0.06-0.07-0.11+0.10
total_16_plus-0.07+1.00+1.00+0.98+0.21+0.05
labor_force-0.06+1.00+1.00+0.98+0.22+0.04
unemployed-0.07+0.98+0.98+1.00+0.17+0.08
labor_force_participation_rate-0.11+0.21+0.22+0.17+1.00-0.44
unemployment_rate+0.10+0.05+0.04+0.08-0.44+1.00

fips numeric identifier

This is the U.S. county FIPS code, evidenced by every one of the 3222 rows being unique with no nulls and values spanning 1001 to 72153 — the standard 5-digit state+county encoding. The distribution is essentially uniform across the FIPS range (skew 0.16, kurtosis -0.63, no outliers), which is expected for an identifier rather than a measured quantity.

Treatment: Treat as a categorical county key; left-join on this id rather than feeding it as a numeric feature.

anthropic:claude-opus-4-7 · confidence high
Out[13]:

saturn.columns["fips"].stats

statvalue
n3,222
nulls0 (0.0%)
unique3,222
min 1,001
max 72,153
mean 3.138e+04
median 30,022
std 1.63e+04
q1 1.903e+04
q3 4.61e+04
iqr 27,075
skew 0.1574
kurtosis -0.6314
n_outliers 0
outlier_rate 0
zero_rate 0
Fig 8.
Distribution of fips. Vertical dash marks the median.
Show data table
Histogram bins for fips (median: 30022.0).
bincount
1001 – 278097
2780 – 455915
4559 – 6337133
6337 – 811659
8116 – 989514
9895 – 1.167e+044
1.167e+04 – 1.345e+04226
1.345e+04 – 1.523e+045
1.523e+04 – 1.701e+0449
1.701e+04 – 1.879e+04189
1.879e+04 – 2.057e+04204
2.057e+04 – 2.235e+04184
2.235e+04 – 2.413e+0439
2.413e+04 – 2.59e+0415
2.59e+04 – 2.768e+04170
2.768e+04 – 2.946e+04196
2.946e+04 – 3.124e+04150
3.124e+04 – 3.302e+0427
3.302e+04 – 3.48e+0421
3.48e+04 – 3.658e+0495
3.658e+04 – 3.836e+04153
3.836e+04 – 4.013e+04155
4.013e+04 – 4.191e+0446
4.191e+04 – 4.369e+0467
4.369e+04 – 4.547e+0451
4.547e+04 – 4.725e+04161
4.725e+04 – 4.903e+04268
4.903e+04 – 5.081e+0429
5.081e+04 – 5.259e+04133
5.259e+04 – 5.436e+0494
5.436e+04 – 5.614e+0495
5.614e+04 – 5.792e+040
5.792e+04 – 5.97e+040
5.97e+04 – 6.148e+040
6.148e+04 – 6.326e+040
6.326e+04 – 6.504e+040
6.504e+04 – 6.682e+040
6.682e+04 – 6.86e+040
6.86e+04 – 7.037e+040
7.037e+04 – 7.215e+0478

county_name text metadata

This column holds US county-level place names: 3,222 rows with 1,960 unique values, all between 10 and 46 characters and averaging ~2 words. The vocabulary is dominated by 'county' (2,999 occurrences) but also includes 'municipio' (78, Puerto Rico), 'parish' (64, Louisiana), and 'city' (47), so the field mixes several jurisdiction types. Note the 39.2% duplicate rate — recurring names like Washington County (30), Jefferson County (25), and Franklin County (24) appear across many states, so this name alone does not uniquely identify a county.

Treatment: Pair with a state code to form a unique key before joining or aggregating.

anthropic:claude-opus-4-7 · confidence high
Out[16]:

saturn.columns["county_name"].stats

statvalue
n3,222
nulls0 (0.0%)
unique1,960
len_min 10
len_max 46
len_mean 14.17
len_median 14
len_p95 18
word_mean 2.083
word_median 2
n_empty 0
n_duplicates 1,262
duplicate_rate 0.3917
vocab_size 1,963
readability_flesch_mean 33.36
emoji_rate 0
url_rate 0
one_word_rate 0
allcaps_rate 0
boilerplate_rate 0
alert: short_text95th-percentile length under 20 chars
alert: duplicates39.2% duplicate strings
Fig 9.
Character-length distribution for county_name.
Show data table
Character-length distribution for county_name (mean: 14.172253258845437).
charscount
10 – 1129
11 – 12255
12 – 13465
13 – 14682
14 – 14588
14 – 15493
15 – 16291
16 – 17219
17 – 1867
18 – 190
19 – 2049
20 – 2123
21 – 2216
22 – 2314
23 – 248
24 – 244
24 – 255
25 – 262
26 – 271
27 – 280
28 – 291
29 – 300
30 – 310
31 – 322
32 – 321
32 – 331
33 – 341
34 – 351
35 – 360
36 – 370
37 – 380
38 – 390
39 – 400
40 – 412
41 – 421
42 – 420
42 – 430
43 – 440
44 – 450
45 – 461

state categorical feature

Two-letter US state/territory codes across 3,222 rows with 52 distinct values and no nulls — consistent with the 50 states plus DC and likely Puerto Rico. Distribution tracks county counts rather than population: TX leads at 254 (7.88%), followed by GA (159), VA (133) and KY (120), suggesting one row per county/equivalent. High entropy ratio (0.93) confirms a fairly even spread across states.

Treatment: one-hot or target-encode for modelling; useful as a join key to state-level reference tables.

anthropic:claude-opus-4-7 · confidence high
Out[19]:

saturn.columns["state"].stats

statvalue
n3,222
nulls0 (0.0%)
unique52
top_value TX
top_rate 0.07883
cardinality 52
entropy 5.314
entropy_ratio 0.9322
Fig 10.
Top values for state.
Show data table
Top values for state (20 unique shown, of 52 total).
valuecountshare
TX2547.9%
GA1594.9%
VA1334.1%
KY1203.7%
MO1153.6%
KS1053.3%
IL1023.2%
NC1003.1%
IA993.1%
TN952.9%
NE932.9%
IN922.9%
OH882.7%
MN872.7%
MI832.6%
MS822.5%
PR782.4%
OK772.4%
AR752.3%
WI722.2%

total_16_plus numeric feature

This is a numeric population-style count of people aged 16+, with 3222 non-null rows and 3148 unique values spanning 50 to 8,086,852. The distribution is severely right-skewed (skew 13.49, kurtosis 305.88): the median is 21,167.5 but the mean is 83,549.93 and 13.7% of rows (443) flag as outliers. The std of 265,514 dwarfs the IQR of 45,507.75, consistent with a long upper tail typical of geographic aggregates.

Treatment: log-transform before regression to tame the heavy right tail.

anthropic:claude-opus-4-7 · confidence high
Out[22]:

saturn.columns["total_16_plus"].stats

statvalue
n3,222
nulls0 (0.0%)
unique3,148
min 50
max 8.087e+06
mean 8.355e+04
median 2.117e+04
std 2.655e+05
q1 8986
q3 5.449e+04
iqr 4.551e+04
skew 13.49
kurtosis 305.9
n_outliers 443
outlier_rate 0.1375
zero_rate 0
alert: high_skewskew=+13.49
alert: outliers13.7% rows beyond 1.5 IQR
Fig 11.
Distribution of total_16_plus. Vertical dash marks the median.
Show data table
Histogram bins for total_16_plus (median: 21167.5).
bincount
50 – 2.022e+052947
2.022e+05 – 4.044e+05134
4.044e+05 – 6.066e+0556
6.066e+05 – 8.087e+0537
8.087e+05 – 1.011e+0612
1.011e+06 – 1.213e+0610
1.213e+06 – 1.415e+067
1.415e+06 – 1.617e+064
1.617e+06 – 1.82e+063
1.82e+06 – 2.022e+064
2.022e+06 – 2.224e+062
2.224e+06 – 2.426e+060
2.426e+06 – 2.628e+061
2.628e+06 – 2.83e+061
2.83e+06 – 3.033e+060
3.033e+06 – 3.235e+060
3.235e+06 – 3.437e+060
3.437e+06 – 3.639e+062
3.639e+06 – 3.841e+060
3.841e+06 – 4.043e+060
4.043e+06 – 4.246e+061
4.246e+06 – 4.448e+060
4.448e+06 – 4.65e+060
4.65e+06 – 4.852e+060
4.852e+06 – 5.054e+060
5.054e+06 – 5.256e+060
5.256e+06 – 5.459e+060
5.459e+06 – 5.661e+060
5.661e+06 – 5.863e+060
5.863e+06 – 6.065e+060
6.065e+06 – 6.267e+060
6.267e+06 – 6.469e+060
6.469e+06 – 6.672e+060
6.672e+06 – 6.874e+060
6.874e+06 – 7.076e+060
7.076e+06 – 7.278e+060
7.278e+06 – 7.48e+060
7.48e+06 – 7.683e+060
7.683e+06 – 7.885e+060
7.885e+06 – 8.087e+061

labor_force numeric feature

This column appears to be the size of the labor force per record, likely at a US county or similar geographic unit given the 3,222 rows and 3,099 unique values. The distribution is severely right-skewed (skew 13.29, kurtosis 295.22) with a median of 11,608.5 but a max of 5,240,842, and 14.2% of values flagged as outliers. No nulls or zeros, but the gap between Q3 (31,930.5) and the maximum signals a long tail of very large jurisdictions.

Treatment: log-transform before modelling to tame the heavy right tail.

anthropic:claude-opus-4-7 · confidence high
Out[25]:

saturn.columns["labor_force"].stats

statvalue
n3,222
nulls0 (0.0%)
unique3,099
min 36
max 5.241e+06
mean 5.287e+04
median 1.161e+04
std 1.742e+05
q1 4777
q3 3.193e+04
iqr 2.715e+04
skew 13.29
kurtosis 295.2
n_outliers 459
outlier_rate 0.1425
zero_rate 0
alert: high_skewskew=+13.29
alert: outliers14.2% rows beyond 1.5 IQR
Fig 12.
Distribution of labor_force. Vertical dash marks the median.
Show data table
Histogram bins for labor_force (median: 11608.5).
bincount
36 – 1.311e+052944
1.311e+05 – 2.621e+05136
2.621e+05 – 3.931e+0552
3.931e+05 – 5.241e+0541
5.241e+05 – 6.551e+0514
6.551e+05 – 7.862e+0510
7.862e+05 – 9.172e+055
9.172e+05 – 1.048e+065
1.048e+06 – 1.179e+064
1.179e+06 – 1.31e+062
1.31e+06 – 1.441e+063
1.441e+06 – 1.572e+060
1.572e+06 – 1.703e+061
1.703e+06 – 1.834e+061
1.834e+06 – 1.965e+060
1.965e+06 – 2.096e+060
2.096e+06 – 2.227e+060
2.227e+06 – 2.358e+061
2.358e+06 – 2.489e+061
2.489e+06 – 2.62e+060
2.62e+06 – 2.751e+060
2.751e+06 – 2.882e+061
2.882e+06 – 3.013e+060
3.013e+06 – 3.145e+060
3.145e+06 – 3.276e+060
3.276e+06 – 3.407e+060
3.407e+06 – 3.538e+060
3.538e+06 – 3.669e+060
3.669e+06 – 3.8e+060
3.8e+06 – 3.931e+060
3.931e+06 – 4.062e+060
4.062e+06 – 4.193e+060
4.193e+06 – 4.324e+060
4.324e+06 – 4.455e+060
4.455e+06 – 4.586e+060
4.586e+06 – 4.717e+060
4.717e+06 – 4.848e+060
4.848e+06 – 4.979e+060
4.979e+06 – 5.11e+060
5.11e+06 – 5.241e+061

unemployed numeric feature

This is a numeric count of unemployed persons per record, with 3222 rows, no nulls, and 1859 unique values. The distribution is severely right-skewed (skew 16.82, kurtosis 450.4): the median is 589 while the mean is 2827 and the max reaches 365544, and 417 rows (12.9%) flag as outliers. Only 0.56% of records are zero, so sparsity is not the issue—a few extreme values are.

Treatment: Log-transform (or winsorize) before any modelling to tame the heavy right tail.

anthropic:claude-opus-4-7 · confidence high
Out[28]:

saturn.columns["unemployed"].stats

statvalue
n3,222
nulls0 (0.0%)
unique1,859
min 0
max 365,544
mean 2827
median 589
std 1.083e+04
q1 223
q3 1706
iqr 1482
skew 16.82
kurtosis 450.4
n_outliers 417
outlier_rate 0.1294
zero_rate 0.005587
alert: high_skewskew=+16.82
alert: outliers12.9% rows beyond 1.5 IQR
Fig 13.
Distribution of unemployed. Vertical dash marks the median.
Show data table
Histogram bins for unemployed (median: 589.0).
bincount
0 – 91393018
9139 – 1.828e+0496
1.828e+04 – 2.742e+0453
2.742e+04 – 3.655e+0423
3.655e+04 – 4.569e+048
4.569e+04 – 5.483e+044
5.483e+04 – 6.397e+043
6.397e+04 – 7.311e+044
7.311e+04 – 8.225e+044
8.225e+04 – 9.139e+043
9.139e+04 – 1.005e+051
1.005e+05 – 1.097e+052
1.097e+05 – 1.188e+050
1.188e+05 – 1.279e+050
1.279e+05 – 1.371e+050
1.371e+05 – 1.462e+050
1.462e+05 – 1.554e+050
1.554e+05 – 1.645e+051
1.645e+05 – 1.736e+050
1.736e+05 – 1.828e+050
1.828e+05 – 1.919e+050
1.919e+05 – 2.01e+051
2.01e+05 – 2.102e+050
2.102e+05 – 2.193e+050
2.193e+05 – 2.285e+050
2.285e+05 – 2.376e+050
2.376e+05 – 2.467e+050
2.467e+05 – 2.559e+050
2.559e+05 – 2.65e+050
2.65e+05 – 2.742e+050
2.742e+05 – 2.833e+050
2.833e+05 – 2.924e+050
2.924e+05 – 3.016e+050
3.016e+05 – 3.107e+050
3.107e+05 – 3.199e+050
3.199e+05 – 3.29e+050
3.29e+05 – 3.381e+050
3.381e+05 – 3.473e+050
3.473e+05 – 3.564e+050
3.564e+05 – 3.655e+051

labor_force_participation_rate numeric feature

Numeric column capturing labor force participation rate, almost certainly expressed as a percentage given the 18.63 to 84.04 range and mean of 57.89. The distribution is moderately left-skewed (-0.58) with a tight IQR of 10.695 around a median of 58.72, and only 1.18% of values flagged as outliers. No nulls or zeros across 3,222 rows, and 1,944 unique values suggest a continuous measurement rather than a coded category.

Treatment: Use as-is for modelling; optionally standardize since values are bounded percentages.

anthropic:claude-opus-4-7 · confidence high
Out[31]:

saturn.columns["labor_force_participation_rate"].stats

statvalue
n3,222
nulls0 (0.0%)
unique1,944
min 18.63
max 84.04
mean 57.89
median 58.72
std 8.041
q1 52.97
q3 63.66
iqr 10.7
skew -0.5766
kurtosis 0.4502
n_outliers 38
outlier_rate 0.01179
zero_rate 0
Fig 14.
Distribution of labor_force_participation_rate. Vertical dash marks the median.
Show data table
Histogram bins for labor_force_participation_rate (median: 58.724999999999994).
bincount
18.63 – 20.271
20.27 – 21.90
21.9 – 23.540
23.54 – 25.171
25.17 – 26.811
26.81 – 28.441
28.44 – 30.080
30.08 – 31.714
31.71 – 33.356
33.35 – 34.988
34.98 – 36.6210
36.62 – 38.2524
38.25 – 39.8930
39.89 – 41.5237
41.52 – 43.1651
43.16 – 44.7961
44.79 – 46.4360
46.43 – 48.0676
48.06 – 49.7109
49.7 – 51.34141
51.34 – 52.97186
52.97 – 54.61174
54.61 – 56.24235
56.24 – 57.88245
57.88 – 59.51272
59.51 – 61.15270
61.15 – 62.78277
62.78 – 64.42252
64.42 – 66.05221
66.05 – 67.69187
67.69 – 69.32118
69.32 – 70.9682
70.96 – 72.5941
72.59 – 74.2319
74.23 – 75.8610
75.86 – 77.54
77.5 – 79.136
79.13 – 80.771
80.77 – 82.40
82.4 – 84.041

unemployment_rate numeric feature

This is a county- or region-level unemployment rate expressed as a percentage, with values from 0.0 to 31.99 and a median of 4.69. The distribution is heavily right-skewed (skew 2.55, kurtosis 12.81) with 154 outliers (4.78%) pulling the mean above the median, and a small share of zero readings (0.56%).

Treatment: Apply a log or winsorising transform before regression to tame the right tail.

anthropic:claude-opus-4-7 · confidence high
Out[34]:

saturn.columns["unemployment_rate"].stats

statvalue
n3,222
nulls0 (0.0%)
unique950
min 0
max 31.99
mean 5.127
median 4.69
std 2.926
q1 3.42
q3 6.08
iqr 2.66
skew 2.545
kurtosis 12.81
n_outliers 154
outlier_rate 0.0478
zero_rate 0.005587
alert: high_skewskew=+2.55
Fig 15.
Distribution of unemployment_rate. Vertical dash marks the median.
Show data table
Histogram bins for unemployment_rate (median: 4.69).
bincount
0 – 0.799760
0.7997 – 1.59990
1.599 – 2.399197
2.399 – 3.199333
3.199 – 3.999464
3.999 – 4.798539
4.798 – 5.598492
5.598 – 6.398361
6.398 – 7.198217
7.198 – 7.997142
7.997 – 8.79785
8.797 – 9.59761
9.597 – 10.436
10.4 – 11.236
11.2 – 1213
12 – 12.818
12.8 – 13.613
13.6 – 14.412
14.4 – 15.211
15.2 – 15.998
15.99 – 16.795
16.79 – 17.594
17.59 – 18.392
18.39 – 19.191
19.19 – 19.993
19.99 – 20.793
20.79 – 21.594
21.59 – 22.393
22.39 – 23.192
23.19 – 23.993
23.99 – 24.791
24.79 – 25.590
25.59 – 26.390
26.39 – 27.190
27.19 – 27.990
27.99 – 28.790
28.79 – 29.590
29.59 – 30.390
30.39 – 31.191
31.19 – 31.992

How to cite

click to copy

BibTeX
@misc{saturn-economic-unemployment-by-county-2026,
  author       = {Steuber, Luke},
  title        = {Saturn reading: economic unemployment by county},
  year         ={2026},
  howpublished = {\url{https://dr.eamer.dev/saturn/view/economic-unemployment_by_county}},
  note         = {Profiled with saturn-dissect v0.2.0, prompt saturn-insight-v2, model anthropic:claude-opus-4-7},
}
APA
Steuber, L. (2026). Saturn reading: economic unemployment by county. Source: /home/coolhand/datasets/us-inequality-atlas/economic/unemployment_by_county.csv. Profiled with saturn-dissect v0.2.0 (saturn-insight-v2, anthropic:claude-opus-4-7). Retrieved from https://dr.eamer.dev/saturn/view/economic-unemployment_by_county