saturn·

healthcare healthcare desert merged

saturn notebook · generated 2026-05-01 Report Notebook

Overview

Source: /home/coolhand/datasets/us-inequality-atlas/healthcare/healthcare_desert_merged.csv

Saturn profiled 3,222 rows across 10 columns. The stats below are deterministic and machine-readable; the prose is a language-model interpretation of those stats (opt-in, added after the fact, never sees raw rows).

[2]:
!pip install saturn-dissect
import subprocess
subprocess.run([
    "saturn", "analyze", "/home/coolhand/datasets/us-inequality-atlas/healthcare/healthcare_desert_merged.csv",
    "--findings", "healthcare-healthcare_desert_merged.json",
    "--llm", "anthropic:claude-opus-4-7",
])

Summary confidence: high

This dataset profiles 3,222 U.S. counties (one row per county, keyed by FIPS) with population, uninsured counts and rates, poverty rate, a hospital closure risk score, and rural/urban flags. Population and uninsured figures are extremely right-skewed (total_pop skew 13.4, uninsured_pop skew 17.8), so a handful of large counties will dominate any raw totals — analysis should likely use rates or log scales. The hospital_closure_risk_score collapses to just 3 distinct values (with ~29% scoring 0), and risk_category is heavily imbalanced with 84% of counties labeled 'Low' and the rest 'Moderate', which is worth examining first. About 69% of counties are flagged Rural, so rural/urban comparisons of uninsured and poverty rates should be a productive next cut.

citing: total_pop · uninsured_pop · uninsured_rate · poverty_rate · hospital_closure_risk_score · risk_category · rural_category

Out[4]:

saturn.schema() · 10 columns

column kind n null% unique alerts
fips numeric 3,222 0.0% 3,222
county_name text 3,222 0.0% 3,222 near_unique
total_pop numeric 3,222 0.0% 3,141 high_skew outliers
uninsured_pop numeric 3,222 0.0% 584 high_skew outliers
uninsured_rate numeric 3,222 0.0% 152 high_skew outliers
poverty_rate numeric 3,222 0.0% 1,719 high_skew
rural categorical 3,222 0.0% 2
rural_category categorical 3,222 0.0% 2
hospital_closure_risk_score numeric 3,222 0.0% 3
risk_category categorical 3,222 0.0% 2
Fig 1.
risk_category · Shows the strong class imbalance — about 84% of counties fall in the Low risk bucket.
Show data table
Top values for risk_category (2 unique shown, of 2 total).
valuecountshare
Low271984.4%
Moderate50315.6%
Fig 2.
hospital_closure_risk_score · Only three distinct score values appear, so a bar of value counts reveals the underlying scoring buckets.
Show data table
Histogram bins for hospital_closure_risk_score (median: 25.0).
bincount
0 – 1.25929
1.25 – 2.50
2.5 – 3.750
3.75 – 50
5 – 6.250
6.25 – 7.50
7.5 – 8.750
8.75 – 100
10 – 11.250
11.25 – 12.50
12.5 – 13.750
13.75 – 150
15 – 16.250
16.25 – 17.50
17.5 – 18.750
18.75 – 200
20 – 21.250
21.25 – 22.50
22.5 – 23.750
23.75 – 250
25 – 26.251790
26.25 – 27.50
27.5 – 28.750
28.75 – 300
30 – 31.250
31.25 – 32.50
32.5 – 33.750
33.75 – 350
35 – 36.250
36.25 – 37.50
37.5 – 38.750
38.75 – 400
40 – 41.250
41.25 – 42.50
42.5 – 43.750
43.75 – 450
45 – 46.250
46.25 – 47.50
47.5 – 48.750
48.75 – 50503
Fig 3.
rural_category · Roughly two-thirds of counties are Rural, framing how to segment every other metric.
Show data table
Top values for rural_category (2 unique shown, of 2 total).
valuecountshare
Rural221268.7%
Urban/Suburban101031.3%
Fig 4.
uninsured_rate · Right-skewed distribution of county uninsured rates; watch the long tail above the 0.25 third quartile.
Show data table
Histogram bins for uninsured_rate (median: 0.12).
bincount
0 – 0.09251403
0.0925 – 0.185704
0.185 – 0.2775403
0.2775 – 0.37213
0.37 – 0.4625158
0.4625 – 0.555101
0.555 – 0.647565
0.6475 – 0.7443
0.74 – 0.832527
0.8325 – 0.92523
0.925 – 1.0189
1.018 – 1.1115
1.11 – 1.20214
1.202 – 1.2955
1.295 – 1.3877
1.387 – 1.487
1.48 – 1.5735
1.573 – 1.6652
1.665 – 1.7584
1.758 – 1.851
1.85 – 1.9421
1.942 – 2.0351
2.035 – 2.1272
2.127 – 2.222
2.22 – 2.3121
2.312 – 2.4050
2.405 – 2.4980
2.498 – 2.591
2.59 – 2.6830
2.683 – 2.7751
2.775 – 2.8680
2.868 – 2.961
2.96 – 3.0521
3.052 – 3.1450
3.145 – 3.2371
3.237 – 3.330
3.33 – 3.4220
3.422 – 3.5150
3.515 – 3.6070
3.607 – 3.71
Fig 5.
poverty_rate · Poverty rate spreads from 1.6 to 66.3 with a median of 13.55 — useful for spotting high-poverty outlier counties.
Show data table
Histogram bins for poverty_rate (median: 13.55).
bincount
1.6 – 3.2187
3.218 – 4.83634
4.836 – 6.454106
6.454 – 8.072246
8.072 – 9.69320
9.69 – 11.31354
11.31 – 12.93393
12.93 – 14.54364
14.54 – 16.16306
16.16 – 17.78262
17.78 – 19.4192
19.4 – 21.02149
21.02 – 22.63123
22.63 – 24.2591
24.25 – 25.8752
25.87 – 27.4944
27.49 – 29.1134
29.11 – 30.7223
30.72 – 32.3418
32.34 – 33.9614
33.96 – 35.586
35.58 – 37.28
37.2 – 38.813
38.81 – 40.438
40.43 – 42.055
42.05 – 43.679
43.67 – 45.294
45.29 – 46.911
46.9 – 48.527
48.52 – 50.148
50.14 – 51.762
51.76 – 53.386
53.38 – 54.995
54.99 – 56.615
56.61 – 58.231
58.23 – 59.850
59.85 – 61.470
61.47 – 63.080
63.08 – 64.71
64.7 – 66.321
Fig 6.
Per-column null rate across the corpus. Columns are ordered by input position.
Show data table
Per-column null rate across the corpus.
columnkindnull %
fipsnumeric0.0%
county_nametext0.0%
total_popnumeric0.0%
uninsured_popnumeric0.0%
uninsured_ratenumeric0.0%
poverty_ratenumeric0.0%
ruralcategorical0.0%
rural_categorycategorical0.0%
hospital_closure_risk_scorenumeric0.0%
risk_categorycategorical0.0%
Fig 7.
Pearson correlation across numeric columns (sampled, bounded).
Show data table
Pearson correlation across 6 numeric columns (values clipped to 2 decimals).
fipstotal_popuninsured_popuninsured_ratepoverty_ratehospital_closure_risk_score
fips+1.00-0.07-0.02+0.01+0.16+0.01
total_pop-0.07+1.00+0.81-0.05-0.11-0.31
uninsured_pop-0.02+0.81+1.00+0.12-0.09-0.27
uninsured_rate+0.01-0.05+0.12+1.00-0.04+0.05
poverty_rate+0.16-0.11-0.09-0.04+1.00+0.58
hospital_closure_risk_score+0.01-0.31-0.27+0.05+0.58+1.00

fips numeric identifier

This is the FIPS county code: 3222 rows with 3222 unique values, no nulls, and a min of 1001 / max of 72153 consistent with the U.S. county FIPS scheme (state prefix * 1000 + county). Distribution is near-uniform across that range (skew 0.16, kurtosis -0.63, no outliers), confirming it indexes geography rather than measuring anything. Treat it as a categorical key, not a quantity, despite the numeric dtype.

Treatment: Cast to zero-padded string and left-join on this county FIPS code; do not use as a numeric feature.

anthropic:claude-opus-4-7 · confidence high
Out[13]:

saturn.columns["fips"].stats

statvalue
n3,222
nulls0 (0.0%)
unique3,222
min 1,001
max 72,153
mean 3.138e+04
median 30,022
std 1.63e+04
q1 1.903e+04
q3 4.61e+04
iqr 27,075
skew 0.1574
kurtosis -0.6314
n_outliers 0
outlier_rate 0
zero_rate 0
Fig 8.
Distribution of fips. Vertical dash marks the median.
Show data table
Histogram bins for fips (median: 30022.0).
bincount
1001 – 278097
2780 – 455915
4559 – 6337133
6337 – 811659
8116 – 989514
9895 – 1.167e+044
1.167e+04 – 1.345e+04226
1.345e+04 – 1.523e+045
1.523e+04 – 1.701e+0449
1.701e+04 – 1.879e+04189
1.879e+04 – 2.057e+04204
2.057e+04 – 2.235e+04184
2.235e+04 – 2.413e+0439
2.413e+04 – 2.59e+0415
2.59e+04 – 2.768e+04170
2.768e+04 – 2.946e+04196
2.946e+04 – 3.124e+04150
3.124e+04 – 3.302e+0427
3.302e+04 – 3.48e+0421
3.48e+04 – 3.658e+0495
3.658e+04 – 3.836e+04153
3.836e+04 – 4.013e+04155
4.013e+04 – 4.191e+0446
4.191e+04 – 4.369e+0467
4.369e+04 – 4.547e+0451
4.547e+04 – 4.725e+04161
4.725e+04 – 4.903e+04268
4.903e+04 – 5.081e+0429
5.081e+04 – 5.259e+04133
5.259e+04 – 5.436e+0494
5.436e+04 – 5.614e+0495
5.614e+04 – 5.792e+040
5.792e+04 – 5.97e+040
5.97e+04 – 6.148e+040
6.148e+04 – 6.326e+040
6.326e+04 – 6.504e+040
6.504e+04 – 6.682e+040
6.682e+04 – 6.86e+040
6.86e+04 – 7.037e+040
7.037e+04 – 7.215e+0478

county_name text identifier

This column holds fully-qualified US county names (e.g. 'X County, State'), with all 3222 values unique and no nulls. The token 'county,' appears 2999 times, confirming a 'County, ' format, while the remaining ~223 rows likely use alternate suffixes like Parish or Borough. Texas (256), Virginia (189), and Georgia (159) lead the state distribution, consistent with national county counts.

Treatment: Use as a join key after splitting into county and state components.

anthropic:claude-opus-4-7 · confidence high
Out[16]:

saturn.columns["county_name"].stats

statvalue
n3,222
nulls0 (0.0%)
unique3,222
len_min 16
len_max 59
len_mean 24.32
len_median 24
len_p95 31
word_mean 3.248
word_median 3
n_empty 0
n_duplicates 0
duplicate_rate 0
vocab_size 1,990
readability_flesch_mean 10.28
emoji_rate 0
url_rate 0
one_word_rate 0
allcaps_rate 0
boilerplate_rate 0
alert: near_unique100.0% of rows are unique strings
Fig 9.
Character-length distribution for county_name.
Show data table
Character-length distribution for county_name (mean: 24.324022346368714).
charscount
16 – 1726
17 – 1872
18 – 19121
19 – 20190
20 – 21264
21 – 22407
22 – 24420
24 – 25363
25 – 26320
26 – 27240
27 – 28231
28 – 29152
29 – 30139
30 – 31165
31 – 3241
32 – 3328
33 – 3416
34 – 3510
35 – 365
36 – 380
38 – 391
39 – 401
40 – 410
41 – 421
42 – 431
43 – 440
44 – 452
45 – 460
46 – 471
47 – 481
48 – 490
49 – 500
50 – 510
51 – 530
53 – 542
54 – 551
55 – 560
56 – 570
57 – 580
58 – 591

total_pop numeric feature

This is almost certainly a population count per geographic unit (likely US counties given n=3222), with values ranging from 47 to 9,866,623 and a median of 25,328. The distribution is severely right-skewed (skew 13.38, kurtosis 298.69) with the mean (102,232) nearly four times the median and 453 outliers (14.06%) — the standard deviation of 326,934 dwarfs the IQR of 54,579. No nulls or zeros, and 3,141 of 3,222 values are unique.

Treatment: Log-transform before any modelling or distance-based analysis to tame the extreme right skew.

anthropic:claude-opus-4-7 · confidence high
Out[19]:

saturn.columns["total_pop"].stats

statvalue
n3,222
nulls0 (0.0%)
unique3,141
min 47
max 9.867e+06
mean 1.022e+05
median 25,328
std 3.269e+05
q1 1.061e+04
q3 65,190
iqr 5.458e+04
skew 13.38
kurtosis 298.7
n_outliers 453
outlier_rate 0.1406
zero_rate 0
alert: high_skewskew=+13.38
alert: outliers14.1% rows beyond 1.5 IQR
Fig 10.
Distribution of total_pop. Vertical dash marks the median.
Show data table
Histogram bins for total_pop (median: 25328.0).
bincount
47 – 2.467e+052942
2.467e+05 – 4.934e+05137
4.934e+05 – 7.4e+0556
7.4e+05 – 9.867e+0539
9.867e+05 – 1.233e+0613
1.233e+06 – 1.48e+069
1.48e+06 – 1.727e+067
1.727e+06 – 1.973e+063
1.973e+06 – 2.22e+063
2.22e+06 – 2.467e+064
2.467e+06 – 2.713e+063
2.713e+06 – 2.96e+060
2.96e+06 – 3.207e+062
3.207e+06 – 3.453e+060
3.453e+06 – 3.7e+060
3.7e+06 – 3.947e+060
3.947e+06 – 4.193e+060
4.193e+06 – 4.44e+061
4.44e+06 – 4.687e+060
4.687e+06 – 4.933e+061
4.933e+06 – 5.18e+060
5.18e+06 – 5.427e+061
5.427e+06 – 5.673e+060
5.673e+06 – 5.92e+060
5.92e+06 – 6.167e+060
6.167e+06 – 6.413e+060
6.413e+06 – 6.66e+060
6.66e+06 – 6.907e+060
6.907e+06 – 7.153e+060
7.153e+06 – 7.4e+060
7.4e+06 – 7.647e+060
7.647e+06 – 7.893e+060
7.893e+06 – 8.14e+060
8.14e+06 – 8.387e+060
8.387e+06 – 8.633e+060
8.633e+06 – 8.88e+060
8.88e+06 – 9.127e+060
9.127e+06 – 9.373e+060
9.373e+06 – 9.62e+060
9.62e+06 – 9.867e+061

uninsured_pop numeric feature

Counts of uninsured residents per record, with values ranging from 0 to 20,915 across 3,222 rows and no nulls. The distribution is severely right-skewed (skew 17.81, kurtosis 462.87): the median is 36 while the mean is 159.95, and 17.2% of rows are zero. 368 outliers (11.4%) sit far above the Q3 of 120, consistent with a few very large populations dominating the tail.

Treatment: Apply a log1p transform before modelling to tame the heavy right tail.

anthropic:claude-opus-4-7 · confidence high
Out[22]:

saturn.columns["uninsured_pop"].stats

statvalue
n3,222
nulls0 (0.0%)
unique584
min 0
max 20,915
mean 159.9
median 36
std 627.2
q1 7
q3 120
iqr 113
skew 17.81
kurtosis 462.9
n_outliers 368
outlier_rate 0.1142
zero_rate 0.1723
alert: high_skewskew=+17.81
alert: outliers11.4% rows beyond 1.5 IQR
Fig 11.
Distribution of uninsured_pop. Vertical dash marks the median.
Show data table
Histogram bins for uninsured_pop (median: 36.0).
bincount
0 – 522.93022
522.9 – 1046124
1046 – 156932
1569 – 209216
2092 – 26147
2614 – 31375
3137 – 36605
3660 – 41832
4183 – 47060
4706 – 52291
5229 – 57522
5752 – 62741
6274 – 67970
6797 – 73200
7320 – 78430
7843 – 83661
8366 – 88891
8889 – 94120
9412 – 99350
9935 – 1.046e+040
1.046e+04 – 1.098e+040
1.098e+04 – 1.15e+042
1.15e+04 – 1.203e+040
1.203e+04 – 1.255e+040
1.255e+04 – 1.307e+040
1.307e+04 – 1.359e+040
1.359e+04 – 1.412e+040
1.412e+04 – 1.464e+040
1.464e+04 – 1.516e+040
1.516e+04 – 1.569e+040
1.569e+04 – 1.621e+040
1.621e+04 – 1.673e+040
1.673e+04 – 1.725e+040
1.725e+04 – 1.778e+040
1.778e+04 – 1.83e+040
1.83e+04 – 1.882e+040
1.882e+04 – 1.935e+040
1.935e+04 – 1.987e+040
1.987e+04 – 2.039e+040
2.039e+04 – 2.092e+041

uninsured_rate numeric feature

This appears to be an uninsured rate per record, expressed as a proportion ranging from 0.0 to 3.7 with a median of 0.12. The maximum of 3.7 is suspicious for a rate that should cap at 1.0, and the distribution is severely right-skewed (skew 4.10, kurtosis 27.70) with 230 outliers (7.1%) and 17.5% exact zeros.

Treatment: Investigate values >1.0 for unit errors, then log-transform or winsorize before modelling.

anthropic:claude-opus-4-7 · confidence high
Out[25]:

saturn.columns["uninsured_rate"].stats

statvalue
n3,222
nulls0 (0.0%)
unique152
min 0
max 3.7
mean 0.2002
median 0.12
std 0.2829
q1 0.04
q3 0.25
iqr 0.21
skew 4.095
kurtosis 27.7
n_outliers 230
outlier_rate 0.07138
zero_rate 0.1754
alert: high_skewskew=+4.10
alert: outliers7.1% rows beyond 1.5 IQR
Fig 12.
Distribution of uninsured_rate. Vertical dash marks the median.
Show data table
Histogram bins for uninsured_rate (median: 0.12).
bincount
0 – 0.09251403
0.0925 – 0.185704
0.185 – 0.2775403
0.2775 – 0.37213
0.37 – 0.4625158
0.4625 – 0.555101
0.555 – 0.647565
0.6475 – 0.7443
0.74 – 0.832527
0.8325 – 0.92523
0.925 – 1.0189
1.018 – 1.1115
1.11 – 1.20214
1.202 – 1.2955
1.295 – 1.3877
1.387 – 1.487
1.48 – 1.5735
1.573 – 1.6652
1.665 – 1.7584
1.758 – 1.851
1.85 – 1.9421
1.942 – 2.0351
2.035 – 2.1272
2.127 – 2.222
2.22 – 2.3121
2.312 – 2.4050
2.405 – 2.4980
2.498 – 2.591
2.59 – 2.6830
2.683 – 2.7751
2.775 – 2.8680
2.868 – 2.961
2.96 – 3.0521
3.052 – 3.1450
3.145 – 3.2371
3.237 – 3.330
3.33 – 3.4220
3.422 – 3.5150
3.515 – 3.6070
3.607 – 3.71

poverty_rate numeric feature

This is a numeric poverty rate (likely percentage of population in poverty) across 3222 rows with no nulls and 1719 unique values. The distribution is right-skewed (skew 2.10, kurtosis 6.89) with a median of 13.55 and mean 15.10, ranging from 1.6 to 66.32; 137 outliers (4.25%) sit in the upper tail. The high skew alert means a long tail of high-poverty units pulls the mean above the median.

Treatment: Consider a log or sqrt transform before regression to tame the right skew.

anthropic:claude-opus-4-7 · confidence high
Out[28]:

saturn.columns["poverty_rate"].stats

statvalue
n3,222
nulls0 (0.0%)
unique1,719
min 1.6
max 66.32
mean 15.1
median 13.55
std 7.706
q1 10.16
q3 17.91
iqr 7.75
skew 2.096
kurtosis 6.891
n_outliers 137
outlier_rate 0.04252
zero_rate 0
alert: high_skewskew=+2.10
Fig 13.
Distribution of poverty_rate. Vertical dash marks the median.
Show data table
Histogram bins for poverty_rate (median: 13.55).
bincount
1.6 – 3.2187
3.218 – 4.83634
4.836 – 6.454106
6.454 – 8.072246
8.072 – 9.69320
9.69 – 11.31354
11.31 – 12.93393
12.93 – 14.54364
14.54 – 16.16306
16.16 – 17.78262
17.78 – 19.4192
19.4 – 21.02149
21.02 – 22.63123
22.63 – 24.2591
24.25 – 25.8752
25.87 – 27.4944
27.49 – 29.1134
29.11 – 30.7223
30.72 – 32.3418
32.34 – 33.9614
33.96 – 35.586
35.58 – 37.28
37.2 – 38.813
38.81 – 40.438
40.43 – 42.055
42.05 – 43.679
43.67 – 45.294
45.29 – 46.911
46.9 – 48.527
48.52 – 50.148
50.14 – 51.762
51.76 – 53.386
53.38 – 54.995
54.99 – 56.615
56.61 – 58.231
58.23 – 59.850
59.85 – 61.470
61.47 – 63.080
63.08 – 64.71
64.7 – 66.321

rural categorical feature

Binary flag indicating whether a record is rural, stored as the strings "True"/"False" rather than booleans. The split is imbalanced toward rural at 68.7% (2212 of 3222) versus 1010 non-rural, with no nulls. Entropy ratio of 0.897 confirms a meaningful but skewed distribution.

Treatment: Cast string "True"/"False" to a 0/1 boolean and use directly as a feature.

anthropic:claude-opus-4-7 · confidence high
Out[31]:

saturn.columns["rural"].stats

statvalue
n3,222
nulls0 (0.0%)
unique2
top_value True
top_rate 0.6865
cardinality 2
entropy 0.8971
entropy_ratio 0.8971
Fig 14.
Top values for rural.
Show data table
Top values for rural (2 unique shown, of 2 total).
valuecountshare
True221268.7%
False101031.3%

rural_category categorical feature

Binary categorical flag splitting records into 'Rural' (2212, 68.7%) and 'Urban/Suburban' (1010), with no nulls across 3222 rows. The split is moderately imbalanced but entropy ratio of 0.90 indicates both classes are well represented. Clean two-level partition suitable as a stratifier or feature.

Treatment: One-hot or binary-encode for modelling; consider stratifying splits on this flag.

anthropic:claude-opus-4-7 · confidence high
Out[34]:

saturn.columns["rural_category"].stats

statvalue
n3,222
nulls0 (0.0%)
unique2
top_value Rural
top_rate 0.6865
cardinality 2
entropy 0.8971
entropy_ratio 0.8971
Fig 15.
Top values for rural_category.
Show data table
Top values for rural_category (2 unique shown, of 2 total).
valuecountshare
Rural221268.7%
Urban/Suburban101031.3%

hospital_closure_risk_score numeric feature

Despite being typed as numeric, hospital_closure_risk_score takes only 3 distinct values across 3222 rows, spanning 0 to 50 with a median of 25 and roughly 28.8% zeros. This is effectively an ordinal risk band (likely 0/25/50) masquerading as a continuous score, so the reported mean of 21.69 and std of 16.34 reflect category mix rather than a smooth distribution.

Treatment: Treat as an ordinal categorical (low/medium/high) rather than a continuous numeric.

anthropic:claude-opus-4-7 · confidence high
Out[37]:

saturn.columns["hospital_closure_risk_score"].stats

statvalue
n3,222
nulls0 (0.0%)
unique3
min 0
max 50
mean 21.69
median 25
std 16.34
q1 0
q3 25
iqr 25
skew 0.1414
kurtosis -0.6949
n_outliers 0
outlier_rate 0
zero_rate 0.2883
Fig 16.
Distribution of hospital_closure_risk_score. Vertical dash marks the median.
Show data table
Histogram bins for hospital_closure_risk_score (median: 25.0).
bincount
0 – 1.25929
1.25 – 2.50
2.5 – 3.750
3.75 – 50
5 – 6.250
6.25 – 7.50
7.5 – 8.750
8.75 – 100
10 – 11.250
11.25 – 12.50
12.5 – 13.750
13.75 – 150
15 – 16.250
16.25 – 17.50
17.5 – 18.750
18.75 – 200
20 – 21.250
21.25 – 22.50
22.5 – 23.750
23.75 – 250
25 – 26.251790
26.25 – 27.50
27.5 – 28.750
28.75 – 300
30 – 31.250
31.25 – 32.50
32.5 – 33.750
33.75 – 350
35 – 36.250
36.25 – 37.50
37.5 – 38.750
38.75 – 400
40 – 41.250
41.25 – 42.50
42.5 – 43.750
43.75 – 450
45 – 46.250
46.25 – 47.50
47.5 – 48.750
48.75 – 50503

risk_category categorical label

Binary risk classification flagging records as either Low or Moderate, with no nulls across 3,222 rows. The distribution is heavily imbalanced: 84.4% fall into Low (2,719) versus only 503 Moderate, and no High tier appears at all. Entropy ratio of 0.62 confirms the skew.

Treatment: Treat as binary target; account for class imbalance via stratified sampling or class weighting.

anthropic:claude-opus-4-7 · confidence high
Out[40]:

saturn.columns["risk_category"].stats

statvalue
n3,222
nulls0 (0.0%)
unique2
top_value Low
top_rate 0.8439
cardinality 2
entropy 0.6249
entropy_ratio 0.6249
Fig 17.
Top values for risk_category.
Show data table
Top values for risk_category (2 unique shown, of 2 total).
valuecountshare
Low271984.4%
Moderate50315.6%

How to cite

click to copy

BibTeX
@misc{saturn-healthcare-healthcare-desert-merged-2026,
  author       = {Steuber, Luke},
  title        = {Saturn reading: healthcare healthcare desert merged},
  year         ={2026},
  howpublished = {\url{https://dr.eamer.dev/saturn/view/healthcare-healthcare_desert_merged}},
  note         = {Profiled with saturn-dissect v0.2.0, prompt saturn-insight-v2, model anthropic:claude-opus-4-7},
}
APA
Steuber, L. (2026). Saturn reading: healthcare healthcare desert merged. Source: /home/coolhand/datasets/us-inequality-atlas/healthcare/healthcare_desert_merged.csv. Profiled with saturn-dissect v0.2.0 (saturn-insight-v2, anthropic:claude-opus-4-7). Retrieved from https://dr.eamer.dev/saturn/view/healthcare-healthcare_desert_merged