saturn

/home/coolhand/datasets/us-inequality-atlas/healthcare/healthcare_desert_merged.csv 3,222 rows sample n=3,222 seed 42 2026-05-01T17:21:19+00:00

Overview

Source/home/coolhand/datasets/us-inequality-atlas/healthcare/healthcare_desert_merged.csv
Total rows3,222
Profiled sample3,222
Columns10
Generated2026-05-01T17:21:19+00:00

Insights opt-in

Model-generated narrative. These are opinions, not facts — the stats below are what saturn measured. Generated by: anthropic:claude-opus-4-7.

Dataset high anthropic:claude-opus-4-7

This dataset profiles 3,222 U.S. counties (one row per county, keyed by FIPS) with population, uninsured counts and rates, poverty rate, a hospital closure risk score, and rural/urban flags. Population and uninsured figures are extremely right-skewed (total_pop skew 13.4, uninsured_pop skew 17.8), so a handful of large counties will dominate any raw totals — analysis should likely use rates or log scales. The hospital_closure_risk_score collapses to just 3 distinct values (with ~29% scoring 0), and risk_category is heavily imbalanced with 84% of counties labeled 'Low' and the rest 'Moderate', which is worth examining first. About 69% of counties are flagged Rural, so rural/urban comparisons of uninsured and poverty rates should be a productive next cut.

fips high anthropic:claude-opus-4-7

This is the FIPS county code: 3222 rows with 3222 unique values, no nulls, and a min of 1001 / max of 72153 consistent with the U.S. county FIPS scheme (state prefix * 1000 + county). Distribution is near-uniform across that range (skew 0.16, kurtosis -0.63, no outliers), confirming it indexes geography rather than measuring anything. Treat it as a categorical key, not a quantity, despite the numeric dtype.

county_name high anthropic:claude-opus-4-7

This column holds fully-qualified US county names (e.g. 'X County, State'), with all 3222 values unique and no nulls. The token 'county,' appears 2999 times, confirming a 'County, ' format, while the remaining ~223 rows likely use alternate suffixes like Parish or Borough. Texas (256), Virginia (189), and Georgia (159) lead the state distribution, consistent with national county counts.

total_pop high anthropic:claude-opus-4-7

This is almost certainly a population count per geographic unit (likely US counties given n=3222), with values ranging from 47 to 9,866,623 and a median of 25,328. The distribution is severely right-skewed (skew 13.38, kurtosis 298.69) with the mean (102,232) nearly four times the median and 453 outliers (14.06%) — the standard deviation of 326,934 dwarfs the IQR of 54,579. No nulls or zeros, and 3,141 of 3,222 values are unique.

uninsured_pop high anthropic:claude-opus-4-7

Counts of uninsured residents per record, with values ranging from 0 to 20,915 across 3,222 rows and no nulls. The distribution is severely right-skewed (skew 17.81, kurtosis 462.87): the median is 36 while the mean is 159.95, and 17.2% of rows are zero. 368 outliers (11.4%) sit far above the Q3 of 120, consistent with a few very large populations dominating the tail.

uninsured_rate high anthropic:claude-opus-4-7

This appears to be an uninsured rate per record, expressed as a proportion ranging from 0.0 to 3.7 with a median of 0.12. The maximum of 3.7 is suspicious for a rate that should cap at 1.0, and the distribution is severely right-skewed (skew 4.10, kurtosis 27.70) with 230 outliers (7.1%) and 17.5% exact zeros.

poverty_rate high anthropic:claude-opus-4-7

This is a numeric poverty rate (likely percentage of population in poverty) across 3222 rows with no nulls and 1719 unique values. The distribution is right-skewed (skew 2.10, kurtosis 6.89) with a median of 13.55 and mean 15.10, ranging from 1.6 to 66.32; 137 outliers (4.25%) sit in the upper tail. The high skew alert means a long tail of high-poverty units pulls the mean above the median.

rural high anthropic:claude-opus-4-7

Binary flag indicating whether a record is rural, stored as the strings "True"/"False" rather than booleans. The split is imbalanced toward rural at 68.7% (2212 of 3222) versus 1010 non-rural, with no nulls. Entropy ratio of 0.897 confirms a meaningful but skewed distribution.

rural_category high anthropic:claude-opus-4-7

Binary categorical flag splitting records into 'Rural' (2212, 68.7%) and 'Urban/Suburban' (1010), with no nulls across 3222 rows. The split is moderately imbalanced but entropy ratio of 0.90 indicates both classes are well represented. Clean two-level partition suitable as a stratifier or feature.

hospital_closure_risk_score high anthropic:claude-opus-4-7

Despite being typed as numeric, hospital_closure_risk_score takes only 3 distinct values across 3222 rows, spanning 0 to 50 with a median of 25 and roughly 28.8% zeros. This is effectively an ordinal risk band (likely 0/25/50) masquerading as a continuous score, so the reported mean of 21.69 and std of 16.34 reflect category mix rather than a smooth distribution.

risk_category high anthropic:claude-opus-4-7

Binary risk classification flagging records as either Low or Moderate, with no nulls across 3,222 rows. The distribution is heavily imbalanced: 84.4% fall into Low (2,719) versus only 503 Moderate, and no High tier appears at all. Entropy ratio of 0.62 confirms the skew.

Numeric correlation

fips numeric

rows3,222
null0 (0.0%)
unique3,222
min1,001
max72,153
mean31,378
median30,022
std16,300
q119,030
q346,104
iqr27,075
skew0.157
kurtosis-0.631
n_outliers0
outlier_rate0.000
zero_rate0.000

county_name text

100.0% of rows are unique strings
rows3,222
null0 (0.0%)
unique3,222
len_min16
len_max59
len_mean24.324
len_median24.000
len_p9531.000
word_mean3.248
word_median3.000
n_empty0
n_duplicates0
duplicate_rate0.000
vocab_size1,990
readability_flesch_mean10.284
emoji_rate0.000
url_rate0.000
one_word_rate0.000
allcaps_rate0.000
boilerplate_rate0.000
Sample values (first 10)
  1. Bibb County, Alabama
  2. Cheatham County, Tennessee
  3. Piute County, Utah
  4. Lamb County, Texas
  5. Martin County, Minnesota
  6. Sheridan County, Wyoming
  7. Chickasaw County, Mississippi
  8. Rockingham County, Virginia
  9. Liberty County, Texas
  10. Clark County, Arkansas

total_pop numeric

skew=+13.38 14.1% rows beyond 1.5 IQR
rows3,222
null0 (0.0%)
unique3,141
min47.000
max9,866,623
mean102,232
median25,328
std326,934
q110,611
q365,190
iqr54,579
skew13.377
kurtosis298.689
n_outliers453
outlier_rate0.141
zero_rate0.000

uninsured_pop numeric

skew=+17.81 11.4% rows beyond 1.5 IQR
rows3,222
null0 (0.0%)
unique584
min0.000
max20,915
mean159.945
median36.000
std627.163
q17.000
q3120.000
iqr113.000
skew17.811
kurtosis462.866
n_outliers368
outlier_rate0.114
zero_rate0.172

uninsured_rate numeric

skew=+4.10 7.1% rows beyond 1.5 IQR
rows3,222
null0 (0.0%)
unique152
min0.000
max3.700
mean0.200
median0.120
std0.283
q10.040
q30.250
iqr0.210
skew4.095
kurtosis27.703
n_outliers230
outlier_rate0.071
zero_rate0.175

poverty_rate numeric

skew=+2.10
rows3,222
null0 (0.0%)
unique1,719
min1.600
max66.320
mean15.100
median13.550
std7.706
q110.160
q317.910
iqr7.750
skew2.096
kurtosis6.891
n_outliers137
outlier_rate0.043
zero_rate0.000

rural categorical

rows3,222
null0 (0.0%)
unique2
top_valueTrue
top_rate0.687
cardinality2
entropy0.897
entropy_ratio0.897
Top values (rank 1–20)
  1. True — 2,212
  2. False — 1,010

rural_category categorical

rows3,222
null0 (0.0%)
unique2
top_valueRural
top_rate0.687
cardinality2
entropy0.897
entropy_ratio0.897
Top values (rank 1–20)
  1. Rural — 2,212
  2. Urban/Suburban — 1,010

hospital_closure_risk_score numeric

rows3,222
null0 (0.0%)
unique3
min0.000
max50.000
mean21.695
median25.000
std16.338
q10.000
q325.000
iqr25.000
skew0.141
kurtosis-0.695
n_outliers0
outlier_rate0.000
zero_rate0.288

risk_category categorical

rows3,222
null0 (0.0%)
unique2
top_valueLow
top_rate0.844
cardinality2
entropy0.625
entropy_ratio0.625
Top values (rank 1–20)
  1. Low — 2,719
  2. Moderate — 503