This dataset profiles 3,222 U.S. counties (one row per county, keyed by FIPS) with population, uninsured counts and rates, poverty rate, a hospital closure risk score, and rural/urban flags. Population and uninsured figures are extremely right-skewed (total_pop skew 13.4, uninsured_pop skew 17.8), so a handful of large counties will dominate any raw totals — analysis should likely use rates or log scales. The hospital_closure_risk_score collapses to just 3 distinct values (with ~29% scoring 0), and risk_category is heavily imbalanced with 84% of counties labeled 'Low' and the rest 'Moderate', which is worth examining first. About 69% of counties are flagged Rural, so rural/urban comparisons of uninsured and poverty rates should be a productive next cut.
saturn
/home/coolhand/datasets/us-inequality-atlas/healthcare/healthcare_desert_merged.csv 3,222 rows sample n=3,222 seed 42 2026-05-01T17:21:19+00:00
Overview
| Source | /home/coolhand/datasets/us-inequality-atlas/healthcare/healthcare_desert_merged.csv |
| Total rows | 3,222 |
| Profiled sample | 3,222 |
| Columns | 10 |
| Generated | 2026-05-01T17:21:19+00:00 |
Insights opt-in
Model-generated narrative. These are opinions, not facts — the stats below are what saturn measured. Generated by: anthropic:claude-opus-4-7.
This is the FIPS county code: 3222 rows with 3222 unique values, no nulls, and a min of 1001 / max of 72153 consistent with the U.S. county FIPS scheme (state prefix * 1000 + county). Distribution is near-uniform across that range (skew 0.16, kurtosis -0.63, no outliers), confirming it indexes geography rather than measuring anything. Treat it as a categorical key, not a quantity, despite the numeric dtype.
This column holds fully-qualified US county names (e.g. 'X County, State'), with all 3222 values unique and no nulls. The token 'county,' appears 2999 times, confirming a 'County,
This is almost certainly a population count per geographic unit (likely US counties given n=3222), with values ranging from 47 to 9,866,623 and a median of 25,328. The distribution is severely right-skewed (skew 13.38, kurtosis 298.69) with the mean (102,232) nearly four times the median and 453 outliers (14.06%) — the standard deviation of 326,934 dwarfs the IQR of 54,579. No nulls or zeros, and 3,141 of 3,222 values are unique.
Counts of uninsured residents per record, with values ranging from 0 to 20,915 across 3,222 rows and no nulls. The distribution is severely right-skewed (skew 17.81, kurtosis 462.87): the median is 36 while the mean is 159.95, and 17.2% of rows are zero. 368 outliers (11.4%) sit far above the Q3 of 120, consistent with a few very large populations dominating the tail.
This appears to be an uninsured rate per record, expressed as a proportion ranging from 0.0 to 3.7 with a median of 0.12. The maximum of 3.7 is suspicious for a rate that should cap at 1.0, and the distribution is severely right-skewed (skew 4.10, kurtosis 27.70) with 230 outliers (7.1%) and 17.5% exact zeros.
This is a numeric poverty rate (likely percentage of population in poverty) across 3222 rows with no nulls and 1719 unique values. The distribution is right-skewed (skew 2.10, kurtosis 6.89) with a median of 13.55 and mean 15.10, ranging from 1.6 to 66.32; 137 outliers (4.25%) sit in the upper tail. The high skew alert means a long tail of high-poverty units pulls the mean above the median.
Binary flag indicating whether a record is rural, stored as the strings "True"/"False" rather than booleans. The split is imbalanced toward rural at 68.7% (2212 of 3222) versus 1010 non-rural, with no nulls. Entropy ratio of 0.897 confirms a meaningful but skewed distribution.
Binary categorical flag splitting records into 'Rural' (2212, 68.7%) and 'Urban/Suburban' (1010), with no nulls across 3222 rows. The split is moderately imbalanced but entropy ratio of 0.90 indicates both classes are well represented. Clean two-level partition suitable as a stratifier or feature.
Despite being typed as numeric, hospital_closure_risk_score takes only 3 distinct values across 3222 rows, spanning 0 to 50 with a median of 25 and roughly 28.8% zeros. This is effectively an ordinal risk band (likely 0/25/50) masquerading as a continuous score, so the reported mean of 21.69 and std of 16.34 reflect category mix rather than a smooth distribution.
Binary risk classification flagging records as either Low or Moderate, with no nulls across 3,222 rows. The distribution is heavily imbalanced: 84.4% fall into Low (2,719) versus only 503 Moderate, and no High tier appears at all. Entropy ratio of 0.62 confirms the skew.
Numeric correlation
fips numeric
county_name text
Sample values (first 10)
- Bibb County, Alabama
- Cheatham County, Tennessee
- Piute County, Utah
- Lamb County, Texas
- Martin County, Minnesota
- Sheridan County, Wyoming
- Chickasaw County, Mississippi
- Rockingham County, Virginia
- Liberty County, Texas
- Clark County, Arkansas
total_pop numeric
uninsured_pop numeric
uninsured_rate numeric
poverty_rate numeric
rural categorical
Top values (rank 1–20)
- True — 2,212
- False — 1,010
rural_category categorical
Top values (rank 1–20)
- Rural — 2,212
- Urban/Suburban — 1,010
hospital_closure_risk_score numeric
risk_category categorical
Top values (rank 1–20)
- Low — 2,719
- Moderate — 503