saturn

/home/coolhand/html/datavis/data_trove/cache/healthcare_data/county_health_rankings_20260121.parquet 3,222 rows sample n=3,222 seed 42 2026-05-01T16:57:25+00:00

Overview

Source/home/coolhand/html/datavis/data_trove/cache/healthcare_data/county_health_rankings_20260121.parquet
Total rows3,222
Profiled sample3,222
Columns5
Generated2026-05-01T16:57:25+00:00

Insights opt-in

Model-generated narrative. These are opinions, not facts — the stats below are what saturn measured. Generated by: anthropic:claude-opus-4-7.

Dataset high anthropic:claude-opus-4-7

This dataset covers 3,222 U.S. counties (one row per FIPS code) with population totals and uninsured counts and rates. Both total_pop and uninsured_pop are extremely right-skewed (skew 13.4 and 17.8) with hundreds of outliers, indicating a handful of very large counties dominate the raw counts — analysts should work in per-capita or log space. The uninsured_rate is the more comparable metric: median 0.12 with about 17.5% of counties reporting zero, and a long tail reaching 3.7 that warrants a data-quality check. The county_name field shows Texas, Virginia, and Georgia contributing the most counties, useful context for any state-level rollups.

fips high anthropic:claude-opus-4-7

This column is a 5-character FIPS code identifying U.S. counties, with every one of the 3222 rows holding a unique value (n_unique equals n) and zero nulls. Lengths are uniformly 5 (min, median, max all 5), values are single tokens (one_word_rate 1.0), and the leading samples like 01001, 01003, 01005 match Alabama county FIPS prefixes. It functions as a primary key rather than a feature.

county_name high anthropic:claude-opus-4-7

This column holds U.S. county identifiers, likely formatted as 'County Name County, State' given that 'county,' appears in 2,999 of 3,222 rows and state names like Texas (256), Virginia (189), and Georgia (159) dominate the top tokens. Every one of the 3,222 rows is unique with zero nulls and zero duplicates, consistent with a canonical roster of U.S. counties. String lengths cluster tightly (min 16, median 24, max 59) and average 3.25 words, so formatting is highly regular.

total_pop high anthropic:claude-opus-4-7

Likely a county- or region-level total population count: 3222 rows with 3141 unique values, no nulls, integer-scale magnitudes from 47 up to 9,866,623. The distribution is extremely right-skewed (skew 13.38, kurtosis 298.69) with median 25,328 far below mean 102,232, and 14.06% of rows flagged as outliers. The std of 326,933 dwarfs the IQR of 54,579, consistent with a few massive metros pulling the tail.

uninsured_pop high anthropic:claude-opus-4-7

Counts of uninsured people per record, likely aggregated to a geographic unit (3222 rows hints at US counties). The distribution is brutally right-skewed: median is 36 but the mean is 159.9 and the max hits 20915, with skew 17.8 and kurtosis 462.9. Roughly 17.2% of rows are zero and 11.4% flag as outliers, so a handful of large jurisdictions dominate the totals.

uninsured_rate high anthropic:claude-opus-4-7

This looks like a per-record uninsured rate, ranging from 0.0 to 3.7 with a median of 0.12 and IQR of 0.21. The distribution is severely right-skewed (skew 4.10, kurtosis 27.70) with 230 outliers (7.14%) and 17.54% exact zeros, and the max of 3.7 is implausible if this is meant to be a proportion bounded at 1.

Numeric correlation

fips text

100.0% of rows are unique strings 100.0% rows are a single word 100.0% rows are all-caps 95th-percentile length under 20 chars
rows3,222
null0 (0.0%)
unique3,222
len_min5
len_max5
len_mean5.000
len_median5.000
len_p955.000
word_mean1.000
word_median1.000
n_empty0
n_duplicates0
duplicate_rate0.000
vocab_size3,222
readability_flesch_mean121.220
emoji_rate0.000
url_rate0.000
one_word_rate1.000
allcaps_rate1.000
boilerplate_rate0.000
Sample values (first 10)
  1. 01007
  2. 47021
  3. 49031
  4. 48279
  5. 27091
  6. 56033
  7. 28017
  8. 51165
  9. 48291
  10. 05019

county_name text

100.0% of rows are unique strings
rows3,222
null0 (0.0%)
unique3,222
len_min16
len_max59
len_mean24.324
len_median24.000
len_p9531.000
word_mean3.248
word_median3.000
n_empty0
n_duplicates0
duplicate_rate0.000
vocab_size1,990
readability_flesch_mean10.284
emoji_rate0.000
url_rate0.000
one_word_rate0.000
allcaps_rate0.000
boilerplate_rate0.000
Sample values (first 10)
  1. Bibb County, Alabama
  2. Cheatham County, Tennessee
  3. Piute County, Utah
  4. Lamb County, Texas
  5. Martin County, Minnesota
  6. Sheridan County, Wyoming
  7. Chickasaw County, Mississippi
  8. Rockingham County, Virginia
  9. Liberty County, Texas
  10. Clark County, Arkansas

total_pop numeric

skew=+13.38 14.1% rows beyond 1.5 IQR
rows3,222
null0 (0.0%)
unique3,141
min47.000
max9,866,623
mean102,232
median25,328
std326,934
q110,611
q365,190
iqr54,579
skew13.377
kurtosis298.689
n_outliers453
outlier_rate0.141
zero_rate0.000

uninsured_pop numeric

skew=+17.81 11.4% rows beyond 1.5 IQR
rows3,222
null0 (0.0%)
unique584
min0.000
max20,915
mean159.945
median36.000
std627.163
q17.000
q3120.000
iqr113.000
skew17.811
kurtosis462.866
n_outliers368
outlier_rate0.114
zero_rate0.172

uninsured_rate numeric

skew=+4.10 7.1% rows beyond 1.5 IQR
rows3,222
null0 (0.0%)
unique152
min0.000
max3.700
mean0.200
median0.120
std0.283
q10.040
q30.250
iqr0.210
skew4.095
kurtosis27.703
n_outliers230
outlier_rate0.071
zero_rate0.175