saturn

/home/coolhand/html/datavis/data_trove/demographic/veterans/cache/acs_2022_county.parquet 3,144 rows sample n=3,144 seed 42 2026-05-01T17:02:24+00:00

Overview

Source/home/coolhand/html/datavis/data_trove/demographic/veterans/cache/acs_2022_county.parquet
Total rows3,144
Profiled sample3,144
Columns7
Generated2026-05-01T17:02:24+00:00

Insights opt-in

Model-generated narrative. These are opinions, not facts — the stats below are what saturn measured. Generated by: anthropic:claude-opus-4-7.

Dataset high anthropic:claude-opus-4-7

This dataset covers 3,144 U.S. counties from the 2022 American Community Survey, with each row identified by FIPS, state, county code, and name, plus three Census table values: total population (B01003_001E), male veteran population (B21001_002E), and civilian labor force (B23025_002E). All three demographic measures are extremely right-skewed (skew of 13.2, 8.0, and 13.1) with hundreds of outlier counties — for example, total population ranges from 50 up to 9.94 million while the median is just 25,784. About 13-14% of counties register as outliers on each measure, reflecting the handful of very large metro counties dominating the tails. Start by looking at population and labor-force distributions on a log scale, and use the state field (51 unique values) to see how counties cluster geographically.

NAME high anthropic:claude-opus-4-7

This column holds full US county names with state suffix (e.g. 'X County, Texas'), as evidenced by 'county,' appearing in 2999 of 3144 rows and the next most common tokens being state names like Texas (256), Virginia (189), and Georgia (159). All 3144 values are unique with zero nulls or duplicates, and lengths cluster tightly between 16 and 59 characters (median 24). The 145 rows lacking the 'county,' token likely correspond to Louisiana parishes, Alaska boroughs, or independent cities, which is worth confirming.

B23025_002E high anthropic:claude-opus-4-7

This is the ACS variable B23025_002E, the count of people aged 16+ in the labor force, reported per row (likely one row per US county given n=3144). The distribution is extremely right-skewed (skew 13.14, kurtosis 288.57) with a median of 11,698 but a max of 5,240,842, and 14.3% of rows flagged as outliers — consistent with a few massive metros dwarfing thousands of small counties. No nulls or zeros, and 3028/3144 values are unique.

B21001_002E high anthropic:claude-opus-4-7

This is the ACS variable B21001_002E, a count of civilian veteran-eligible population per geographic unit (likely county, given n=3144). Values span 0 to 244,160 with a median of 1,547.5 but a mean of 5,419, and skew of 8.01 with kurtosis above 100 indicate a heavy right tail driven by 408 outlier rows (~13%). Near-zero null and zero rates confirm the count is populated nearly everywhere.

B01003_001E high anthropic:claude-opus-4-7

This is the ACS table B01003_001E, total population, reported for 3,144 rows — consistent with US counties. Values span 50 to 9,936,690 with median 25,784.5 versus mean 105,310.94, and skew of 13.17 with kurtosis 289.76 confirms an extreme long right tail (440 outliers, 14.0%). No nulls or zeros, and 3,080 of 3,144 values are unique.

state high anthropic:claude-opus-4-7

This column holds 51 distinct integer codes ranging from 1 to 56 across 3144 rows with no nulls, matching the FIPS state code scheme (50 states plus DC, with gaps explaining why the max is 56). The near-uniform spread (IQR 27, skew -0.08, kurtosis -1.10) is consistent with a categorical identifier rather than a true numeric quantity. Row count of 3144 also aligns with US county-level data keyed by state.

county high anthropic:claude-opus-4-7

This column is named 'county' and contains integer codes ranging from 1 to 840 across 3144 rows with only 329 unique values, consistent with FIPS-style county numbers that repeat across states. The distribution is heavily right-skewed (skew 2.84, kurtosis 11.38) with 176 outliers (5.6%) above the upper fence, which is expected when a handful of states use higher county numbers. Despite being stored as numeric, the values are categorical identifiers, not measurements.

fips high anthropic:claude-opus-4-7

This is the U.S. county FIPS code: a 3144-row column with 3144 unique integer values, no nulls, ranging from 1001 to 56045. The distribution is uniform-like (skew -0.08, kurtosis -1.10) which is exactly what you'd expect from state-prefixed county identifiers, not a measured quantity. Treat it as a key, not a numeric feature.

Numeric correlation

NAME text

100.0% of rows are unique strings
rows3,144
null0 (0.0%)
unique3,144
len_min16
len_max59
len_mean24.165
len_median24.000
len_p9530.850
word_mean3.224
word_median3.000
n_empty0
n_duplicates0
duplicate_rate0.000
vocab_size1,910
readability_flesch_mean6.826
emoji_rate0.000
url_rate0.000
one_word_rate0.000
allcaps_rate0.000
boilerplate_rate0.000
Sample values (first 10)
  1. Bibb County, Alabama
  2. Day County, South Dakota
  3. Sabine County, Texas
  4. Fayette County, Texas
  5. Chisago County, Minnesota
  6. Dane County, Wisconsin
  7. Ramsey County, Minnesota
  8. Bath County, Virginia
  9. Freestone County, Texas
  10. Carroll County, Arkansas

B23025_002E numeric

skew=+13.14 14.3% rows beyond 1.5 IQR
rows3,144
null0 (0.0%)
unique3,028
min36.000
max5,240,842
mean53,783
median11,698
std176,263
q14,722
q332,590
iqr27,868
skew13.138
kurtosis288.571
n_outliers449
outlier_rate0.143
zero_rate0.000

B21001_002E numeric

skew=+8.01 13.0% rows beyond 1.5 IQR
rows3,144
null0 (0.0%)
unique2,424
min0.000
max244,160
mean5,419
median1,548
std13,105
q1634.750
q34,428
iqr3,793
skew8.014
kurtosis100.022
n_outliers408
outlier_rate0.130
zero_rate3.18e-04

B01003_001E numeric

skew=+13.17 14.0% rows beyond 1.5 IQR
rows3,144
null0 (0.0%)
unique3,080
min50.000
max9,936,690
mean105,311
median25,784
std333,792
q110,836
q368,080
iqr57,244
skew13.175
kurtosis289.761
n_outliers440
outlier_rate0.140
zero_rate0.000

state numeric

rows3,144
null0 (0.0%)
unique51
min1.000
max56.000
mean30.264
median29.000
std15.153
q118.000
q345.000
iqr27.000
skew-0.081
kurtosis-1.099
n_outliers0
outlier_rate0.000
zero_rate0.000

county numeric

skew=+2.84 5.6% rows beyond 1.5 IQR
rows3,144
null0 (0.0%)
unique329
min1.000
max840.000
mean103.874
median79.000
std107.567
q135.000
q3133.500
iqr98.500
skew2.841
kurtosis11.377
n_outliers176
outlier_rate0.056
zero_rate0.000

fips numeric

rows3,144
null0 (0.0%)
unique3,144
min1,001
max56,045
mean30,368
median29,174
std15,170
q118,174
q345,080
iqr26,905
skew-0.079
kurtosis-1.099
n_outliers0
outlier_rate0.000
zero_rate0.000