saturn·

veterans merged county analysis

saturn notebook · generated 2026-05-01 Report Notebook

Overview

Source: /home/coolhand/html/datavis/data_trove/data/policy/veterans/merged_county_analysis.csv

Saturn profiled 3,144 rows across 18 columns. The stats below are deterministic and machine-readable; the prose is a language-model interpretation of those stats (opt-in, added after the fact, never sees raw rows).

[2]:
!pip install saturn-dissect
import subprocess
subprocess.run([
    "saturn", "analyze", "/home/coolhand/html/datavis/data_trove/data/policy/veterans/merged_county_analysis.csv",
    "--findings", "veterans-merged_county_analysis.json",
    "--llm", "anthropic:claude-opus-4-7",
])

Summary confidence: high

This dataset contains 3,144 rows — one per U.S. county — combining Census geographic identifiers (GEOID, STATE_NAME, NAMELSAD, ALAND, AWATER) with veteran and active-duty military estimates and rate-normalized fields. The raw count columns (total_pop, active_duty_est, veterans_est, ALAND) are extremely right-skewed with skew values above 8 and hundreds of outliers each, so any analysis on them should use logs or per-capita versions. The rate columns tell a cleaner story: active_duty_per_10k is roughly symmetric (skew -0.38, mean ~4,694 per 10k) while veterans_per_100 is mildly right-skewed (mean 6.19, max 18.09) and is the better candidate for ranking counties. State coverage is uneven — Texas alone supplies 254 counties (8.1%), followed by Georgia and Virginia — which matters when aggregating. Note also that LSAD is heavily imbalanced (95% code '06') and GEOID and fips are duplicates of each other.

citing: row_count · column_count · ALAND · total_pop · active_duty_est · veterans_est · active_duty_per_10k · veterans_per_100 · STATE_NAME · LSAD · NAMELSAD

Out[4]:

saturn.schema() · 18 columns

column kind n null% unique alerts
STATEFP numeric 3,144 0.0% 51
COUNTYFP numeric 3,144 0.0% 329 high_skew outliers
COUNTYNS numeric 3,144 0.0% 3,144
GEOIDFQ text 3,144 0.0% 3,144 near_unique one_word allcaps short_text
GEOID numeric 3,144 0.0% 3,144
NAME text 3,144 0.0% 1,838 one_word short_text duplicates
NAMELSAD text 3,144 0.0% 1,882 short_text duplicates
STUSPS categorical 3,144 0.0% 51
STATE_NAME categorical 3,144 0.0% 51
LSAD categorical 3,144 0.0% 9 imbalance
ALAND numeric 3,144 0.0% 3,144 high_skew outliers
AWATER numeric 3,144 0.0% 3,144 high_skew outliers
fips numeric 3,144 0.0% 3,144
active_duty_est numeric 3,144 0.0% 3,028 high_skew outliers
veterans_est numeric 3,144 0.0% 2,424 high_skew outliers
total_pop numeric 3,144 0.0% 3,080 high_skew outliers
active_duty_per_10k numeric 3,144 0.0% 3,144
veterans_per_100 numeric 3,144 0.0% 3,143
Fig 1.
STATE_NAME · Counties per state — Texas, Georgia and Virginia dominate the row counts.
Show data table
Top values for STATE_NAME (20 unique shown, of 51 total).
valuecountshare
Texas2548.1%
Georgia1595.1%
Virginia1334.2%
Kentucky1203.8%
Missouri1153.7%
Kansas1053.3%
Illinois1023.2%
North Carolina1003.2%
Iowa993.1%
Tennessee953.0%
Nebraska933.0%
Indiana922.9%
Ohio882.8%
Minnesota872.8%
Michigan832.6%
Mississippi822.6%
Oklahoma772.4%
Arkansas752.4%
Wisconsin722.3%
Alabama672.1%
Fig 2.
veterans_per_100 · Distribution of veterans as a share of population; most counties cluster between 5 and 7 per 100 with a long right tail.
Show data table
Histogram bins for veterans_per_100 (median: 5.984985213609011).
bincount
0 – 0.45221
0.4522 – 0.90430
0.9043 – 1.3576
1.357 – 1.80913
1.809 – 2.26121
2.261 – 2.71334
2.713 – 3.16560
3.165 – 3.61784
3.617 – 4.07137
4.07 – 4.522187
4.522 – 4.974271
4.974 – 5.426319
5.426 – 5.878346
5.878 – 6.33358
6.33 – 6.783313
6.783 – 7.235259
7.235 – 7.687173
7.687 – 8.139128
8.139 – 8.59194
8.591 – 9.04388
9.043 – 9.49660
9.496 – 9.94838
9.948 – 10.439
10.4 – 10.8526
10.85 – 11.316
11.3 – 11.7621
11.76 – 12.2114
12.21 – 12.6615
12.66 – 13.114
13.11 – 13.576
13.57 – 14.026
14.02 – 14.471
14.47 – 14.921
14.92 – 15.373
15.37 – 15.830
15.83 – 16.280
16.28 – 16.731
16.73 – 17.180
17.18 – 17.630
17.63 – 18.091
Fig 3.
active_duty_per_10k · Active-duty rate per 10k is roughly symmetric around ~4,700, useful as a normalized comparison metric.
Show data table
Histogram bins for active_duty_per_10k (median: 4733.279942644007).
bincount
1708 – 18451
1845 – 19830
1983 – 21201
2120 – 22570
2257 – 23951
2395 – 25321
2532 – 26691
2669 – 28076
2807 – 29447
2944 – 30819
3081 – 321821
3218 – 335620
3356 – 349329
3493 – 363057
3630 – 376852
3768 – 390592
3905 – 4042111
4042 – 4179165
4179 – 4317193
4317 – 4454224
4454 – 4591267
4591 – 4729303
4729 – 4866280
4866 – 5003301
5003 – 5141277
5141 – 5278268
5278 – 5415177
5415 – 5552136
5552 – 569061
5690 – 582737
5827 – 596418
5964 – 61028
6102 – 62395
6239 – 63765
6376 – 65143
6514 – 66512
6651 – 67882
6788 – 69250
6925 – 70630
7063 – 72003
Fig 4.
total_pop · County population is extremely right-skewed (skew ~13); plot on a log scale to see structure.
Show data table
Histogram bins for total_pop (median: 25784.5).
bincount
50 – 2.485e+052863
2.485e+05 – 4.969e+05137
4.969e+05 – 7.453e+0557
7.453e+05 – 9.937e+0537
9.937e+05 – 1.242e+0614
1.242e+06 – 1.491e+0610
1.491e+06 – 1.739e+067
1.739e+06 – 1.987e+063
1.987e+06 – 2.236e+063
2.236e+06 – 2.484e+064
2.484e+06 – 2.733e+063
2.733e+06 – 2.981e+060
2.981e+06 – 3.229e+061
3.229e+06 – 3.478e+061
3.478e+06 – 3.726e+060
3.726e+06 – 3.975e+060
3.975e+06 – 4.223e+060
4.223e+06 – 4.472e+061
4.472e+06 – 4.72e+060
4.72e+06 – 4.968e+061
4.968e+06 – 5.217e+060
5.217e+06 – 5.465e+061
5.465e+06 – 5.714e+060
5.714e+06 – 5.962e+060
5.962e+06 – 6.21e+060
6.21e+06 – 6.459e+060
6.459e+06 – 6.707e+060
6.707e+06 – 6.956e+060
6.956e+06 – 7.204e+060
7.204e+06 – 7.453e+060
7.453e+06 – 7.701e+060
7.701e+06 – 7.949e+060
7.949e+06 – 8.198e+060
8.198e+06 – 8.446e+060
8.446e+06 – 8.695e+060
8.695e+06 – 8.943e+060
8.943e+06 – 9.191e+060
9.191e+06 – 9.44e+060
9.44e+06 – 9.688e+060
9.688e+06 – 9.937e+061
Fig 5.
LSAD · Legal/statistical area descriptor is dominated by code '06' (~95%), so this column adds little signal.
Show data table
Top values for LSAD (9 unique shown, of 9 total).
valuecountshare
06299995.4%
15642.0%
25401.3%
04130.4%
05110.3%
PL90.3%
0340.1%
0020.1%
1220.1%
Fig 6.
Per-column null rate across the corpus. Columns are ordered by input position.
Show data table
Per-column null rate across the corpus.
columnkindnull %
STATEFPnumeric0.0%
COUNTYFPnumeric0.0%
COUNTYNSnumeric0.0%
GEOIDFQtext0.0%
GEOIDnumeric0.0%
NAMEtext0.0%
NAMELSADtext0.0%
STUSPScategorical0.0%
STATE_NAMEcategorical0.0%
LSADcategorical0.0%
ALANDnumeric0.0%
AWATERnumeric0.0%
fipsnumeric0.0%
active_duty_estnumeric0.0%
veterans_estnumeric0.0%
total_popnumeric0.0%
active_duty_per_10knumeric0.0%
veterans_per_100numeric0.0%
Fig 7.
Pearson correlation across numeric columns (sampled, bounded).
Show data table
Pearson correlation across 12 numeric columns (values clipped to 2 decimals).
STATEFPCOUNTYFPCOUNTYNSGEOIDALANDAWATERfipsactive_duty_estveterans_esttotal_popactive_duty_per_10kveterans_per_100
STATEFP+1.00+0.21+0.76+1.00-0.11-0.21+1.00-0.05-0.04-0.05+0.06+0.14
COUNTYFP+0.21+1.00+0.17+0.22-0.11-0.03+0.22-0.02+0.01-0.02+0.05+0.09
COUNTYNS+0.76+0.17+1.00+0.76+0.10+0.09+0.76-0.06-0.05-0.06+0.09+0.19
GEOID+1.00+0.22+0.76+1.00-0.11-0.21+1.00-0.05-0.04-0.05+0.06+0.14
ALAND-0.11-0.11+0.10-0.11+1.00+0.43-0.11+0.07+0.10+0.07-0.05+0.03
AWATER-0.21-0.03+0.09-0.21+0.43+1.00-0.21+0.01+0.02+0.01+0.20-0.01
fips+1.00+0.22+0.76+1.00-0.11-0.21+1.00-0.05-0.04-0.05+0.06+0.14
active_duty_est-0.05-0.02-0.06-0.05+0.07+0.01-0.05+1.00+0.94+1.00+0.27-0.11
veterans_est-0.04+0.01-0.05-0.04+0.10+0.02-0.04+0.94+1.00+0.95+0.24+0.02
total_pop-0.05-0.02-0.06-0.05+0.07+0.01-0.05+1.00+0.95+1.00+0.26-0.11
active_duty_per_10k+0.06+0.05+0.09+0.06-0.05+0.20+0.06+0.27+0.24+0.26+1.00-0.04
veterans_per_100+0.14+0.09+0.19+0.14+0.03-0.01+0.14-0.11+0.02-0.11-0.04+1.00

STATEFP numeric foreign_key

This is the US Census STATEFP code, a 1-2 digit FIPS identifier for states stored numerically. Values range from 1 to 56 with 51 unique entries across 3144 rows, matching the count of US states plus DC, and the row count aligns with the number of US counties. The near-uniform spread (skew -0.08, kurtosis -1.10) and zero outliers are consistent with a categorical state code rather than a measured quantity.

Treatment: Cast to zero-padded string and treat as a categorical state key for joins, not a numeric feature.

anthropic:claude-opus-4-7 · confidence high
Out[13]:

saturn.columns["STATEFP"].stats

statvalue
n3,144
nulls0 (0.0%)
unique51
min 1
max 56
mean 30.26
median 29
std 15.15
q1 18
q3 45
iqr 27
skew -0.08128
kurtosis -1.099
n_outliers 0
outlier_rate 0
zero_rate 0
Fig 8.
Distribution of STATEFP. Vertical dash marks the median.
Show data table
Histogram bins for STATEFP (median: 29.0).
bincount
1 – 2.37597
2.375 – 3.750
3.75 – 5.12590
5.125 – 6.558
6.5 – 7.8750
7.875 – 9.2573
9.25 – 10.623
10.62 – 121
12 – 13.38226
13.38 – 14.750
14.75 – 16.1249
16.12 – 17.5102
17.5 – 18.8892
18.88 – 20.25204
20.25 – 21.62120
21.62 – 2364
23 – 24.3840
24.38 – 25.7514
25.75 – 27.12170
27.12 – 28.582
28.5 – 29.88115
29.88 – 31.25149
31.25 – 32.6217
32.62 – 3410
34 – 35.3854
35.38 – 36.7562
36.75 – 38.12153
38.12 – 39.588
39.5 – 40.8877
40.88 – 42.25103
42.25 – 43.620
43.62 – 455
45 – 46.38112
46.38 – 47.7595
47.75 – 49.12283
49.12 – 50.514
50.5 – 51.88133
51.88 – 53.2539
53.25 – 54.6255
54.62 – 5695

COUNTYFP numeric identifier

COUNTYFP is the 3-digit FIPS county code, stored numerically across 3144 rows with no nulls and 329 unique values. The distribution is heavily right-skewed (skew 2.84, kurtosis 11.4) with a max of 840 well beyond Q3 of 133.5, flagging 176 outliers — expected behavior since FIPS codes are categorical identifiers, not measurements, and high values correspond to specific county assignments.

Treatment: Cast to zero-padded string and combine with STATEFP to form a 5-digit GEOID join key; do not treat as numeric.

anthropic:claude-opus-4-7 · confidence high
Out[16]:

saturn.columns["COUNTYFP"].stats

statvalue
n3,144
nulls0 (0.0%)
unique329
min 1
max 840
mean 103.9
median 79
std 107.6
q1 35
q3 133.5
iqr 98.5
skew 2.841
kurtosis 11.38
n_outliers 176
outlier_rate 0.05598
zero_rate 0
alert: high_skewskew=+2.84
alert: outliers5.6% rows beyond 1.5 IQR
Fig 9.
Distribution of COUNTYFP. Vertical dash marks the median.
Show data table
Histogram bins for COUNTYFP (median: 79.0).
bincount
1 – 21.98512
21.98 – 42.95408
42.95 – 63.93399
63.93 – 84.9335
84.9 – 105.9341
105.9 – 126.9271
126.9 – 147.8225
147.8 – 168.8165
168.8 – 189.8140
189.8 – 210.871
210.8 – 231.745
231.7 – 252.725
252.7 – 273.722
273.7 – 294.723
294.7 – 315.622
315.6 – 336.613
336.6 – 357.611
357.6 – 378.610
378.6 – 399.511
399.5 – 420.510
420.5 – 441.511
441.5 – 462.510
462.5 – 483.411
483.4 – 504.410
504.4 – 525.47
525.4 – 546.42
546.4 – 567.31
567.3 – 588.32
588.3 – 609.33
609.3 – 630.23
630.2 – 651.22
651.2 – 672.22
672.2 – 693.25
693.2 – 714.22
714.2 – 735.13
735.1 – 756.12
756.1 – 777.13
777.1 – 798.11
798.1 – 8192
819 – 8403

COUNTYNS numeric identifier

COUNTYNS is the Census Bureau's permanent numeric ANSI/GNIS identifier for U.S. counties: all 3144 values are unique with no nulls or zeros, and the range (23901 to 2830254) matches the GNIS ID space. The distribution is broad but unremarkable (skew 0.17, kurtosis -0.80), as expected for an ID code rather than a measurement.

Treatment: Treat as a county-level key for joins; do not use as a numeric feature.

anthropic:claude-opus-4-7 · confidence high
Out[19]:

saturn.columns["COUNTYNS"].stats

statvalue
n3,144
nulls0 (0.0%)
unique3,144
min 23,901
max 2.83e+06
mean 9.503e+05
median 9.741e+05
std 5.168e+05
q1 4.85e+05
q3 1.384e+06
iqr 8.99e+05
skew 0.1721
kurtosis -0.8015
n_outliers 11
outlier_rate 0.003499
zero_rate 0
Fig 10.
Distribution of COUNTYNS. Vertical dash marks the median.
Show data table
Histogram bins for COUNTYNS (median: 974128.5).
bincount
2.39e+04 – 9.406e+0490
9.406e+04 – 1.642e+0566
1.642e+05 – 2.344e+0567
2.344e+05 – 3.045e+0586
3.045e+05 – 3.747e+05161
3.747e+05 – 4.449e+05114
4.449e+05 – 5.15e+05296
5.15e+05 – 5.852e+05191
5.852e+05 – 6.553e+0521
6.553e+05 – 7.255e+05169
7.255e+05 – 7.956e+05116
7.956e+05 – 8.658e+05110
8.658e+05 – 9.36e+0553
9.36e+05 – 1.006e+0664
1.006e+06 – 1.076e+06241
1.076e+06 – 1.146e+0699
1.146e+06 – 1.217e+0681
1.217e+06 – 1.287e+06117
1.287e+06 – 1.357e+060
1.357e+06 – 1.427e+06278
1.427e+06 – 1.497e+06100
1.497e+06 – 1.567e+06121
1.567e+06 – 1.638e+06187
1.638e+06 – 1.708e+06193
1.708e+06 – 1.778e+0663
1.778e+06 – 1.848e+0643
1.848e+06 – 1.918e+060
1.918e+06 – 1.988e+061
1.988e+06 – 2.059e+061
2.059e+06 – 2.129e+060
2.129e+06 – 2.199e+060
2.199e+06 – 2.269e+060
2.269e+06 – 2.339e+060
2.339e+06 – 2.409e+062
2.409e+06 – 2.479e+060
2.479e+06 – 2.55e+062
2.55e+06 – 2.62e+060
2.62e+06 – 2.69e+060
2.69e+06 – 2.76e+060
2.76e+06 – 2.83e+0611

GEOIDFQ text identifier

This is the Census Bureau's fully-qualified GEOID (GEOIDFQ) for U.S. counties: every value is exactly 14 characters, single-token, all-caps, and follows the `0500000US` summary-level prefix followed by a state+county FIPS code. All 3144 rows are unique with no nulls or duplicates, matching the count of U.S. counties. Vocab size equals row count (3144), confirming it is a pure identifier with no analytical signal of its own.

Treatment: Use as a left-join key against Census geographies; do not feed into models.

anthropic:claude-opus-4-7 · confidence high
Out[22]:

saturn.columns["GEOIDFQ"].stats

statvalue
n3,144
nulls0 (0.0%)
unique3,144
len_min 14
len_max 14
len_mean 14
len_median 14
len_p95 14
word_mean 1
word_median 1
n_empty 0
n_duplicates 0
duplicate_rate 0
vocab_size 3,144
readability_flesch_mean 121.2
emoji_rate 0
url_rate 0
one_word_rate 1
allcaps_rate 1
boilerplate_rate 0
alert: near_unique100.0% of rows are unique strings
alert: one_word100.0% rows are a single word
alert: allcaps100.0% rows are all-caps
alert: short_text95th-percentile length under 20 chars
Fig 11.
Character-length distribution for GEOIDFQ.
Show data table
Character-length distribution for GEOIDFQ (mean: 14.0).
charscount
14 – 140
14 – 140
14 – 140
14 – 140
14 – 140
14 – 140
14 – 140
14 – 140
14 – 140
14 – 140
14 – 140
14 – 140
14 – 140
14 – 140
14 – 140
14 – 140
14 – 140
14 – 140
14 – 140
14 – 140
14 – 143144
14 – 140
14 – 140
14 – 140
14 – 140
14 – 140
14 – 140
14 – 140
14 – 140
14 – 140
14 – 140
14 – 140
14 – 140
14 – 140
14 – 140
14 – 140
14 – 140
14 – 140
14 – 140
14 – 140

GEOID numeric identifier

GEOID is the 5-digit FIPS code identifying US counties: every one of the 3,144 rows is unique with no nulls, and the range 1001 to 56045 matches the state+county FIPS convention (Alabama through Wyoming). The near-zero skew (-0.08) and flat kurtosis (-1.10) reflect roughly uniform coverage across state codes rather than any meaningful distribution. Treating these as numbers is misleading—they are categorical keys.

Treatment: Cast to zero-padded string and use as a join key to county-level geographies; do not model as numeric.

anthropic:claude-opus-4-7 · confidence high
Out[25]:

saturn.columns["GEOID"].stats

statvalue
n3,144
nulls0 (0.0%)
unique3,144
min 1,001
max 56,045
mean 3.037e+04
median 29,174
std 1.517e+04
q1 1.817e+04
q3 4.508e+04
iqr 26,905
skew -0.07923
kurtosis -1.099
n_outliers 0
outlier_rate 0
zero_rate 0
Fig 12.
Distribution of GEOID. Vertical dash marks the median.
Show data table
Histogram bins for GEOID (median: 29174.0).
bincount
1001 – 237797
2377 – 37530
3753 – 512980
5129 – 650568
6505 – 78820
7882 – 925873
9258 – 1.063e+043
1.063e+04 – 1.201e+046
1.201e+04 – 1.339e+04221
1.339e+04 – 1.476e+040
1.476e+04 – 1.614e+0449
1.614e+04 – 1.751e+04102
1.751e+04 – 1.889e+0492
1.889e+04 – 2.027e+04204
2.027e+04 – 2.164e+04120
2.164e+04 – 2.302e+0473
2.302e+04 – 2.439e+0430
2.439e+04 – 2.577e+0415
2.577e+04 – 2.715e+04156
2.715e+04 – 2.852e+0496
2.852e+04 – 2.99e+04115
2.99e+04 – 3.128e+04149
3.128e+04 – 3.265e+0417
3.265e+04 – 3.403e+0424
3.403e+04 – 3.54e+0440
3.54e+04 – 3.678e+0462
3.678e+04 – 3.816e+04153
3.816e+04 – 3.953e+0488
3.953e+04 – 4.091e+0477
4.091e+04 – 4.228e+04103
4.228e+04 – 4.366e+040
4.366e+04 – 4.504e+0423
4.504e+04 – 4.641e+0494
4.641e+04 – 4.779e+0495
4.779e+04 – 4.916e+04283
4.916e+04 – 5.054e+0414
5.054e+04 – 5.192e+04133
5.192e+04 – 5.329e+0439
5.329e+04 – 5.467e+0455
5.467e+04 – 5.604e+0495

NAME text label

This column holds short place names — almost certainly US county names, given the dominance of 'Washington' (31), 'Franklin' (26), 'Jefferson' (26), 'Lincoln' (24) and 'Madison' (20), all classic county namesakes. Values are overwhelmingly single-word (one_word_rate 0.934, word_mean 1.07) and short (len_mean 7.0, len_max 30), with no nulls. The 41.5% duplicate_rate is expected here: the same county name recurs across states, so 3144 rows collapse to 1838 unique strings.

Treatment: Treat as a non-unique name; pair with a state/FIPS column before joining or grouping.

anthropic:claude-opus-4-7 · confidence high
Out[28]:

saturn.columns["NAME"].stats

statvalue
n3,144
nulls0 (0.0%)
unique1,838
len_min 3
len_max 30
len_mean 7.05
len_median 7
len_p95 11
word_mean 1.072
word_median 1
n_empty 0
n_duplicates 1,306
duplicate_rate 0.4154
vocab_size 1,875
readability_flesch_mean 36.74
emoji_rate 0
url_rate 0
one_word_rate 0.9342
allcaps_rate 0
boilerplate_rate 0
alert: one_word93.4% rows are a single word
alert: short_text95th-percentile length under 20 chars
alert: duplicates41.5% duplicate strings
Fig 13.
Character-length distribution for NAME.
Show data table
Character-length distribution for NAME (mean: 7.049618320610687).
charscount
3 – 427
4 – 4254
4 – 5460
5 – 60
6 – 6680
6 – 7592
7 – 80
8 – 8486
8 – 9282
9 – 100
10 – 10203
10 – 1155
11 – 120
12 – 1244
12 – 1318
13 – 140
14 – 1414
14 – 159
15 – 160
16 – 165
16 – 173
17 – 180
18 – 192
19 – 192
19 – 200
20 – 213
21 – 211
21 – 220
22 – 230
23 – 230
23 – 240
24 – 252
25 – 251
25 – 260
26 – 270
27 – 270
27 – 280
28 – 290
29 – 290
29 – 301

NAMELSAD text label

This is the full legal name of US county-equivalents (NAMELSAD from Census TIGER), with 'county' appearing 2999 times alongside 64 'parish' and 47 'city' entries reflecting Louisiana and independent-city conventions. Names are short (mean 14.08 chars, ~2 words) and heavily duplicated across states: 1262 duplicates (40.1%) driven by repeated names like 'Washington County' (30), 'Jefferson County' (25), and 'Franklin County' (24). Only 1882 unique values across 3144 rows, so this field alone does not identify a county.

Treatment: Use as a display label only; join on a state+FIPS code rather than this name to avoid duplicate collisions.

anthropic:claude-opus-4-7 · confidence high
Out[31]:

saturn.columns["NAMELSAD"].stats

statvalue
n3,144
nulls0 (0.0%)
unique1,882
len_min 10
len_max 46
len_mean 14.08
len_median 14
len_p95 18
word_mean 2.08
word_median 2
n_empty 0
n_duplicates 1,262
duplicate_rate 0.4014
vocab_size 1,883
readability_flesch_mean 35.29
emoji_rate 0
url_rate 0
one_word_rate 0
allcaps_rate 0
boilerplate_rate 0
alert: short_text95th-percentile length under 20 chars
alert: duplicates40.1% duplicate strings
Fig 14.
Character-length distribution for NAMELSAD.
Show data table
Character-length distribution for NAMELSAD (mean: 14.083651399491094).
charscount
10 – 1129
11 – 12255
12 – 13465
13 – 14682
14 – 14587
14 – 15483
15 – 16278
16 – 17200
17 – 1854
18 – 190
19 – 2043
20 – 2118
21 – 2212
22 – 2310
23 – 245
24 – 244
24 – 255
25 – 262
26 – 271
27 – 280
28 – 291
29 – 300
30 – 310
31 – 322
32 – 321
32 – 331
33 – 341
34 – 351
35 – 360
36 – 370
37 – 380
38 – 390
39 – 400
40 – 412
41 – 421
42 – 420
42 – 430
43 – 440
44 – 450
45 – 461

STUSPS categorical foreign_key

STUSPS is the USPS two-letter state abbreviation, with 51 distinct values across 3,144 rows — consistent with US states plus DC at the county grain. Distribution matches county counts: TX leads at 254 (8.08%), followed by GA (159), VA (133), and KY (120). No nulls and high entropy ratio (0.93) indicate clean, well-spread categorical coverage.

Treatment: left-join on this code to state-level reference tables, or one-hot/target-encode for modelling.

anthropic:claude-opus-4-7 · confidence high
Out[34]:

saturn.columns["STUSPS"].stats

statvalue
n3,144
nulls0 (0.0%)
unique51
top_value TX
top_rate 0.08079
cardinality 51
entropy 5.277
entropy_ratio 0.9304
Fig 15.
Top values for STUSPS.
Show data table
Top values for STUSPS (20 unique shown, of 51 total).
valuecountshare
TX2548.1%
GA1595.1%
VA1334.2%
KY1203.8%
MO1153.7%
KS1053.3%
IL1023.2%
NC1003.2%
IA993.1%
TN953.0%
NE933.0%
IN922.9%
OH882.8%
MN872.8%
MI832.6%
MS822.6%
OK772.4%
AR752.4%
WI722.3%
AL672.1%

STATE_NAME categorical feature

STATE_NAME holds US state labels across 3,144 rows with exactly 51 unique values (50 states plus likely DC) and zero nulls. The distribution mirrors county counts per state: Texas leads at 254 (8.1%), followed by Georgia (159) and Virginia (133), consistent with this being one row per US county. Entropy ratio of 0.93 indicates a fairly even spread across states given their natural county-count differences.

Treatment: use as a categorical grouping key or one-hot/target-encode for modelling.

anthropic:claude-opus-4-7 · confidence high
Out[37]:

saturn.columns["STATE_NAME"].stats

statvalue
n3,144
nulls0 (0.0%)
unique51
top_value Texas
top_rate 0.08079
cardinality 51
entropy 5.277
entropy_ratio 0.9304
Fig 16.
Top values for STATE_NAME.
Show data table
Top values for STATE_NAME (20 unique shown, of 51 total).
valuecountshare
Texas2548.1%
Georgia1595.1%
Virginia1334.2%
Kentucky1203.8%
Missouri1153.7%
Kansas1053.3%
Illinois1023.2%
North Carolina1003.2%
Iowa993.1%
Tennessee953.0%
Nebraska933.0%
Indiana922.9%
Ohio882.8%
Minnesota872.8%
Michigan832.6%
Mississippi822.6%
Oklahoma772.4%
Arkansas752.4%
Wisconsin722.3%
Alabama672.1%

LSAD categorical metadata

LSAD is a Census Legal/Statistical Area Description code identifying the type of geographic entity for each of 3144 rows. The distribution is extremely imbalanced: code '06' (county) accounts for 95.39% of rows, leaving only 9 distinct codes and an entropy ratio of 0.117. Minor categories like '15', '25', and 'PL' tail off quickly into single-digit counts.

Treatment: Collapse rare codes into an 'other' bucket or drop, since one value dominates.

anthropic:claude-opus-4-7 · confidence high
Out[40]:

saturn.columns["LSAD"].stats

statvalue
n3,144
nulls0 (0.0%)
unique9
top_value 06
top_rate 0.9539
cardinality 9
entropy 0.3707
entropy_ratio 0.1169
alert: imbalancetop value is 95.4% of rows
Fig 17.
Top values for LSAD.
Show data table
Top values for LSAD (9 unique shown, of 9 total).
valuecountshare
06299995.4%
15642.0%
25401.3%
04130.4%
05110.3%
PL90.3%
0340.1%
0020.1%
1220.1%

ALAND numeric feature

ALAND looks like land-area in square meters for 3,144 unique geographic units (matching the U.S. county count), with no nulls or zeros. The distribution is extremely right-skewed (skew 26.8, kurtosis 953) — the median is 1.59B while the max reaches 377B, and 11.5% of rows flag as outliers. A handful of very large areas dominate the mean (2.91B) versus the median.

Treatment: log-transform before modelling to tame the heavy right tail.

anthropic:claude-opus-4-7 · confidence high
Out[43]:

saturn.columns["ALAND"].stats

statvalue
n3,144
nulls0 (0.0%)
unique3,144
min 5.3e+06
max 3.771e+11
mean 2.911e+09
median 1.594e+09
std 9.306e+09
q1 1.116e+09
q3 2.394e+09
iqr 1.277e+09
skew 26.82
kurtosis 953.2
n_outliers 362
outlier_rate 0.1151
zero_rate 0
alert: high_skewskew=+26.82
alert: outliers11.5% rows beyond 1.5 IQR
Fig 18.
Distribution of ALAND. Vertical dash marks the median.
Show data table
Histogram bins for ALAND (median: 1594401059.0).
bincount
5.3e+06 – 9.432e+093006
9.432e+09 – 1.886e+1097
1.886e+10 – 2.828e+1022
2.828e+10 – 3.771e+103
3.771e+10 – 4.714e+104
4.714e+10 – 5.656e+103
5.656e+10 – 6.599e+105
6.599e+10 – 7.542e+100
7.542e+10 – 8.484e+100
8.484e+10 – 9.427e+101
9.427e+10 – 1.037e+110
1.037e+11 – 1.131e+111
1.131e+11 – 1.225e+110
1.225e+11 – 1.32e+110
1.32e+11 – 1.414e+110
1.414e+11 – 1.508e+110
1.508e+11 – 1.603e+110
1.603e+11 – 1.697e+110
1.697e+11 – 1.791e+110
1.791e+11 – 1.885e+110
1.885e+11 – 1.98e+110
1.98e+11 – 2.074e+110
2.074e+11 – 2.168e+110
2.168e+11 – 2.262e+110
2.262e+11 – 2.357e+111
2.357e+11 – 2.451e+110
2.451e+11 – 2.545e+110
2.545e+11 – 2.639e+110
2.639e+11 – 2.734e+110
2.734e+11 – 2.828e+110
2.828e+11 – 2.922e+110
2.922e+11 – 3.016e+110
3.016e+11 – 3.111e+110
3.111e+11 – 3.205e+110
3.205e+11 – 3.299e+110
3.299e+11 – 3.394e+110
3.394e+11 – 3.488e+110
3.488e+11 – 3.582e+110
3.582e+11 – 3.676e+110
3.676e+11 – 3.771e+111

AWATER numeric feature

AWATER is the standard Census TIGER field for water-area in square meters, here at what looks like county granularity given n=3144 unique values. The distribution is extremely right-skewed (skew 13.18, kurtosis 210.8): the median is 19.4M but the mean is 222M and the max reaches 25.99B, with 440 outliers (14.0% of rows). One row is zero, and all 3144 values are unique, so this behaves like a continuous geographic feature rather than a key.

Treatment: Apply a log1p transform before any modelling to tame the 13.2 skew and heavy outlier tail.

anthropic:claude-opus-4-7 · confidence high
Out[46]:

saturn.columns["AWATER"].stats

statvalue
n3,144
nulls0 (0.0%)
unique3,144
min 0
max 2.599e+10
mean 2.22e+08
median 1.939e+07
std 1.241e+09
q1 7.132e+06
q3 5.946e+07
iqr 5.233e+07
skew 13.18
kurtosis 210.8
n_outliers 440
outlier_rate 0.1399
zero_rate 0.0003181
alert: high_skewskew=+13.18
alert: outliers14.0% rows beyond 1.5 IQR
Fig 19.
Distribution of AWATER. Vertical dash marks the median.
Show data table
Histogram bins for AWATER (median: 19389010.5).
bincount
0 – 6.497e+082954
6.497e+08 – 1.299e+0989
1.299e+09 – 1.949e+0930
1.949e+09 – 2.599e+0925
2.599e+09 – 3.249e+0913
3.249e+09 – 3.898e+093
3.898e+09 – 4.548e+093
4.548e+09 – 5.198e+097
5.198e+09 – 5.848e+091
5.848e+09 – 6.497e+093
6.497e+09 – 7.147e+091
7.147e+09 – 7.797e+090
7.797e+09 – 8.447e+091
8.447e+09 – 9.096e+090
9.096e+09 – 9.746e+090
9.746e+09 – 1.04e+100
1.04e+10 – 1.105e+101
1.105e+10 – 1.17e+101
1.17e+10 – 1.235e+100
1.235e+10 – 1.299e+102
1.299e+10 – 1.364e+100
1.364e+10 – 1.429e+103
1.429e+10 – 1.494e+102
1.494e+10 – 1.559e+101
1.559e+10 – 1.624e+100
1.624e+10 – 1.689e+100
1.689e+10 – 1.754e+100
1.754e+10 – 1.819e+100
1.819e+10 – 1.884e+100
1.884e+10 – 1.949e+100
1.949e+10 – 2.014e+100
2.014e+10 – 2.079e+100
2.079e+10 – 2.144e+101
2.144e+10 – 2.209e+100
2.209e+10 – 2.274e+101
2.274e+10 – 2.339e+100
2.339e+10 – 2.404e+100
2.404e+10 – 2.469e+100
2.469e+10 – 2.534e+101
2.534e+10 – 2.599e+101

fips numeric identifier

This is the 5-digit US county FIPS code: every one of the 3144 rows is unique, there are no nulls, and the range 1001–56045 spans the standard state+county encoding. The distribution is essentially uniform across the code space (skew −0.08, kurtosis −1.10), as expected for an identifier rather than a measurement.

Treatment: left-join on this id; do not use as a numeric feature.

anthropic:claude-opus-4-7 · confidence high
Out[49]:

saturn.columns["fips"].stats

statvalue
n3,144
nulls0 (0.0%)
unique3,144
min 1,001
max 56,045
mean 3.037e+04
median 29,174
std 1.517e+04
q1 1.817e+04
q3 4.508e+04
iqr 26,905
skew -0.07923
kurtosis -1.099
n_outliers 0
outlier_rate 0
zero_rate 0
Fig 20.
Distribution of fips. Vertical dash marks the median.
Show data table
Histogram bins for fips (median: 29174.0).
bincount
1001 – 237797
2377 – 37530
3753 – 512980
5129 – 650568
6505 – 78820
7882 – 925873
9258 – 1.063e+043
1.063e+04 – 1.201e+046
1.201e+04 – 1.339e+04221
1.339e+04 – 1.476e+040
1.476e+04 – 1.614e+0449
1.614e+04 – 1.751e+04102
1.751e+04 – 1.889e+0492
1.889e+04 – 2.027e+04204
2.027e+04 – 2.164e+04120
2.164e+04 – 2.302e+0473
2.302e+04 – 2.439e+0430
2.439e+04 – 2.577e+0415
2.577e+04 – 2.715e+04156
2.715e+04 – 2.852e+0496
2.852e+04 – 2.99e+04115
2.99e+04 – 3.128e+04149
3.128e+04 – 3.265e+0417
3.265e+04 – 3.403e+0424
3.403e+04 – 3.54e+0440
3.54e+04 – 3.678e+0462
3.678e+04 – 3.816e+04153
3.816e+04 – 3.953e+0488
3.953e+04 – 4.091e+0477
4.091e+04 – 4.228e+04103
4.228e+04 – 4.366e+040
4.366e+04 – 4.504e+0423
4.504e+04 – 4.641e+0494
4.641e+04 – 4.779e+0495
4.779e+04 – 4.916e+04283
4.916e+04 – 5.054e+0414
5.054e+04 – 5.192e+04133
5.192e+04 – 5.329e+0439
5.329e+04 – 5.467e+0455
5.467e+04 – 5.604e+0495

active_duty_est numeric feature

Numeric estimate of active-duty population per record, with 3144 rows and 3028 unique values suggesting one row per geographic unit (likely county-level given the row count). The distribution is severely right-skewed (skew 13.14, kurtosis 288.57): median is 11698 but mean is 53782.95 and the max reaches 5240842, with 449 outliers (14.3%). No nulls or zeros, and the IQR of 27868 is dwarfed by a std of 176262.59.

Treatment: log-transform before modelling to tame the heavy right tail.

anthropic:claude-opus-4-7 · confidence high
Out[52]:

saturn.columns["active_duty_est"].stats

statvalue
n3,144
nulls0 (0.0%)
unique3,028
min 36
max 5.241e+06
mean 5.378e+04
median 11,698
std 1.763e+05
q1 4722
q3 3.259e+04
iqr 27,868
skew 13.14
kurtosis 288.6
n_outliers 449
outlier_rate 0.1428
zero_rate 0
alert: high_skewskew=+13.14
alert: outliers14.3% rows beyond 1.5 IQR
Fig 21.
Distribution of active_duty_est. Vertical dash marks the median.
Show data table
Histogram bins for active_duty_est (median: 11698.0).
bincount
36 – 1.311e+052867
1.311e+05 – 2.621e+05135
2.621e+05 – 3.931e+0552
3.931e+05 – 5.241e+0541
5.241e+05 – 6.551e+0514
6.551e+05 – 7.862e+0510
7.862e+05 – 9.172e+055
9.172e+05 – 1.048e+065
1.048e+06 – 1.179e+064
1.179e+06 – 1.31e+062
1.31e+06 – 1.441e+063
1.441e+06 – 1.572e+060
1.572e+06 – 1.703e+061
1.703e+06 – 1.834e+061
1.834e+06 – 1.965e+060
1.965e+06 – 2.096e+060
2.096e+06 – 2.227e+060
2.227e+06 – 2.358e+061
2.358e+06 – 2.489e+061
2.489e+06 – 2.62e+060
2.62e+06 – 2.751e+060
2.751e+06 – 2.882e+061
2.882e+06 – 3.013e+060
3.013e+06 – 3.145e+060
3.145e+06 – 3.276e+060
3.276e+06 – 3.407e+060
3.407e+06 – 3.538e+060
3.538e+06 – 3.669e+060
3.669e+06 – 3.8e+060
3.8e+06 – 3.931e+060
3.931e+06 – 4.062e+060
4.062e+06 – 4.193e+060
4.193e+06 – 4.324e+060
4.324e+06 – 4.455e+060
4.455e+06 – 4.586e+060
4.586e+06 – 4.717e+060
4.717e+06 – 4.848e+060
4.848e+06 – 4.979e+060
4.979e+06 – 5.11e+060
5.11e+06 – 5.241e+061

veterans_est numeric feature

Estimated veteran counts per row (likely US counties given n=3144), ranging from 0 to 244,160 with a median of 1,547.5 but a mean of 5,419.5. The distribution is heavily right-skewed (skew 8.01, kurtosis 100.0) with 408 outliers (12.98%) reflecting a few highly populous areas dwarfing the rest. Near-zero null and zero rates, so coverage is essentially complete.

Treatment: log1p-transform before modelling to tame the heavy right skew.

anthropic:claude-opus-4-7 · confidence high
Out[55]:

saturn.columns["veterans_est"].stats

statvalue
n3,144
nulls0 (0.0%)
unique2,424
min 0
max 244,160
mean 5419
median 1548
std 1.311e+04
q1 634.8
q3 4428
iqr 3,793
skew 8.014
kurtosis 100
n_outliers 408
outlier_rate 0.1298
zero_rate 0.0003181
alert: high_skewskew=+8.01
alert: outliers13.0% rows beyond 1.5 IQR
Fig 22.
Distribution of veterans_est. Vertical dash marks the median.
Show data table
Histogram bins for veterans_est (median: 1547.5).
bincount
0 – 61042534
6104 – 1.221e+04271
1.221e+04 – 1.831e+04116
1.831e+04 – 2.442e+0468
2.442e+04 – 3.052e+0449
3.052e+04 – 3.662e+0430
3.662e+04 – 4.273e+0418
4.273e+04 – 4.883e+0420
4.883e+04 – 5.494e+047
5.494e+04 – 6.104e+043
6.104e+04 – 6.714e+044
6.714e+04 – 7.325e+046
7.325e+04 – 7.935e+040
7.935e+04 – 8.546e+046
8.546e+04 – 9.156e+042
9.156e+04 – 9.766e+041
9.766e+04 – 1.038e+050
1.038e+05 – 1.099e+051
1.099e+05 – 1.16e+051
1.16e+05 – 1.221e+050
1.221e+05 – 1.282e+050
1.282e+05 – 1.343e+050
1.343e+05 – 1.404e+051
1.404e+05 – 1.465e+052
1.465e+05 – 1.526e+050
1.526e+05 – 1.587e+051
1.587e+05 – 1.648e+050
1.648e+05 – 1.709e+050
1.709e+05 – 1.77e+050
1.77e+05 – 1.831e+050
1.831e+05 – 1.892e+050
1.892e+05 – 1.953e+051
1.953e+05 – 2.014e+050
2.014e+05 – 2.075e+050
2.075e+05 – 2.136e+050
2.136e+05 – 2.197e+050
2.197e+05 – 2.258e+050
2.258e+05 – 2.32e+051
2.32e+05 – 2.381e+050
2.381e+05 – 2.442e+051

total_pop numeric feature

Looks like a per-row population total across 3,144 rows (suggestive of US counties), with no nulls and 3,080 unique values. The distribution is severely right-skewed (skew 13.17, kurtosis 289.76): median is 25,784.5 but the mean is 105,310.94 and the max reaches 9,936,690, with 440 rows (14.0%) flagged as outliers. Min is 50 and zero_rate is 0, so every row carries a real count.

Treatment: log-transform before regression to tame the heavy right tail.

anthropic:claude-opus-4-7 · confidence high
Out[58]:

saturn.columns["total_pop"].stats

statvalue
n3,144
nulls0 (0.0%)
unique3,080
min 50
max 9.937e+06
mean 1.053e+05
median 2.578e+04
std 3.338e+05
q1 1.084e+04
q3 6.808e+04
iqr 57,244
skew 13.17
kurtosis 289.8
n_outliers 440
outlier_rate 0.1399
zero_rate 0
alert: high_skewskew=+13.17
alert: outliers14.0% rows beyond 1.5 IQR
Fig 23.
Distribution of total_pop. Vertical dash marks the median.
Show data table
Histogram bins for total_pop (median: 25784.5).
bincount
50 – 2.485e+052863
2.485e+05 – 4.969e+05137
4.969e+05 – 7.453e+0557
7.453e+05 – 9.937e+0537
9.937e+05 – 1.242e+0614
1.242e+06 – 1.491e+0610
1.491e+06 – 1.739e+067
1.739e+06 – 1.987e+063
1.987e+06 – 2.236e+063
2.236e+06 – 2.484e+064
2.484e+06 – 2.733e+063
2.733e+06 – 2.981e+060
2.981e+06 – 3.229e+061
3.229e+06 – 3.478e+061
3.478e+06 – 3.726e+060
3.726e+06 – 3.975e+060
3.975e+06 – 4.223e+060
4.223e+06 – 4.472e+061
4.472e+06 – 4.72e+060
4.72e+06 – 4.968e+061
4.968e+06 – 5.217e+060
5.217e+06 – 5.465e+061
5.465e+06 – 5.714e+060
5.714e+06 – 5.962e+060
5.962e+06 – 6.21e+060
6.21e+06 – 6.459e+060
6.459e+06 – 6.707e+060
6.707e+06 – 6.956e+060
6.956e+06 – 7.204e+060
7.204e+06 – 7.453e+060
7.453e+06 – 7.701e+060
7.701e+06 – 7.949e+060
7.949e+06 – 8.198e+060
8.198e+06 – 8.446e+060
8.446e+06 – 8.695e+060
8.695e+06 – 8.943e+060
8.943e+06 – 9.191e+060
9.191e+06 – 9.44e+060
9.44e+06 – 9.688e+060
9.688e+06 – 9.937e+061

active_duty_per_10k numeric feature

A per-capita rate (active duty personnel per 10,000) reported across 3,144 rows with no nulls, no zeros, and every value unique. The distribution is tight around a mean of 4,693.79 and median of 4,733.28 with std 592.22, mildly left-skewed (-0.38), and 57 outliers (1.81%) span a range from 1,708.13 to 7,200.00. The 3,144 row count strongly suggests one record per US county.

Treatment: Use as-is as a continuous feature; the mild skew does not require transformation.

anthropic:claude-opus-4-7 · confidence high
Out[61]:

saturn.columns["active_duty_per_10k"].stats

statvalue
n3,144
nulls0 (0.0%)
unique3,144
min 1708
max 7,200
mean 4694
median 4733
std 592.2
q1 4331
q3 5102
iqr 771.6
skew -0.3768
kurtosis 0.8418
n_outliers 57
outlier_rate 0.01813
zero_rate 0
Fig 24.
Distribution of active_duty_per_10k. Vertical dash marks the median.
Show data table
Histogram bins for active_duty_per_10k (median: 4733.279942644007).
bincount
1708 – 18451
1845 – 19830
1983 – 21201
2120 – 22570
2257 – 23951
2395 – 25321
2532 – 26691
2669 – 28076
2807 – 29447
2944 – 30819
3081 – 321821
3218 – 335620
3356 – 349329
3493 – 363057
3630 – 376852
3768 – 390592
3905 – 4042111
4042 – 4179165
4179 – 4317193
4317 – 4454224
4454 – 4591267
4591 – 4729303
4729 – 4866280
4866 – 5003301
5003 – 5141277
5141 – 5278268
5278 – 5415177
5415 – 5552136
5552 – 569061
5690 – 582737
5827 – 596418
5964 – 61028
6102 – 62395
6239 – 63765
6376 – 65143
6514 – 66512
6651 – 67882
6788 – 69250
6925 – 70630
7063 – 72003

veterans_per_100 numeric feature

This column reports veterans per 100 residents, with 3143 unique values across 3144 rows (likely one row per US county). Values range from 0 to 18.09 with a mean of 6.19 and median of 5.98, showing a mild right skew (0.88) and 125 outliers (~3.98%) on the high end. Only one row is zero, so the distribution is effectively continuous and well-populated.

Treatment: Use as-is for modelling; optionally winsorize the upper ~4% outliers.

anthropic:claude-opus-4-7 · confidence high
Out[64]:

saturn.columns["veterans_per_100"].stats

statvalue
n3,144
nulls0 (0.0%)
unique3,143
min 0
max 18.09
mean 6.19
median 5.985
std 1.998
q1 4.92
q3 7.136
iqr 2.216
skew 0.8797
kurtosis 2.029
n_outliers 125
outlier_rate 0.03976
zero_rate 0.0003181
Fig 25.
Distribution of veterans_per_100. Vertical dash marks the median.
Show data table
Histogram bins for veterans_per_100 (median: 5.984985213609011).
bincount
0 – 0.45221
0.4522 – 0.90430
0.9043 – 1.3576
1.357 – 1.80913
1.809 – 2.26121
2.261 – 2.71334
2.713 – 3.16560
3.165 – 3.61784
3.617 – 4.07137
4.07 – 4.522187
4.522 – 4.974271
4.974 – 5.426319
5.426 – 5.878346
5.878 – 6.33358
6.33 – 6.783313
6.783 – 7.235259
7.235 – 7.687173
7.687 – 8.139128
8.139 – 8.59194
8.591 – 9.04388
9.043 – 9.49660
9.496 – 9.94838
9.948 – 10.439
10.4 – 10.8526
10.85 – 11.316
11.3 – 11.7621
11.76 – 12.2114
12.21 – 12.6615
12.66 – 13.114
13.11 – 13.576
13.57 – 14.026
14.02 – 14.471
14.47 – 14.921
14.92 – 15.373
15.37 – 15.830
15.83 – 16.280
16.28 – 16.731
16.73 – 17.180
17.18 – 17.630
17.63 – 18.091

How to cite

click to copy

BibTeX
@misc{saturn-veterans-merged-county-analysis-2026,
  author       = {Steuber, Luke},
  title        = {Saturn reading: veterans merged county analysis},
  year         ={2026},
  howpublished = {\url{https://dr.eamer.dev/saturn/view/veterans-merged_county_analysis}},
  note         = {Profiled with saturn-dissect v0.2.0, prompt saturn-insight-v2, model anthropic:claude-opus-4-7},
}
APA
Steuber, L. (2026). Saturn reading: veterans merged county analysis. Source: /home/coolhand/html/datavis/data_trove/data/policy/veterans/merged_county_analysis.csv. Profiled with saturn-dissect v0.2.0 (saturn-insight-v2, anthropic:claude-opus-4-7). Retrieved from https://dr.eamer.dev/saturn/view/veterans-merged_county_analysis