saturn·

data trove veteran suicide rates

source /home/coolhand/html/datavis/data_trove/demographic/veterans/military_firearm_suicide.csv 50 rows 4 columns profiled 2026-06-22 raw JSON static .html .ipynb Report Notebook

Reading

dataset summary · high confidence anthropic:default

This dataset contains state-level suicide rate statistics for all 50 U.S. states, comparing civilian and veteran populations along with a veteran risk ratio. The most striking signal is the scale of the veteran suicide burden: the mean veteran suicide rate (36.1 per 100k) is roughly double the civilian mean (17.6 per 100k), and the veteran risk ratio ranges from 1.8 to 3.23, meaning veterans are at minimum nearly twice as likely to die by suicide as civilians in every single state. The right-skewed distribution of the veteran risk ratio deserves closer attention — a handful of states show ratios above 2.4, suggesting particularly acute disparities worth investigating.

citing: mean · median · min · max · skew · iqr · std

Schema

4 columns
Per-column summary. Click column name to jump to its detail.
Alerts
state categorical 0.0% 50
long_tail
veteran_suicide_rate numeric 0.0% 50
civilian_suicide_rate numeric 0.0% 50
veteran_risk_ratio numeric 0.0% 41

state

categorical label long_tail
This column contains US state names, with exactly 50 rows and 50 unique values — one row per state, perfectly uniform. Entropy ratio of 1.0 and a top_rate of 0.02 confirm complete uniformity with zero repetition, meaning this is effectively a lookup or reference table keyed by state. The 'long_tail' alert is a statistical artefact of the uniform distribution rather than a genuine concentration problem. Treatment: Use as a join key or grouping label; no encoding needed unless joining to a larger fact table, in which case left-join on this field. high · anthropic:default
n
50
nulls
0 (0.0%)
unique
50
top_value
Montana
top_rate
0.02
cardinality
50
entropy
5.644
entropy_ratio
1

veteran_suicide_rate

numeric numeric_target
This column represents veteran suicide rates, likely per 100,000 veterans, across 50 geographic or demographic units (probably U.S. states given n=50). All 50 values are unique with no nulls, indicating clean, granular measurement. The distribution is notably broad — ranging from 24.9 to 52.3 with a mean of 36.106 and std of 7.43 — meaning the highest-rate unit has more than double the lowest, a substantial disparity. The slight positive skew (0.42) and near-normal shape (kurtosis −0.77) with zero outliers suggest a relatively well-behaved continuous distribution without extreme anomalies. Treatment: Use as-is in regression or ranking models; consider log-transform only if residuals show heteroscedasticity, as skew is mild at 0.42. high · anthropic:default
n
50
nulls
0 (0.0%)
unique
50
min
24.9
max
52.3
mean
36.11
median
35
std
7.426
q1
30.18
q3
41.33
iqr
11.15
skew
0.4169
kurtosis
-0.7696
n_outliers
0
outlier_rate
0
zero_rate
0

civilian_suicide_rate

numeric numeric_target
This column represents civilian suicide rates, likely per 100,000 population, across 50 distinct observations (possibly states, countries, or time periods). The distribution is remarkably well-behaved: near-zero skew (0.09), platykurtic shape (kurtosis −1.10), and no outliers detected, suggesting an unusually uniform spread across the full range of 7.7 to 28.9. The mean (17.618) and median (17.5) are nearly identical, and all 50 values are unique with no nulls or zeros. Treatment: Use as-is in regression; near-normal distribution requires no transformation, but verify the unit (rate per 100k) before modelling. high · anthropic:default
n
50
nulls
0 (0.0%)
unique
50
min
7.7
max
28.9
mean
17.62
median
17.5
std
6.02
q1
12.6
q3
22.4
iqr
9.8
skew
0.08973
kurtosis
-1.103
n_outliers
0
outlier_rate
0
zero_rate
0

veteran_risk_ratio

numeric feature
This column appears to represent a risk multiplier or odds ratio specifically for a veteran population, with values bounded between 1.8 and 3.23 — all strictly above 1.0, suggesting it encodes elevated risk relative to some baseline. The mean (2.1714) sits above the median (2.0), and a skew of 1.09 indicates a moderate right tail, though no outliers were flagged. With only 41 unique values across 50 rows, there is light discretisation present, hinting at rounded or binned inputs. The relatively tight IQR (0.555) around a floor of 1.8 suggests the risk ratios cluster in a narrow band, which may warrant checking whether the 1.8 minimum is a data-entry floor or a genuine distributional boundary. Treatment: Check whether 1.8 minimum is a hard floor/truncation artifact; apply log-transform to reduce right skew before regression modelling. medium · anthropic:default
n
50
nulls
0 (0.0%)
unique
41
min
1.8
max
3.23
mean
2.171
median
2
std
0.4021
q1
1.843
q3
2.397
iqr
0.555
skew
1.088
kurtosis
0.0735
n_outliers
0
outlier_rate
0
zero_rate
0