data trove veteran suicide rates
Reading
This dataset contains state-level suicide rate statistics for all 50 U.S. states, comparing civilian and veteran populations along with a veteran risk ratio. The most striking signal is the scale of the veteran suicide burden: the mean veteran suicide rate (36.1 per 100k) is roughly double the civilian mean (17.6 per 100k), and the veteran risk ratio ranges from 1.8 to 3.23, meaning veterans are at minimum nearly twice as likely to die by suicide as civilians in every single state. The right-skewed distribution of the veteran risk ratio deserves closer attention — a handful of states show ratios above 2.4, suggesting particularly acute disparities worth investigating.
citing: mean · median · min · max · skew · iqr · std
Charts the summary said to look at first
Show data table
| bin | count |
|---|---|
| 24.9 – 28.81 | 10 |
| 28.81 – 32.73 | 9 |
| 32.73 – 36.64 | 9 |
| 36.64 – 40.56 | 8 |
| 40.56 – 44.47 | 6 |
| 44.47 – 48.39 | 4 |
| 48.39 – 52.3 | 4 |
Show data table
| bin | count |
|---|---|
| 7.7 – 10.73 | 8 |
| 10.73 – 13.76 | 8 |
| 13.76 – 16.79 | 7 |
| 16.79 – 19.81 | 8 |
| 19.81 – 22.84 | 7 |
| 22.84 – 25.87 | 7 |
| 25.87 – 28.9 | 5 |
Show data table
| bin | count |
|---|---|
| 1.8 – 2.004 | 25 |
| 2.004 – 2.209 | 7 |
| 2.209 – 2.413 | 6 |
| 2.413 – 2.617 | 4 |
| 2.617 – 2.821 | 3 |
| 2.821 – 3.026 | 2 |
| 3.026 – 3.23 | 3 |
Show data table
| value | count | share |
|---|---|---|
| Montana | 1 | 2.0% |
| Wyoming | 1 | 2.0% |
| Alaska | 1 | 2.0% |
| New Mexico | 1 | 2.0% |
| Idaho | 1 | 2.0% |
| Oklahoma | 1 | 2.0% |
| Colorado | 1 | 2.0% |
| South Dakota | 1 | 2.0% |
| West Virginia | 1 | 2.0% |
| Arkansas | 1 | 2.0% |
| Nevada | 1 | 2.0% |
| Arizona | 1 | 2.0% |
| Oregon | 1 | 2.0% |
| Utah | 1 | 2.0% |
| Kentucky | 1 | 2.0% |
| Tennessee | 1 | 2.0% |
| Alabama | 1 | 2.0% |
| North Dakota | 1 | 2.0% |
| Missouri | 1 | 2.0% |
| Kansas | 1 | 2.0% |
Schema
4 columns| Alerts | ||||
|---|---|---|---|---|
| state | categorical | 0.0% | 50 |
long_tail
|
| veteran_suicide_rate | numeric | 0.0% | 50 |
|
| civilian_suicide_rate | numeric | 0.0% | 50 |
|
| veteran_risk_ratio | numeric | 0.0% | 41 |
|
state
categorical label long_tailThis column contains US state names, with exactly 50 rows and 50 unique values — one row per state, perfectly uniform. Entropy ratio of 1.0 and a top_rate of 0.02 confirm complete uniformity with zero repetition, meaning this is effectively a lookup or reference table keyed by state. The 'long_tail' alert is a statistical artefact of the uniform distribution rather than a genuine concentration problem. Treatment: Use as a join key or grouping label; no encoding needed unless joining to a larger fact table, in which case left-join on this field.
- n
- 50
- nulls
- 0 (0.0%)
- unique
- 50
- top_value
- Montana
- top_rate
- 0.02
- cardinality
- 50
- entropy
- 5.644
- entropy_ratio
- 1
veteran_suicide_rate
numeric numeric_targetThis column represents veteran suicide rates, likely per 100,000 veterans, across 50 geographic or demographic units (probably U.S. states given n=50). All 50 values are unique with no nulls, indicating clean, granular measurement. The distribution is notably broad — ranging from 24.9 to 52.3 with a mean of 36.106 and std of 7.43 — meaning the highest-rate unit has more than double the lowest, a substantial disparity. The slight positive skew (0.42) and near-normal shape (kurtosis −0.77) with zero outliers suggest a relatively well-behaved continuous distribution without extreme anomalies. Treatment: Use as-is in regression or ranking models; consider log-transform only if residuals show heteroscedasticity, as skew is mild at 0.42.
- n
- 50
- nulls
- 0 (0.0%)
- unique
- 50
- min
- 24.9
- max
- 52.3
- mean
- 36.11
- median
- 35
- std
- 7.426
- q1
- 30.18
- q3
- 41.33
- iqr
- 11.15
- skew
- 0.4169
- kurtosis
- -0.7696
- n_outliers
- 0
- outlier_rate
- 0
- zero_rate
- 0
civilian_suicide_rate
numeric numeric_targetThis column represents civilian suicide rates, likely per 100,000 population, across 50 distinct observations (possibly states, countries, or time periods). The distribution is remarkably well-behaved: near-zero skew (0.09), platykurtic shape (kurtosis −1.10), and no outliers detected, suggesting an unusually uniform spread across the full range of 7.7 to 28.9. The mean (17.618) and median (17.5) are nearly identical, and all 50 values are unique with no nulls or zeros. Treatment: Use as-is in regression; near-normal distribution requires no transformation, but verify the unit (rate per 100k) before modelling.
- n
- 50
- nulls
- 0 (0.0%)
- unique
- 50
- min
- 7.7
- max
- 28.9
- mean
- 17.62
- median
- 17.5
- std
- 6.02
- q1
- 12.6
- q3
- 22.4
- iqr
- 9.8
- skew
- 0.08973
- kurtosis
- -1.103
- n_outliers
- 0
- outlier_rate
- 0
- zero_rate
- 0
veteran_risk_ratio
numeric featureThis column appears to represent a risk multiplier or odds ratio specifically for a veteran population, with values bounded between 1.8 and 3.23 — all strictly above 1.0, suggesting it encodes elevated risk relative to some baseline. The mean (2.1714) sits above the median (2.0), and a skew of 1.09 indicates a moderate right tail, though no outliers were flagged. With only 41 unique values across 50 rows, there is light discretisation present, hinting at rounded or binned inputs. The relatively tight IQR (0.555) around a floor of 1.8 suggests the risk ratios cluster in a narrow band, which may warrant checking whether the 1.8 minimum is a data-entry floor or a genuine distributional boundary. Treatment: Check whether 1.8 minimum is a hard floor/truncation artifact; apply log-transform to reduce right skew before regression modelling.
- n
- 50
- nulls
- 0 (0.0%)
- unique
- 41
- min
- 1.8
- max
- 3.23
- mean
- 2.171
- median
- 2
- std
- 0.4021
- q1
- 1.843
- q3
- 2.397
- iqr
- 0.555
- skew
- 1.088
- kurtosis
- 0.0735
- n_outliers
- 0
- outlier_rate
- 0
- zero_rate
- 0