economic poverty depth by county
Reading
This dataset contains 3,222 rows of US county-level poverty statistics, with each row identified by a FIPS code, county name, and state abbreviation, plus three poverty rate measures and a population total. The poverty measures are all right-skewed: pct_poverty ranges from 1.6% to 66.32% with a median of 13.55%, while pct_deep_poverty has a median of 5.82% but reaches as high as 34.7%. The total population column is extremely skewed (skew of 13.4, kurtosis ~297) with a median of 25,174 but a max near 9.8 million, so any aggregate analysis should account for this. Texas (254 counties), Georgia (159), and Virginia (133) dominate the state distribution, which matters for any state-level rollups.
citing: row_count · column_count · columns · kinds
Charts the summary said to look at first
Show data table
| bin | count |
|---|---|
| 1.6 – 3.218 | 7 |
| 3.218 – 4.836 | 34 |
| 4.836 – 6.454 | 106 |
| 6.454 – 8.072 | 246 |
| 8.072 – 9.69 | 320 |
| 9.69 – 11.31 | 354 |
| 11.31 – 12.93 | 393 |
| 12.93 – 14.54 | 364 |
| 14.54 – 16.16 | 306 |
| 16.16 – 17.78 | 262 |
| 17.78 – 19.4 | 192 |
| 19.4 – 21.02 | 149 |
| 21.02 – 22.63 | 123 |
| 22.63 – 24.25 | 91 |
| 24.25 – 25.87 | 52 |
| 25.87 – 27.49 | 44 |
| 27.49 – 29.11 | 34 |
| 29.11 – 30.72 | 23 |
| 30.72 – 32.34 | 18 |
| 32.34 – 33.96 | 14 |
| 33.96 – 35.58 | 6 |
| 35.58 – 37.2 | 8 |
| 37.2 – 38.81 | 3 |
| 38.81 – 40.43 | 8 |
| 40.43 – 42.05 | 5 |
| 42.05 – 43.67 | 9 |
| 43.67 – 45.29 | 4 |
| 45.29 – 46.9 | 11 |
| 46.9 – 48.52 | 7 |
| 48.52 – 50.14 | 8 |
| 50.14 – 51.76 | 2 |
| 51.76 – 53.38 | 6 |
| 53.38 – 54.99 | 5 |
| 54.99 – 56.61 | 5 |
| 56.61 – 58.23 | 1 |
| 58.23 – 59.85 | 0 |
| 59.85 – 61.47 | 0 |
| 61.47 – 63.08 | 0 |
| 63.08 – 64.7 | 1 |
| 64.7 – 66.32 | 1 |
Show data table
| bin | count |
|---|---|
| 0 – 0.8675 | 15 |
| 0.8675 – 1.735 | 28 |
| 1.735 – 2.603 | 128 |
| 2.603 – 3.47 | 241 |
| 3.47 – 4.338 | 429 |
| 4.338 – 5.205 | 446 |
| 5.205 – 6.073 | 436 |
| 6.073 – 6.94 | 403 |
| 6.94 – 7.808 | 261 |
| 7.808 – 8.675 | 211 |
| 8.675 – 9.543 | 157 |
| 9.543 – 10.41 | 113 |
| 10.41 – 11.28 | 57 |
| 11.28 – 12.15 | 58 |
| 12.15 – 13.01 | 50 |
| 13.01 – 13.88 | 28 |
| 13.88 – 14.75 | 18 |
| 14.75 – 15.62 | 22 |
| 15.62 – 16.48 | 18 |
| 16.48 – 17.35 | 8 |
| 17.35 – 18.22 | 11 |
| 18.22 – 19.09 | 9 |
| 19.09 – 19.95 | 7 |
| 19.95 – 20.82 | 4 |
| 20.82 – 21.69 | 7 |
| 21.69 – 22.55 | 8 |
| 22.55 – 23.42 | 5 |
| 23.42 – 24.29 | 2 |
| 24.29 – 25.16 | 8 |
| 25.16 – 26.03 | 4 |
| 26.03 – 26.89 | 6 |
| 26.89 – 27.76 | 2 |
| 27.76 – 28.63 | 4 |
| 28.63 – 29.5 | 7 |
| 29.5 – 30.36 | 3 |
| 30.36 – 31.23 | 0 |
| 31.23 – 32.1 | 2 |
| 32.1 – 32.97 | 1 |
| 32.97 – 33.83 | 1 |
| 33.83 – 34.7 | 4 |
Show data table
| bin | count |
|---|---|
| 47 – 2.446e+05 | 2942 |
| 2.446e+05 – 4.892e+05 | 137 |
| 4.892e+05 – 7.337e+05 | 57 |
| 7.337e+05 – 9.783e+05 | 39 |
| 9.783e+05 – 1.223e+06 | 12 |
| 1.223e+06 – 1.467e+06 | 9 |
| 1.467e+06 – 1.712e+06 | 7 |
| 1.712e+06 – 1.957e+06 | 3 |
| 1.957e+06 – 2.201e+06 | 3 |
| 2.201e+06 – 2.446e+06 | 4 |
| 2.446e+06 – 2.69e+06 | 3 |
| 2.69e+06 – 2.935e+06 | 0 |
| 2.935e+06 – 3.179e+06 | 1 |
| 3.179e+06 – 3.424e+06 | 1 |
| 3.424e+06 – 3.669e+06 | 0 |
| 3.669e+06 – 3.913e+06 | 0 |
| 3.913e+06 – 4.158e+06 | 0 |
| 4.158e+06 – 4.402e+06 | 1 |
| 4.402e+06 – 4.647e+06 | 0 |
| 4.647e+06 – 4.891e+06 | 1 |
| 4.891e+06 – 5.136e+06 | 0 |
| 5.136e+06 – 5.38e+06 | 1 |
| 5.38e+06 – 5.625e+06 | 0 |
| 5.625e+06 – 5.87e+06 | 0 |
| 5.87e+06 – 6.114e+06 | 0 |
| 6.114e+06 – 6.359e+06 | 0 |
| 6.359e+06 – 6.603e+06 | 0 |
| 6.603e+06 – 6.848e+06 | 0 |
| 6.848e+06 – 7.092e+06 | 0 |
| 7.092e+06 – 7.337e+06 | 0 |
| 7.337e+06 – 7.582e+06 | 0 |
| 7.582e+06 – 7.826e+06 | 0 |
| 7.826e+06 – 8.071e+06 | 0 |
| 8.071e+06 – 8.315e+06 | 0 |
| 8.315e+06 – 8.56e+06 | 0 |
| 8.56e+06 – 8.804e+06 | 0 |
| 8.804e+06 – 9.049e+06 | 0 |
| 9.049e+06 – 9.293e+06 | 0 |
| 9.293e+06 – 9.538e+06 | 0 |
| 9.538e+06 – 9.783e+06 | 1 |
Show data table
| value | count | share |
|---|---|---|
| TX | 254 | 7.9% |
| GA | 159 | 4.9% |
| VA | 133 | 4.1% |
| KY | 120 | 3.7% |
| MO | 115 | 3.6% |
| KS | 105 | 3.3% |
| IL | 102 | 3.2% |
| NC | 100 | 3.1% |
| IA | 99 | 3.1% |
| TN | 95 | 2.9% |
| NE | 93 | 2.9% |
| IN | 92 | 2.9% |
| OH | 88 | 2.7% |
| MN | 87 | 2.7% |
| MI | 83 | 2.6% |
| MS | 82 | 2.5% |
| PR | 78 | 2.4% |
| OK | 77 | 2.4% |
| AR | 75 | 2.3% |
| WI | 72 | 2.2% |
Show data table
| bin | count |
|---|---|
| 0.58 – 1.794 | 7 |
| 1.794 – 3.008 | 18 |
| 3.008 – 4.222 | 82 |
| 4.222 – 5.436 | 161 |
| 5.436 – 6.65 | 302 |
| 6.65 – 7.864 | 419 |
| 7.864 – 9.078 | 480 |
| 9.078 – 10.29 | 487 |
| 10.29 – 11.51 | 392 |
| 11.51 – 12.72 | 280 |
| 12.72 – 13.93 | 210 |
| 13.93 – 15.15 | 138 |
| 15.15 – 16.36 | 87 |
| 16.36 – 17.58 | 53 |
| 17.58 – 18.79 | 37 |
| 18.79 – 20 | 35 |
| 20 – 21.22 | 15 |
| 21.22 – 22.43 | 7 |
| 22.43 – 23.65 | 2 |
| 23.65 – 24.86 | 5 |
| 24.86 – 26.07 | 1 |
| 26.07 – 27.29 | 2 |
| 27.29 – 28.5 | 0 |
| 28.5 – 29.72 | 0 |
| 29.72 – 30.93 | 0 |
| 30.93 – 32.14 | 0 |
| 32.14 – 33.36 | 0 |
| 33.36 – 34.57 | 0 |
| 34.57 – 35.79 | 0 |
| 35.79 – 37 | 1 |
| 37 – 38.21 | 0 |
| 38.21 – 39.43 | 0 |
| 39.43 – 40.64 | 0 |
| 40.64 – 41.86 | 0 |
| 41.86 – 43.07 | 0 |
| 43.07 – 44.28 | 0 |
| 44.28 – 45.5 | 0 |
| 45.5 – 46.71 | 0 |
| 46.71 – 47.93 | 0 |
| 47.93 – 49.14 | 1 |
Schema
7 columns| Alerts | ||||
|---|---|---|---|---|
| fips | numeric | 0.0% | 3,222 |
|
| county_name | text | 0.0% | 1,960 |
short_text
duplicates
|
| state | categorical | 0.0% | 52 |
|
| total | numeric | 0.0% | 3,173 |
high_skew
outliers
|
| pct_deep_poverty | numeric | 0.0% | 1,131 |
high_skew
outliers
|
| pct_poverty | numeric | 0.0% | 1,719 |
high_skew
|
| pct_near_poverty | numeric | 0.0% | 1,237 |
|
fips
numeric identifierThis column is the US county FIPS code: every one of the 3222 rows is unique, null-free, and the value range (1001 to 72153) matches the standard 5-digit state+county encoding. Treating it as numeric is misleading despite the clean distribution (skew 0.16, no outliers) — the digits are categorical identifiers, not measurements. Treatment: Cast to zero-padded string and use as a join key to county-level data.
- n
- 3,222
- nulls
- 0 (0.0%)
- unique
- 3,222
- min
- 1,001
- max
- 72,153
- mean
- 3.138e+04
- median
- 30,022
- std
- 1.63e+04
- q1
- 1.903e+04
- q3
- 4.61e+04
- iqr
- 27,075
- skew
- 0.1574
- kurtosis
- -0.6314
- n_outliers
- 0
- outlier_rate
- 0
- zero_rate
- 0
county_name
text feature short_text duplicatesThis column holds US county-level place names — virtually every value ends in 'County' (2999 occurrences), with smaller groups of Louisiana 'parish' (64) and Puerto Rican 'municipio' (78) entries. Despite 3222 rows, only 1960 are unique and 39.2% are duplicates, because common names like Washington County (30), Jefferson County (25) and Franklin County (24) recur across states. Values are short and uniform (mean 14.2 chars, ~2 words), so the name alone does not uniquely identify a county. Treatment: Pair with a state column to form a unique key before joining or grouping.
- n
- 3,222
- nulls
- 0 (0.0%)
- unique
- 1,960
- len_min
- 10
- len_max
- 46
- len_mean
- 14.17
- len_median
- 14
- len_p95
- 18
- word_mean
- 2.083
- word_median
- 2
- n_empty
- 0
- n_duplicates
- 1,262
- duplicate_rate
- 0.3917
- vocab_size
- 1,963
- readability_flesch_mean
- 33.36
- emoji_rate
- 0
- url_rate
- 0
- one_word_rate
- 0
- allcaps_rate
- 0
- boilerplate_rate
- 0
state
categorical featureThis is a US state code field with 52 distinct values, consistent with the 50 states plus DC and likely one territory. Distribution is broad and near-uniform (entropy ratio 0.93), with TX leading at just 7.88% (254 of 3222 rows), followed by GA, VA, KY, and MO. No nulls, and the row count suggests multiple records per state rather than one-per-state. Treatment: One-hot or target-encode for modelling; usable as a join key to state-level reference data.
- n
- 3,222
- nulls
- 0 (0.0%)
- unique
- 52
- top_value
- TX
- top_rate
- 0.07883
- cardinality
- 52
- entropy
- 5.314
- entropy_ratio
- 0.9322
total
numeric feature high_skew outliersA heavily right-skewed numeric measure (skew 13.36, kurtosis 297.59) ranging from 47 to 9,782,602 with a median of 25,174 but a mean of 101,340 — the upper tail dwarfs the center. Roughly 13.9% of rows (449) flag as outliers, and the standard deviation (324,628) is over three times the mean, signalling a few very large values dominate. With 3,173 unique values across 3,222 rows and no nulls or zeros, this looks like a per-record aggregate total rather than a category or flag. Treatment: log-transform before modelling and consider winsorising the extreme tail.
- n
- 3,222
- nulls
- 0 (0.0%)
- unique
- 3,173
- min
- 47
- max
- 9.783e+06
- mean
- 1.013e+05
- median
- 25,174
- std
- 3.246e+05
- q1
- 1.059e+04
- q3
- 6.501e+04
- iqr
- 5.442e+04
- skew
- 13.36
- kurtosis
- 297.6
- n_outliers
- 449
- outlier_rate
- 0.1394
- zero_rate
- 0
pct_deep_poverty
numeric feature high_skew outliersThis is a numeric feature representing the percent of population in deep poverty, likely at a county or similar geographic unit (n=3222 with 1131 unique values). The distribution is right-skewed (skew 2.67, kurtosis 10.4) with a median of 5.82 but a max of 34.7, and 176 outliers (5.46%) sit in the upper tail. Min is 0.0 but the zero rate is just 0.09%, so the floor is rarely hit. Treatment: Apply a log1p or similar transform before regression to tame the right skew.
- n
- 3,222
- nulls
- 0 (0.0%)
- unique
- 1,131
- min
- 0
- max
- 34.7
- mean
- 6.743
- median
- 5.82
- std
- 4.154
- q1
- 4.27
- q3
- 7.918
- iqr
- 3.648
- skew
- 2.665
- kurtosis
- 10.4
- n_outliers
- 176
- outlier_rate
- 0.05462
- zero_rate
- 0.0009311
pct_poverty
numeric feature high_skewLikely a county- or tract-level poverty rate expressed as a percentage, ranging from 1.6 to 66.32 with a median of 13.55. The distribution is heavily right-skewed (skew 2.10, kurtosis 6.89) with 137 high-end outliers (~4.3%) pulling the mean (15.10) above the median. No nulls or zeros across 3,222 rows. Treatment: Consider a log or Yeo-Johnson transform before linear modelling to tame the right tail.
- n
- 3,222
- nulls
- 0 (0.0%)
- unique
- 1,719
- min
- 1.6
- max
- 66.32
- mean
- 15.1
- median
- 13.55
- std
- 7.706
- q1
- 10.16
- q3
- 17.91
- iqr
- 7.75
- skew
- 2.096
- kurtosis
- 6.891
- n_outliers
- 137
- outlier_rate
- 0.04252
- zero_rate
- 0
pct_near_poverty
numeric featureThis column reports a percentage of population near the poverty line, ranging from 0.58 to 49.14 with a mean of 9.81 and median of 9.38. The distribution is right-skewed (skew 1.19, kurtosis 5.73) with 82 outliers (2.55%) on the high tail, but no nulls or zeros. The IQR is tight at 4.43, so most observations cluster between 7.33 and 11.76 with a long upper tail. Treatment: Consider a log or winsorization before regression to dampen the right tail.
- n
- 3,222
- nulls
- 0 (0.0%)
- unique
- 1,237
- min
- 0.58
- max
- 49.14
- mean
- 9.813
- median
- 9.38
- std
- 3.644
- q1
- 7.33
- q3
- 11.76
- iqr
- 4.43
- skew
- 1.19
- kurtosis
- 5.729
- n_outliers
- 82
- outlier_rate
- 0.02545
- zero_rate
- 0