median income
Reading
This dataset contains 3,222 rows covering U.S. counties, with three columns: a county name, a FIPS code, and median household income. The income column is the headline issue — it has a minimum of -666,666,666 and a mean of roughly -144,603 against a median of 60,458, indicating a sentinel value (likely a missing-data placeholder) that is dragging the distribution into nonsense. About 5.8% of records (188 rows) are flagged as outliers and skew is extreme (-56.7), so any analysis should filter these sentinels before computing summary stats. County names are essentially unique row labels, while FIPS codes look clean and well-distributed across the expected national range.
citing: row_count · column_count · median_household_income.stats.min · median_household_income.stats.max · median_household_income.stats.mean · median_household_income.stats.median · median_household_income.stats.skew · median_household_income.stats.n_outliers · median_household_income.stats.outlier_rate · fips.stats.min · fips.stats.max · county_name.top_words
Charts the summary said to look at first
Show data table
| bin | count |
|---|---|
| -6.667e+08 – -6.5e+08 | 1 |
| -6.5e+08 – -6.333e+08 | 0 |
| -6.333e+08 – -6.167e+08 | 0 |
| -6.167e+08 – -6e+08 | 0 |
| -6e+08 – -5.833e+08 | 0 |
| -5.833e+08 – -5.666e+08 | 0 |
| -5.666e+08 – -5.5e+08 | 0 |
| -5.5e+08 – -5.333e+08 | 0 |
| -5.333e+08 – -5.166e+08 | 0 |
| -5.166e+08 – -5e+08 | 0 |
| -5e+08 – -4.833e+08 | 0 |
| -4.833e+08 – -4.666e+08 | 0 |
| -4.666e+08 – -4.499e+08 | 0 |
| -4.499e+08 – -4.333e+08 | 0 |
| -4.333e+08 – -4.166e+08 | 0 |
| -4.166e+08 – -3.999e+08 | 0 |
| -3.999e+08 – -3.833e+08 | 0 |
| -3.833e+08 – -3.666e+08 | 0 |
| -3.666e+08 – -3.499e+08 | 0 |
| -3.499e+08 – -3.332e+08 | 0 |
| -3.332e+08 – -3.166e+08 | 0 |
| -3.166e+08 – -2.999e+08 | 0 |
| -2.999e+08 – -2.832e+08 | 0 |
| -2.832e+08 – -2.666e+08 | 0 |
| -2.666e+08 – -2.499e+08 | 0 |
| -2.499e+08 – -2.332e+08 | 0 |
| -2.332e+08 – -2.166e+08 | 0 |
| -2.166e+08 – -1.999e+08 | 0 |
| -1.999e+08 – -1.832e+08 | 0 |
| -1.832e+08 – -1.665e+08 | 0 |
| -1.665e+08 – -1.499e+08 | 0 |
| -1.499e+08 – -1.332e+08 | 0 |
| -1.332e+08 – -1.165e+08 | 0 |
| -1.165e+08 – -9.986e+07 | 0 |
| -9.986e+07 – -8.318e+07 | 0 |
| -8.318e+07 – -6.651e+07 | 0 |
| -6.651e+07 – -4.984e+07 | 0 |
| -4.984e+07 – -3.317e+07 | 0 |
| -3.317e+07 – -1.65e+07 | 0 |
| -1.65e+07 – 1.705e+05 | 3221 |
Show data table
| bin | count |
|---|---|
| 1001 – 2780 | 97 |
| 2780 – 4559 | 15 |
| 4559 – 6337 | 133 |
| 6337 – 8116 | 59 |
| 8116 – 9895 | 14 |
| 9895 – 1.167e+04 | 4 |
| 1.167e+04 – 1.345e+04 | 226 |
| 1.345e+04 – 1.523e+04 | 5 |
| 1.523e+04 – 1.701e+04 | 49 |
| 1.701e+04 – 1.879e+04 | 189 |
| 1.879e+04 – 2.057e+04 | 204 |
| 2.057e+04 – 2.235e+04 | 184 |
| 2.235e+04 – 2.413e+04 | 39 |
| 2.413e+04 – 2.59e+04 | 15 |
| 2.59e+04 – 2.768e+04 | 170 |
| 2.768e+04 – 2.946e+04 | 196 |
| 2.946e+04 – 3.124e+04 | 150 |
| 3.124e+04 – 3.302e+04 | 27 |
| 3.302e+04 – 3.48e+04 | 21 |
| 3.48e+04 – 3.658e+04 | 95 |
| 3.658e+04 – 3.836e+04 | 153 |
| 3.836e+04 – 4.013e+04 | 155 |
| 4.013e+04 – 4.191e+04 | 46 |
| 4.191e+04 – 4.369e+04 | 67 |
| 4.369e+04 – 4.547e+04 | 51 |
| 4.547e+04 – 4.725e+04 | 161 |
| 4.725e+04 – 4.903e+04 | 268 |
| 4.903e+04 – 5.081e+04 | 29 |
| 5.081e+04 – 5.259e+04 | 133 |
| 5.259e+04 – 5.436e+04 | 94 |
| 5.436e+04 – 5.614e+04 | 95 |
| 5.614e+04 – 5.792e+04 | 0 |
| 5.792e+04 – 5.97e+04 | 0 |
| 5.97e+04 – 6.148e+04 | 0 |
| 6.148e+04 – 6.326e+04 | 0 |
| 6.326e+04 – 6.504e+04 | 0 |
| 6.504e+04 – 6.682e+04 | 0 |
| 6.682e+04 – 6.86e+04 | 0 |
| 6.86e+04 – 7.037e+04 | 0 |
| 7.037e+04 – 7.215e+04 | 78 |
Show data table
| chars | count |
|---|---|
| 16 – 17 | 26 |
| 17 – 18 | 72 |
| 18 – 19 | 121 |
| 19 – 20 | 190 |
| 20 – 21 | 264 |
| 21 – 22 | 407 |
| 22 – 24 | 420 |
| 24 – 25 | 363 |
| 25 – 26 | 320 |
| 26 – 27 | 240 |
| 27 – 28 | 231 |
| 28 – 29 | 152 |
| 29 – 30 | 139 |
| 30 – 31 | 165 |
| 31 – 32 | 41 |
| 32 – 33 | 28 |
| 33 – 34 | 16 |
| 34 – 35 | 10 |
| 35 – 36 | 5 |
| 36 – 38 | 0 |
| 38 – 39 | 1 |
| 39 – 40 | 1 |
| 40 – 41 | 0 |
| 41 – 42 | 1 |
| 42 – 43 | 1 |
| 43 – 44 | 0 |
| 44 – 45 | 2 |
| 45 – 46 | 0 |
| 46 – 47 | 1 |
| 47 – 48 | 1 |
| 48 – 49 | 0 |
| 49 – 50 | 0 |
| 50 – 51 | 0 |
| 51 – 53 | 0 |
| 53 – 54 | 2 |
| 54 – 55 | 1 |
| 55 – 56 | 0 |
| 56 – 57 | 0 |
| 57 – 58 | 0 |
| 58 – 59 | 1 |
Schema
3 columns| Alerts | ||||
|---|---|---|---|---|
| fips | numeric | 0.0% | 3,222 |
|
| county_name | text | 0.0% | 3,222 |
near_unique
|
| median_household_income | numeric | 0.0% | 3,099 |
high_skew
outliers
|
fips
numeric identifierThis column is a FIPS county/area code—every one of the 3222 rows is unique with no nulls, and the values span 1001 to 72153, the canonical FIPS range covering U.S. states and territories. The distribution is nearly symmetric (skew 0.157, kurtosis -0.631) with no outliers, consistent with a structured geographic identifier rather than a measured quantity. Treat it as a key, not a numeric feature. Treatment: Use as a categorical join key on county-level data; do not feed as a numeric feature.
- n
- 3,222
- nulls
- 0 (0.0%)
- unique
- 3,222
- min
- 1,001
- max
- 72,153
- mean
- 3.138e+04
- median
- 30,022
- std
- 1.63e+04
- q1
- 1.903e+04
- q3
- 4.61e+04
- iqr
- 27,075
- skew
- 0.1574
- kurtosis
- -0.6314
- n_outliers
- 0
- outlier_rate
- 0
- zero_rate
- 0
county_name
text identifier near_uniqueThis column holds full US county identifiers (e.g., 'X County,'), with all 3222 rows unique and zero nulls. The token 'county,' appears 2999 times, suggesting ~223 rows use a different suffix (likely 'Parish' in Louisiana, 'Borough'/'Census Area' in Alaska, or independent cities). State-name frequencies match expected US distribution, with Texas (256) leading. Treatment: Use as a join key after splitting into county and state components; do not treat as a feature.
- n
- 3,222
- nulls
- 0 (0.0%)
- unique
- 3,222
- len_min
- 16
- len_max
- 59
- len_mean
- 24.32
- len_median
- 24
- len_p95
- 31
- word_mean
- 3.248
- word_median
- 3
- n_empty
- 0
- n_duplicates
- 0
- duplicate_rate
- 0
- vocab_size
- 1,990
- readability_flesch_mean
- 10.28
- emoji_rate
- 0
- url_rate
- 0
- one_word_rate
- 0
- allcaps_rate
- 0
- boilerplate_rate
- 0
median_household_income
numeric feature high_skew outliersCounty-level median household income in dollars, with 3099 distinct values across 3222 rows and no nulls. The minimum of -666666666 is a clear sentinel for missing data, dragging the mean to -144603 even though the median is 60458.5 and Q1-Q3 sit between 51814.75 and 70376.25. This sentinel produces the extreme skew (-56.74) and kurtosis (3216.99), and 188 outliers (5.83%) are flagged. Treatment: Replace the -666666666 sentinel with null, then consider a log or robust scaler before modelling.
- n
- 3,222
- nulls
- 0 (0.0%)
- unique
- 3,099
- min
- -6.667e+08
- max
- 170,463
- mean
- -1.446e+05
- median
- 6.046e+04
- std
- 1.175e+07
- q1
- 5.181e+04
- q3
- 7.038e+04
- iqr
- 1.856e+04
- skew
- -56.74
- kurtosis
- 3217
- n_outliers
- 188
- outlier_rate
- 0.05835
- zero_rate
- 0