housing housing crisis counties
Reading
This dataset covers 3,222 US counties with 16 columns describing housing affordability — rents, incomes, renter shares, and rent-burden percentages. Several core numeric fields (annual_rent, median_gross_rent, median_household_income, rent_to_income_ratio) contain extreme negative sentinel values like -666666666 and -7999999992 that are dragging means deeply negative and producing skew of -17 to -56; these need to be cleaned or filtered before any analysis. The affordability_category field is heavily imbalanced, with 'Affordable' covering 99.1% of counties and only 1 county labeled 'Extremely Burdened', which suggests the categorization rule may be miscalibrated. Once the sentinel values are removed, the rent-burden percentage columns (pct_rent_burdened_30plus around a median of 37.4%, pct_rent_burdened_50plus around 17.6%) look like the cleanest signals to start with.
citing: annual_rent · median_gross_rent · median_household_income · rent_to_income_ratio · affordability_category · pct_rent_burdened_30plus · pct_rent_burdened_50plus · pct_renter
Charts the summary said to look at first
Show data table
| value | count | share |
|---|---|---|
| Affordable | 3192 | 99.1% |
| Moderately Burdened | 29 | 0.9% |
| Extremely Burdened | 1 | 0.0% |
Show data table
| bin | count |
|---|---|
| 0 – 1.624 | 9 |
| 1.624 – 3.248 | 5 |
| 3.248 – 4.872 | 3 |
| 4.872 – 6.496 | 5 |
| 6.496 – 8.12 | 9 |
| 8.12 – 9.744 | 13 |
| 9.744 – 11.37 | 11 |
| 11.37 – 12.99 | 16 |
| 12.99 – 14.62 | 26 |
| 14.62 – 16.24 | 19 |
| 16.24 – 17.86 | 35 |
| 17.86 – 19.49 | 43 |
| 19.49 – 21.11 | 52 |
| 21.11 – 22.74 | 52 |
| 22.74 – 24.36 | 73 |
| 24.36 – 25.98 | 99 |
| 25.98 – 27.61 | 109 |
| 27.61 – 29.23 | 116 |
| 29.23 – 30.86 | 132 |
| 30.86 – 32.48 | 159 |
| 32.48 – 34.1 | 189 |
| 34.1 – 35.73 | 209 |
| 35.73 – 37.35 | 227 |
| 37.35 – 38.98 | 239 |
| 38.98 – 40.6 | 205 |
| 40.6 – 42.22 | 209 |
| 42.22 – 43.85 | 210 |
| 43.85 – 45.47 | 190 |
| 45.47 – 47.1 | 131 |
| 47.1 – 48.72 | 114 |
| 48.72 – 50.34 | 118 |
| 50.34 – 51.97 | 69 |
| 51.97 – 53.59 | 51 |
| 53.59 – 55.22 | 34 |
| 55.22 – 56.84 | 24 |
| 56.84 – 58.46 | 6 |
| 58.46 – 60.09 | 3 |
| 60.09 – 61.71 | 2 |
| 61.71 – 63.34 | 3 |
| 63.34 – 64.96 | 3 |
Show data table
| bin | count |
|---|---|
| 0 – 1.624 | 42 |
| 1.624 – 3.248 | 27 |
| 3.248 – 4.872 | 34 |
| 4.872 – 6.496 | 63 |
| 6.496 – 8.12 | 102 |
| 8.12 – 9.744 | 148 |
| 9.744 – 11.37 | 163 |
| 11.37 – 12.99 | 214 |
| 12.99 – 14.62 | 242 |
| 14.62 – 16.24 | 310 |
| 16.24 – 17.86 | 315 |
| 17.86 – 19.49 | 332 |
| 19.49 – 21.11 | 335 |
| 21.11 – 22.74 | 264 |
| 22.74 – 24.36 | 219 |
| 24.36 – 25.98 | 150 |
| 25.98 – 27.61 | 99 |
| 27.61 – 29.23 | 64 |
| 29.23 – 30.86 | 39 |
| 30.86 – 32.48 | 20 |
| 32.48 – 34.1 | 21 |
| 34.1 – 35.73 | 9 |
| 35.73 – 37.35 | 2 |
| 37.35 – 38.98 | 3 |
| 38.98 – 40.6 | 1 |
| 40.6 – 42.22 | 1 |
| 42.22 – 43.85 | 1 |
| 43.85 – 45.47 | 0 |
| 45.47 – 47.1 | 1 |
| 47.1 – 48.72 | 0 |
| 48.72 – 50.34 | 0 |
| 50.34 – 51.97 | 0 |
| 51.97 – 53.59 | 0 |
| 53.59 – 55.22 | 0 |
| 55.22 – 56.84 | 0 |
| 56.84 – 58.46 | 0 |
| 58.46 – 60.09 | 0 |
| 60.09 – 61.71 | 0 |
| 61.71 – 63.34 | 0 |
| 63.34 – 64.96 | 1 |
Show data table
| bin | count |
|---|---|
| 3.01 – 5.435 | 1 |
| 5.435 – 7.859 | 3 |
| 7.859 – 10.28 | 9 |
| 10.28 – 12.71 | 26 |
| 12.71 – 15.13 | 63 |
| 15.13 – 17.56 | 156 |
| 17.56 – 19.98 | 316 |
| 19.98 – 22.41 | 371 |
| 22.41 – 24.83 | 450 |
| 24.83 – 27.26 | 419 |
| 27.26 – 29.68 | 357 |
| 29.68 – 32.11 | 301 |
| 32.11 – 34.53 | 203 |
| 34.53 – 36.96 | 169 |
| 36.96 – 39.38 | 115 |
| 39.38 – 41.81 | 75 |
| 41.81 – 44.23 | 56 |
| 44.23 – 46.66 | 45 |
| 46.66 – 49.08 | 25 |
| 49.08 – 51.5 | 15 |
| 51.5 – 53.93 | 11 |
| 53.93 – 56.35 | 10 |
| 56.35 – 58.78 | 8 |
| 58.78 – 61.2 | 4 |
| 61.2 – 63.63 | 4 |
| 63.63 – 66.05 | 1 |
| 66.05 – 68.48 | 1 |
| 68.48 – 70.9 | 3 |
| 70.9 – 73.33 | 1 |
| 73.33 – 75.75 | 1 |
| 75.75 – 78.18 | 0 |
| 78.18 – 80.6 | 1 |
| 80.6 – 83.03 | 0 |
| 83.03 – 85.45 | 1 |
| 85.45 – 87.88 | 0 |
| 87.88 – 90.3 | 0 |
| 90.3 – 92.73 | 0 |
| 92.73 – 95.15 | 0 |
| 95.15 – 97.58 | 0 |
| 97.58 – 100 | 1 |
Show data table
| bin | count |
|---|---|
| -6.667e+08 – -6.5e+08 | 1 |
| -6.5e+08 – -6.333e+08 | 0 |
| -6.333e+08 – -6.167e+08 | 0 |
| -6.167e+08 – -6e+08 | 0 |
| -6e+08 – -5.833e+08 | 0 |
| -5.833e+08 – -5.666e+08 | 0 |
| -5.666e+08 – -5.5e+08 | 0 |
| -5.5e+08 – -5.333e+08 | 0 |
| -5.333e+08 – -5.166e+08 | 0 |
| -5.166e+08 – -5e+08 | 0 |
| -5e+08 – -4.833e+08 | 0 |
| -4.833e+08 – -4.666e+08 | 0 |
| -4.666e+08 – -4.499e+08 | 0 |
| -4.499e+08 – -4.333e+08 | 0 |
| -4.333e+08 – -4.166e+08 | 0 |
| -4.166e+08 – -3.999e+08 | 0 |
| -3.999e+08 – -3.833e+08 | 0 |
| -3.833e+08 – -3.666e+08 | 0 |
| -3.666e+08 – -3.499e+08 | 0 |
| -3.499e+08 – -3.332e+08 | 0 |
| -3.332e+08 – -3.166e+08 | 0 |
| -3.166e+08 – -2.999e+08 | 0 |
| -2.999e+08 – -2.832e+08 | 0 |
| -2.832e+08 – -2.666e+08 | 0 |
| -2.666e+08 – -2.499e+08 | 0 |
| -2.499e+08 – -2.332e+08 | 0 |
| -2.332e+08 – -2.166e+08 | 0 |
| -2.166e+08 – -1.999e+08 | 0 |
| -1.999e+08 – -1.832e+08 | 0 |
| -1.832e+08 – -1.665e+08 | 0 |
| -1.665e+08 – -1.499e+08 | 0 |
| -1.499e+08 – -1.332e+08 | 0 |
| -1.332e+08 – -1.165e+08 | 0 |
| -1.165e+08 – -9.986e+07 | 0 |
| -9.986e+07 – -8.318e+07 | 0 |
| -8.318e+07 – -6.651e+07 | 0 |
| -6.651e+07 – -4.984e+07 | 0 |
| -4.984e+07 – -3.317e+07 | 0 |
| -3.317e+07 – -1.65e+07 | 0 |
| -1.65e+07 – 1.705e+05 | 3221 |
Schema
16 columns| Alerts | ||||
|---|---|---|---|---|
| fips | numeric | 0.0% | 3,222 |
|
| county_name | text | 0.0% | 3,222 |
near_unique
|
| total_renters | numeric | 0.0% | 2,709 |
high_skew
outliers
|
| pct_rent_burdened_30plus | numeric | 0.0% | 2,146 |
|
| pct_rent_burdened_50plus | numeric | 0.0% | 1,769 |
|
| median_gross_rent | numeric | 0.0% | 984 |
high_skew
outliers
|
| median_household_income | numeric | 0.0% | 3,099 |
high_skew
outliers
|
| total_housing_units | numeric | 0.0% | 3,074 |
high_skew
outliers
|
| owner_occupied | numeric | 0.0% | 3,001 |
high_skew
outliers
|
| renter_occupied | numeric | 0.0% | 2,709 |
high_skew
outliers
|
| pct_renter | numeric | 0.0% | 1,925 |
|
| annual_rent | numeric | 0.0% | 984 |
high_skew
outliers
|
| rent_to_income_ratio | numeric | 0.0% | 1,278 |
high_skew
|
| affordability_category | categorical | 0.0% | 3 |
imbalance
|
| hours_at_min_wage_for_rent | numeric | 0.0% | 230 |
high_skew
outliers
|
| weeks_at_min_wage_for_rent | numeric | 0.0% | 72 |
high_skew
outliers
|
fips
numeric identifierThis column is the US county FIPS code, a 4-5 digit geographic identifier where each row is unique (n=3222, n_unique=3222). Values span 1001 to 72153 with no nulls or zeros, consistent with the standard state+county encoding (e.g., 01001 Alabama through 72xxx Puerto Rico). The numeric statistics (mean 31377, skew 0.16) are not meaningful here since the digits encode geography, not magnitude. Treatment: Treat as a categorical geographic key; left-join on this to bring in county-level attributes rather than using as a numeric feature.
- n
- 3,222
- nulls
- 0 (0.0%)
- unique
- 3,222
- min
- 1,001
- max
- 72,153
- mean
- 3.138e+04
- median
- 30,022
- std
- 1.63e+04
- q1
- 1.903e+04
- q3
- 4.61e+04
- iqr
- 27,075
- skew
- 0.1574
- kurtosis
- -0.6314
- n_outliers
- 0
- outlier_rate
- 0
- zero_rate
- 0
county_name
text identifier near_uniqueThis column holds fully-qualified US county names (e.g., 'X County, State'), with 'county,' appearing in 2999 of 3222 rows and state tokens like Texas (256), Virginia (189), and Georgia (159) trailing as the second word. Every one of the 3222 values is unique with zero nulls or duplicates, and lengths cluster tightly between 16 and 31 characters (median 24). The 223 rows missing the 'county,' token likely correspond to parishes (Louisiana), boroughs (Alaska), or independent cities — worth confirming before any string join. Treatment: Split into county and state fields, then use as a join key against FIPS or census tables.
- n
- 3,222
- nulls
- 0 (0.0%)
- unique
- 3,222
- len_min
- 16
- len_max
- 59
- len_mean
- 24.32
- len_median
- 24
- len_p95
- 31
- word_mean
- 3.248
- word_median
- 3
- n_empty
- 0
- n_duplicates
- 0
- duplicate_rate
- 0
- vocab_size
- 1,990
- readability_flesch_mean
- 10.28
- emoji_rate
- 0
- url_rate
- 0
- one_word_rate
- 0
- allcaps_rate
- 0
- boilerplate_rate
- 0
total_renters
numeric feature high_skew outliersA count of renters per record, ranging from 28 to 1,810,929 with a median of 2,579.5 but a mean of 13,851 — classic right-tailed population/household data. The distribution is severely skewed (skew 15.82, kurtosis 398.15) with 449 outliers (13.9% of rows) and a standard deviation (55,351) far exceeding the IQR (6,392). No nulls or zeros, and 2,709 unique values across 3,222 rows suggest aggregated geographic units rather than individuals. Treatment: log-transform before regression to tame the heavy right tail.
- n
- 3,222
- nulls
- 0 (0.0%)
- unique
- 2,709
- min
- 28
- max
- 1.811e+06
- mean
- 1.385e+04
- median
- 2580
- std
- 5.535e+04
- q1
- 1004
- q3
- 7396
- iqr
- 6,392
- skew
- 15.82
- kurtosis
- 398.2
- n_outliers
- 449
- outlier_rate
- 0.1394
- zero_rate
- 0
pct_rent_burdened_30plus
numeric featureThis appears to be the share of renter households spending 30%+ of income on rent, expressed as a percentage (0 to 64.96, mean 36.44, median 37.36). The distribution is moderately left-skewed (-0.57) and tightly concentrated, with an IQR of 12.81 around a Q1-Q3 range of 30.67-43.48. Only 0.25% of rows are zero and 1.8% flag as outliers, suggesting the metric is well-populated and behaves consistently across the 3,222 rows. Treatment: Use as-is as a continuous feature; no transformation needed given the near-symmetric, bounded distribution.
- n
- 3,222
- nulls
- 0 (0.0%)
- unique
- 2,146
- min
- 0
- max
- 64.96
- mean
- 36.44
- median
- 37.36
- std
- 10.01
- q1
- 30.67
- q3
- 43.48
- iqr
- 12.81
- skew
- -0.5673
- kurtosis
- 0.5032
- n_outliers
- 58
- outlier_rate
- 0.018
- zero_rate
- 0.002483
pct_rent_burdened_50plus
numeric featureThis column reports the percentage of households spending 50%+ of income on rent, observed for 3,222 geographies with no nulls. The distribution is roughly symmetric (skew 0.054, kurtosis 0.98) and centered near 17.35% mean / 17.62% median, with an IQR of 8.56 points and 47 outliers (1.46%) reaching up to 64.96%. About 0.93% of rows are exactly zero, which may reflect very small or non-residential areas. Treatment: Use as-is for modelling; no transform needed given near-symmetric distribution, but consider winsorizing the 47 high-end outliers.
- n
- 3,222
- nulls
- 0 (0.0%)
- unique
- 1,769
- min
- 0
- max
- 64.96
- mean
- 17.35
- median
- 17.62
- std
- 6.577
- q1
- 13.07
- q3
- 21.63
- iqr
- 8.557
- skew
- 0.05436
- kurtosis
- 0.9823
- n_outliers
- 47
- outlier_rate
- 0.01459
- zero_rate
- 0.009311
median_gross_rent
numeric feature high_skew outliersThis column reports median gross rent in dollars, with a typical value near the median of 817.5 and an interquartile range of 718 to 978. The data is corrupted by sentinel values: the minimum is -666666666 and the mean is -2068220 with std 37088473, producing extreme negative skew (-17.87) and kurtosis (317.20). Roughly 7.3% of rows (235) are flagged as outliers, almost certainly these sentinel codes rather than legitimate rents. Treatment: Replace negative sentinel values with nulls before any modelling or aggregation.
- n
- 3,222
- nulls
- 0 (0.0%)
- unique
- 984
- min
- -6.667e+08
- max
- 2,805
- mean
- -2.068e+06
- median
- 817.5
- std
- 3.709e+07
- q1
- 718
- q3
- 978
- iqr
- 260
- skew
- -17.87
- kurtosis
- 317.2
- n_outliers
- 235
- outlier_rate
- 0.07294
- zero_rate
- 0
median_household_income
numeric feature high_skew outliersThis column reports median household income per row (likely county-level given n=3222), with 3099 unique values and no nulls. The minimum of -666666666 is a classic sentinel for missing data and single-handedly drags the mean to -144603 despite a median of 60458.5; skew of -56.7 and kurtosis of 3216 confirm the contamination. After removing sentinels, the IQR of 18561.5 between 51814.75 and 70376.25 looks like a plausible income distribution, with 188 flagged outliers (5.8%). Treatment: Recode the -666666666 sentinel to null, then consider a log or robust scaler before modelling.
- n
- 3,222
- nulls
- 0 (0.0%)
- unique
- 3,099
- min
- -6.667e+08
- max
- 170,463
- mean
- -1.446e+05
- median
- 6.046e+04
- std
- 1.175e+07
- q1
- 5.181e+04
- q3
- 7.038e+04
- iqr
- 1.856e+04
- skew
- -56.74
- kurtosis
- 3217
- n_outliers
- 188
- outlier_rate
- 0.05835
- zero_rate
- 0
total_housing_units
numeric feature high_skew outliersCounts of housing units per record, almost certainly aggregated to a geographic area (county or similar) given 3,222 rows and a median of 10,021 units. The distribution is severely right-skewed (skew 12.05, kurtosis 240.5) with a max of 3,363,093 against a Q3 of just 25,939, and 13.7% of rows flag as outliers. No nulls or zeros, and 3,074 unique values out of 3,222 suggest near-distinct totals per area. Treatment: Log-transform before regression to tame the heavy right tail.
- n
- 3,222
- nulls
- 0 (0.0%)
- unique
- 3,074
- min
- 32
- max
- 3.363e+06
- mean
- 3.94e+04
- median
- 10,021
- std
- 1.201e+05
- q1
- 4211
- q3
- 25,939
- iqr
- 2.173e+04
- skew
- 12.05
- kurtosis
- 240.5
- n_outliers
- 443
- outlier_rate
- 0.1375
- zero_rate
- 0
owner_occupied
numeric feature high_skew outliersLikely a count of owner-occupied housing units per geographic area, given the integer-like range from 0 to 1,552,164 and median of 7,325.5. The distribution is severely right-skewed (skew 9.52, kurtosis 146.9) with 429 outliers (13.3% rate) and a mean (25,551.7) far above the median, indicating a long tail of high-population areas. Near-unique values (3,001 of 3,222) and effectively no zeros (0.03%) are consistent with a per-region count rather than a categorical flag. Treatment: log-transform before regression to tame the heavy right tail.
- n
- 3,222
- nulls
- 0 (0.0%)
- unique
- 3,001
- min
- 0
- max
- 1.552e+06
- mean
- 2.555e+04
- median
- 7326
- std
- 6.755e+04
- q1
- 3148
- q3
- 1.886e+04
- iqr
- 1.572e+04
- skew
- 9.516
- kurtosis
- 146.9
- n_outliers
- 429
- outlier_rate
- 0.1331
- zero_rate
- 0.0003104
renter_occupied
numeric feature high_skew outliersCounts of renter-occupied housing units per record, ranging from 28 to 1,810,929 with a median of 2,579.5 but a mean of 13,851. The distribution is severely right-skewed (skew 15.82, kurtosis 398.15) and 13.9% of rows fall outside the IQR fence, consistent with a small number of very large geographies dominating the tail. Treatment: Log-transform before modelling to tame the heavy right tail.
- n
- 3,222
- nulls
- 0 (0.0%)
- unique
- 2,709
- min
- 28
- max
- 1.811e+06
- mean
- 1.385e+04
- median
- 2580
- std
- 5.535e+04
- q1
- 1004
- q3
- 7396
- iqr
- 6,392
- skew
- 15.82
- kurtosis
- 398.2
- n_outliers
- 449
- outlier_rate
- 0.1394
- zero_rate
- 0
pct_renter
numeric featurePercent of renter-occupied housing units, reported per row across 3222 records with no nulls and no zeros. Values span 3.01 to 100.0 with a mean of 27.35 and median of 26.07, and the distribution is right-skewed (skew 1.32, kurtosis 4.41) with 88 high-side outliers (2.7%). The 100.0 maximum is worth checking — it suggests at least one fully-renter geography that may warrant verification. Treatment: Mild right-skew; consider a log1p or sqrt transform before linear modelling and inspect the 100.0 cases.
- n
- 3,222
- nulls
- 0 (0.0%)
- unique
- 1,925
- min
- 3.01
- max
- 100
- mean
- 27.35
- median
- 26.07
- std
- 8.564
- q1
- 21.64
- q3
- 31.66
- iqr
- 10.02
- skew
- 1.317
- kurtosis
- 4.412
- n_outliers
- 88
- outlier_rate
- 0.02731
- zero_rate
- 0
annual_rent
numeric feature high_skew outliersLikely an annual rent amount in currency units, with a typical lease near the median of 9810 and an interquartile band from 8616 to 11736. The column is corrupted by sentinel-like negatives: the min is -7999999992 and the mean of -24818640.7 is impossible for rent, driving extreme skew (-17.87) and kurtosis (317.2). About 7.3% of rows (235) flag as outliers, while 0% are null or zero, suggesting missing values were encoded as large negatives rather than NaN. Treatment: Replace large-magnitude negatives with NaN, then winsorize or log-transform before modelling.
- n
- 3,222
- nulls
- 0 (0.0%)
- unique
- 984
- min
- -8e+09
- max
- 33,660
- mean
- -2.482e+07
- median
- 9,810
- std
- 4.451e+08
- q1
- 8,616
- q3
- 11,736
- iqr
- 3,120
- skew
- -17.87
- kurtosis
- 317.2
- n_outliers
- 235
- outlier_rate
- 0.07294
- zero_rate
- 0
rent_to_income_ratio
numeric feature high_skewLikely a rent-to-income ratio feature, with a tight interquartile range between 15.07 and 19.3875 and a median of 17.05 that suggests typical values are well-behaved percentages. However, the column is severely corrupted: the minimum is -24357569.09, the mean is -37244.13, std is 752361.7, skew is -22.74 and kurtosis is 570.21, indicating extreme negative outliers that are implausible for a ratio. 114 outliers (3.54%) are flagged and the max of 1200.0 is also suspicious. Treatment: Investigate and clip or null the negative and extreme values, then consider a robust scaler or log-transform before modelling.
- n
- 3,222
- nulls
- 0 (0.0%)
- unique
- 1,278
- min
- -2.436e+07
- max
- 1,200
- mean
- -3.724e+04
- median
- 17.05
- std
- 7.524e+05
- q1
- 15.07
- q3
- 19.39
- iqr
- 4.317
- skew
- -22.74
- kurtosis
- 570.2
- n_outliers
- 114
- outlier_rate
- 0.03538
- zero_rate
- 0
affordability_category
categorical label imbalanceA 3-level categorical bucket classifying affordability, almost certainly derived from a rent or income ratio. The distribution is severely degenerate: 'Affordable' covers 3192 of 3222 rows (top_rate 0.9907), 'Moderately Burdened' has 29, and 'Extremely Burdened' has just 1, yielding an entropy ratio of 0.049. With effectively no variance, this column carries little discriminative signal. Treatment: Drop or collapse to binary (Affordable vs. Burdened); too imbalanced for direct modelling.
- n
- 3,222
- nulls
- 0 (0.0%)
- unique
- 3
- top_value
- Affordable
- top_rate
- 0.9907
- cardinality
- 3
- entropy
- 0.07815
- entropy_ratio
- 0.04931
hours_at_min_wage_for_rent
numeric feature high_skew outliersThis column appears to be the number of minimum-wage hours required to afford rent, with a typical value around 113 hours (median) and an interquartile range of 99-135. However, the data is severely corrupted by at least one extreme negative value (min = -91,954,023), which drags the mean to -285,271 despite a sensible median, and produces extreme skew (-17.87) and kurtosis (317.20). 232 outliers (7.2%) are flagged, suggesting the negatives are likely sentinel codes or data-entry errors rather than real measurements. Treatment: Filter or null out negative sentinel values, then consider a log or robust scaling before modelling.
- n
- 3,222
- nulls
- 0 (0.0%)
- unique
- 230
- min
- -9.195e+07
- max
- 387
- mean
- -2.853e+05
- median
- 113
- std
- 5.116e+06
- q1
- 99
- q3
- 135
- iqr
- 36
- skew
- -17.87
- kurtosis
- 317.2
- n_outliers
- 232
- outlier_rate
- 0.072
- zero_rate
- 0
weeks_at_min_wage_for_rent
numeric feature high_skew outliersLikely the number of weeks of minimum-wage labor required to cover rent, with a typical value near 2.8 weeks and an interquartile range of 2.5–3.4. The distribution is corrupted by extreme negatives: the minimum is -2,298,850.6 and the mean is -7,131.79, driving skew of -17.87 and kurtosis of 317.2. 7.2% of rows (232) are flagged outliers, suggesting sentinel values or unit/sign errors rather than genuine measurements. Treatment: Investigate and clip/null the negative sentinel values before any modelling or aggregation.
- n
- 3,222
- nulls
- 0 (0.0%)
- unique
- 72
- min
- -2.299e+06
- max
- 9.7
- mean
- -7132
- median
- 2.8
- std
- 1.279e+05
- q1
- 2.5
- q3
- 3.4
- iqr
- 0.9
- skew
- -17.87
- kurtosis
- 317.2
- n_outliers
- 232
- outlier_rate
- 0.072
- zero_rate
- 0