This dataset contains 3,222 rows of U.S. county-level median gross rent figures, keyed by county name and FIPS code. The standout issue is the median_gross_rent column: while the median is a plausible $817.50 and the IQR runs $718 to $978, the minimum is -666,666,666, dragging the mean to roughly -2.07M and producing extreme skew (-17.87) and kurtosis (317.2). That sentinel-style negative value and the 235 flagged outliers (7.3%) should be cleaned or filtered before any analysis. The fips column is well-behaved and unique per row, and county_name is essentially an identifier (3,222 unique values), so neither needs deep inspection beyond confirming coverage.
saturn
/home/coolhand/html/datavis/data_trove/cache/median_rents.parquet 3,222 rows sample n=3,222 seed 42 2026-05-01T16:53:40+00:00
Overview
| Source | /home/coolhand/html/datavis/data_trove/cache/median_rents.parquet |
| Total rows | 3,222 |
| Profiled sample | 3,222 |
| Columns | 3 |
| Generated | 2026-05-01T16:53:40+00:00 |
Insights opt-in
Model-generated narrative. These are opinions, not facts — the stats below are what saturn measured. Generated by: anthropic:claude-opus-4-7.
This is the county-level FIPS code: an integer geographic identifier where every one of the 3222 rows is unique and non-null. The range (1001 to 72153) and distribution (mean 31377, median 30022, low skew 0.16) are consistent with the standard 5-digit state+county FIPS scheme covering US states and territories. There is nothing anomalous here — it behaves as a clean primary key rather than a numeric feature.
This column holds fully-qualified US county names (e.g. 'X County, Texas'), with the word 'county,' appearing in 2999 of 3222 rows and state names like Texas (256), Virginia (189) and Georgia (159) dominating the remaining tokens. Every one of the 3222 values is unique with zero nulls or duplicates, so it functions as a row identifier rather than a categorical feature. String lengths are tight (16-59 chars, median 24) and there is no boilerplate, URL or emoji noise.
Median gross rent in dollars, with a healthy interquartile range of 718 to 978 around a median of 817.5. The minimum of -666666666 is clearly a sentinel for missing data, dragging the mean to -2068220 and producing extreme skew (-17.87) and kurtosis (317.20). 235 outliers (7.3%) flag this contamination even though null_rate is 0.
Numeric correlation
fips numeric
county_name text
Sample values (first 10)
- Bibb County, Alabama
- Cheatham County, Tennessee
- Piute County, Utah
- Lamb County, Texas
- Martin County, Minnesota
- Sheridan County, Wyoming
- Chickasaw County, Mississippi
- Rockingham County, Virginia
- Liberty County, Texas
- Clark County, Arkansas