saturn·

median rents

saturn notebook · generated 2026-05-01 Report Notebook

Overview

Source: /home/coolhand/html/datavis/data_trove/cache/median_rents.parquet

Saturn profiled 3,222 rows across 3 columns. The stats below are deterministic and machine-readable; the prose is a language-model interpretation of those stats (opt-in, added after the fact, never sees raw rows).

[2]:
!pip install saturn-dissect
import subprocess
subprocess.run([
    "saturn", "analyze", "/home/coolhand/html/datavis/data_trove/cache/median_rents.parquet",
    "--findings", "median_rents.json",
    "--llm", "anthropic:claude-opus-4-7",
])

Summary confidence: high

This dataset contains 3,222 rows of U.S. county-level median gross rent figures, keyed by county name and FIPS code. The standout issue is the median_gross_rent column: while the median is a plausible $817.50 and the IQR runs $718 to $978, the minimum is -666,666,666, dragging the mean to roughly -2.07M and producing extreme skew (-17.87) and kurtosis (317.2). That sentinel-style negative value and the 235 flagged outliers (7.3%) should be cleaned or filtered before any analysis. The fips column is well-behaved and unique per row, and county_name is essentially an identifier (3,222 unique values), so neither needs deep inspection beyond confirming coverage.

citing: median_gross_rent.stats.median · median_gross_rent.stats.mean · median_gross_rent.stats.min · median_gross_rent.stats.max · median_gross_rent.stats.q1 · median_gross_rent.stats.q3 · median_gross_rent.stats.skew · median_gross_rent.stats.kurtosis · median_gross_rent.stats.n_outliers · median_gross_rent.stats.outlier_rate · fips.stats.min · fips.stats.max · row_count · column_count

Out[4]:

saturn.schema() · 3 columns

column kind n null% unique alerts
fips numeric 3,222 0.0% 3,222
county_name text 3,222 0.0% 3,222 near_unique
median_gross_rent numeric 3,222 0.0% 984 high_skew outliers
Fig 1.
median_gross_rent · Expect an extreme left tail from the -666,666,666 sentinel; filter it out to see the real distribution centered near $817.
Show data table
Histogram bins for median_gross_rent (median: 817.5).
bincount
-6.667e+08 – -6.5e+0810
-6.5e+08 – -6.333e+080
-6.333e+08 – -6.167e+080
-6.167e+08 – -6e+080
-6e+08 – -5.833e+080
-5.833e+08 – -5.667e+080
-5.667e+08 – -5.5e+080
-5.5e+08 – -5.333e+080
-5.333e+08 – -5.167e+080
-5.167e+08 – -5e+080
-5e+08 – -4.833e+080
-4.833e+08 – -4.667e+080
-4.667e+08 – -4.5e+080
-4.5e+08 – -4.333e+080
-4.333e+08 – -4.167e+080
-4.167e+08 – -4e+080
-4e+08 – -3.833e+080
-3.833e+08 – -3.667e+080
-3.667e+08 – -3.5e+080
-3.5e+08 – -3.333e+080
-3.333e+08 – -3.167e+080
-3.167e+08 – -3e+080
-3e+08 – -2.833e+080
-2.833e+08 – -2.667e+080
-2.667e+08 – -2.5e+080
-2.5e+08 – -2.333e+080
-2.333e+08 – -2.167e+080
-2.167e+08 – -2e+080
-2e+08 – -1.833e+080
-1.833e+08 – -1.667e+080
-1.667e+08 – -1.5e+080
-1.5e+08 – -1.333e+080
-1.333e+08 – -1.167e+080
-1.167e+08 – -1e+080
-1e+08 – -8.333e+070
-8.333e+07 – -6.666e+070
-6.666e+07 – -5e+070
-5e+07 – -3.333e+070
-3.333e+07 – -1.666e+070
-1.666e+07 – 28053212
Fig 2.
fips · Should spread fairly evenly across state FIPS prefixes, confirming nationwide county coverage.
Show data table
Histogram bins for fips (median: 30022.0).
bincount
1001 – 278097
2780 – 455915
4559 – 6337133
6337 – 811659
8116 – 989514
9895 – 1.167e+044
1.167e+04 – 1.345e+04226
1.345e+04 – 1.523e+045
1.523e+04 – 1.701e+0449
1.701e+04 – 1.879e+04189
1.879e+04 – 2.057e+04204
2.057e+04 – 2.235e+04184
2.235e+04 – 2.413e+0439
2.413e+04 – 2.59e+0415
2.59e+04 – 2.768e+04170
2.768e+04 – 2.946e+04196
2.946e+04 – 3.124e+04150
3.124e+04 – 3.302e+0427
3.302e+04 – 3.48e+0421
3.48e+04 – 3.658e+0495
3.658e+04 – 3.836e+04153
3.836e+04 – 4.013e+04155
4.013e+04 – 4.191e+0446
4.191e+04 – 4.369e+0467
4.369e+04 – 4.547e+0451
4.547e+04 – 4.725e+04161
4.725e+04 – 4.903e+04268
4.903e+04 – 5.081e+0429
5.081e+04 – 5.259e+04133
5.259e+04 – 5.436e+0494
5.436e+04 – 5.614e+0495
5.614e+04 – 5.792e+040
5.792e+04 – 5.97e+040
5.97e+04 – 6.148e+040
6.148e+04 – 6.326e+040
6.326e+04 – 6.504e+040
6.504e+04 – 6.682e+040
6.682e+04 – 6.86e+040
6.86e+04 – 7.037e+040
7.037e+04 – 7.215e+0478
Fig 3.
county_name · County name lengths cluster tightly around 24 characters; useful as a sanity check that no rows are truncated or malformed.
Show data table
Character-length distribution for county_name (mean: 24.324022346368714).
charscount
16 – 1726
17 – 1872
18 – 19121
19 – 20190
20 – 21264
21 – 22407
22 – 24420
24 – 25363
25 – 26320
26 – 27240
27 – 28231
28 – 29152
29 – 30139
30 – 31165
31 – 3241
32 – 3328
33 – 3416
34 – 3510
35 – 365
36 – 380
38 – 391
39 – 401
40 – 410
41 – 421
42 – 431
43 – 440
44 – 452
45 – 460
46 – 471
47 – 481
48 – 490
49 – 500
50 – 510
51 – 530
53 – 542
54 – 551
55 – 560
56 – 570
57 – 580
58 – 591
Fig 4.
Per-column null rate across the corpus. Columns are ordered by input position.
Show data table
Per-column null rate across the corpus.
columnkindnull %
fipsnumeric0.0%
county_nametext0.0%
median_gross_rentnumeric0.0%
Fig 5.
Pearson correlation across numeric columns (sampled, bounded).
Show data table
Pearson correlation across 2 numeric columns (values clipped to 2 decimals).
fipsmedian_gross_rent
fips+1.00-0.06
median_gross_rent-0.06+1.00

fips numeric identifier

This is the county-level FIPS code: an integer geographic identifier where every one of the 3222 rows is unique and non-null. The range (1001 to 72153) and distribution (mean 31377, median 30022, low skew 0.16) are consistent with the standard 5-digit state+county FIPS scheme covering US states and territories. There is nothing anomalous here — it behaves as a clean primary key rather than a numeric feature.

Treatment: Treat as a categorical key for joins; do not use as a numeric feature.

anthropic:claude-opus-4-7 · confidence high
Out[11]:

saturn.columns["fips"].stats

statvalue
n3,222
nulls0 (0.0%)
unique3,222
min 1,001
max 72,153
mean 3.138e+04
median 30,022
std 1.63e+04
q1 1.903e+04
q3 4.61e+04
iqr 27,075
skew 0.1574
kurtosis -0.6314
n_outliers 0
outlier_rate 0
zero_rate 0
Fig 6.
Distribution of fips. Vertical dash marks the median.
Show data table
Histogram bins for fips (median: 30022.0).
bincount
1001 – 278097
2780 – 455915
4559 – 6337133
6337 – 811659
8116 – 989514
9895 – 1.167e+044
1.167e+04 – 1.345e+04226
1.345e+04 – 1.523e+045
1.523e+04 – 1.701e+0449
1.701e+04 – 1.879e+04189
1.879e+04 – 2.057e+04204
2.057e+04 – 2.235e+04184
2.235e+04 – 2.413e+0439
2.413e+04 – 2.59e+0415
2.59e+04 – 2.768e+04170
2.768e+04 – 2.946e+04196
2.946e+04 – 3.124e+04150
3.124e+04 – 3.302e+0427
3.302e+04 – 3.48e+0421
3.48e+04 – 3.658e+0495
3.658e+04 – 3.836e+04153
3.836e+04 – 4.013e+04155
4.013e+04 – 4.191e+0446
4.191e+04 – 4.369e+0467
4.369e+04 – 4.547e+0451
4.547e+04 – 4.725e+04161
4.725e+04 – 4.903e+04268
4.903e+04 – 5.081e+0429
5.081e+04 – 5.259e+04133
5.259e+04 – 5.436e+0494
5.436e+04 – 5.614e+0495
5.614e+04 – 5.792e+040
5.792e+04 – 5.97e+040
5.97e+04 – 6.148e+040
6.148e+04 – 6.326e+040
6.326e+04 – 6.504e+040
6.504e+04 – 6.682e+040
6.682e+04 – 6.86e+040
6.86e+04 – 7.037e+040
7.037e+04 – 7.215e+0478

county_name text identifier

This column holds fully-qualified US county names (e.g. 'X County, Texas'), with the word 'county,' appearing in 2999 of 3222 rows and state names like Texas (256), Virginia (189) and Georgia (159) dominating the remaining tokens. Every one of the 3222 values is unique with zero nulls or duplicates, so it functions as a row identifier rather than a categorical feature. String lengths are tight (16-59 chars, median 24) and there is no boilerplate, URL or emoji noise.

Treatment: Split into county and state fields, or use as a join key against county-level reference data.

anthropic:claude-opus-4-7 · confidence high
Out[14]:

saturn.columns["county_name"].stats

statvalue
n3,222
nulls0 (0.0%)
unique3,222
len_min 16
len_max 59
len_mean 24.32
len_median 24
len_p95 31
word_mean 3.248
word_median 3
n_empty 0
n_duplicates 0
duplicate_rate 0
vocab_size 1,990
readability_flesch_mean 10.28
emoji_rate 0
url_rate 0
one_word_rate 0
allcaps_rate 0
boilerplate_rate 0
alert: near_unique100.0% of rows are unique strings
Fig 7.
Character-length distribution for county_name.
Show data table
Character-length distribution for county_name (mean: 24.324022346368714).
charscount
16 – 1726
17 – 1872
18 – 19121
19 – 20190
20 – 21264
21 – 22407
22 – 24420
24 – 25363
25 – 26320
26 – 27240
27 – 28231
28 – 29152
29 – 30139
30 – 31165
31 – 3241
32 – 3328
33 – 3416
34 – 3510
35 – 365
36 – 380
38 – 391
39 – 401
40 – 410
41 – 421
42 – 431
43 – 440
44 – 452
45 – 460
46 – 471
47 – 481
48 – 490
49 – 500
50 – 510
51 – 530
53 – 542
54 – 551
55 – 560
56 – 570
57 – 580
58 – 591

median_gross_rent numeric feature

Median gross rent in dollars, with a healthy interquartile range of 718 to 978 around a median of 817.5. The minimum of -666666666 is clearly a sentinel for missing data, dragging the mean to -2068220 and producing extreme skew (-17.87) and kurtosis (317.20). 235 outliers (7.3%) flag this contamination even though null_rate is 0.

Treatment: Replace the -666666666 sentinel with NA before any modelling or aggregation.

anthropic:claude-opus-4-7 · confidence high
Out[17]:

saturn.columns["median_gross_rent"].stats

statvalue
n3,222
nulls0 (0.0%)
unique984
min -6.667e+08
max 2,805
mean -2.068e+06
median 817.5
std 3.709e+07
q1 718
q3 978
iqr 260
skew -17.87
kurtosis 317.2
n_outliers 235
outlier_rate 0.07294
zero_rate 0
alert: high_skewskew=-17.87
alert: outliers7.3% rows beyond 1.5 IQR
Fig 8.
Distribution of median_gross_rent. Vertical dash marks the median.
Show data table
Histogram bins for median_gross_rent (median: 817.5).
bincount
-6.667e+08 – -6.5e+0810
-6.5e+08 – -6.333e+080
-6.333e+08 – -6.167e+080
-6.167e+08 – -6e+080
-6e+08 – -5.833e+080
-5.833e+08 – -5.667e+080
-5.667e+08 – -5.5e+080
-5.5e+08 – -5.333e+080
-5.333e+08 – -5.167e+080
-5.167e+08 – -5e+080
-5e+08 – -4.833e+080
-4.833e+08 – -4.667e+080
-4.667e+08 – -4.5e+080
-4.5e+08 – -4.333e+080
-4.333e+08 – -4.167e+080
-4.167e+08 – -4e+080
-4e+08 – -3.833e+080
-3.833e+08 – -3.667e+080
-3.667e+08 – -3.5e+080
-3.5e+08 – -3.333e+080
-3.333e+08 – -3.167e+080
-3.167e+08 – -3e+080
-3e+08 – -2.833e+080
-2.833e+08 – -2.667e+080
-2.667e+08 – -2.5e+080
-2.5e+08 – -2.333e+080
-2.333e+08 – -2.167e+080
-2.167e+08 – -2e+080
-2e+08 – -1.833e+080
-1.833e+08 – -1.667e+080
-1.667e+08 – -1.5e+080
-1.5e+08 – -1.333e+080
-1.333e+08 – -1.167e+080
-1.167e+08 – -1e+080
-1e+08 – -8.333e+070
-8.333e+07 – -6.666e+070
-6.666e+07 – -5e+070
-5e+07 – -3.333e+070
-3.333e+07 – -1.666e+070
-1.666e+07 – 28053212

How to cite

click to copy

BibTeX
@misc{saturn-median-rents-2026,
  author       = {Steuber, Luke},
  title        = {Saturn reading: median rents},
  year         ={2026},
  howpublished = {\url{https://dr.eamer.dev/saturn/view/median_rents}},
  note         = {Profiled with saturn-dissect v0.2.0, prompt saturn-insight-v2, model anthropic:claude-opus-4-7},
}
APA
Steuber, L. (2026). Saturn reading: median rents. Source: /home/coolhand/html/datavis/data_trove/cache/median_rents.parquet. Profiled with saturn-dissect v0.2.0 (saturn-insight-v2, anthropic:claude-opus-4-7). Retrieved from https://dr.eamer.dev/saturn/view/median_rents