saturn

/home/coolhand/html/datavis/data_trove/data/urban/nyc_housing/nyc_rent_burden_by_tract.csv 2,327 rows sample n=2,327 seed 42 2026-05-01T17:26:11+00:00

Overview

Source/home/coolhand/html/datavis/data_trove/data/urban/nyc_housing/nyc_rent_burden_by_tract.csv
Total rows2,327
Profiled sample2,327
Columns16
Generated2026-05-01T17:26:11+00:00

Insights opt-in

Model-generated narrative. These are opinions, not facts — the stats below are what saturn measured. Generated by: anthropic:claude-opus-4-7.

Dataset high anthropic:claude-opus-4-7

This dataset covers 2,327 NYC census tracts with 16 columns describing renter households and rent burden levels across the five boroughs. All tracts are in New York State (state is constant at 36) and split across five counties, with Brooklyn (Kings) the largest share at about 34.6% of tracts and Staten Island the smallest at 126 tracts. The headline housing-affordability metric, pct_rent_burdened, is roughly symmetric around a median of 50% with an IQR of 40.9 to 58.8, indicating that in a typical tract about half of renters spend 30%+ of income on rent. The raw count columns (rent_burdened, rent_50_pct_or_more, total_renter_households) are right-skewed with notable outliers, so look at the burden percentages first for cross-tract comparison and reserve the count fields for identifying the highest-volume tracts.

total_renter_households high anthropic:claude-opus-4-7

This column counts renter households per record, ranging from 0 to 8209 with a median of 726 and mean of 946. The distribution is right-skewed (skew 1.59, kurtosis 4.63) with 69 outliers (2.97%) on the high end, and 4.38% of rows are zero. No nulls, and 1418 unique values across 2327 rows suggest some repeated counts.

rent_30_to_34_9_pct high anthropic:claude-opus-4-7

This appears to be a count of households paying 30% to 34.9% of income on rent within some geographic unit (likely census tract or ZIP). The distribution is heavily right-skewed (skew 2.76, kurtosis 13.86) with median 51 but mean 83 and max 1205, and 16.2% of rows are zero. About 5.3% of values (124 rows) flag as outliers, suggesting a few large areas dominate the tail.

rent_35_to_39_9_pct high anthropic:claude-opus-4-7

This column appears to be a count of housing units (or households) paying 35% to 39.9% of income on rent, aggregated per geographic unit. The distribution is heavily right-skewed (skew 2.40, kurtosis 9.27) with median 35 but max 633, and nearly 20% of rows are zero (zero_rate 0.196), pointing to many small or sparsely populated areas alongside a long tail of larger ones. 110 outliers (4.7%) sit above the IQR fence of 10–83.

rent_40_to_49_9_pct high anthropic:claude-opus-4-7

Likely a count of households whose rent falls in the 40-49.9% income bracket per geographic unit. The distribution is heavily right-skewed (skew 2.14, kurtosis 7.14) with a median of 49 but a max of 740, and 15.6% of rows are zero, suggesting many small areas with no such households alongside a long tail of large ones. About 4.8% of values are flagged as outliers.

rent_50_pct_or_more medium anthropic:claude-opus-4-7

This column likely counts households (or housing units) spending 50% or more of income on rent within each geographic record. Values span 0 to 1918 with a median of 184 and mean of 253.2, and the distribution is right-skewed (skew 1.60, kurtosis 3.44) with 87 high-end outliers (3.7%). About 6.3% of rows are zero and there are no nulls across 2327 rows.

NAME high anthropic:claude-opus-4-7

This column holds fully-qualified Census tract names for New York City, with every one of the 2327 rows unique and non-null. Lengths cluster tightly between 38 and 46 characters and every record contains the boilerplate tokens 'new', 'york', 'census', 'tract', and 'county;', followed by a borough name (Kings 805, Queens 725, Bronx 361, Richmond 126). It functions as a row identifier rather than a feature, though the embedded borough token is the only varying signal worth extracting.

state high anthropic:claude-opus-4-7

The column 'state' is a numeric field that holds the single value 36 across all 2327 rows with no nulls. It carries a 'constant' alert and contributes zero variance (std 0.0, n_unique 1), suggesting it is a leftover filter key (perhaps a state/region code) rather than a usable feature.

county high anthropic:claude-opus-4-7

Despite being typed numeric, `county` only takes 5 distinct values across 2327 rows (min 5, max 85), so these integers are almost certainly encoded county identifiers rather than measurements. The distribution is left-skewed (skew -0.72) with median 47 below mean 55, and quartiles land exactly on observed codes (Q1=47, Q3=81), confirming a small categorical support. No nulls or outliers are reported.

tract high anthropic:claude-opus-4-7

Census tract codes stored as integers, with 1530 unique values across 2327 rows and no nulls. The distribution is severely right-skewed (skew 10.14, kurtosis 189.82) with a max of 990100 against a median of 30100, which is characteristic of tract identifiers rather than a measurable quantity. The 63 flagged outliers and the heavy tail are artifacts of the coding scheme, not anomalies to clean.

county_name high anthropic:claude-opus-4-7

This column is the NYC borough/county name, with exactly 5 unique values matching the city's five boroughs and no nulls across 2327 rows. Brooklyn (Kings) leads at 34.6% (805), followed by Queens (725), Bronx (361), Manhattan (310), and Staten Island (126); entropy ratio of 0.898 indicates a fairly even spread despite Staten Island being noticeably underrepresented.

moderate_burden high anthropic:claude-opus-4-7

A non-negative integer count column named 'moderate_burden', with 2327 rows, no nulls, and 639 distinct values ranging from 0 to 1732 (median 159, mean 216). The distribution is right-skewed (skew 1.93, kurtosis 6.05) with 86 outliers (3.7%) and 6.4% exact zeros, suggesting a long tail of high-burden cases over a typical mid-hundreds baseline.

severe_burden high anthropic:claude-opus-4-7

Numeric count-like column 'severe_burden' spanning 0 to 1918 across 2327 rows with no nulls and 706 distinct values. The distribution is right-skewed (skew 1.60, kurtosis 3.44) with median 184 well below mean 253.18, an IQR of 278, and 87 outliers (3.7%); 6.3% of rows are exactly zero.

pct_moderate_burden high anthropic:claude-opus-4-7

Percentage of households with a moderate housing-cost burden, expressed on a 0-100 scale (min 0.0, max 100.0, mean 22.74, median 21.8). The distribution is right-skewed (skew 1.51, kurtosis 6.70) with a tight IQR of 12.3 around the median but a long upper tail producing 59 outliers (2.65%). About 4.38% of rows are null and 2.11% are exact zeros, both worth checking before modelling.

pct_severe_burden high anthropic:claude-opus-4-7

A percentage feature (0–100 range) capturing the share of some population under severe burden, averaging 27.1% with a median of 26.2 and IQR of 15.9. The distribution is mildly right-skewed (0.57) with 30 outliers (1.35%) reaching up to 100, and 4.38% of rows are null. With 518 unique values across 2327 rows and a 1.98% zero rate, it behaves as a continuous rate rather than a categorical bucket.

rent_burdened medium anthropic:claude-opus-4-7

Likely a per-record count or dollar measure of rent-burdened households (or burden amount), ranging 0 to 3153 with a median of 358 and mean of 469.26. The distribution is right-skewed (skew 1.49, kurtosis 3.00) with 82 high outliers (3.5%) and 4.7% zeros. With 1013 unique values across 2327 rows and no nulls, it behaves like a continuous feature rather than a category.

pct_rent_burdened high anthropic:claude-opus-4-7

Likely the share of households that are rent-burdened, expressed as a percentage from 0 to 100. The distribution is roughly symmetric (skew -0.04) and centered near 50 (mean 49.87, median 50.0) with an IQR of 17.9, suggesting a wide spread across geographies. About 4.4% of rows are null and 62 values (2.8%) fall outside the Tukey fences, including some at the 0 and 100 extremes.

Numeric correlation

total_renter_households numeric

rows2,327
null0 (0.0%)
unique1,418
min0.000
max8,209
mean946.145
median726.000
std815.372
q1346.000
q31,357
iqr1,011
skew1.595
kurtosis4.627
n_outliers69
outlier_rate0.030
zero_rate0.044

rent_30_to_34_9_pct numeric

skew=+2.76 5.3% rows beyond 1.5 IQR
rows2,327
null0 (0.0%)
unique355
min0.000
max1,205
mean83.050
median51.000
std100.320
q115.000
q3116.000
iqr101.000
skew2.755
kurtosis13.860
n_outliers124
outlier_rate0.053
zero_rate0.162

rent_35_to_39_9_pct numeric

skew=+2.40
rows2,327
null0 (0.0%)
unique270
min0.000
max633.000
mean58.351
median35.000
std69.848
q110.000
q383.000
iqr73.000
skew2.395
kurtosis9.275
n_outliers110
outlier_rate0.047
zero_rate0.196

rent_40_to_49_9_pct numeric

skew=+2.14
rows2,327
null0 (0.0%)
unique322
min0.000
max740.000
mean74.676
median49.000
std83.794
q114.000
q3106.000
iqr92.000
skew2.137
kurtosis7.139
n_outliers111
outlier_rate0.048
zero_rate0.156

rent_50_pct_or_more numeric

rows2,327
null0 (0.0%)
unique706
min0.000
max1,918
mean253.181
median184.000
std236.597
q182.000
q3360.000
iqr278.000
skew1.603
kurtosis3.435
n_outliers87
outlier_rate0.037
zero_rate0.063

NAME text

100.0% of rows are unique strings
rows2,327
null0 (0.0%)
unique2,327
len_min38
len_max46
len_mean41.649
len_median41.000
len_p9546.000
word_mean7.133
word_median7.000
n_empty0
n_duplicates0
duplicate_rate0.000
vocab_size1,539
readability_flesch_mean91.451
emoji_rate0.000
url_rate0.000
one_word_rate0.000
allcaps_rate0.000
boilerplate_rate0.000
Sample values (first 10)
  1. Census Tract 4; Bronx County; New York
  2. Census Tract 399.01; Queens County; New York
  3. Census Tract 779.08; Queens County; New York
  4. Census Tract 613.02; Queens County; New York
  5. Census Tract 780; Kings County; New York
  6. Census Tract 156.02; Richmond County; New York
  7. Census Tract 848; Kings County; New York
  8. Census Tract 1008.04; Queens County; New York
  9. Census Tract 618; Queens County; New York
  10. Census Tract 145; Bronx County; New York

state numeric

only one distinct value
rows2,327
null0 (0.0%)
unique1
min36.000
max36.000
mean36.000
median36.000
std0.000
q136.000
q336.000
iqr0.000
skew0.000
kurtosis0.000
n_outliers0
outlier_rate0.000
zero_rate0.000

county numeric

rows2,327
null0 (0.0%)
unique5
min5.000
max85.000
mean55.000
median47.000
std25.969
q147.000
q381.000
iqr34.000
skew-0.720
kurtosis-0.453
n_outliers0
outlier_rate0.000
zero_rate0.000

tract numeric

skew=+10.14
rows2,327
null0 (0.0%)
unique1,530
min100.000
max990,100
mean42,252
median30,100
std48,265
q115,200
q357,900
iqr42,700
skew10.143
kurtosis189.824
n_outliers63
outlier_rate0.027
zero_rate0.000

county_name categorical

rows2,327
null0 (0.0%)
unique5
top_valueBrooklyn (Kings)
top_rate0.346
cardinality5
entropy2.086
entropy_ratio0.898
Top values (rank 1–20)
  1. Brooklyn (Kings) — 805
  2. Queens — 725
  3. Bronx — 361
  4. Manhattan (New York) — 310
  5. Staten Island (Richmond) — 126

moderate_burden numeric

rows2,327
null0 (0.0%)
unique639
min0.000
max1,732
mean216.076
median159.000
std210.384
q164.000
q3311.000
iqr247.000
skew1.934
kurtosis6.052
n_outliers86
outlier_rate0.037
zero_rate0.064

severe_burden numeric

rows2,327
null0 (0.0%)
unique706
min0.000
max1,918
mean253.181
median184.000
std236.597
q182.000
q3360.000
iqr278.000
skew1.603
kurtosis3.435
n_outliers87
outlier_rate0.037
zero_rate0.063

pct_moderate_burden numeric

rows2,327
null102 (4.4%)
unique461
min0.000
max100.000
mean22.744
median21.800
std11.359
q115.900
q328.200
iqr12.300
skew1.509
kurtosis6.704
n_outliers59
outlier_rate0.027
zero_rate0.021

pct_severe_burden numeric

rows2,327
null102 (4.4%)
unique518
min0.000
max100.000
mean27.124
median26.200
std12.677
q118.700
q334.600
iqr15.900
skew0.566
kurtosis1.222
n_outliers30
outlier_rate0.013
zero_rate0.020

rent_burdened numeric

rows2,327
null0 (0.0%)
unique1,013
min0.000
max3,153
mean469.258
median358.000
std415.279
q1164.500
q3670.000
iqr505.500
skew1.494
kurtosis3.005
n_outliers82
outlier_rate0.035
zero_rate0.047

pct_rent_burdened numeric

rows2,327
null102 (4.4%)
unique596
min0.000
max100.000
mean49.867
median50.000
std14.615
q140.900
q358.800
iqr17.900
skew-0.038
kurtosis0.785
n_outliers62
outlier_rate0.028
zero_rate3.60e-03