saturn

/home/coolhand/html/datavis/data_trove/data/urban/nyc_housing/nyc_tenure_by_tract.csv 2,327 rows sample n=2,327 seed 42 2026-05-01T17:16:48+00:00

Overview

Source/home/coolhand/html/datavis/data_trove/data/urban/nyc_housing/nyc_tenure_by_tract.csv
Total rows2,327
Profiled sample2,327
Columns10
Generated2026-05-01T17:16:48+00:00

Insights opt-in

Model-generated narrative. These are opinions, not facts — the stats below are what saturn measured. Generated by: anthropic:claude-opus-4-7.

Dataset high anthropic:claude-opus-4-7

This dataset contains 2,327 New York City census tracts with housing tenure breakdowns across 10 columns, covering owner- and renter-occupied household counts and percentages by county. Brooklyn (Kings) leads with 805 tracts (34.6% of rows), followed by Queens (725) and Bronx (361), while Staten Island has just 126. Renting dominates citywide: the mean share of renter-occupied households is 62.5% versus 37.5% owner-occupied, and renter counts are right-skewed with a long tail up to 8,209 per tract. Worth a closer look: the strong skew in raw household counts (owner_occupied skew 1.76, renter_occupied skew 1.59) and the ~4% null rate in the percentage columns. Note that 'state' is constant (all 36) and can be ignored.

total_households high anthropic:claude-opus-4-7

Counts of households per geographic unit, ranging from 0 to 8209 with a median of 1252 and mean of 1410.7. The distribution is right-skewed (skew 1.48, kurtosis 4.38) with 70 outliers (3.0%) on the high end, and 4.1% of rows are zeros which may indicate uninhabited or unreported areas.

owner_occupied high anthropic:claude-opus-4-7

Despite the boolean-sounding name, owner_occupied is an integer count ranging 0–3052 with 1001 distinct values and a mean of 464.6 versus a median of 371, suggesting a per-area tally of owner-occupied units rather than a flag. The distribution is right-skewed (skew 1.76, kurtosis 4.25) with 143 outliers (6.1%) and 7.2% exact zeros. No nulls are present.

renter_occupied high anthropic:claude-opus-4-7

This column reports the count of renter-occupied units per record, ranging from 0 to 8209 with a mean of 946 and median of 726. The distribution is right-skewed (skew 1.59, kurtosis 4.63) with 4.4% zeros and 69 outliers (2.97%) in the upper tail. No nulls and 1418 unique values across 2327 rows suggest a per-area aggregate count rather than a per-unit flag.

NAME high anthropic:claude-opus-4-7

This column holds fully-qualified Census tract names for New York City, with every one of the 2327 rows unique and non-null. Lengths cluster tightly between 38 and 46 characters and every record contains the tokens 'new', 'york', 'census', 'tract', and 'county;', with the borough breakdown skewed toward Kings (805) and Queens (725) over Bronx (361) and Richmond (126) — Manhattan/New York County appears absent from the top words, which is worth checking. With n_unique == n, this is effectively a row identifier rather than a feature.

state high anthropic:claude-opus-4-7

The column 'state' is numeric but holds the single value 36 across all 2327 rows, with zero variance and only one unique value. It carries no information for analysis and is flagged constant.

county high anthropic:claude-opus-4-7

Encoded as numeric but only 5 distinct values across 2327 rows (min 5, max 85, median 47), this is almost certainly a categorical county code stored as an integer. The distribution is left-skewed (skew -0.72) with mean 55 sitting above median 47, suggesting one or two higher-numbered codes dominate. No nulls or outliers reported.

tract high anthropic:claude-opus-4-7

Census tract codes stored as integers, ranging from 100 to 990100 across 1530 distinct values in 2327 rows. The skew of 10.14 and kurtosis of 189.8 are artefacts of the tract numbering scheme rather than a real distribution — these are categorical identifiers, not measurements. 63 outliers (2.7%) reflect tracts with unusually high numeric codes, not anomalous data.

county_name high anthropic:claude-opus-4-7

This column lists New York City borough/county names across 2327 rows, with all 5 NYC boroughs represented and no nulls. Distribution is fairly even (entropy ratio 0.898), though Brooklyn (Kings) leads at 34.6% (805) and Staten Island (Richmond) trails at 126. The parenthetical county names suggest the source schema uses formal county labels rather than borough-only naming.

pct_owner_occupied high anthropic:claude-opus-4-7

Numeric column on a 0-100 scale (min 0.0, max 100.0) capturing the percentage of owner-occupied housing per record. The distribution is wide and flattish (std 25.65, kurtosis -0.85) with mean 37.51 just above median 34.4, and a broad IQR from 16.4 to 56.1, indicating most areas are minority owner-occupied. About 4.13% of rows are null and 3.23% are exactly zero, which may represent fully-rental areas worth flagging.

pct_renter_occupied high anthropic:claude-opus-4-7

Numeric share variable bounded between 0 and 100 (mean 62.49, median 65.6) — almost certainly the percentage of renter-occupied housing units in each row. The distribution is wide (std 25.65, IQR 39.7) and slightly left-skewed (skew -0.39, kurtosis -0.85), so values cluster toward the high end with a long tail of owner-dominated areas. About 4.13% of rows are null and only 0.27% are exact zeros; no outliers were flagged given the natural 0–100 bounds.

Numeric correlation

total_households numeric

rows2,327
null0 (0.0%)
unique1,495
min0.000
max8,209
mean1,411
median1,252
std923.255
q1773.500
q31,850
iqr1,076
skew1.479
kurtosis4.377
n_outliers70
outlier_rate0.030
zero_rate0.041

owner_occupied numeric

6.1% rows beyond 1.5 IQR
rows2,327
null0 (0.0%)
unique1,001
min0.000
max3,052
mean464.600
median371.000
std422.558
q1177.000
q3608.000
iqr431.000
skew1.761
kurtosis4.254
n_outliers143
outlier_rate0.061
zero_rate0.072

renter_occupied numeric

rows2,327
null0 (0.0%)
unique1,418
min0.000
max8,209
mean946.145
median726.000
std815.372
q1346.000
q31,357
iqr1,011
skew1.595
kurtosis4.627
n_outliers69
outlier_rate0.030
zero_rate0.044

NAME text

100.0% of rows are unique strings
rows2,327
null0 (0.0%)
unique2,327
len_min38
len_max46
len_mean41.649
len_median41.000
len_p9546.000
word_mean7.133
word_median7.000
n_empty0
n_duplicates0
duplicate_rate0.000
vocab_size1,539
readability_flesch_mean91.451
emoji_rate0.000
url_rate0.000
one_word_rate0.000
allcaps_rate0.000
boilerplate_rate0.000
Sample values (first 10)
  1. Census Tract 4; Bronx County; New York
  2. Census Tract 399.01; Queens County; New York
  3. Census Tract 779.08; Queens County; New York
  4. Census Tract 613.02; Queens County; New York
  5. Census Tract 780; Kings County; New York
  6. Census Tract 156.02; Richmond County; New York
  7. Census Tract 848; Kings County; New York
  8. Census Tract 1008.04; Queens County; New York
  9. Census Tract 618; Queens County; New York
  10. Census Tract 145; Bronx County; New York

state numeric

only one distinct value
rows2,327
null0 (0.0%)
unique1
min36.000
max36.000
mean36.000
median36.000
std0.000
q136.000
q336.000
iqr0.000
skew0.000
kurtosis0.000
n_outliers0
outlier_rate0.000
zero_rate0.000

county numeric

rows2,327
null0 (0.0%)
unique5
min5.000
max85.000
mean55.000
median47.000
std25.969
q147.000
q381.000
iqr34.000
skew-0.720
kurtosis-0.453
n_outliers0
outlier_rate0.000
zero_rate0.000

tract numeric

skew=+10.14
rows2,327
null0 (0.0%)
unique1,530
min100.000
max990,100
mean42,252
median30,100
std48,265
q115,200
q357,900
iqr42,700
skew10.143
kurtosis189.824
n_outliers63
outlier_rate0.027
zero_rate0.000

county_name categorical

rows2,327
null0 (0.0%)
unique5
top_valueBrooklyn (Kings)
top_rate0.346
cardinality5
entropy2.086
entropy_ratio0.898
Top values (rank 1–20)
  1. Brooklyn (Kings) — 805
  2. Queens — 725
  3. Bronx — 361
  4. Manhattan (New York) — 310
  5. Staten Island (Richmond) — 126

pct_owner_occupied numeric

rows2,327
null96 (4.1%)
unique823
min0.000
max100.000
mean37.513
median34.400
std25.651
q116.400
q356.100
iqr39.700
skew0.395
kurtosis-0.854
n_outliers0
outlier_rate0.000
zero_rate0.032

pct_renter_occupied numeric

rows2,327
null96 (4.1%)
unique823
min0.000
max100.000
mean62.487
median65.600
std25.651
q143.900
q383.600
iqr39.700
skew-0.395
kurtosis-0.854
n_outliers0
outlier_rate0.000
zero_rate2.69e-03