saturn·

data trove boy bands

saturn notebook · generated 2026-06-21 Report Notebook

Overview

Source: /home/coolhand/html/datavis/data_trove/entertainment/pop_culture/Boy Band.csv

Saturn profiled 15 rows across 4 columns. The stats below are deterministic and machine-readable; the prose is a language-model interpretation of those stats (opt-in, added after the fact, never sees raw rows).

[2]:
!pip install saturn-dissect
import subprocess
subprocess.run([
    "saturn", "analyze", "/home/coolhand/html/datavis/data_trove/entertainment/pop_culture/Boy Band.csv",
    "--findings", "data-trove-boy-bands.json",
    "--llm", "anthropic:default",
])

Summary confidence: medium

This dataset is a small reference list of 15 famous boy bands, capturing each band's name and its years active. The most immediately interesting angle is the band frequency distribution — four bands (Westlife, Jonas Brothers, Take That, and Blue) each appear twice, suggesting possible duplicate rows or multiple entries per group worth investigating. The Years Active column is entirely unique across all 15 rows, spanning acts from 1958 (The Osmonds) to present-day groups, hinting at a wide generational spread that could reward closer reading.

citing: row_count · column_count · top_value · top_rate · n_unique · null_rate

Out[4]:

saturn.schema() · 4 columns

column kind n null% unique alerts
index numeric 15 0.0% 15
S.No. numeric 15 0.0% 15
Band categorical 15 0.0% 11 long_tail
Years Active categorical 15 0.0% 15 long_tail
Fig 1.
Band · Look for the four bands with duplicate entries — these may indicate data quality issues or intentional multi-row records.
Show data table
Top values for Band (11 unique shown, of 11 total).
valuecountshare
Westlife213.3%
Jonas Brothers213.3%
Take That213.3%
Blue213.3%
NSync16.7%
Backstreet Boys16.7%
BTS16.7%
One Direction16.7%
The Osmonds16.7%
New Kids on the Block16.7%
The Beatles16.7%
Fig 2.
Years Active · Each band has a unique active period; scan for the range from 1958 to present-day acts to appreciate the dataset's generational span.
Show data table
Top values for Years Active (15 unique shown, of 15 total).
valuecountshare
1995-2002 16.7%
1998-201216.7%
2018-present16.7%
1993-present16.7%
2013-present16.7%
2010-201616.7%
2005-201316.7%
2019-present16.7%
1958-present 16.7%
1984-1994 16.7%
1990-199616.7%
2005-present16.7%
1960-1970 16.7%
2000-200516.7%
2011-present16.7%
Fig 3.
Per-column null rate across the corpus. Columns are ordered by input position.
Show data table
Per-column null rate across the corpus.
columnkindnull %
indexnumeric0.0%
S.No.numeric0.0%
Bandcategorical0.0%
Years Activecategorical0.0%
Fig 4.
Pearson correlation across numeric columns (sampled, bounded).
Show data table
Pearson correlation across 2 numeric columns (values clipped to 2 decimals).
indexS.No.
index+1.00+1.00
S.No.+1.00+1.00

index numeric identifier

This column is a row index running 0–14 across all 15 records, with perfect uniqueness and no nulls. Values are uniformly spaced (mean = median = 7.0, skew = 0.0, platykurtic at −1.21), consistent with an auto-generated sequential integer index. It carries no analytical information.

Treatment: Drop before modelling; it is a row counter with no predictive value.

anthropic:default · confidence high
Out[10]:

saturn.columns["index"].stats

statvalue
n15
nulls0 (0.0%)
unique15
min 0
max 14
mean 7
median 7
std 4.472
q1 3.5
q3 10.5
iqr 7
skew 0
kurtosis -1.211
n_outliers 0
outlier_rate 0
zero_rate 0.06667
Fig 5.
Distribution of index. Vertical dash marks the median.
Show data table
Histogram bins for index (median: 7.0).
bincount
0 – 2.83
2.8 – 5.63
5.6 – 8.43
8.4 – 11.23
11.2 – 143

S.No. numeric identifier

This column is a sequential row index (serial number), running from 1 to 15 with all 15 values unique and no nulls. The distribution is perfectly symmetric (skew = 0.0, mean = median = 8.0) and uniformly spread, consistent with a simple integer counter. There is nothing analytically informative here beyond row ordering.

Treatment: Drop before modelling; use only for row traceability if needed.

anthropic:default · confidence high
Out[13]:

saturn.columns["S.No."].stats

statvalue
n15
nulls0 (0.0%)
unique15
min 1
max 15
mean 8
median 8
std 4.472
q1 4.5
q3 11.5
iqr 7
skew 0
kurtosis -1.211
n_outliers 0
outlier_rate 0
zero_rate 0
Fig 6.
Distribution of S.No.. Vertical dash marks the median.
Show data table
Histogram bins for S.No. (median: 8.0).
bincount
1 – 3.83
3.8 – 6.63
6.6 – 9.43
9.4 – 12.23
12.2 – 153

Band categorical label

This column contains the names of pop/boy bands, functioning as a categorical label in what appears to be a small reference dataset of 15 rows covering 11 distinct acts. The top four values (Westlife, Jonas Brothers, Take That, Blue) each appear exactly twice, while the remaining 7 bands appear once — producing a long-tail alert despite the tiny dataset size. With only 15 rows total, high entropy ratio (0.975) and near-unique cardinality (11/15), this column is close to an identifier rather than a grouping feature.

Treatment: Use as a grouping label for lookup or display; with only 15 rows and 11 unique values, avoid treating as a statistical feature without acquiring significantly more data.

anthropic:default · confidence high
Out[16]:

saturn.columns["Band"].stats

statvalue
n15
nulls0 (0.0%)
unique11
top_value Westlife
top_rate 0.1333
cardinality 11
entropy 3.374
entropy_ratio 0.9752
alert: long_tail7 singleton categories
Fig 7.
Top values for Band.
Show data table
Top values for Band (11 unique shown, of 11 total).
valuecountshare
Westlife213.3%
Jonas Brothers213.3%
Take That213.3%
Blue213.3%
NSync16.7%
Backstreet Boys16.7%
BTS16.7%
One Direction16.7%
The Osmonds16.7%
New Kids on the Block16.7%
The Beatles16.7%

Years Active categorical feature

This column captures the active career span of entities (likely artists, bands, or performers) as free-form date-range strings such as '1995-2002' or '2018-present'. With cardinality of 15 out of 15 rows and entropy_ratio of ~1.0, every value is unique — the column is essentially free text with no repeated categories. The trailing whitespace visible in values like '1995-2002 ' and '1958-present ' indicates inconsistent formatting that will require cleaning before any date parsing.

Treatment: Strip whitespace, split on '-' to extract start year and end year/flag 'present', then engineer numeric duration and is_active boolean features.

anthropic:default · confidence high
Out[19]:

saturn.columns["Years Active"].stats

statvalue
n15
nulls0 (0.0%)
unique15
top_value 1995-2002
top_rate 0.06667
cardinality 15
entropy 3.907
entropy_ratio 1
alert: long_tail15 singleton categories
Fig 8.
Top values for Years Active.
Show data table
Top values for Years Active (15 unique shown, of 15 total).
valuecountshare
1995-2002 16.7%
1998-201216.7%
2018-present16.7%
1993-present16.7%
2013-present16.7%
2010-201616.7%
2005-201316.7%
2019-present16.7%
1958-present 16.7%
1984-1994 16.7%
1990-199616.7%
2005-present16.7%
1960-1970 16.7%
2000-200516.7%
2011-present16.7%

How to cite

click to copy

BibTeX
@misc{saturn-data-trove-boy-bands-2026,
  author       = {Steuber, Luke},
  title        = {Saturn reading: data trove boy bands},
  year         ={2026},
  howpublished = {\url{https://dr.eamer.dev/saturn/view/data-trove-boy-bands}},
  note         = {Profiled with saturn-dissect v0.2.0, prompt saturn-insight-v2, model anthropic:default},
}
APA
Steuber, L. (2026). Saturn reading: data trove boy bands. Source: /home/coolhand/html/datavis/data_trove/entertainment/pop_culture/Boy Band.csv. Profiled with saturn-dissect v0.2.0 (saturn-insight-v2, anthropic:default). Retrieved from https://dr.eamer.dev/saturn/view/data-trove-boy-bands