saturn

/home/coolhand/datasets/us-attention-data/wikipedia_trending.json 500 rows sample n=500 seed 42 2026-05-01T17:25:20+00:00

Overview

Source/home/coolhand/datasets/us-attention-data/wikipedia_trending.json
Total rows500
Profiled sample500
Columns5
Generated2026-05-01T17:25:20+00:00

Insights opt-in

Model-generated narrative. These are opinions, not facts — the stats below are what saturn measured. Generated by: anthropic:claude-opus-4-7.

Dataset high anthropic:claude-opus-4-7

This dataset captures 500 trending Wikipedia articles, with each row identified by a unique title and described by days_in_top_100, peak_views, total_views, and a daily_views series. All three numeric columns are heavily right-skewed with significant outliers — total_views skew is 10.4 with a max of ~23.9M against a median of ~213K, and peak_views shows similar behavior. Most articles spend only a few days in the top 100 (median 3, max 30), but a long tail extends well beyond. Start by examining the distribution of total_views and days_in_top_100 to understand how concentrated attention is on a few breakout articles.

title high anthropic:claude-opus-4-7

Wikipedia-style article titles with underscores (e.g. '1989_Tiananmen_Square_protests_and_massacre', 'Stranger_Things_season_5'), unique across all 500 rows (n_unique=500, entropy_ratio=1.0). Every value appears exactly once, so this functions as a row identifier rather than a categorical feature. The long_tail alert simply reflects that uniqueness.

total_views high anthropic:claude-opus-4-7

Likely a per-row view count, with all 500 values unique and no nulls or zeros. The distribution is severely right-skewed (skew 10.44, kurtosis 149.66): the median is 213,065 but the mean is 580,331 and the max reaches 23,890,102, roughly 112x the median. Outliers make up 10.2% of rows (51 of 500), so a small set of viral entries dominates the tail.

days_in_top_100 high anthropic:claude-opus-4-7

This column counts days a record spent in some top-100 ranking, with 500 non-null integer values ranging from 1 to 30 and a median of just 3. The distribution is heavily right-skewed (skew 2.45, kurtosis 5.98) — most items churn out fast while a long tail lingers, producing 56 outliers (11.2% outlier rate) above the q3 of 6. Mean (5.18) sits well above median, and only 27 unique values suggest tenure is bounded and discrete.

peak_views high anthropic:claude-opus-4-7

A numeric measure of peak viewership per record, with 499 unique values across 500 rows and no nulls or zeros. The distribution is severely right-skewed (skew 9.71, kurtosis 127.99): the median is 104,303 but the mean is 159,797 and the max reaches 4,011,044, well above q3 of 148,907. 57 outliers (11.4%) sit above the upper whisker, suggesting a small tail of viral peaks dominates the variance (std 250,171).

daily_views low anthropic:claude-opus-4-7

Column 'daily_views' was skipped by the profiler, so no type, uniqueness, or distribution stats were computed despite a full 500 non-null rows. The name suggests a per-day view count (likely numeric and right-skewed in practice), but nothing in the evidence confirms that. Re-run profiling with this column included before drawing any conclusions.

Numeric correlation

title categorical

500 singleton categories
rows500
null0 (0.0%)
unique500
top_value1989_Tiananmen_Square_protests_and_massacre
top_rate2.00e-03
cardinality500
entropy8.966
entropy_ratio1.000
Top values (rank 1–20)
  1. 1989_Tiananmen_Square_protests_and_massacre — 1
  2. .xxx — 1
  3. Wikipedia:Featured_pictures — 1
  4. Dhurandhar — 1
  5. Avatar:_Fire_and_Ash — 1
  6. Nicolás_Maduro — 1
  7. Stranger_Things — 1
  8. Marty_Supreme — 1
  9. Stranger_Things_season_5 — 1
  10. List_of_highest-grossing_Indian_films — 1
  11. Bruce_Lee — 1
  12. Heated_Rivalry — 1
  13. Venezuela — 1
  14. One_Battle_After_Another — 1
  15. Donald_Trump — 1
  16. ChatGPT — 1
  17. Brigitte_Bardot — 1
  18. Pluribus_(TV_series) — 1
  19. The_Housemaid_(2025_film) — 1
  20. Google_Chrome — 1

total_views numeric

skew=+10.44 10.2% rows beyond 1.5 IQR
rows500
null0 (0.0%)
unique500
min76,451
max23,890,102
mean580,331
median213,065
std1,424,123
q1115,081
q3535,224
iqr420,143
skew10.440
kurtosis149.659
n_outliers51
outlier_rate0.102
zero_rate0.000

days_in_top_100 numeric

skew=+2.45 11.2% rows beyond 1.5 IQR
rows500
null0 (0.0%)
unique27
min1.000
max30.000
mean5.176
median3.000
std6.239
q12.000
q36.000
iqr4.000
skew2.449
kurtosis5.979
n_outliers56
outlier_rate0.112
zero_rate0.000

peak_views numeric

skew=+9.71 11.4% rows beyond 1.5 IQR
rows500
null0 (0.0%)
unique499
min40,332
max4,011,044
mean159,797
median104,303
std250,171
q177,305
q3148,907
iqr71,602
skew9.709
kurtosis127.991
n_outliers57
outlier_rate0.114
zero_rate0.000

daily_views unknown

no profiler for kind=unknown
rows500
null0 (0.0%)