saturn·

accessibility atlas who hale long

saturn notebook · generated 2026-05-01 Report Notebook

Overview

Source: /home/coolhand/datasets/accessibility-atlas/who_hale_long.csv

Saturn profiled 4,070 rows across 4 columns. The stats below are deterministic and machine-readable; the prose is a language-model interpretation of those stats (opt-in, added after the fact, never sees raw rows).

[2]:
!pip install saturn-dissect
import subprocess
subprocess.run([
    "saturn", "analyze", "/home/coolhand/datasets/accessibility-atlas/who_hale_long.csv",
    "--findings", "accessibility-atlas-who_hale_long.json",
    "--llm", "anthropic:claude-opus-4-7",
])

Summary confidence: high

This dataset contains 4,070 rows of WHO Healthy Life Expectancy (HALE) data spanning 185 countries, 6 regions, and 22 years from 2000 to 2021. The panel is balanced — each country contributes 22 yearly observations — so the country_code distribution is essentially uniform and not informative on its own. The most interesting variable is hale_years, which ranges from 35.3 to 73.8 with a mean of 61.0 and a left-skewed distribution (skew = -0.82), indicating a long tail of countries with notably lower healthy life expectancy. Regional coverage is uneven, with Europe (1,100 rows) and Africa (1,034) dominating while South-East Asia contributes only 220 rows. Start by examining the hale_years distribution and how it breaks down by region.

citing: row_count · column_count · columns · kinds

Out[4]:

saturn.schema() · 4 columns

column kind n null% unique alerts
country_code categorical 4,070 0.0% 185
region categorical 4,070 0.0% 6
year numeric 4,070 0.0% 22
hale_years numeric 4,070 0.0% 345
Fig 1.
hale_years · Look for the left-skewed shape and the lower tail of countries with HALE under ~50 years.
Show data table
Histogram bins for hale_years (median: 63.1).
bincount
35.3 – 36.261
36.26 – 37.222
37.22 – 38.194
38.19 – 39.1512
39.15 – 40.1113
40.11 – 41.0710
41.07 – 42.0414
42.04 – 4318
43 – 43.9624
43.96 – 44.9236
44.92 – 45.8938
45.89 – 46.8550
46.85 – 47.8151
47.81 – 48.7746
48.77 – 49.7458
49.74 – 50.762
50.7 – 51.6685
51.66 – 52.62100
52.62 – 53.5998
53.59 – 54.55114
54.55 – 55.5196
55.51 – 56.47102
56.47 – 57.44115
57.44 – 58.4117
58.4 – 59.36130
59.36 – 60.33124
60.33 – 61.29136
61.29 – 62.25196
62.25 – 63.21217
63.21 – 64.17316
64.17 – 65.14301
65.14 – 66.1256
66.1 – 67.06255
67.06 – 68.03215
68.03 – 68.99177
68.99 – 69.95214
69.95 – 70.91163
70.91 – 71.8870
71.88 – 72.8419
72.84 – 73.815
Fig 2.
region · Note the uneven regional representation, with Europe and Africa contributing the most rows and South-East Asia the fewest.
Show data table
Top values for region (6 unique shown, of 6 total).
valuecountshare
Europe110027.0%
Africa103425.4%
Americas74818.4%
Eastern Mediterranean48411.9%
Western Pacific48411.9%
South-East Asia2205.4%
Fig 3.
year · Confirm the uniform yearly coverage from 2000 to 2021 that makes this a balanced panel.
Show data table
Histogram bins for year (median: 2010.5).
bincount
2000 – 2001185
2001 – 2001185
2001 – 20020
2002 – 2002185
2002 – 20030
2003 – 2003185
2003 – 20040
2004 – 2004185
2004 – 20050
2005 – 2005185
2005 – 20060
2006 – 2006185
2006 – 20070
2007 – 2007185
2007 – 20080
2008 – 2008185
2008 – 20090
2009 – 2009185
2009 – 20100
2010 – 2010185
2010 – 2011185
2011 – 20120
2012 – 2012185
2012 – 20130
2013 – 2013185
2013 – 20140
2014 – 2014185
2014 – 20150
2015 – 2015185
2015 – 20160
2016 – 2016185
2016 – 20170
2017 – 2017185
2017 – 20180
2018 – 2018185
2018 – 20190
2019 – 2019185
2019 – 20200
2020 – 2020185
2020 – 2021185
Fig 4.
country_code · 185 countries each appear ~22 times; useful as a sanity check on panel completeness.
Show data table
Top values for country_code (20 unique shown, of 185 total).
valuecountshare
AFG220.5%
AGO220.5%
ALB220.5%
ARE220.5%
ARG220.5%
ARM220.5%
ATG220.5%
AUS220.5%
AUT220.5%
AZE220.5%
BDI220.5%
BEL220.5%
BEN220.5%
BFA220.5%
BGD220.5%
BGR220.5%
BHR220.5%
BHS220.5%
BIH220.5%
BLR220.5%
Fig 5.
Per-column null rate across the corpus. Columns are ordered by input position.
Show data table
Per-column null rate across the corpus.
columnkindnull %
country_codecategorical0.0%
regioncategorical0.0%
yearnumeric0.0%
hale_yearsnumeric0.0%
Fig 6.
Pearson correlation across numeric columns (sampled, bounded).
Show data table
Pearson correlation across 2 numeric columns (values clipped to 2 decimals).
yearhale_years
year+1.00+0.20
hale_years+0.20+1.00

country_code categorical foreign_key

This column holds ISO 3166-1 alpha-3 country codes, with 185 unique values across 4,070 rows and zero nulls. The distribution is perfectly uniform — every visible top value appears exactly 22 times and entropy_ratio is 1.0 — which strongly suggests a panel structure (185 countries × 22 periods).

Treatment: Use as a join key to country reference data; pair with a time column to model the panel.

anthropic:claude-opus-4-7 · confidence high
Out[12]:

saturn.columns["country_code"].stats

statvalue
n4,070
nulls0 (0.0%)
unique185
top_value AFG
top_rate 0.005405
cardinality 185
entropy 7.531
entropy_ratio 1
Fig 7.
Top values for country_code.
Show data table
Top values for country_code (20 unique shown, of 185 total).
valuecountshare
AFG220.5%
AGO220.5%
ALB220.5%
ARE220.5%
ARG220.5%
ARM220.5%
ATG220.5%
AUS220.5%
AUT220.5%
AZE220.5%
BDI220.5%
BEL220.5%
BEN220.5%
BFA220.5%
BGD220.5%
BGR220.5%
BHR220.5%
BHS220.5%
BIH220.5%
BLR220.5%

region categorical feature

This is a categorical region field with 6 distinct values matching WHO regional groupings (Europe, Africa, Americas, Eastern Mediterranean, Western Pacific, South-East Asia) and no nulls across 4070 rows. Distribution is fairly balanced — entropy ratio of 0.936 — with Europe leading at 27% (1100 rows) and South-East Asia trailing at 220. The mix of WHO-style labels suggests this dataset is sourced from or aligned with WHO global health data.

Treatment: one-hot or target-encode for modelling; safe to use as a stratification key.

anthropic:claude-opus-4-7 · confidence high
Out[15]:

saturn.columns["region"].stats

statvalue
n4,070
nulls0 (0.0%)
unique6
top_value Europe
top_rate 0.2703
cardinality 6
entropy 2.42
entropy_ratio 0.9361
Fig 8.
Top values for region.
Show data table
Top values for region (6 unique shown, of 6 total).
valuecountshare
Europe110027.0%
Africa103425.4%
Americas74818.4%
Eastern Mediterranean48411.9%
Western Pacific48411.9%
South-East Asia2205.4%

year numeric timestamp

This column captures the calendar year, spanning 2000 to 2021 with 22 distinct integer values across 4070 rows and no nulls. The distribution is perfectly symmetric (skew 0.0, mean equals median at 2010.5) with negative kurtosis (-1.20), indicating a near-uniform spread across years rather than a concentration in any period. No outliers are flagged.

Treatment: Treat as a discrete time index for grouping or trend analysis rather than a continuous numeric feature.

anthropic:claude-opus-4-7 · confidence high
Out[18]:

saturn.columns["year"].stats

statvalue
n4,070
nulls0 (0.0%)
unique22
min 2,000
max 2,021
mean 2010
median 2010
std 6.345
q1 2,005
q3 2,016
iqr 11
skew 0
kurtosis -1.205
n_outliers 0
outlier_rate 0
zero_rate 0
Fig 9.
Distribution of year. Vertical dash marks the median.
Show data table
Histogram bins for year (median: 2010.5).
bincount
2000 – 2001185
2001 – 2001185
2001 – 20020
2002 – 2002185
2002 – 20030
2003 – 2003185
2003 – 20040
2004 – 2004185
2004 – 20050
2005 – 2005185
2005 – 20060
2006 – 2006185
2006 – 20070
2007 – 2007185
2007 – 20080
2008 – 2008185
2008 – 20090
2009 – 2009185
2009 – 20100
2010 – 2010185
2010 – 2011185
2011 – 20120
2012 – 2012185
2012 – 20130
2013 – 2013185
2013 – 20140
2014 – 2014185
2014 – 20150
2015 – 2015185
2015 – 20160
2016 – 2016185
2016 – 20170
2017 – 2017185
2017 – 20180
2018 – 2018185
2018 – 20190
2019 – 2019185
2019 – 20200
2020 – 2020185
2020 – 2021185

hale_years numeric feature

Healthy life expectancy in years (HALE), spanning 35.3 to 73.8 with a mean of 61.03 and median 63.1 across 4070 rows. The distribution is left-skewed (skew -0.82), reflecting a long tail of low-HALE observations pulling below the bulk concentrated between Q1=56.3 and Q3=66.4. Only 1.1% outliers and zero nulls, so the column is clean and ready to use.

Treatment: Use directly as a numeric feature; consider modelling the left skew if linearity is required.

anthropic:claude-opus-4-7 · confidence high
Out[21]:

saturn.columns["hale_years"].stats

statvalue
n4,070
nulls0 (0.0%)
unique345
min 35.3
max 73.8
mean 61.03
median 63.1
std 7.344
q1 56.3
q3 66.4
iqr 10.1
skew -0.8235
kurtosis 0.02741
n_outliers 45
outlier_rate 0.01106
zero_rate 0
Fig 10.
Distribution of hale_years. Vertical dash marks the median.
Show data table
Histogram bins for hale_years (median: 63.1).
bincount
35.3 – 36.261
36.26 – 37.222
37.22 – 38.194
38.19 – 39.1512
39.15 – 40.1113
40.11 – 41.0710
41.07 – 42.0414
42.04 – 4318
43 – 43.9624
43.96 – 44.9236
44.92 – 45.8938
45.89 – 46.8550
46.85 – 47.8151
47.81 – 48.7746
48.77 – 49.7458
49.74 – 50.762
50.7 – 51.6685
51.66 – 52.62100
52.62 – 53.5998
53.59 – 54.55114
54.55 – 55.5196
55.51 – 56.47102
56.47 – 57.44115
57.44 – 58.4117
58.4 – 59.36130
59.36 – 60.33124
60.33 – 61.29136
61.29 – 62.25196
62.25 – 63.21217
63.21 – 64.17316
64.17 – 65.14301
65.14 – 66.1256
66.1 – 67.06255
67.06 – 68.03215
68.03 – 68.99177
68.99 – 69.95214
69.95 – 70.91163
70.91 – 71.8870
71.88 – 72.8419
72.84 – 73.815

How to cite

click to copy

BibTeX
@misc{saturn-accessibility-atlas-who-hale-long-2026,
  author       = {Steuber, Luke},
  title        = {Saturn reading: accessibility atlas who hale long},
  year         ={2026},
  howpublished = {\url{https://dr.eamer.dev/saturn/view/accessibility-atlas-who_hale_long}},
  note         = {Profiled with saturn-dissect v0.2.0, prompt saturn-insight-v2, model anthropic:claude-opus-4-7},
}
APA
Steuber, L. (2026). Saturn reading: accessibility atlas who hale long. Source: /home/coolhand/datasets/accessibility-atlas/who_hale_long.csv. Profiled with saturn-dissect v0.2.0 (saturn-insight-v2, anthropic:claude-opus-4-7). Retrieved from https://dr.eamer.dev/saturn/view/accessibility-atlas-who_hale_long