saturn·

accessibility atlas who hale long

source /home/coolhand/datasets/accessibility-atlas/who_hale_long.csv 4,070 rows 4 columns profiled 2026-05-01 raw JSON static .html .ipynb Report Notebook

Reading

dataset summary · high confidence anthropic:claude-opus-4-7

This dataset contains 4,070 rows of WHO Healthy Life Expectancy (HALE) data spanning 185 countries, 6 regions, and 22 years from 2000 to 2021. The panel is balanced — each country contributes 22 yearly observations — so the country_code distribution is essentially uniform and not informative on its own. The most interesting variable is hale_years, which ranges from 35.3 to 73.8 with a mean of 61.0 and a left-skewed distribution (skew = -0.82), indicating a long tail of countries with notably lower healthy life expectancy. Regional coverage is uneven, with Europe (1,100 rows) and Africa (1,034) dominating while South-East Asia contributes only 220 rows. Start by examining the hale_years distribution and how it breaks down by region.

citing: row_count · column_count · columns · kinds

Schema

4 columns
Per-column summary. Click column name to jump to its detail.
Alerts
country_code categorical 0.0% 185
region categorical 0.0% 6
year numeric 0.0% 22
hale_years numeric 0.0% 345

country_code

categorical foreign_key
This column holds ISO 3166-1 alpha-3 country codes, with 185 unique values across 4,070 rows and zero nulls. The distribution is perfectly uniform — every visible top value appears exactly 22 times and entropy_ratio is 1.0 — which strongly suggests a panel structure (185 countries × 22 periods). Treatment: Use as a join key to country reference data; pair with a time column to model the panel. high · anthropic:claude-opus-4-7
n
4,070
nulls
0 (0.0%)
unique
185
top_value
AFG
top_rate
0.005405
cardinality
185
entropy
7.531
entropy_ratio
1

region

categorical feature
This is a categorical region field with 6 distinct values matching WHO regional groupings (Europe, Africa, Americas, Eastern Mediterranean, Western Pacific, South-East Asia) and no nulls across 4070 rows. Distribution is fairly balanced — entropy ratio of 0.936 — with Europe leading at 27% (1100 rows) and South-East Asia trailing at 220. The mix of WHO-style labels suggests this dataset is sourced from or aligned with WHO global health data. Treatment: one-hot or target-encode for modelling; safe to use as a stratification key. high · anthropic:claude-opus-4-7
n
4,070
nulls
0 (0.0%)
unique
6
top_value
Europe
top_rate
0.2703
cardinality
6
entropy
2.42
entropy_ratio
0.9361

year

numeric timestamp
This column captures the calendar year, spanning 2000 to 2021 with 22 distinct integer values across 4070 rows and no nulls. The distribution is perfectly symmetric (skew 0.0, mean equals median at 2010.5) with negative kurtosis (-1.20), indicating a near-uniform spread across years rather than a concentration in any period. No outliers are flagged. Treatment: Treat as a discrete time index for grouping or trend analysis rather than a continuous numeric feature. high · anthropic:claude-opus-4-7
n
4,070
nulls
0 (0.0%)
unique
22
min
2,000
max
2,021
mean
2010
median
2010
std
6.345
q1
2,005
q3
2,016
iqr
11
skew
0
kurtosis
-1.205
n_outliers
0
outlier_rate
0
zero_rate
0

hale_years

numeric feature
Healthy life expectancy in years (HALE), spanning 35.3 to 73.8 with a mean of 61.03 and median 63.1 across 4070 rows. The distribution is left-skewed (skew -0.82), reflecting a long tail of low-HALE observations pulling below the bulk concentrated between Q1=56.3 and Q3=66.4. Only 1.1% outliers and zero nulls, so the column is clean and ready to use. Treatment: Use directly as a numeric feature; consider modelling the left skew if linearity is required. high · anthropic:claude-opus-4-7
n
4,070
nulls
0 (0.0%)
unique
345
min
35.3
max
73.8
mean
61.03
median
63.1
std
7.344
q1
56.3
q3
66.4
iqr
10.1
skew
-0.8235
kurtosis
0.02741
n_outliers
45
outlier_rate
0.01106
zero_rate
0