saturn·

.cache who yld region

source /home/coolhand/html/datavis/data_trove/data/accessibility/.cache_who/yld_region.xlsx#Notes 196 rows 2 columns profiled 2026-05-01 raw JSON static .html .ipynb Report Notebook

Reading

dataset summary · high confidence anthropic:claude-opus-4-7

This is the 'Notes' sheet from a WHO Global Health Estimates 2021 workbook on Years Lost due to Disability (YLDs) by region, with 196 rows and just 2 columns. The first column is almost entirely empty (96.94% null) and contains only six narrative blurbs, while the second column carries 190 unique short strings — mostly country names plus a handful of header/citation lines. In other words, this isn't analytical data: it's a metadata/documentation sheet listing WHO member states and citation text. Before doing anything analytical, point the user to the workbook's other sheets; the meaningful YLD numbers live elsewhere.

citing: row_count · column_count · columns[0].null_rate · columns[0].n_unique · columns[1].n_unique · columns[1].top_value · columns[1].top_values

Schema

2 columns
Per-column summary. Click column name to jump to its detail.
Alerts
__UNNAMED__0 categorical 96.9% 6
long_tail null_rate
GLOBAL HEALTH ESTIMATES 2021 SUMMARY TABLES: categorical 3.1% 190
long_tail

__UNNAMED__0

categorical metadata long_tail null_rate
This is the unnamed first column of a WHO Global Health Estimates 2021 spreadsheet, holding spillover header/notes text rather than tabular data. Out of 196 rows, 96.94% are null and only 6 unique strings appear, including the workbook's preamble, a recommended citation, and section labels like 'List of Countries'. It is documentation scaffolding, not a feature. Treatment: Drop; this column carries spreadsheet preamble text, not data. high · anthropic:claude-opus-4-7
n
196
nulls
190 (96.9%)
unique
6
top_value
Global Health Estimates 2021 Summary Tables This workbook contains summary burden of disease estimates from the WHO Global Health Estimates (GHE). The estimates are based on analysis of latest available national information on levels of mortality and cause distributions as of the end of 2023 together with latest available information from WHO programs for causes of public health importance. Data, methods and cause categories are described in a Technical Paper (1) available on the WHO website. Population estimates are from the 2022 revision of the UN World Population Prospects (2). This spreadsheet includes point estimates for years lost due to disability (YLDs) by WHO region and by cause, age and sex, for the years 2000, 2010, 2015, 2019, 2020 and 2021. Documentation, country-level and regional-level summary tables are available on the WHO website ( https://www.who.int/data/gho/data/themes/mortality-and-global-health-estimates ). Depending on the available data sources, the cause-specific estimates will have quite substantial uncertainty ranges. Due to changes in data and some methods, these estimates are not comparable to previously-released WHO estimates. The preparation of these statistics was undertaken by the WHO Department of Data and Analytics, in collaboration with WHO technical programs. For further queries, please send an email to healthstat@who.int . References: (1) WHO methods and data sources for global burden of disease 2000-2021. Global Health Estimates Technical Paper WHO/DDI/DNA/GHE/2020.3.Geneva: World Health Organization; 2024 (https://www.who.int/docs/default-source/gho-documents/global-health-estimates/GlobalBurden_method_2000_2021.pdf). (2) World Population Prospects: The 2019 revision. New York: United Nations, Department of Economic and Social Affairs, Population Division; 2019 (https://esa.un.org/unpd/wpp/).
top_rate
0.1667
cardinality
6
entropy
2.585
entropy_ratio
1

GLOBAL HEALTH ESTIMATES 2021 SUMMARY TABLES:

categorical metadata long_tail
This column appears to be the leftmost label/header column of a WHO Global Health Estimates 2021 summary table, mixing report metadata (publisher, date, URL, table title) with a list of country names. With 190 unique values across 196 rows and a top_rate of 0.0053, it is essentially all singletons, and entropy_ratio of 1.0 confirms maximum dispersion. The 3.06% nulls likely correspond to blank spacer rows in the original spreadsheet layout. Treatment: Split into header metadata rows and country rows, then promote the country values to a proper key column. high · anthropic:claude-opus-4-7
n
196
nulls
6 (3.1%)
unique
190
top_value
YLDs BY CAUSE, AGE AND SEX, BY WHO REGION, 2000-2021
top_rate
0.005263
cardinality
190
entropy
7.57
entropy_ratio
1