saturn·

.cache who yld global

source /home/coolhand/html/datavis/data_trove/data/accessibility/.cache_who/yld_global.xlsx#Notes 196 rows 2 columns profiled 2026-05-01 raw JSON static .html .ipynb Report Notebook

Reading

dataset summary · high confidence anthropic:claude-opus-4-7

This is a small 'Notes' sheet (196 rows, 2 columns) extracted from a WHO Global Health Estimates 2021 workbook on years lost due to disability (YLDs). It is essentially metadata and a country list rather than a tabular dataset: the unnamed first column is 96.94% null with only 6 distinct header/note strings, while the second column holds 190 nearly-unique values dominated by country names. The most useful thing to look at is the second column's values to confirm it is the WHO Member State list. Treat this sheet as documentation; the real burden-of-disease numbers live on other sheets of the source workbook.

citing: row_count · column_count · columns[0].null_rate · columns[0].n_unique · columns[1].n_unique · columns[1].top_value · columns[1].null_rate

Schema

2 columns
Per-column summary. Click column name to jump to its detail.
Alerts
__UNNAMED__0 categorical 96.9% 6
long_tail null_rate
GLOBAL HEALTH ESTIMATES 2021 SUMMARY TABLES: categorical 3.1% 190
long_tail

__UNNAMED__0

categorical metadata long_tail null_rate
This unnamed column is almost certainly the leading text/header block from a WHO Global Health Estimates 2021 spreadsheet rather than a true data field — values include the workbook description, a 'Recommended citation:' label, and 'List of Countries'. Of 196 rows, 96.94% are null and only 6 distinct strings appear, each occurring once, so it carries no analytic signal. The presence of a multi-paragraph documentation blob as the top value confirms this is spreadsheet preamble that leaked in during ingest. Treatment: Drop; this is spreadsheet header/preamble text, not a column. high · anthropic:claude-opus-4-7
n
196
nulls
190 (96.9%)
unique
6
top_value
Global Health Estimates 2021 Summary Tables This workbook contains summary burden of disease estimates from the WHO Global Health Estimates (GHE). The estimates are based on analysis of latest available national information on levels of mortality and cause distributions as of the end of 2023 together with latest available information from WHO programs for causes of public health importance. Data, methods and cause categories are described in a Technical Paper (1) available on the WHO website. Population estimates are from the 2024 revision of the UN World Population Prospects (2). This spreadsheet includes point estimates for years lost due to disability (YLDs), globally, by cause, age and sex, for the years 2000, 2010, 2015, 2019, 2020 and 2021. Documentation, country-level and regional-level summary tables are available on the WHO website ( https://www.who.int/data/gho/data/themes/mortality-and-global-health-estimates ). Depending on the available data sources, the cause-specific estimates will have quite substantial uncertainty ranges. Due to changes in data and some methods, these estimates are not comparable to previously-released WHO estimates. The preparation of these statistics was undertaken by the WHO Department of Data and Analytics, in collaboration with WHO technical programs. For further queries, please send an email to healthstat@who.int . References: (1) WHO methods and data sources for global burden of disease 2000-2021. Global Health Estimates Technical Paper WHO/DDI/DNA/GHE/2020.3.Geneva: World Health Organization; 2024 (https://www.who.int/docs/default-source/gho-documents/global-health-estimates/GlobalBurden_method_2000_2021.pdf). (2) World Population Prospects: The 2024 revision. New York: United Nations, Department of Economic and Social Affairs, Population Division; 2024 (https://esa.un.org/unpd/wpp/).
top_rate
0.1667
cardinality
6
entropy
2.585
entropy_ratio
1

GLOBAL HEALTH ESTIMATES 2021 SUMMARY TABLES:

categorical identifier long_tail
This appears to be a malformed header column from a WHO Global Health Estimates 2021 summary table, where document metadata (title, date, publisher, URL) has been concatenated with country names into a single field. With 190 unique values across 196 rows and a maximum frequency of 1 (top_rate 0.005), it is effectively a free-text identifier rather than a categorical feature. Entropy ratio of 1.0 confirms every populated value is distinct, and a 3.06% null rate suggests a few stray blank rows. Treatment: Drop or re-parse: this column conflates report metadata with country labels and is near-unique. high · anthropic:claude-opus-4-7
n
196
nulls
6 (3.1%)
unique
190
top_value
GLOBAL YLDs BY CAUSE, AGE AND SEX, 2000-2021
top_rate
0.005263
cardinality
190
entropy
7.57
entropy_ratio
1