saturn·

.cache who daly region

source /home/coolhand/html/datavis/data_trove/data/accessibility/.cache_who/daly_region.xlsx#Notes 196 rows 2 columns profiled 2026-05-01 raw JSON static .html .ipynb Report Notebook

Reading

dataset summary · high confidence anthropic:claude-opus-4-7

This is the 'Notes' sheet from the WHO Global Health Estimates 2021 workbook on DALYs by cause, age and sex, by WHO region, 2000-2021. It is essentially a metadata and country-listing tab rather than analytical data: 196 rows across just two columns. The first column (__UNNAMED__0) is 96.94% null and only carries six header/citation strings, while the second column holds 190 mostly unique entries — predominantly the list of WHO Member States plus a few title and source lines. Treat this sheet as documentation; the real DALY figures live on other sheets of the workbook.

citing: row_count · column_count · columns[0].null_rate · columns[0].n_unique · columns[0].top_values · columns[1].n_unique · columns[1].null_rate · columns[1].top_values

Schema

2 columns
Per-column summary. Click column name to jump to its detail.
Alerts
__UNNAMED__0 categorical 96.9% 6
long_tail null_rate
GLOBAL HEALTH ESTIMATES 2021 SUMMARY TABLES: categorical 3.1% 190
long_tail

__UNNAMED__0

categorical metadata long_tail null_rate
This unnamed column appears to be spreadsheet header/preamble text from a WHO Global Health Estimates 2021 workbook, likely the first column of an Excel sheet read without a header row. Of 196 rows, 96.94% are null and only 6 unique values exist, each appearing once — these are documentation strings (citation, sheet titles, methodology notes) rather than analytical data. The column carries no tabular signal; it is metadata bleed-through from the source file. Treatment: Drop; this is sheet preamble, not data — re-read the source with the correct header row. high · anthropic:claude-opus-4-7
n
196
nulls
190 (96.9%)
unique
6
top_value
Global Health Estimates 2021 Summary Tables This workbook contains summary burden of disease estimates from the WHO Global Health Estimates (GHE). The estimates are based on analysis of latest available national information on levels of mortality and cause distributions as of the end of 2023 together with latest available information from WHO programs for causes of public health importance. Data, methods and cause categories are described in a Technical Paper (1) available on the WHO website. Population estimates are from the 2022 revision of the UN World Population Prospects (2). This spreadsheet includes estimates for disability-adjusted life year (DALY) by WHO region and by cause, age and sex, for the years 2000, 2010, 2015, 2019, 2020 and 2021. Documentation, country-level and regional-level summary tables are available on the WHO website ( https://www.who.int/data/gho/data/themes/mortality-and-global-health-estimates ). Depending on the available data sources, the cause-specific estimates will have quite substantial uncertainty ranges. Due to changes in data and some methods, these estimates are not comparable to previously-released WHO estimates. The preparation of these statistics was undertaken by the WHO Department of Data and Analytics, in collaboration with WHO technical programs. For further queries, please send an email to healthstat@who.int . References: (1) WHO methods and data sources for global burden of disease 2000-2021. Global Health Estimates Technical Paper WHO/DDI/DNA/GHE/2020.3.Geneva: World Health Organization; 2024 (https://www.who.int/docs/default-source/gho-documents/global-health-estimates/GlobalBurden_method_2000_2021.pdf). (2) World Population Prospects: The 2019 revision. New York: United Nations, Department of Economic and Social Affairs, Population Division; 2019 (https://esa.un.org/unpd/wpp/).
top_rate
0.1667
cardinality
6
entropy
2.585
entropy_ratio
1

GLOBAL HEALTH ESTIMATES 2021 SUMMARY TABLES:

categorical label long_tail
This column appears to be a free-text leftmost label from a WHO summary table, mixing report metadata (title "DALYs BY CAUSE, AGE AND SEX, BY WHO REGION, 2000-2021", "July 2024", "World Health Organization", "Geneva, Switzerland", a URL) with country names (Afghanistan, Albania, Algeria...). With 190 unique values across 196 rows and entropy_ratio 1.0, every entry is essentially distinct, and the long_tail alert plus 3.06% nulls confirm it is not a clean categorical. The header rows bleeding into the data are the real surprise — this column was never normalized after import. Treatment: Strip the leading metadata rows and rename this column to 'country' before joining or analysis. high · anthropic:claude-opus-4-7
n
196
nulls
6 (3.1%)
unique
190
top_value
DALYs BY CAUSE, AGE AND SEX, BY WHO REGION, 2000-2021
top_rate
0.005263
cardinality
190
entropy
7.57
entropy_ratio
1