Summary confidence: high
This dataset is a comprehensive catalogue of the world's languoids from Glottolog, covering 23,740 entries that span dialects (10,920), languages (8,481), and language families (4,339). The most striking pattern is in endangerment status: while the majority (18,965) are marked 'safe', nearly 4,800 entries are endangered, extinct, or vulnerable — worth examining closely against family and geographic distribution. A second area of interest is the highly skewed child-count columns: most languoids have zero children (74% for dialects, 82% for languages), but a handful of nodes have hundreds or even thousands of descendants, suggesting a very uneven tree structure. Geographic coverage is also notably incomplete, with latitude and longitude missing for 66% of rows, limiting spatial analysis to a subset of the data.
citing: row_count · level.top_values · status.top_values · child_dialect_count.zero_rate · child_dialect_count.max · child_language_count.zero_rate · child_language_count.max · latitude.null_rate · family_id.top_values · country_ids.top_values