Summary confidence: medium
This dataset is a reference catalogue of 7,130 world languages, each identified by a unique ISO 639-3 three-letter code alongside its name and several linked data sources (Glottolog, Joshua Project, speaker counts, and more). Every row is distinct — no duplicate language codes or names — making this primarily a lookup/reference table. Two things stand out for closer inspection: first, the name column reveals notable clusters around directional qualifiers (Southern, Northern, Western, Eastern) and language families like Zapotec (58), Mixtec (52), and Naga (49), suggesting rich geographic and genealogical structure worth exploring. Second, 'sign' appears 152 times in language names, indicating a surprisingly large representation of sign languages across the world's documented tongues.
citing: row_count · column_count · columns[0].n_unique · columns[1].n_unique · columns[1].stats.len_max · columns[1].stats.len_mean · columns[1].top_words · columns[1].stats.one_word_rate