saturn

/home/coolhand/servers/diachronica/data_raw/wals_language.csv 3,573 rows sample n=3,573 seed 42 2026-05-01T17:52:07+00:00

Overview

Source/home/coolhand/servers/diachronica/data_raw/wals_language.csv
Total rows3,573
Profiled sample3,573
Columns17
Generated2026-05-01T17:52:07+00:00

Insights opt-in

Model-generated narrative. These are opinions, not facts — the stats below are what saturn measured. Generated by: anthropic:claude-opus-4-7.

Errors during insight pass (18)
  • dataset:__global__:anthropic:claude-opus-4-7: BadRequestError — Error code: 400 - {'type': 'error', 'error': {'type': 'invalid_request_error', 'message': 'Your credit balance is too low to access the Anthropic API. Please go to Plans & Billing to upgrade or purchase credits.'}, 'request_id': 'req_011CacGsv59tCMcNmTCSFGmJ'}
  • column:ID:anthropic:claude-opus-4-7: BadRequestError — Error code: 400 - {'type': 'error', 'error': {'type': 'invalid_request_error', 'message': 'Your credit balance is too low to access the Anthropic API. Please go to Plans & Billing to upgrade or purchase credits.'}, 'request_id': 'req_011CacGsvaun1aSDxETPBuJj'}
  • column:Name:anthropic:claude-opus-4-7: BadRequestError — Error code: 400 - {'type': 'error', 'error': {'type': 'invalid_request_error', 'message': 'Your credit balance is too low to access the Anthropic API. Please go to Plans & Billing to upgrade or purchase credits.'}, 'request_id': 'req_011CacGsw99QVV9hP7u9U3Bq'}
  • column:Macroarea:anthropic:claude-opus-4-7: BadRequestError — Error code: 400 - {'type': 'error', 'error': {'type': 'invalid_request_error', 'message': 'Your credit balance is too low to access the Anthropic API. Please go to Plans & Billing to upgrade or purchase credits.'}, 'request_id': 'req_011CacGswfemh6twvutVRr24'}
  • column:Latitude:anthropic:claude-opus-4-7: BadRequestError — Error code: 400 - {'type': 'error', 'error': {'type': 'invalid_request_error', 'message': 'Your credit balance is too low to access the Anthropic API. Please go to Plans & Billing to upgrade or purchase credits.'}, 'request_id': 'req_011CacGsxDtCotu8uTyoJwCU'}
  • column:Longitude:anthropic:claude-opus-4-7: BadRequestError — Error code: 400 - {'type': 'error', 'error': {'type': 'invalid_request_error', 'message': 'Your credit balance is too low to access the Anthropic API. Please go to Plans & Billing to upgrade or purchase credits.'}, 'request_id': 'req_011CacGsxjeKqvDrF4EVt7Lb'}
  • column:Glottocode:anthropic:claude-opus-4-7: BadRequestError — Error code: 400 - {'type': 'error', 'error': {'type': 'invalid_request_error', 'message': 'Your credit balance is too low to access the Anthropic API. Please go to Plans & Billing to upgrade or purchase credits.'}, 'request_id': 'req_011CacGsyDR2GyWfviyxMT7U'}
  • column:ISO639P3code:anthropic:claude-opus-4-7: BadRequestError — Error code: 400 - {'type': 'error', 'error': {'type': 'invalid_request_error', 'message': 'Your credit balance is too low to access the Anthropic API. Please go to Plans & Billing to upgrade or purchase credits.'}, 'request_id': 'req_011CacGszfyPtr3rge8nax8d'}
  • column:Family:anthropic:claude-opus-4-7: BadRequestError — Error code: 400 - {'type': 'error', 'error': {'type': 'invalid_request_error', 'message': 'Your credit balance is too low to access the Anthropic API. Please go to Plans & Billing to upgrade or purchase credits.'}, 'request_id': 'req_011CacGt1DU2a9tTWcULDzTn'}
  • column:Subfamily:anthropic:claude-opus-4-7: BadRequestError — Error code: 400 - {'type': 'error', 'error': {'type': 'invalid_request_error', 'message': 'Your credit balance is too low to access the Anthropic API. Please go to Plans & Billing to upgrade or purchase credits.'}, 'request_id': 'req_011CacGt1nCge2xRm4p7LPYD'}
  • column:Genus:anthropic:claude-opus-4-7: BadRequestError — Error code: 400 - {'type': 'error', 'error': {'type': 'invalid_request_error', 'message': 'Your credit balance is too low to access the Anthropic API. Please go to Plans & Billing to upgrade or purchase credits.'}, 'request_id': 'req_011CacGt2JhYeEaQYaSTt6R2'}
  • column:GenusIcon:anthropic:claude-opus-4-7: BadRequestError — Error code: 400 - {'type': 'error', 'error': {'type': 'invalid_request_error', 'message': 'Your credit balance is too low to access the Anthropic API. Please go to Plans & Billing to upgrade or purchase credits.'}, 'request_id': 'req_011CacGt2oU91AZQpbZjSqcE'}
  • column:ISO_codes:anthropic:claude-opus-4-7: BadRequestError — Error code: 400 - {'type': 'error', 'error': {'type': 'invalid_request_error', 'message': 'Your credit balance is too low to access the Anthropic API. Please go to Plans & Billing to upgrade or purchase credits.'}, 'request_id': 'req_011CacGt3U9uA8ChTBEWe8fG'}
  • column:Samples_100:anthropic:claude-opus-4-7: BadRequestError — Error code: 400 - {'type': 'error', 'error': {'type': 'invalid_request_error', 'message': 'Your credit balance is too low to access the Anthropic API. Please go to Plans & Billing to upgrade or purchase credits.'}, 'request_id': 'req_011CacGt438sKtDkSTPE3ct7'}
  • column:Samples_200:anthropic:claude-opus-4-7: BadRequestError — Error code: 400 - {'type': 'error', 'error': {'type': 'invalid_request_error', 'message': 'Your credit balance is too low to access the Anthropic API. Please go to Plans & Billing to upgrade or purchase credits.'}, 'request_id': 'req_011CacGt4UBCVcVVukzj5DRE'}
  • column:Country_ID:anthropic:claude-opus-4-7: BadRequestError — Error code: 400 - {'type': 'error', 'error': {'type': 'invalid_request_error', 'message': 'Your credit balance is too low to access the Anthropic API. Please go to Plans & Billing to upgrade or purchase credits.'}, 'request_id': 'req_011CacGt51g2qiTWEApSLcZH'}
  • column:Source:anthropic:claude-opus-4-7: BadRequestError — Error code: 400 - {'type': 'error', 'error': {'type': 'invalid_request_error', 'message': 'Your credit balance is too low to access the Anthropic API. Please go to Plans & Billing to upgrade or purchase credits.'}, 'request_id': 'req_011CacGt5e7yLWHsqCD3y361'}
  • column:Parent_ID:anthropic:claude-opus-4-7: BadRequestError — Error code: 400 - {'type': 'error', 'error': {'type': 'invalid_request_error', 'message': 'Your credit balance is too low to access the Anthropic API. Please go to Plans & Billing to upgrade or purchase credits.'}, 'request_id': 'req_011CacGt6MHgbcSqLFBdB24q'}

Numeric correlation

ID text

100.0% of rows are unique strings 100.0% rows are a single word 95th-percentile length under 20 chars
rows3,573
null0 (0.0%)
unique3,573
len_min2
len_max36
len_mean5.982
len_median3.000
len_p9517.000
word_mean1.000
word_median1.000
n_empty0
n_duplicates0
duplicate_rate0.000
vocab_size3,573
readability_flesch_mean61.577
emoji_rate0.000
url_rate0.000
one_word_rate1.000
allcaps_rate0.000
boilerplate_rate0.000
Sample values (first 10)
  1. abd
  2. genus-araucanian
  3. genus-misumalpan
  4. subfamily-palaihnihan
  5. mmp
  6. family-tunica
  7. mrj
  8. genus-northwestcaucasian
  9. genus-huavean
  10. arg

Name text

80.0% rows are a single word 95th-percentile length under 20 chars
rows3,573
null0 (0.0%)
unique3,198
len_min2
len_max46
len_mean8.705
len_median7.000
len_p9519.000
word_mean1.258
word_median1.000
n_empty0
n_duplicates375
duplicate_rate0.105
vocab_size3,383
readability_flesch_mean48.158
emoji_rate0.000
url_rate0.000
one_word_rate0.800
allcaps_rate0.000
boilerplate_rate0.000
Sample values (first 10)
  1. Abidji
  2. Araucanian
  3. Misumalpan
  4. Palaihnihan
  5. Mampruli
  6. Tunica
  7. Mirniny
  8. Northwest Caucasian
  9. Huavean
  10. Arabic (Gulf)

Macroarea categorical

25.5% null
rows3,573
null911 (25.5%)
unique6
top_valueEurasia
top_rate0.248
cardinality6
entropy2.459
entropy_ratio0.951
Top values (rank 1–20)
  1. Eurasia — 659
  2. Africa — 606
  3. Papunesia — 560
  4. North America — 396
  5. South America — 258
  6. Australia — 183

Latitude numeric

25.5% null
rows3,573
null911 (25.5%)
unique887
min-55.000
max71.250
mean11.880
median8.292
std22.722
q1-5.000
q328.000
iqr33.000
skew0.356
kurtosis-0.502
n_outliers1
outlier_rate3.76e-04
zero_rate2.25e-03

Longitude numeric

25.5% null
rows3,573
null911 (25.5%)
unique1,360
min-178.167
max179.167
mean35.172
median34.792
std89.352
q1-45.750
q3121.000
iqr166.750
skew-0.326
kurtosis-1.047
n_outliers0
outlier_rate0.000
zero_rate1.50e-03

Glottocode text

100.0% rows are a single word 26.0% null 95th-percentile length under 20 chars
rows3,573
null928 (26.0%)
unique2,502
len_min8
len_max8
len_mean8.000
len_median8.000
len_p958.000
word_mean1.000
word_median1.000
n_empty0
n_duplicates143
duplicate_rate0.054
vocab_size2,502
readability_flesch_mean92.879
emoji_rate0.000
url_rate0.000
one_word_rate1.000
allcaps_rate0.000
boilerplate_rate0.000
Sample values (first 10)
  1. chad1249
  2. nyor1246
  3. musl1236
  4. taga1270
  5. kuni1267
  6. yukp1241
  7. libe1247
  8. tuva1244
  9. tali1258
  10. yane1238

ISO639P3code text

100.0% rows are a single word 26.8% null 95th-percentile length under 20 chars
rows3,573
null959 (26.8%)
unique2,442
len_min3
len_max3
len_mean3.000
len_median3.000
len_p953.000
word_mean1.000
word_median1.000
n_empty0
n_duplicates172
duplicate_rate0.066
vocab_size2,442
readability_flesch_mean119.528
emoji_rate0.000
url_rate0.000
one_word_rate1.000
allcaps_rate0.000
boilerplate_rate0.000
Sample values (first 10)
  1. shu
  2. nyo
  3. ttt
  4. tgl
  5. kup
  6. yup
  7. kpk
  8. tvl
  9. tlj
  10. adx

Family categorical

25.5% null
rows3,573
null911 (25.5%)
unique254
top_valueNiger-Congo
top_rate0.122
cardinality254
entropy5.631
entropy_ratio0.705
Top values (rank 1–20)
  1. Niger-Congo — 324
  2. Austronesian — 324
  3. Indo-European — 176
  4. Sino-Tibetan — 146
  5. Afro-Asiatic — 145
  6. Pama-Nyungan — 121
  7. Trans-New Guinea — 98
  8. other — 72
  9. Altaic — 65
  10. Oto-Manguean — 56
  11. Austro-Asiatic — 48
  12. Eastern Sudanic — 47
  13. Uto-Aztecan — 44
  14. Algic — 31
  15. Mayan — 30
  16. Arawakan — 29
  17. Nakh-Daghestanian — 28
  18. Mande — 28
  19. Uralic — 27
  20. Hokan — 26

Subfamily categorical

74.5% null
rows3,573
null2,662 (74.5%)
unique32
top_valueBenue-Congo
top_rate0.220
cardinality32
entropy3.856
entropy_ratio0.771
Top values (rank 1–20)
  1. Benue-Congo — 200
  2. Eastern Malayo-Polynesian — 159
  3. Tibeto-Burman — 139
  4. Chadic — 47
  5. Mon-Khmer — 38
  6. Adamawa-Ubangi — 30
  7. Gur — 27
  8. Daghestanian — 25
  9. Cushitic — 24
  10. Finno-Ugric — 21
  11. Kwa — 20
  12. North-Central Atlantic — 20
  13. Nilotic — 19
  14. Mixtecan — 18
  15. Omotic — 15
  16. Kainantu-Goroka — 14
  17. Madang — 13
  18. Awyu-Ok — 10
  19. Surmic — 10
  20. Je — 9

Genus categorical

25.5% null
rows3,573
null911 (25.5%)
unique625
top_valueOceanic
top_rate0.056
cardinality625
entropy7.950
entropy_ratio0.856
Top values (rank 1–20)
  1. Oceanic — 149
  2. Bantu — 141
  3. Indic — 50
  4. Western Pama-Nyungan — 49
  5. Semitic — 43
  6. Turkic — 41
  7. Sign Languages — 40
  8. Bodic — 40
  9. Germanic — 39
  10. Northern Pama-Nyungan — 33
  11. Creoles and Pidgins — 32
  12. Mayan — 30
  13. Algonquian — 29
  14. Central Malayo-Polynesian — 29
  15. Iranian — 26
  16. Romance — 24
  17. Biu-Mandara — 24
  18. Southeastern Pama-Nyungan — 23
  19. Dravidian — 23
  20. Malayo-Sumbawan — 22

GenusIcon categorical

601 singleton categories 82.5% null
rows3,573
null2,948 (82.5%)
unique613
top_valuec688033
top_rate3.20e-03
cardinality613
entropy9.249
entropy_ratio0.999
Top values (rank 1–20)
  1. c688033 — 2
  2. c803E33 — 2
  3. c804733 — 2
  4. c807D33 — 2
  5. c806233 — 2
  6. c805033 — 2
  7. c7A8033 — 2
  8. c805933 — 2
  9. c807433 — 2
  10. c806B33 — 2
  11. c718033 — 2
  12. c803533 — 2
  13. cCC8C51 — 1
  14. cCC6851 — 1
  15. cCC7E51 — 1
  16. c8FCC51 — 1
  17. cCC8051 — 1
  18. c528033 — 1
  19. cCC9F51 — 1
  20. cCCB551 — 1

ISO_codes text

99.0% rows are a single word 26.1% null 95th-percentile length under 20 chars
rows3,573
null933 (26.1%)
unique2,468
len_min3
len_max7
len_mean3.039
len_median3.000
len_p953.000
word_mean1.010
word_median1.000
n_empty0
n_duplicates172
duplicate_rate0.065
vocab_size2,486
readability_flesch_mean117.413
emoji_rate0.000
url_rate0.000
one_word_rate0.990
allcaps_rate0.000
boilerplate_rate0.000
Sample values (first 10)
  1. shu
  2. nyo
  3. dto
  4. sps
  5. kpx
  6. yux
  7. kff
  8. tue
  9. tlj
  10. ame

Samples_100 categorical

25.5% null top value is 96.2% of rows
rows3,573
null911 (25.5%)
unique2
top_valueFalse
top_rate0.962
cardinality2
entropy0.231
entropy_ratio0.231
Top values (rank 1–20)
  1. False — 2,562
  2. True — 100

Samples_200 categorical

25.5% null
rows3,573
null911 (25.5%)
unique2
top_valueFalse
top_rate0.925
cardinality2
entropy0.385
entropy_ratio0.385
Top values (rank 1–20)
  1. False — 2,462
  2. True — 200

Country_ID categorical

25.7% null
rows3,573
null918 (25.7%)
unique337
top_valuePG
top_rate0.081
cardinality337
entropy6.314
entropy_ratio0.752
Top values (rank 1–20)
  1. PG — 214
  2. AU — 185
  3. US — 177
  4. ID — 177
  5. IN — 120
  6. MX — 120
  7. RU — 89
  8. NG — 66
  9. BR — 66
  10. CN — 54
  11. CD — 49
  12. CM — 46
  13. CA — 45
  14. CO — 39
  15. ET — 36
  16. PH — 36
  17. PE — 35
  18. NP — 32
  19. TZ — 28
  20. VU — 28

Source text

45.5% rows are a single word 30.1% null
rows3,573
null1,074 (30.1%)
unique2,373
len_min7
len_max452
len_mean42.071
len_median25.000
len_p95135.000
word_mean2.854
word_median2.000
n_empty0
n_duplicates126
duplicate_rate0.050
vocab_size5,899
readability_flesch_mean21.332
emoji_rate0.000
url_rate0.000
one_word_rate0.455
allcaps_rate0.000
boilerplate_rate0.000
Sample values (first 10)
  1. Abu-Absi-1995
  2. Grenoble-1992
  3. Goldstein-1991
  4. Ross-2002g
  5. Bacelar-2004
  6. Hanes-1952 de-Vegamian-1978
  7. Laanest-1982 Leskinen-1984 Raun-1964b Rjagoev-1993
  8. Haas-1940 Haas-1953 Nichols-1992 Swanton-1919 Swanton-1921
  9. Ross-2002h
  10. Duff-Tripp-1997 Fast-1953 Wise-1958 Wise-1978 Wise-1986 Wise-1990

Parent_ID categorical

501 singleton categories
rows3,573
null254 (7.1%)
unique911
top_valuegenus-oceanic
top_rate0.045
cardinality911
entropy8.554
entropy_ratio0.870
Top values (rank 1–20)
  1. genus-oceanic — 149
  2. genus-bantu — 141
  3. genus-indic — 50
  4. genus-westernpamanyungan — 49
  5. genus-semitic — 43
  6. genus-turkic — 41
  7. genus-signlanguages — 40
  8. genus-bodic — 40
  9. genus-germanic — 39
  10. genus-northernpamanyungan — 33
  11. genus-creolesandpidgins — 32
  12. genus-mayan — 30
  13. family-austronesian — 30
  14. genus-algonquian — 29
  15. genus-centralmalayopolynesian — 29
  16. genus-iranian — 26
  17. family-transnewguinea — 25
  18. genus-romance — 24
  19. genus-biumandara — 24
  20. genus-southeasternpamanyungan — 23