saturn

/home/coolhand/datasets/language-data/wals_values.csv 76,475 rows sample n=76,475 seed 42 2026-05-01T18:36:31+00:00

Overview

Source/home/coolhand/datasets/language-data/wals_values.csv
Total rows76,475
Profiled sample76,475
Columns8
Generated2026-05-01T18:36:31+00:00

Insights opt-in

Model-generated narrative. These are opinions, not facts — the stats below are what saturn measured. Generated by: anthropic:claude-opus-4-7.

Errors during insight pass (9)
  • dataset:__global__:anthropic:claude-opus-4-7: BadRequestError — Error code: 400 - {'type': 'error', 'error': {'type': 'invalid_request_error', 'message': 'Your credit balance is too low to access the Anthropic API. Please go to Plans & Billing to upgrade or purchase credits.'}, 'request_id': 'req_011CacLGNSsr2Fxd7sWhgmym'}
  • column:ID:anthropic:claude-opus-4-7: BadRequestError — Error code: 400 - {'type': 'error', 'error': {'type': 'invalid_request_error', 'message': 'Your credit balance is too low to access the Anthropic API. Please go to Plans & Billing to upgrade or purchase credits.'}, 'request_id': 'req_011CacLGNxP8UUwqv24E7VJu'}
  • column:Language_ID:anthropic:claude-opus-4-7: BadRequestError — Error code: 400 - {'type': 'error', 'error': {'type': 'invalid_request_error', 'message': 'Your credit balance is too low to access the Anthropic API. Please go to Plans & Billing to upgrade or purchase credits.'}, 'request_id': 'req_011CacLGPUPVqyexG9N1xphg'}
  • column:Parameter_ID:anthropic:claude-opus-4-7: BadRequestError — Error code: 400 - {'type': 'error', 'error': {'type': 'invalid_request_error', 'message': 'Your credit balance is too low to access the Anthropic API. Please go to Plans & Billing to upgrade or purchase credits.'}, 'request_id': 'req_011CacLGPvBWD7Pk5f7ciEoB'}
  • column:Value:anthropic:claude-opus-4-7: BadRequestError — Error code: 400 - {'type': 'error', 'error': {'type': 'invalid_request_error', 'message': 'Your credit balance is too low to access the Anthropic API. Please go to Plans & Billing to upgrade or purchase credits.'}, 'request_id': 'req_011CacLGQSgHvU7YsPksThZb'}
  • column:Code_ID:anthropic:claude-opus-4-7: BadRequestError — Error code: 400 - {'type': 'error', 'error': {'type': 'invalid_request_error', 'message': 'Your credit balance is too low to access the Anthropic API. Please go to Plans & Billing to upgrade or purchase credits.'}, 'request_id': 'req_011CacLGQxSPC4pdSurughww'}
  • column:Comment:anthropic:claude-opus-4-7: BadRequestError — Error code: 400 - {'type': 'error', 'error': {'type': 'invalid_request_error', 'message': 'Your credit balance is too low to access the Anthropic API. Please go to Plans & Billing to upgrade or purchase credits.'}, 'request_id': 'req_011CacLGRTwmbdpia8aGMkD1'}
  • column:Source:anthropic:claude-opus-4-7: BadRequestError — Error code: 400 - {'type': 'error', 'error': {'type': 'invalid_request_error', 'message': 'Your credit balance is too low to access the Anthropic API. Please go to Plans & Billing to upgrade or purchase credits.'}, 'request_id': 'req_011CacLGRzwm446p5ULgkc5E'}
  • column:Example_ID:anthropic:claude-opus-4-7: BadRequestError — Error code: 400 - {'type': 'error', 'error': {'type': 'invalid_request_error', 'message': 'Your credit balance is too low to access the Anthropic API. Please go to Plans & Billing to upgrade or purchase credits.'}, 'request_id': 'req_011CacLGSXwL4xv6XvEvfpJY'}

Languages detected

Per-string language detection across text columns (sampled).

ID text

100.0% of rows are unique strings 100.0% rows are a single word 95th-percentile length under 20 chars
rows76,475
null0 (0.0%)
unique76,475
len_min5
len_max8
len_mean7.271
len_median7.000
len_p958.000
word_mean1.000
word_median1.000
n_empty0
n_duplicates0
duplicate_rate0.000
vocab_size20,000
readability_flesch_mean88.649
emoji_rate0.000
url_rate0.000
one_word_rate1.000
allcaps_rate0.000
boilerplate_rate0.000
Sample values (first 10)
  1. 18A-abi
  2. 96A-run
  3. 87A-tmn
  4. 144J-taf
  5. 144P-kmz
  6. 95A-yim
  7. 144B-kom
  8. 114A-tuy
  9. 131A-tag
  10. 96A-aml

Language_ID text

100.0% rows are a single word 95th-percentile length under 20 chars 96.5% duplicate strings
rows76,475
null0 (0.0%)
unique2,660
len_min2
len_max3
len_mean2.996
len_median3.000
len_p953.000
word_mean1.000
word_median1.000
n_empty0
n_duplicates73,815
duplicate_rate0.965
vocab_size2,238
readability_flesch_mean117.653
emoji_rate0.000
url_rate0.000
one_word_rate1.000
allcaps_rate0.000
boilerplate_rate0.000
Sample values (first 10)
  1. abi
  2. run
  3. tmn
  4. taf
  5. kmz
  6. yim
  7. kom
  8. tuy
  9. tag
  10. aml

Parameter_ID categorical

rows76,475
null0 (0.0%)
unique192
top_value83A
top_rate0.020
cardinality192
entropy7.103
entropy_ratio0.937
Top values (rank 1–20)
  1. 83A — 1,518
  2. 82A — 1,496
  3. 81A — 1,376
  4. 87A — 1,367
  5. 143A — 1,325
  6. 143E — 1,325
  7. 143F — 1,325
  8. 143G — 1,325
  9. 97A — 1,316
  10. 86A — 1,249
  11. 88A — 1,225
  12. 144A — 1,190
  13. 85A — 1,184
  14. 112A — 1,157
  15. 89A — 1,154
  16. 95A — 1,142
  17. 69A — 1,131
  18. 33A — 1,066
  19. 51A — 1,031
  20. 26A — 969

Value numeric

skew=+3.49
rows76,475
null0 (0.0%)
unique28
min1.000
max28.000
mean2.854
median2.000
std2.824
q11.000
q34.000
iqr3.000
skew3.493
kurtosis16.361
n_outliers2,469
outlier_rate0.032
zero_rate0.000

Code_ID text

100.0% rows are a single word 100.0% rows are all-caps 95th-percentile length under 20 chars 98.5% duplicate strings
rows76,475
null0 (0.0%)
unique1,139
len_min4
len_max7
len_mean5.300
len_median5.000
len_p956.000
word_mean1.000
word_median1.000
n_empty0
n_duplicates75,336
duplicate_rate0.985
vocab_size911
readability_flesch_mean121.220
emoji_rate0.000
url_rate0.000
one_word_rate1.000
allcaps_rate1.000
boilerplate_rate0.000
Sample values (first 10)
  1. 18A-1
  2. 96A-2
  3. 87A-2
  4. 144J-7
  5. 144P-4
  6. 95A-1
  7. 144B-3
  8. 114A-7
  9. 131A-1
  10. 96A-4

Comment text

8 languages detected in sample 96.9% null
rows76,475
null74,102 (96.9%)
unique2,068
len_min36
len_max15,127
len_mean372.224
len_median196.000
len_p95917.000
word_mean49.535
word_median8.000
n_empty0
n_duplicates305
duplicate_rate0.129
vocab_size7,544
readability_flesch_mean-127.131
emoji_rate0.000
url_rate1.69e-03
one_word_rate0.000
allcaps_rate0.000
boilerplate_rate0.000
Sample values (first 10)
  1.  

    pronominal and adnominal identical:

     

    a[b]nə̀y

    wəy / wəbrə̀y

  2. máŋa~máŋa

    "finger/toe"

  3. ARM: pakán

    "el brazo ... la ala ... la pluma grande de las alas"

    HAND: macán

    "la mano ... la manga de una camisa"

  4. du

    2nd sg. fam.

    - used to address friends, relatives, children, and betweenmembers of the same professional group without previousacquaintance

    ni

    ò, lè, là, ì, ŋ, mà

    ‘3rd pronouns’ showing noun class distinctions

    hóò, lêŋ, lâŋ, etc.

    ‘proximate dem.’

    kóN, léN, lâN,<…

  5. a. 

    Declarative 

    b. 

    Prohibitive 

    maʹa+pukpe

    "finger" [pukpe "child"]

  6. mâ=hǔ́-nǎ

    "finger"

  7. h=

    2nd sg. hon.

    - 2nd plural actor clitic is used as honorific form of

    address if the addressee is older than the speaker;

    - plural is also indicator of respect when adressing/…

  8.  

    pronominal and adnominal differentiated:

     

    mǎn

    mànɔ́gɔ̀

    màn(ɪ̀)cɛ́<…

    Source text

    83.2% rows are a single word 57.2% duplicate strings
    rows76,475
    null7,092 (9.3%)
    unique29,715
    len_min7
    len_max165
    len_mean21.859
    len_median19.000
    len_p9544.000
    word_mean1.239
    word_median1.000
    n_empty0
    n_duplicates39,668
    duplicate_rate0.572
    vocab_size13,953
    readability_flesch_mean-9.644
    emoji_rate0.000
    url_rate0.000
    one_word_rate0.832
    allcaps_rate5.77e-05
    boilerplate_rate0.000
    Sample values (first 10)
    1. Najlis-1966
    2. Bivon-1971[42]
    3. Bergman-et-al-1969
    4. Schachter-and-Otanes-1972[123, 131];Aldridge-2004[200]
    5. Schiffman-1983[113-116];Bright-1958[24];Sridhar-1990[227-228]
    6. Foley-1991[101-103]
    7. Sohn-1994[243]
    8. Barnes-1994[330]
    9. Rastorgueva-1963[passim]
    10. Derbyshire-and-Payne-1990[260]

    Example_ID text

    39.1% rows are a single word 97.9% null
    rows76,475
    null74,863 (97.9%)
    unique1,444
    len_min5
    len_max575
    len_mean23.574
    len_median17.000
    len_p9553.000
    word_mean2.818
    word_median2.000
    n_empty0
    n_duplicates168
    duplicate_rate0.104
    vocab_size3,810
    readability_flesch_mean119.703
    emoji_rate0.000
    url_rate0.000
    one_word_rate0.391
    allcaps_rate0.000
    boilerplate_rate0.000
    Sample values (first 10)
    1. igt-2884
    2. igt-1810 igt-1811 igt-1812 igt-3738 igt-3739
    3. igt-2404
    4. igt-2961
    5. igt-741 igt-742 igt-743 igt-744
    6. igt-229 igt-230
    7. igt-3397 igt-3398
    8. igt-1494
    9. igt-1194
    10. igt-1498 igt-1499 igt-1500 igt-1501 igt-1502