saturn
/home/coolhand/servers/diachronica/etymology_atlas/parquet/cognate_sets.parquet 4,981 rows sample n=4,981 seed 42 2026-05-01T17:52:34+00:00
Overview
| Source | /home/coolhand/servers/diachronica/etymology_atlas/parquet/cognate_sets.parquet |
| Total rows | 4,981 |
| Profiled sample | 4,981 |
| Columns | 7 |
| Generated | 2026-05-01T17:52:34+00:00 |
Insights opt-in
Model-generated narrative. These are opinions, not facts — the stats below are what saturn measured. Generated by: anthropic:claude-opus-4-7.
Errors during insight pass (8)
dataset:__global__:anthropic:claude-opus-4-7: BadRequestError — Error code: 400 - {'type': 'error', 'error': {'type': 'invalid_request_error', 'message': 'Your credit balance is too low to access the Anthropic API. Please go to Plans & Billing to upgrade or purchase credits.'}, 'request_id': 'req_011CacGuur4r7Zdxk1iJjbGX'}column:cognate_id:anthropic:claude-opus-4-7: BadRequestError — Error code: 400 - {'type': 'error', 'error': {'type': 'invalid_request_error', 'message': 'Your credit balance is too low to access the Anthropic API. Please go to Plans & Billing to upgrade or purchase credits.'}, 'request_id': 'req_011CacGuvMaZVBmMBiD3ZfHn'}column:concept:anthropic:claude-opus-4-7: BadRequestError — Error code: 400 - {'type': 'error', 'error': {'type': 'invalid_request_error', 'message': 'Your credit balance is too low to access the Anthropic API. Please go to Plans & Billing to upgrade or purchase credits.'}, 'request_id': 'req_011CacGuvv4UNvr7cmfyAzCy'}column:word_count:anthropic:claude-opus-4-7: BadRequestError — Error code: 400 - {'type': 'error', 'error': {'type': 'invalid_request_error', 'message': 'Your credit balance is too low to access the Anthropic API. Please go to Plans & Billing to upgrade or purchase credits.'}, 'request_id': 'req_011CacGuweiYcAFWhx4qsxh8'}column:language_count:anthropic:claude-opus-4-7: BadRequestError — Error code: 400 - {'type': 'error', 'error': {'type': 'invalid_request_error', 'message': 'Your credit balance is too low to access the Anthropic API. Please go to Plans & Billing to upgrade or purchase credits.'}, 'request_id': 'req_011CacGux9Uj4jkn47uauwiC'}column:words:anthropic:claude-opus-4-7: BadRequestError — Error code: 400 - {'type': 'error', 'error': {'type': 'invalid_request_error', 'message': 'Your credit balance is too low to access the Anthropic API. Please go to Plans & Billing to upgrade or purchase credits.'}, 'request_id': 'req_011CacGuxgj6Sf37gZZZjoFu'}column:source_dataset:anthropic:claude-opus-4-7: BadRequestError — Error code: 400 - {'type': 'error', 'error': {'type': 'invalid_request_error', 'message': 'Your credit balance is too low to access the Anthropic API. Please go to Plans & Billing to upgrade or purchase credits.'}, 'request_id': 'req_011CacGuyDUXDiq9D5e9qwqL'}column:confidence:anthropic:claude-opus-4-7: BadRequestError — Error code: 400 - {'type': 'error', 'error': {'type': 'invalid_request_error', 'message': 'Your credit balance is too low to access the Anthropic API. Please go to Plans & Billing to upgrade or purchase credits.'}, 'request_id': 'req_011CacGuyiVGkWMg3F5hcvci'}
Numeric correlation
cognate_id text
100.0% of rows are unique strings
100.0% rows are a single word
95th-percentile length under 20 chars
rows4,981
null0 (0.0%)
unique4,981
len_min7
len_max10
len_mean9.884
len_median10.000
len_p9510.000
word_mean1.000
word_median1.000
n_empty0
n_duplicates0
duplicate_rate0.000
vocab_size4,981
readability_flesch_mean121.220
emoji_rate0.000
url_rate0.000
one_word_rate1.000
allcaps_rate0.000
boilerplate_rate0.000
Sample values (first 10)
- iecor:12
- iecor:8032
- iecor:9076
- iecor:8599
- iecor:5170
- iecor:9758
- iecor:5291
- iecor:9282
- iecor:8613
- iecor:322
concept categorical
top value is 100.0% of rows
rows4,981
null0 (0.0%)
unique1
top_value
top_rate1.000
cardinality1
entropy-0.000
entropy_ratio0.000
Top values (rank 1–20)
- — 4,981
word_count numeric
skew=+6.84
13.0% rows beyond 1.5 IQR
rows4,981
null0 (0.0%)
unique93
min1.000
max157.000
mean5.168
median2.000
std12.135
q11.000
q34.000
iqr3.000
skew6.837
kurtosis59.740
n_outliers649
outlier_rate0.130
zero_rate0.000
language_count numeric
skew=+6.84
13.0% rows beyond 1.5 IQR
rows4,981
null0 (0.0%)
unique94
min1.000
max157.000
mean5.166
median2.000
std12.130
q11.000
q34.000
iqr3.000
skew6.838
kurtosis59.775
n_outliers649
outlier_rate0.130
zero_rate0.000
words text
99.6% of rows are unique strings
rows4,981
null0 (0.0%)
unique4,963
len_min83
len_max14,956
len_mean498.881
len_median184.000
len_p951,988
word_mean44.534
word_median16.000
n_empty0
n_duplicates18
duplicate_rate3.61e-03
vocab_size12,094
readability_flesch_mean48.433
emoji_rate0.000
url_rate0.000
one_word_rate0.000
allcaps_rate0.000
boilerplate_rate0.000
Sample values (first 10)
- [{"form": "s\u016bxtan", "language": "Persian: Tehran", "iso_639_3": "pes", "glottocode": "west2369"}, {"form": "su\u0292yn", "language": "Ossetic: Iron", "iso_639_3": "oss", "glottocode": "iron1242"}, {"form": "so\u0292un", "language": "Ossetic: Digor", "iso_639_3": "oss", "glot…
- [{"form": "r\u00e9st\u00e2", "language": "Franco-Proven\u00e7al", "iso_639_3": "frp", "glottocode": "fran1269"}]
- [{"form": "s\u0101pt", "language": "Sogdian", "iso_639_3": "sog", "glottocode": "sogd1245"}]
- [{"form": "wa\u0113", "language": "Kurdish S.: Elami", "iso_639_3": "sdh", "glottocode": "sout2640"}, {"form": "a-w(a)-\u0101", "language": "Kurdish S.: Qorveh", "iso_639_3": "sdh", "glottocode": "sout2640"}]
- [{"form": "sp\u00edti", "language": "Greek: Modern Std", "iso_639_3": "ell", "glottocode": "mode1248"}, {"form": "sp\u00edtin", "language": "Greek: Cypriot", "iso_639_3": "ell", "glottocode": "cypr1249"}, {"form": "sp\u00edti", "language": "Greek: Italiot", "iso_639_3": "ell", "g…
- [{"form": "\u01f0\u0259tu", "language": "Pashai: North-West", "iso_639_3": "glh", "glottocode": "nort2665"}]
- [{"form": "c\u014dgit\u0101re", "language": "Latin", "iso_639_3": "lat", "glottocode": "lati1261"}, {"form": "cuider", "language": "Anglo-Norman", "iso_639_3": "xno", "glottocode": "angl1258"}, {"form": "cuidier", "language": "Old French", "iso_639_3": "fro", "glottocode": "oldf1…
- [{"form": "peden", "language": "Old Occitan", "iso_639_3": "pro", "glottocode": "oldp1253"}]
- [{"form": "t\u0113\u03b3", "language": "Khwarazmian", "iso_639_3": "xco", "glottocode": "khwa1238"}]
- [{"form": "denken", "language": "Dutch", "iso_639_3": "nld", "glottocode": "dutc1256"}, {"form": "think", "language": "English", "iso_639_3": "eng", "glottocode": "stan1293"}, {"form": "denken", "language": "Flemish", "iso_639_3": "vls", "glottocode": "vlaa1240"}, {"form": "tinke…
source_dataset categorical
top value is 100.0% of rows
rows4,981
null0 (0.0%)
unique1
top_valueiecor
top_rate1.000
cardinality1
entropy-0.000
entropy_ratio0.000
Top values (rank 1–20)
- iecor — 4,981
confidence numeric
only one distinct value
rows4,981
null0 (0.0%)
unique1
min1.000
max1.000
mean1.000
median1.000
std0.000
q11.000
q31.000
iqr0.000
skew0.000
kurtosis0.000
n_outliers0
outlier_rate0.000
zero_rate0.000