witnessed meteorite falls witnessed meteorite falls

saturn notebook · generated 2026-05-01 Report Notebook

Overview

Source: /home/coolhand/datasets/witnessed-meteorite-falls/witnessed_meteorite_falls.json

Saturn profiled 1,097 rows across 10 columns. The stats below are deterministic and machine-readable; the prose is a language-model interpretation of those stats (opt-in, added after the fact, never sees raw rows).

[2]:

!pip install saturn-dissect
import subprocess
subprocess.run([
    "saturn", "analyze", "/home/coolhand/datasets/witnessed-meteorite-falls/witnessed_meteorite_falls.json",
    "--findings", "witnessed-meteorite-falls-witnessed_meteorite_falls.json",
    "--llm", "anthropic:claude-opus-4-7",
])

Summary confidence: high

This dataset catalogs 1,097 witnessed meteorite falls, with each row identified by a unique name and described by date, geographic coordinates, meteorite class, and a short description. Two columns (category and fall_type) are constants ('witnessed_meteorite_falls' and 'Fell') and offer no analytical value. The most informative dimensions are meteorite_class — heavily dominated by L6 (260 falls, ~24%) followed by H5 (163) and H6 (91) — and the latitude/longitude pair, where latitude skews north (median 36.1) with about 8% outliers and longitude spans the full globe. The date column covers 231 distinct years with 1933 as the most frequent (17 falls), suggesting room for a time-trend exploration.

citing: row_count · column_count · columns.category.stats.top_value · columns.fall_type.stats.top_value · columns.meteorite_class.top_values · columns.meteorite_class.stats.cardinality · columns.latitude.stats · columns.longitude.stats · columns.date.top_values · columns.date.stats.cardinality

Out[4]:

saturn.schema() · 10 columns

column	kind	n	null%	unique	alerts
latitude	numeric	1,097	0.0%	958	outliers
longitude	numeric	1,097	0.0%	1,030
name	text	1,097	0.0%	1,097	near_unique one_word short_text
description	text	1,097	0.0%	1,097	near_unique
category	categorical	1,097	0.0%	1	imbalance
date	categorical	1,097	1.7%	231
country	unknown	1,097	0.0%	—	skipped
mass_g	unknown	1,097	0.0%	—	skipped
meteorite_class	categorical	1,097	0.0%	125
fall_type	categorical	1,097	0.0%	1	imbalance

Fig 1.

meteorite_class · L6, H5, and H6 dominate — check how concentrated the class distribution really is across 125 categories.

Show data table

Top values for meteorite_class (20 unique shown, of 125 total).
value	count	share
L6	260	23.7%
H5	163	14.9%
H6	91	8.3%
L5	76	6.9%
H4	50	4.6%
LL6	41	3.7%
Stone-uncl	39	3.6%
OC	24	2.2%
LL5	19	1.7%
Eucrite-mmict	18	1.6%
L4	18	1.6%
Howardite	16	1.5%
CM2	15	1.4%
H	13	1.2%
L	10	0.9%
Iron, IIIAB	10	0.9%
Aubrite	9	0.8%
Diogenite	8	0.7%
EL6	8	0.7%
CV3	7	0.6%

Fig 2.

latitude · Latitude skews toward the northern hemisphere (median 36.1) with ~8% outliers worth inspecting.

Show data table

Histogram bins for latitude (median: 36.1).
bin	count
-44.12 – -40.77	2
-40.77 – -37.42	1
-37.42 – -34.07	3
-34.07 – -30.73	31
-30.73 – -27.38	14
-27.38 – -24.03	14
-24.03 – -20.68	9
-20.68 – -17.34	11
-17.34 – -13.99	6
-13.99 – -10.64	4
-10.64 – -7.295	11
-7.295 – -3.948	18
-3.948 – -0.6002	9
-0.6002 – 2.747	10
2.747 – 6.095	6
6.095 – 9.442	13
9.442 – 12.79	34
12.79 – 16.14	32
16.14 – 19.48	20
19.48 – 22.83	34
22.83 – 26.18	57
26.18 – 29.53	59
29.53 – 32.87	61
32.87 – 36.22	96
36.22 – 39.57	79
39.57 – 42.92	78
42.92 – 46.26	119
46.26 – 49.61	81
49.61 – 52.96	92
52.96 – 56.31	53
56.31 – 59.65	22
59.65 – 63	14
63 – 66.35	4

Fig 3.

longitude · Longitude spans the globe with a wide IQR of 80.5 — useful for spotting regional clustering of fall sites.

Show data table

Histogram bins for longitude (median: 18.71667).
bin	count
-157.9 – -147.8	2
-147.8 – -137.7	0
-137.7 – -127.7	1
-127.7 – -117.6	7
-117.6 – -107.5	11
-107.5 – -97.45	44
-97.45 – -87.39	49
-87.39 – -77.32	51
-77.32 – -67.25	26
-67.25 – -57.18	21
-57.18 – -47.11	14
-47.11 – -37.04	7
-37.04 – -26.97	2
-26.97 – -16.91	0
-16.91 – -6.836	23
-6.836 – 3.232	102
3.232 – 13.3	135
13.3 – 23.37	88
23.37 – 33.44	91
33.44 – 43.51	59
43.51 – 53.58	25
53.58 – 63.64	11
63.64 – 73.71	29
73.71 – 83.78	104
83.78 – 93.85	31
93.85 – 103.9	13
103.9 – 114	40
114 – 124.1	38
124.1 – 134.1	28
134.1 – 144.2	33
144.2 – 154.3	9
154.3 – 164.3	1
164.3 – 174.4	2

Fig 4.

date · Top years like 1933 and 1949 hint at temporal patterns; look for eras with elevated witnessed-fall counts.

Show data table

Top values for date (20 unique shown, of 231 total).
value	count	share
1933-01-01	17	1.5%
1949-01-01	13	1.2%
1950-01-01	12	1.1%
1976-01-01	11	1.0%
1930-01-01	11	1.0%
1938-01-01	11	1.0%
1910-01-01	11	1.0%
1868-01-01	11	1.0%
1977-01-01	10	0.9%
1939-01-01	10	0.9%
1984-01-01	10	0.9%
1934-01-01	10	0.9%
1916-01-01	10	0.9%
1924-01-01	10	0.9%
1917-01-01	10	0.9%
2008-01-01	9	0.8%
2003-01-01	9	0.8%
1998-01-01	9	0.8%
1890-01-01	9	0.8%
1986-01-01	9	0.8%

Fig 5.

description · Descriptions are tightly templated (46–72 chars); length variation reflects which fields are populated.

Show data table

Character-length distribution for description (mean: 54.30811303555151).
chars	count
46 – 47	1
47 – 47	5
47 – 48	0
48 – 49	29
49 – 49	79
49 – 50	0
50 – 51	118
51 – 51	137
51 – 52	0
52 – 52	129
52 – 53	110
53 – 54	0
54 – 54	76
54 – 55	68
55 – 56	0
56 – 56	58
56 – 57	54
57 – 58	0
58 – 58	34
58 – 59	0
59 – 60	40
60 – 60	22
60 – 61	0
61 – 62	26
62 – 62	21
62 – 63	0
63 – 64	20
64 – 64	20
64 – 65	0
65 – 66	14
66 – 66	9
66 – 67	0
67 – 67	11
67 – 68	4
68 – 69	0
69 – 69	3
69 – 70	5
70 – 71	0
71 – 71	1
71 – 72	3

Fig 6.

Per-column null rate across the corpus. Columns are ordered by input position.

Show data table

Per-column null rate across the corpus.
column	kind	null %
latitude	numeric	0.0%
longitude	numeric	0.0%
name	text	0.0%
description	text	0.0%
category	categorical	0.0%
date	categorical	1.7%
country	unknown	0.0%
mass_g	unknown	0.0%
meteorite_class	categorical	0.0%
fall_type	categorical	0.0%

Fig 7.

Pearson correlation across numeric columns (sampled, bounded).

Show data table

Pearson correlation across 2 numeric columns (values clipped to 2 decimals).
	latitude	longitude
latitude	+1.00	-0.09
longitude	-0.09	+1.00

latitude numeric feature

Geographic latitude coordinates spanning -44.12 to 66.35 degrees, covering most of the inhabited globe. The distribution is left-skewed (skew -1.28) with median 36.1° pulling above the mean of 30.04°, indicating a Northern Hemisphere concentration. Roughly 8.2% of values (90 rows) flag as outliers, likely far-southern points well below the Q1 of 21.87°.

Treatment: Pair with longitude for geospatial features; keep outliers as legitimate Southern Hemisphere observations rather than trimming.

anthropic:claude-opus-4-7 · confidence high

Out[13]:

saturn.columns["latitude"].stats

stat	value
n	1,097
nulls	0 (0.0%)
unique	958
min	-44.12
max	66.35
mean	30.04
median	36.1
std	23.13
q1	21.87
q3	46.07
iqr	24.2
skew	-1.276
kurtosis	1.01
n_outliers	90
outlier_rate	0.08204
zero_rate	0.001823
alert: outliers	8.2% rows beyond 1.5 IQR

Fig 8.

Distribution of latitude. Vertical dash marks the median.

Show data table

Histogram bins for latitude (median: 36.1).
bin	count
-44.12 – -40.77	2
-40.77 – -37.42	1
-37.42 – -34.07	3
-34.07 – -30.73	31
-30.73 – -27.38	14
-27.38 – -24.03	14
-24.03 – -20.68	9
-20.68 – -17.34	11
-17.34 – -13.99	6
-13.99 – -10.64	4
-10.64 – -7.295	11
-7.295 – -3.948	18
-3.948 – -0.6002	9
-0.6002 – 2.747	10
2.747 – 6.095	6
6.095 – 9.442	13
9.442 – 12.79	34
12.79 – 16.14	32
16.14 – 19.48	20
19.48 – 22.83	34
22.83 – 26.18	57
26.18 – 29.53	59
29.53 – 32.87	61
32.87 – 36.22	96
36.22 – 39.57	79
39.57 – 42.92	78
42.92 – 46.26	119
46.26 – 49.61	81
49.61 – 52.96	92
52.96 – 56.31	53
56.31 – 59.65	22
59.65 – 63	14
63 – 66.35	4

longitude numeric feature

Geographic longitude in decimal degrees, with values spanning -157.87 to 174.4 — essentially the full -180/180 range. Distribution is broad (std 68.87, IQR 80.5) and only mildly left-skewed (-0.23) with flat tails (kurtosis -0.62), indicating worldwide coverage rather than a single region. 1030 unique values across 1097 rows suggests these are distinct point locations with minimal repetition; no nulls and only 3 outliers.

Treatment: Pair with latitude as a geospatial coordinate; avoid treating as a standalone scalar feature.

anthropic:claude-opus-4-7 · confidence high

Out[16]:

saturn.columns["longitude"].stats

stat	value
n	1,097
nulls	0 (0.0%)
unique	1,030
min	-157.9
max	174.4
mean	20.13
median	18.72
std	68.87
q1	-4.233
q3	76.27
iqr	80.5
skew	-0.2257
kurtosis	-0.6185
n_outliers	3
outlier_rate	0.002735
zero_rate	0.0009116

Fig 9.

Distribution of longitude. Vertical dash marks the median.

Show data table

Histogram bins for longitude (median: 18.71667).
bin	count
-157.9 – -147.8	2
-147.8 – -137.7	0
-137.7 – -127.7	1
-127.7 – -117.6	7
-117.6 – -107.5	11
-107.5 – -97.45	44
-97.45 – -87.39	49
-87.39 – -77.32	51
-77.32 – -67.25	26
-67.25 – -57.18	21
-57.18 – -47.11	14
-47.11 – -37.04	7
-37.04 – -26.97	2
-26.97 – -16.91	0
-16.91 – -6.836	23
-6.836 – 3.232	102
3.232 – 13.3	135
13.3 – 23.37	88
23.37 – 33.44	91
33.44 – 43.51	59
43.51 – 53.58	25
53.58 – 63.64	11
63.64 – 73.71	29
73.71 – 83.78	104
83.78 – 93.85	31
93.85 – 103.9	13
103.9 – 114	40
114 – 124.1	38
124.1 – 134.1	28
134.1 – 144.2	33
144.2 – 154.3	9
154.3 – 164.3	1
164.3 – 174.4	2

name text identifier

This is a `name` column with 1097 fully unique short strings (n_unique equals n, duplicate_rate 0.0), averaging 8.56 characters and 1.21 words, with 82.95% being single-word entries. Top tokens like `st.`, `county`, `san`, `santa`, `creek`, plus Spanish articles `de`, `la`, `el`, strongly suggest place names (likely US/Latin-influenced toponyms) rather than person names. Every row is distinct, so this functions as an identifier-like label rather than a learnable feature.

Treatment: Treat as a unique label/key; drop from modelling features or use only for joins and display.

anthropic:claude-opus-4-7 · confidence high

Out[19]:

saturn.columns["name"].stats

stat	value
n	1,097
nulls	0 (0.0%)
unique	1,097
len_min	2
len_max	28
len_mean	8.557
len_median	8
len_p95	15
word_mean	1.209
word_median	1
n_empty	0
n_duplicates	0
duplicate_rate	0
vocab_size	1,238
readability_flesch_mean	40.67
emoji_rate	0
url_rate	0
one_word_rate	0.8295
allcaps_rate	0
boilerplate_rate	0
alert: near_unique	100.0% of rows are unique strings
alert: one_word	83.0% rows are a single word
alert: short_text	95th-percentile length under 20 chars

Fig 10.

Character-length distribution for name.

Show data table

Character-length distribution for name (mean: 8.55697356426618).
chars	count
2 – 3	1
3 – 3	7
3 – 4	0
4 – 5	40
5 – 5	104
5 – 6	0
6 – 7	174
7 – 7	170
7 – 8	0
8 – 8	144
8 – 9	136
9 – 10	0
10 – 10	78
10 – 11	62
11 – 12	0
12 – 12	51
12 – 13	44
13 – 14	0
14 – 14	20
14 – 15	0
15 – 16	17
16 – 16	11
16 – 17	0
17 – 18	15
18 – 18	4
18 – 19	0
19 – 20	9
20 – 20	3
20 – 21	0
21 – 22	2
22 – 22	3
22 – 23	0
23 – 23	1
23 – 24	0
24 – 25	0
25 – 25	0
25 – 26	0
26 – 27	0
27 – 27	0
27 – 28	1

description text free_text

Short, templated descriptions of meteorite records — every one of 1097 rows contains the tokens 'meteorite', 'mass:', 'found:', and 'fell.', confirming a generated sentence rather than free prose. Lengths are tight (46–72 chars, mean 54.3, ~8 words) and each row is unique (n_unique=1097, duplicate_rate=0), so the field carries the same signal as the underlying structured columns. Class codes like 'l6.' (260), 'h5.' (163), 'h6.' (91), 'l5.' (76) leak the meteorite classification into the text.

Treatment: Drop or parse into structured fields (mass, found, class) rather than embedding — it is a template over existing columns.

anthropic:claude-opus-4-7 · confidence high

Out[22]:

saturn.columns["description"].stats

stat	value
n	1,097
nulls	0 (0.0%)
unique	1,097
len_min	46
len_max	72
len_mean	54.31
len_median	53
len_p95	64
word_mean	8.254
word_median	8
n_empty	0
n_duplicates	0
duplicate_rate	0
vocab_size	1,372
readability_flesch_mean	52.62
emoji_rate	0
url_rate	0
one_word_rate	0
allcaps_rate	0
boilerplate_rate	0
alert: near_unique	100.0% of rows are unique strings

Fig 11.

Character-length distribution for description.

Show data table

Character-length distribution for description (mean: 54.30811303555151).
chars	count
46 – 47	1
47 – 47	5
47 – 48	0
48 – 49	29
49 – 49	79
49 – 50	0
50 – 51	118
51 – 51	137
51 – 52	0
52 – 52	129
52 – 53	110
53 – 54	0
54 – 54	76
54 – 55	68
55 – 56	0
56 – 56	58
56 – 57	54
57 – 58	0
58 – 58	34
58 – 59	0
59 – 60	40
60 – 60	22
60 – 61	0
61 – 62	26
62 – 62	21
62 – 63	0
63 – 64	20
64 – 64	20
64 – 65	0
65 – 66	14
66 – 66	9
66 – 67	0
67 – 67	11
67 – 68	4
68 – 69	0
69 – 69	3
69 – 70	5
70 – 71	0
71 – 71	1
71 – 72	3

category categorical metadata

This column is a single-valued categorical tag, with all 1097 rows labeled "witnessed_meteorite_falls". Cardinality is 1 and entropy is 0, so it carries no information for modelling and merely records the dataset's provenance or scope.

Treatment: Drop before modelling; retain only as a dataset-level annotation.

anthropic:claude-opus-4-7 · confidence high

Out[25]:

saturn.columns["category"].stats

stat	value
n	1,097
nulls	0 (0.0%)
unique	1
top_value	witnessed_meteorite_falls
top_rate	1
cardinality	1
entropy	0
entropy_ratio	0
alert: imbalance	top value is 100.0% of rows

Fig 12.

Top values for category.

Show data table

Top values for category (1 unique shown, of 1 total).
value	count	share
witnessed_meteorite_falls	1097	100.0%

date categorical timestamp

This column holds dates stored as strings, all snapped to January 1st of the year, suggesting year-only granularity disguised as full dates. Across 1097 rows there are 231 distinct values with very high entropy ratio (0.967) and no single year exceeding 1.6% frequency, so the distribution is spread broadly across years from at least 1868 to 1977. Null rate is low at 1.73%.

Treatment: Parse to datetime and extract year as the working feature, since month/day are constant.

anthropic:claude-opus-4-7 · confidence high

Out[28]:

saturn.columns["date"].stats

stat	value
n	1,097
nulls	19 (1.7%)
unique	231
top_value	1933-01-01
top_rate	0.01577
cardinality	231
entropy	7.593
entropy_ratio	0.967

Fig 13.

Top values for date.

Show data table

Top values for date (20 unique shown, of 231 total).
value	count	share
1933-01-01	17	1.5%
1949-01-01	13	1.2%
1950-01-01	12	1.1%
1976-01-01	11	1.0%
1930-01-01	11	1.0%
1938-01-01	11	1.0%
1910-01-01	11	1.0%
1868-01-01	11	1.0%
1977-01-01	10	0.9%
1939-01-01	10	0.9%
1984-01-01	10	0.9%
1934-01-01	10	0.9%
1916-01-01	10	0.9%
1924-01-01	10	0.9%
1917-01-01	10	0.9%
2008-01-01	9	0.8%
2003-01-01	9	0.8%
1998-01-01	9	0.8%
1890-01-01	9	0.8%
1986-01-01	9	0.8%

country unknown metadata

This column is labeled "country" and contains 1097 non-null values, but saturn skipped detailed profiling so neither the cardinality nor value distribution is available. Without unique counts or sample values, I cannot confirm whether it holds country names, ISO codes, or something else. The only firm signals are full population (null_rate 0.0) and the skipped alert.

Treatment: Re-profile with categorical stats enabled, then standardize to ISO codes before use.

anthropic:claude-opus-4-7 · confidence low

Out[31]:

saturn.columns["country"].stats

stat	value
n	1,097
nulls	0 (0.0%)
unique	—
alert: skipped	no profiler for kind=unknown

mass_g unknown other

Column `mass_g` was skipped by the profiler, so its kind is unknown and no descriptive statistics are available. The only confirmed signals are 1097 rows with a 0.0 null rate; uniqueness, distribution, and type are all missing. The name suggests a numeric mass measurement in grams, but this cannot be verified from the evidence.

Treatment: Re-run profiling on this column to recover type and distribution before any downstream use.

anthropic:claude-opus-4-7 · confidence low

Out[33]:

saturn.columns["mass_g"].stats

stat	value
n	1,097
nulls	0 (0.0%)
unique	—
alert: skipped	no profiler for kind=unknown

meteorite_class categorical label

This column captures the petrologic classification of meteorites, with 125 distinct classes across 1097 records and no nulls. The distribution is dominated by ordinary chondrite types — L6 alone covers 23.7% of rows, followed by H5 (163) and H6 (91) — while a long tail of 115+ rare classes pushes entropy ratio to 0.67. Analysts should note the heavy concentration in a handful of chondrite groups alongside niche entries like Eucrite-mmict (18).

Treatment: Group rare classes into an 'other' bucket before encoding for modelling.

anthropic:claude-opus-4-7 · confidence high

Out[35]:

saturn.columns["meteorite_class"].stats

stat	value
n	1,097
nulls	0 (0.0%)
unique	125
top_value	L6
top_rate	0.237
cardinality	125
entropy	4.639
entropy_ratio	0.666

Fig 14.

Top values for meteorite_class.

Show data table

Top values for meteorite_class (20 unique shown, of 125 total).
value	count	share
L6	260	23.7%
H5	163	14.9%
H6	91	8.3%
L5	76	6.9%
H4	50	4.6%
LL6	41	3.7%
Stone-uncl	39	3.6%
OC	24	2.2%
LL5	19	1.7%
Eucrite-mmict	18	1.6%
L4	18	1.6%
Howardite	16	1.5%
CM2	15	1.4%
H	13	1.2%
L	10	0.9%
Iron, IIIAB	10	0.9%
Aubrite	9	0.8%
Diogenite	8	0.7%
EL6	8	0.7%
CV3	7	0.6%

fall_type categorical metadata

This column records the type of fall event but contains the single value "Fell" across all 1097 rows, with zero nulls. Entropy is 0.0 and top_rate is 1.0, so it carries no information for any downstream model.

Treatment: Drop; constant column with a single value.

anthropic:claude-opus-4-7 · confidence high

Out[38]:

saturn.columns["fall_type"].stats

stat	value
n	1,097
nulls	0 (0.0%)
unique	1
top_value	Fell
top_rate	1
cardinality	1
entropy	0
entropy_ratio	0
alert: imbalance	top value is 100.0% of rows

Fig 15.

Top values for fall_type.

Show data table

Top values for fall_type (1 unique shown, of 1 total).
value	count	share
Fell	1097	100.0%

How to cite

click to copy

BibTeX

@misc{saturn-witnessed-meteorite-falls-witnessed-meteorite-falls-2026,
  author       = {Steuber, Luke},
  title        = {Saturn reading: witnessed meteorite falls witnessed meteorite falls},
  year         ={2026},
  howpublished = {\url{https://dr.eamer.dev/saturn/view/witnessed-meteorite-falls-witnessed_meteorite_falls}},
  note         = {Profiled with saturn-dissect v0.2.0, prompt saturn-insight-v2, model anthropic:claude-opus-4-7},
}

APA

Steuber, L. (2026). Saturn reading: witnessed meteorite falls witnessed meteorite falls. Source: /home/coolhand/datasets/witnessed-meteorite-falls/witnessed_meteorite_falls.json. Profiled with saturn-dissect v0.2.0 (saturn-insight-v2, anthropic:claude-opus-4-7). Retrieved from https://dr.eamer.dev/saturn/view/witnessed-meteorite-falls-witnessed_meteorite_falls