witnessed meteorite falls witnessed meteorite falls
Reading
This dataset catalogs 1,097 witnessed meteorite falls, with each row identified by a unique name and described by date, geographic coordinates, meteorite class, and a short description. Two columns (category and fall_type) are constants ('witnessed_meteorite_falls' and 'Fell') and offer no analytical value. The most informative dimensions are meteorite_class — heavily dominated by L6 (260 falls, ~24%) followed by H5 (163) and H6 (91) — and the latitude/longitude pair, where latitude skews north (median 36.1) with about 8% outliers and longitude spans the full globe. The date column covers 231 distinct years with 1933 as the most frequent (17 falls), suggesting room for a time-trend exploration.
citing: row_count · column_count · columns.category.stats.top_value · columns.fall_type.stats.top_value · columns.meteorite_class.top_values · columns.meteorite_class.stats.cardinality · columns.latitude.stats · columns.longitude.stats · columns.date.top_values · columns.date.stats.cardinality
Charts the summary said to look at first
Show data table
| value | count | share |
|---|---|---|
| L6 | 260 | 23.7% |
| H5 | 163 | 14.9% |
| H6 | 91 | 8.3% |
| L5 | 76 | 6.9% |
| H4 | 50 | 4.6% |
| LL6 | 41 | 3.7% |
| Stone-uncl | 39 | 3.6% |
| OC | 24 | 2.2% |
| LL5 | 19 | 1.7% |
| Eucrite-mmict | 18 | 1.6% |
| L4 | 18 | 1.6% |
| Howardite | 16 | 1.5% |
| CM2 | 15 | 1.4% |
| H | 13 | 1.2% |
| L | 10 | 0.9% |
| Iron, IIIAB | 10 | 0.9% |
| Aubrite | 9 | 0.8% |
| Diogenite | 8 | 0.7% |
| EL6 | 8 | 0.7% |
| CV3 | 7 | 0.6% |
Show data table
| bin | count |
|---|---|
| -44.12 – -40.77 | 2 |
| -40.77 – -37.42 | 1 |
| -37.42 – -34.07 | 3 |
| -34.07 – -30.73 | 31 |
| -30.73 – -27.38 | 14 |
| -27.38 – -24.03 | 14 |
| -24.03 – -20.68 | 9 |
| -20.68 – -17.34 | 11 |
| -17.34 – -13.99 | 6 |
| -13.99 – -10.64 | 4 |
| -10.64 – -7.295 | 11 |
| -7.295 – -3.948 | 18 |
| -3.948 – -0.6002 | 9 |
| -0.6002 – 2.747 | 10 |
| 2.747 – 6.095 | 6 |
| 6.095 – 9.442 | 13 |
| 9.442 – 12.79 | 34 |
| 12.79 – 16.14 | 32 |
| 16.14 – 19.48 | 20 |
| 19.48 – 22.83 | 34 |
| 22.83 – 26.18 | 57 |
| 26.18 – 29.53 | 59 |
| 29.53 – 32.87 | 61 |
| 32.87 – 36.22 | 96 |
| 36.22 – 39.57 | 79 |
| 39.57 – 42.92 | 78 |
| 42.92 – 46.26 | 119 |
| 46.26 – 49.61 | 81 |
| 49.61 – 52.96 | 92 |
| 52.96 – 56.31 | 53 |
| 56.31 – 59.65 | 22 |
| 59.65 – 63 | 14 |
| 63 – 66.35 | 4 |
Show data table
| bin | count |
|---|---|
| -157.9 – -147.8 | 2 |
| -147.8 – -137.7 | 0 |
| -137.7 – -127.7 | 1 |
| -127.7 – -117.6 | 7 |
| -117.6 – -107.5 | 11 |
| -107.5 – -97.45 | 44 |
| -97.45 – -87.39 | 49 |
| -87.39 – -77.32 | 51 |
| -77.32 – -67.25 | 26 |
| -67.25 – -57.18 | 21 |
| -57.18 – -47.11 | 14 |
| -47.11 – -37.04 | 7 |
| -37.04 – -26.97 | 2 |
| -26.97 – -16.91 | 0 |
| -16.91 – -6.836 | 23 |
| -6.836 – 3.232 | 102 |
| 3.232 – 13.3 | 135 |
| 13.3 – 23.37 | 88 |
| 23.37 – 33.44 | 91 |
| 33.44 – 43.51 | 59 |
| 43.51 – 53.58 | 25 |
| 53.58 – 63.64 | 11 |
| 63.64 – 73.71 | 29 |
| 73.71 – 83.78 | 104 |
| 83.78 – 93.85 | 31 |
| 93.85 – 103.9 | 13 |
| 103.9 – 114 | 40 |
| 114 – 124.1 | 38 |
| 124.1 – 134.1 | 28 |
| 134.1 – 144.2 | 33 |
| 144.2 – 154.3 | 9 |
| 154.3 – 164.3 | 1 |
| 164.3 – 174.4 | 2 |
Show data table
| value | count | share |
|---|---|---|
| 1933-01-01 | 17 | 1.5% |
| 1949-01-01 | 13 | 1.2% |
| 1950-01-01 | 12 | 1.1% |
| 1976-01-01 | 11 | 1.0% |
| 1930-01-01 | 11 | 1.0% |
| 1938-01-01 | 11 | 1.0% |
| 1910-01-01 | 11 | 1.0% |
| 1868-01-01 | 11 | 1.0% |
| 1977-01-01 | 10 | 0.9% |
| 1939-01-01 | 10 | 0.9% |
| 1984-01-01 | 10 | 0.9% |
| 1934-01-01 | 10 | 0.9% |
| 1916-01-01 | 10 | 0.9% |
| 1924-01-01 | 10 | 0.9% |
| 1917-01-01 | 10 | 0.9% |
| 2008-01-01 | 9 | 0.8% |
| 2003-01-01 | 9 | 0.8% |
| 1998-01-01 | 9 | 0.8% |
| 1890-01-01 | 9 | 0.8% |
| 1986-01-01 | 9 | 0.8% |
Show data table
| chars | count |
|---|---|
| 46 – 47 | 1 |
| 47 – 47 | 5 |
| 47 – 48 | 0 |
| 48 – 49 | 29 |
| 49 – 49 | 79 |
| 49 – 50 | 0 |
| 50 – 51 | 118 |
| 51 – 51 | 137 |
| 51 – 52 | 0 |
| 52 – 52 | 129 |
| 52 – 53 | 110 |
| 53 – 54 | 0 |
| 54 – 54 | 76 |
| 54 – 55 | 68 |
| 55 – 56 | 0 |
| 56 – 56 | 58 |
| 56 – 57 | 54 |
| 57 – 58 | 0 |
| 58 – 58 | 34 |
| 58 – 59 | 0 |
| 59 – 60 | 40 |
| 60 – 60 | 22 |
| 60 – 61 | 0 |
| 61 – 62 | 26 |
| 62 – 62 | 21 |
| 62 – 63 | 0 |
| 63 – 64 | 20 |
| 64 – 64 | 20 |
| 64 – 65 | 0 |
| 65 – 66 | 14 |
| 66 – 66 | 9 |
| 66 – 67 | 0 |
| 67 – 67 | 11 |
| 67 – 68 | 4 |
| 68 – 69 | 0 |
| 69 – 69 | 3 |
| 69 – 70 | 5 |
| 70 – 71 | 0 |
| 71 – 71 | 1 |
| 71 – 72 | 3 |
Schema
10 columns| Alerts | ||||
|---|---|---|---|---|
| latitude | numeric | 0.0% | 958 |
outliers
|
| longitude | numeric | 0.0% | 1,030 |
|
| name | text | 0.0% | 1,097 |
near_unique
one_word
short_text
|
| description | text | 0.0% | 1,097 |
near_unique
|
| category | categorical | 0.0% | 1 |
imbalance
|
| date | categorical | 1.7% | 231 |
|
| country | unknown | 0.0% | — |
skipped
|
| mass_g | unknown | 0.0% | — |
skipped
|
| meteorite_class | categorical | 0.0% | 125 |
|
| fall_type | categorical | 0.0% | 1 |
imbalance
|
latitude
numeric feature outliersGeographic latitude coordinates spanning -44.12 to 66.35 degrees, covering most of the inhabited globe. The distribution is left-skewed (skew -1.28) with median 36.1° pulling above the mean of 30.04°, indicating a Northern Hemisphere concentration. Roughly 8.2% of values (90 rows) flag as outliers, likely far-southern points well below the Q1 of 21.87°. Treatment: Pair with longitude for geospatial features; keep outliers as legitimate Southern Hemisphere observations rather than trimming.
- n
- 1,097
- nulls
- 0 (0.0%)
- unique
- 958
- min
- -44.12
- max
- 66.35
- mean
- 30.04
- median
- 36.1
- std
- 23.13
- q1
- 21.87
- q3
- 46.07
- iqr
- 24.2
- skew
- -1.276
- kurtosis
- 1.01
- n_outliers
- 90
- outlier_rate
- 0.08204
- zero_rate
- 0.001823
longitude
numeric featureGeographic longitude in decimal degrees, with values spanning -157.87 to 174.4 — essentially the full -180/180 range. Distribution is broad (std 68.87, IQR 80.5) and only mildly left-skewed (-0.23) with flat tails (kurtosis -0.62), indicating worldwide coverage rather than a single region. 1030 unique values across 1097 rows suggests these are distinct point locations with minimal repetition; no nulls and only 3 outliers. Treatment: Pair with latitude as a geospatial coordinate; avoid treating as a standalone scalar feature.
- n
- 1,097
- nulls
- 0 (0.0%)
- unique
- 1,030
- min
- -157.9
- max
- 174.4
- mean
- 20.13
- median
- 18.72
- std
- 68.87
- q1
- -4.233
- q3
- 76.27
- iqr
- 80.5
- skew
- -0.2257
- kurtosis
- -0.6185
- n_outliers
- 3
- outlier_rate
- 0.002735
- zero_rate
- 0.0009116
name
text identifier near_unique one_word short_textThis is a `name` column with 1097 fully unique short strings (n_unique equals n, duplicate_rate 0.0), averaging 8.56 characters and 1.21 words, with 82.95% being single-word entries. Top tokens like `st.`, `county`, `san`, `santa`, `creek`, plus Spanish articles `de`, `la`, `el`, strongly suggest place names (likely US/Latin-influenced toponyms) rather than person names. Every row is distinct, so this functions as an identifier-like label rather than a learnable feature. Treatment: Treat as a unique label/key; drop from modelling features or use only for joins and display.
- n
- 1,097
- nulls
- 0 (0.0%)
- unique
- 1,097
- len_min
- 2
- len_max
- 28
- len_mean
- 8.557
- len_median
- 8
- len_p95
- 15
- word_mean
- 1.209
- word_median
- 1
- n_empty
- 0
- n_duplicates
- 0
- duplicate_rate
- 0
- vocab_size
- 1,238
- readability_flesch_mean
- 40.67
- emoji_rate
- 0
- url_rate
- 0
- one_word_rate
- 0.8295
- allcaps_rate
- 0
- boilerplate_rate
- 0
description
text free_text near_uniqueShort, templated descriptions of meteorite records — every one of 1097 rows contains the tokens 'meteorite', 'mass:', 'found:', and 'fell.', confirming a generated sentence rather than free prose. Lengths are tight (46–72 chars, mean 54.3, ~8 words) and each row is unique (n_unique=1097, duplicate_rate=0), so the field carries the same signal as the underlying structured columns. Class codes like 'l6.' (260), 'h5.' (163), 'h6.' (91), 'l5.' (76) leak the meteorite classification into the text. Treatment: Drop or parse into structured fields (mass, found, class) rather than embedding — it is a template over existing columns.
- n
- 1,097
- nulls
- 0 (0.0%)
- unique
- 1,097
- len_min
- 46
- len_max
- 72
- len_mean
- 54.31
- len_median
- 53
- len_p95
- 64
- word_mean
- 8.254
- word_median
- 8
- n_empty
- 0
- n_duplicates
- 0
- duplicate_rate
- 0
- vocab_size
- 1,372
- readability_flesch_mean
- 52.62
- emoji_rate
- 0
- url_rate
- 0
- one_word_rate
- 0
- allcaps_rate
- 0
- boilerplate_rate
- 0
category
categorical metadata imbalanceThis column is a single-valued categorical tag, with all 1097 rows labeled "witnessed_meteorite_falls". Cardinality is 1 and entropy is 0, so it carries no information for modelling and merely records the dataset's provenance or scope. Treatment: Drop before modelling; retain only as a dataset-level annotation.
- n
- 1,097
- nulls
- 0 (0.0%)
- unique
- 1
- top_value
- witnessed_meteorite_falls
- top_rate
- 1
- cardinality
- 1
- entropy
- 0
- entropy_ratio
- 0
date
categorical timestampThis column holds dates stored as strings, all snapped to January 1st of the year, suggesting year-only granularity disguised as full dates. Across 1097 rows there are 231 distinct values with very high entropy ratio (0.967) and no single year exceeding 1.6% frequency, so the distribution is spread broadly across years from at least 1868 to 1977. Null rate is low at 1.73%. Treatment: Parse to datetime and extract year as the working feature, since month/day are constant.
- n
- 1,097
- nulls
- 19 (1.7%)
- unique
- 231
- top_value
- 1933-01-01
- top_rate
- 0.01577
- cardinality
- 231
- entropy
- 7.593
- entropy_ratio
- 0.967
country
unknown metadata skippedThis column is labeled "country" and contains 1097 non-null values, but saturn skipped detailed profiling so neither the cardinality nor value distribution is available. Without unique counts or sample values, I cannot confirm whether it holds country names, ISO codes, or something else. The only firm signals are full population (null_rate 0.0) and the skipped alert. Treatment: Re-profile with categorical stats enabled, then standardize to ISO codes before use.
- n
- 1,097
- nulls
- 0 (0.0%)
- unique
- —
mass_g
unknown other skippedColumn `mass_g` was skipped by the profiler, so its kind is unknown and no descriptive statistics are available. The only confirmed signals are 1097 rows with a 0.0 null rate; uniqueness, distribution, and type are all missing. The name suggests a numeric mass measurement in grams, but this cannot be verified from the evidence. Treatment: Re-run profiling on this column to recover type and distribution before any downstream use.
- n
- 1,097
- nulls
- 0 (0.0%)
- unique
- —
meteorite_class
categorical labelThis column captures the petrologic classification of meteorites, with 125 distinct classes across 1097 records and no nulls. The distribution is dominated by ordinary chondrite types — L6 alone covers 23.7% of rows, followed by H5 (163) and H6 (91) — while a long tail of 115+ rare classes pushes entropy ratio to 0.67. Analysts should note the heavy concentration in a handful of chondrite groups alongside niche entries like Eucrite-mmict (18). Treatment: Group rare classes into an 'other' bucket before encoding for modelling.
- n
- 1,097
- nulls
- 0 (0.0%)
- unique
- 125
- top_value
- L6
- top_rate
- 0.237
- cardinality
- 125
- entropy
- 4.639
- entropy_ratio
- 0.666
fall_type
categorical metadata imbalanceThis column records the type of fall event but contains the single value "Fell" across all 1097 rows, with zero nulls. Entropy is 0.0 and top_rate is 1.0, so it carries no information for any downstream model. Treatment: Drop; constant column with a single value.
- n
- 1,097
- nulls
- 0 (0.0%)
- unique
- 1
- top_value
- Fell
- top_rate
- 1
- cardinality
- 1
- entropy
- 0
- entropy_ratio
- 0