saturn

/home/coolhand/html/datavis/data_trove/data/wild/nasa_meteorites.csv 45,716 rows sample n=45,716 seed 42 2026-06-22T00:21:51+00:00

Overview

Source	/home/coolhand/html/datavis/data_trove/data/wild/nasa_meteorites.csv
Total rows	45,716
Profiled sample	45,716
Columns	20
Generated	2026-06-22T00:21:51+00:00

Show data table

Per-column null rate across the corpus.
column	kind	null %
sid	text	0.0%
id	text	0.0%
position	numeric	0.0%
created_at	numeric	0.0%
created_meta	unknown	0.0%
updated_at	numeric	0.0%
updated_meta	unknown	0.0%
meta	categorical	0.0%
name	text	0.0%
id_1	numeric	0.0%
nametype	categorical	0.0%
recclass	categorical	0.0%
mass (g)	numeric	0.3%
fall	categorical	0.0%
year	categorical	0.6%
reclat	numeric	16.0%
reclong	numeric	16.0%
GeoLocation	text	16.0%
States	numeric	96.4%
Counties	numeric	96.4%

Insights opt-in

Model-generated narrative. These are opinions, not facts — the stats below are what saturn measured. Generated by: anthropic:default.

Dataset high anthropic:default

This dataset is a NASA meteorite landings catalogue covering 45,716 unique meteorite records with attributes including mass, classification, discovery year, and geographic coordinates. The most striking feature is the mass distribution: the median mass is just 32.6 g but the maximum reaches 60,000,000 g, producing extreme skew (skew=76.9) and over 7,000 statistical outliers — a handful of enormous meteorites are pulling the mean to 13,278 g. A second key finding is that 97.6% of records are classified as 'Found' rather than 'Fell', meaning nearly all entries are meteorites discovered on the ground rather than witnessed falling, which has strong implications for geographic and temporal bias in the data. The meteorite classification column (recclass) spans 466 types, dominated by ordinary chondrites (L6, H5, L5), and year of discovery shows a clear spike in the late 1990s–2000s likely tied to Antarctic collection campaigns.

id high anthropic:default

This column contains UUIDs (universally unique identifiers) serving as a primary key, with all 45,716 values being exactly 36 characters long and fully unique — zero duplicates, zero nulls. Notably, all sampled top values share the prefix '00000000-0000-0000-', suggesting the UUID version/variant fields are zeroed out, which is atypical of standard UUID v4 generation and may indicate a custom or synthetic ID scheme. The allcaps_rate of 1.0 is consistent with hex characters but worth noting if downstream systems are case-sensitive.

sid high anthropic:default

This column is a Socrata-style row identifier, recognizable from the 'row-XXXX.XXXX-XXXX' format visible in all sampled values. Every value is exactly 18 characters long (len_min = len_max = len_mean = 18.0), and all 45,716 rows are unique with a duplicate_rate of 0.0 and null_rate of 0.0 — a perfect surrogate key. No surprises in the data; it is entirely consistent with an auto-generated system identifier from a Socrata open-data platform.

mass (g) high anthropic:default

This column records the physical mass of objects in grams, almost certainly meteorite or asteroid specimen weights given the scale and distribution. The median is just 32.6 g while the mean explodes to 13,278 g and the maximum reaches 60,000,000 g (60 tonnes), indicating a tiny fraction of massive outliers dragging the distribution — skew of 76.9 and kurtosis of 6,796 confirm an extreme long tail. Fully 15.5% of rows (7,086) are flagged as outliers, meaning the bulk of specimens are small rocks while a handful of giants dominate the aggregate statistics.

GeoLocation high anthropic:default

GeoLocation stores geographic coordinates as serialized Python list strings in the format [None, '', '', None, False], representing what appears to be a structured geo-point object flattened to text. The most common value — '[None, '0.0', '0.0', None, False]' appearing 6,214 times — is almost certainly a null/unknown sentinel rather than a genuine equatorial location, masking true missingness beyond the 16% null rate. Duplicate rate is high at 55.5% (21,301 duplicates across 17,100 unique values), consistent with many records sharing the same geographic coordinates. The column should be parsed to extract numeric longitude and latitude fields rather than used as raw text.

fall high anthropic:default

This column captures whether a meteorite was discovered on the ground ('Found') versus observed falling ('Fell'), making it a binary classification label for meteorite recovery type. The distribution is severely imbalanced: 'Found' accounts for 97.6% of 45,716 records (44,609), while 'Fell' represents only 2.4% (1,107). The entropy ratio of 0.164 confirms near-minimum uncertainty, flagged explicitly as an imbalance alert. Any model using this as a target will require class-balancing techniques.

meta high anthropic:default

This column is a metadata field that contains exclusively the empty object literal '{ }' across all 45,716 rows, with zero nulls and a cardinality of 1. It carries no information whatsoever — entropy is 0.0 and top_rate is 1.0, meaning every single record is identical. This is almost certainly an unfilled placeholder or a defaulted JSON field that was never populated.

nametype high anthropic:default

This column is a meteorite name-type classification flag, distinguishing between currently valid meteorite names ('Valid') and relict/superseded ones ('Relict'). The distribution is extremely imbalanced: 45,641 of 45,716 records (99.84%) are 'Valid', with only 75 'Relict' entries. The near-zero entropy (0.018) confirms this column carries almost no information variance, which triggered the imbalance alert.

name high anthropic:default

This column contains proper names of geographic features — the top words ('range', 'hills', 'mountains', 'northwest', 'africa', 'grove', 'yamato', 'Queen Alexandra') are all typical components of named landforms or place names. Every one of the 45,716 rows has a distinct value (duplicate_rate 0.0, n_unique = 45,716) with zero nulls, making it a perfect natural identifier. The mean word count of 2.77 and median of 3.0 confirm multi-token names rather than single labels, while 'yamato' appearing 3,317 times as the top individual word suggests a large Antarctic or Japanese geographic sub-corpus driving partial lexical repetition even though full names are unique.

created_at high anthropic:default

This column is a Unix timestamp named 'created_at', but every single one of its 45,716 non-null rows holds the identical value 1446143734 (approximately 2015-10-29 UTC), triggering a 'constant' alert. With n_unique of 1, std of 0.0, and IQR of 0.0, the column carries zero information variance — strongly suggesting a bulk-load default, a data pipeline bug, or a one-time snapshot import where timestamps were not properly captured. This is a critical data quality issue that renders the column useless as a temporal signal.

position high anthropic:default

This column, named 'position', is a numeric field that is entirely constant — every one of its 45,716 non-null values is exactly 0.0 (zero_rate = 1.0, n_unique = 1). It carries zero information and would contribute nothing to any model or analysis. This is flagged as a constant alert, confirming it is safe to drop.

updated_at high anthropic:default

This column is a Unix epoch timestamp named `updated_at`, representing a last-modified datetime for each row. Every single one of the 45,716 non-null records holds the identical value 1446143734 (approximately 2015-10-29 UTC), meaning the column is a constant — it carries zero information variance. This strongly suggests a bulk data load or migration event where all rows were stamped with the same timestamp rather than tracking real update times.

created_meta low anthropic:default

The column 'created_meta' is likely a creation timestamp or metadata field associated with record provenance, but saturn classified it as 'unknown' kind and skipped all profiling, yielding zero stats and no uniqueness count. With 45,716 rows, zero nulls, and no further signal available, its actual dtype, distribution, and content cannot be assessed from this evidence alone.

updated_meta low anthropic:default

The column 'updated_meta' was skipped by the profiler, yielding no stats, no uniqueness count, and no type resolution beyond 'unknown'. With 45,716 non-null rows and a null rate of 0.0, the column is fully populated, but its content, structure, and distribution are entirely opaque from the available evidence. The name suggests it may hold metadata update timestamps or serialized metadata objects (e.g., JSON blobs), but nothing in the evidence confirms this.

id_1 high anthropic:default

This column is almost certainly a row or entity identifier: it has 45,716 unique values across 45,716 rows with zero nulls and zero duplicates, indicating a perfect 1-to-1 mapping. Values run from 1 to 57,458, suggesting either a sparse sequential ID (gaps exist since max > n) or a pre-filtered subset of a larger table. The near-uniform distribution (kurtosis −1.16, skew 0.27, zero outliers) is consistent with a sequential or pseudo-random integer key rather than a meaningful numeric feature.

reclat high anthropic:default

This column represents the recorded latitude of meteorite find/fall locations, with values ranging from -87.37° to +81.17° consistent with geographic latitude bounds. Surprising signals: the median of -71.5° indicates the majority of records are concentrated in high southern latitudes (likely Antarctic recovery sites), yet the Q3 is exactly 0.0°, suggesting a notable cluster at the equator or a placeholder zero — reinforced by a zero_rate of 16.8% that almost exactly matches the null_rate of 16%, implying zeros may be encoding missing coordinates rather than true equatorial finds. Kurtosis of -1.48 confirms a flat, bimodal-like distribution rather than a normal one.

reclong high anthropic:default

This column represents the recorded longitude of meteorite landing or find locations, covering a range from -165.43° to 354.47°. The maximum value of 354.47 is surprising — valid WGS84 longitude should cap at 180°, suggesting some records use a 0–360° convention rather than the standard -180 to 180° range, which will cause mapping errors if not normalised. The zero_rate of ~16% mirrors the null_rate of 16%, strongly implying that zero-filled values are placeholder/missing entries rather than genuine equatorial coordinates at the prime meridian. Distribution is near-symmetric (skew -0.17, kurtosis -0.73) with a large IQR of 157.17, consistent with a globally spread geographic variable.

recclass high anthropic:default

This column contains meteorite classification codes, identifying the mineralogical and petrologic type of each recovered meteorite specimen. The top 7 values (L6, H5, L5, H6, H4, LL5, LL6) are all ordinary chondrite classes and together account for the vast majority of records, with L6 alone representing 18.1% of the 45,716 rows. Despite 466 unique classes, the entropy ratio of 0.51 indicates moderate concentration — the long tail of rare classes (e.g., CM2 with only 416 occurrences) will create sparse dummy variables if one-hot encoded naively.

year high anthropic:default

This column represents a calendar year, stored as full ISO-8601 timestamps normalised to January 1st of each year (e.g. '2003-01-01T00:00:00'), confirming the time component carries no information. Despite being profiled as categorical, it is effectively an annual time dimension spanning at least the range visible in the top values (1979–2006). Surprising: cardinality is 266 distinct values against only ~45 years visible in top values, suggesting either a much wider date range or some malformed/unexpected entries worth inspecting. The top year (2003) accounts for just 7.3% of rows, indicating a reasonably spread distribution rather than heavy concentration.

Numeric correlation

Show data table

Pearson correlation across 9 numeric columns (values clipped to 2 decimals).
	position	created_at	updated_at	id_1	mass (g)	reclat	reclong	States	Counties
position	+nan	+nan	+nan	+nan	+nan	+nan	+nan	+nan	+nan
created_at	+nan	+nan	+nan	+nan	+nan	+nan	+nan	+nan	+nan
updated_at	+nan	+nan	+nan	+nan	+nan	+nan	+nan	+nan	+nan
id_1	+nan	+nan	+nan	+1.00	-0.04	+0.09	-0.18	-0.06	-0.09
mass (g)	+nan	+nan	+nan	-0.04	+1.00	+0.06	+0.02	+0.09	-0.01
reclat	+nan	+nan	+nan	+0.09	+0.06	+1.00	-0.56	+0.07	+0.02
reclong	+nan	+nan	+nan	-0.18	+0.02	-0.56	+1.00	-0.03	-0.08
States	+nan	+nan	+nan	-0.06	+0.09	+0.07	-0.03	+1.00	+0.15
Counties	+nan	+nan	+nan	-0.09	-0.01	+0.02	-0.08	+0.15	+1.00

Languages detected

Per-string language detection across text columns (sampled).

Show data table

Per-language counts (total 4,211 detected strings).
lang	count	share
en	4209	100.0%
sh	2	0.0%

sid text

100.0% of rows are unique strings 100.0% rows are a single word 95th-percentile length under 20 chars

rows45,716

null0 (0.0%)

unique45,716

len_min18

len_max18

len_mean18.000

len_median18.000

len_p9518.000

word_mean1.000

word_median1.000

n_empty0

n_duplicates0

duplicate_rate0.000

vocab_size20,000

readability_flesch_mean-5.680

emoji_rate0.000

url_rate0.000

one_word_rate1.000

allcaps_rate0.000

boilerplate_rate0.000

Show data table

Character-length distribution for sid (mean: 18.0).
chars	count
18 – 18	0
18 – 18	0
18 – 18	0
18 – 18	0
18 – 18	0
18 – 18	0
18 – 18	0
18 – 18	0
18 – 18	0
18 – 18	0
18 – 18	0
18 – 18	0
18 – 18	0
18 – 18	0
18 – 18	0
18 – 18	0
18 – 18	0
18 – 18	0
18 – 18	0
18 – 18	0
18 – 18	45716
18 – 18	0
18 – 18	0
18 – 18	0
18 – 18	0
18 – 18	0
18 – 18	0
18 – 18	0
18 – 18	0
18 – 18	0
18 – 18	0
18 – 18	0
18 – 18	0
18 – 18	0
18 – 18	0
18 – 18	0
18 – 18	0
18 – 18	0
18 – 18	0
18 – 18	0

Sample values (first 10)

row-r47i_enas-8d5d
row-2gxt~mwbj.kygv
row-97xa-5e59-vagr
row-7cp3~x6x4.vm6k
row-3wek_jm8i.pni8
row-y6uh~zk3x.wra8
row-t5mj_vvcr~wn2s
row-ki6c~wwdn-e92u
row-8f4u.pck5.95b7
row-7i8i-ffdi~r4gj

id text

100.0% of rows are unique strings 100.0% rows are a single word 100.0% rows are all-caps

rows45,716

null0 (0.0%)

unique45,716

len_min36

len_max36

len_mean36.000

len_median36.000

len_p9536.000

word_mean1.000

word_median1.000

n_empty0

n_duplicates0

duplicate_rate0.000

vocab_size20,000

readability_flesch_mean65.384

emoji_rate0.000

url_rate0.000

one_word_rate1.000

allcaps_rate1.000

boilerplate_rate0.000

Show data table

Character-length distribution for id (mean: 36.0).
chars	count
36 – 36	0
36 – 36	0
36 – 36	0
36 – 36	0
36 – 36	0
36 – 36	0
36 – 36	0
36 – 36	0
36 – 36	0
36 – 36	0
36 – 36	0
36 – 36	0
36 – 36	0
36 – 36	0
36 – 36	0
36 – 36	0
36 – 36	0
36 – 36	0
36 – 36	0
36 – 36	0
36 – 36	45716
36 – 36	0
36 – 36	0
36 – 36	0
36 – 36	0
36 – 36	0
36 – 36	0
36 – 36	0
36 – 36	0
36 – 36	0
36 – 36	0
36 – 36	0
36 – 36	0
36 – 36	0
36 – 36	0
36 – 36	0
36 – 36	0
36 – 36	0
36 – 36	0
36 – 36	0

Sample values (first 10)

00000000-0000-0000-160C-D58AE0A2ECD9
00000000-0000-0000-D601-94E78E43026D
00000000-0000-0000-5666-66D2DD76BE81
00000000-0000-0000-FAAF-5F9E2B6945C2
00000000-0000-0000-7DCE-D65B4702C6F1
00000000-0000-0000-A883-F7AB796E948C
00000000-0000-0000-8F12-152021CC7C30
00000000-0000-0000-B720-5ED557EDEF92
00000000-0000-0000-57BA-C8C8DC37D4C8
00000000-0000-0000-7DF4-0F5269CA84E7

position numeric

only one distinct value

rows45,716

null0 (0.0%)

unique1

min0.000

max0.000

mean0.000

median0.000

std0.000

q10.000

q30.000

iqr0.000

skew0.000

kurtosis0.000

n_outliers0

outlier_rate0.000

zero_rate1.000

Show data table

Histogram bins for position (median: 0.0).
bin	count
-0.5 – -0.475	0
-0.475 – -0.45	0
-0.45 – -0.425	0
-0.425 – -0.4	0
-0.4 – -0.375	0
-0.375 – -0.35	0
-0.35 – -0.325	0
-0.325 – -0.3	0
-0.3 – -0.275	0
-0.275 – -0.25	0
-0.25 – -0.225	0
-0.225 – -0.2	0
-0.2 – -0.175	0
-0.175 – -0.15	0
-0.15 – -0.125	0
-0.125 – -0.1	0
-0.1 – -0.075	0
-0.075 – -0.05	0
-0.05 – -0.025	0
-0.025 – 0	0
0 – 0.025	45716
0.025 – 0.05	0
0.05 – 0.075	0
0.075 – 0.1	0
0.1 – 0.125	0
0.125 – 0.15	0
0.15 – 0.175	0
0.175 – 0.2	0
0.2 – 0.225	0
0.225 – 0.25	0
0.25 – 0.275	0
0.275 – 0.3	0
0.3 – 0.325	0
0.325 – 0.35	0
0.35 – 0.375	0
0.375 – 0.4	0
0.4 – 0.425	0
0.425 – 0.45	0
0.45 – 0.475	0
0.475 – 0.5	0

created_at numeric

only one distinct value

rows45,716

null0 (0.0%)

unique1

min1,446,143,734

max1,446,143,734

mean1,446,143,734

median1,446,143,734

std0.000

q11,446,143,734

q31,446,143,734

iqr0.000

skew0.000

kurtosis0.000

n_outliers0

outlier_rate0.000

zero_rate0.000

Show data table

Histogram bins for created_at (median: 1446143734.0).
bin	count
1.446e+09 – 1.446e+09	0
1.446e+09 – 1.446e+09	0
1.446e+09 – 1.446e+09	0
1.446e+09 – 1.446e+09	0
1.446e+09 – 1.446e+09	0
1.446e+09 – 1.446e+09	0
1.446e+09 – 1.446e+09	0
1.446e+09 – 1.446e+09	0
1.446e+09 – 1.446e+09	0
1.446e+09 – 1.446e+09	0
1.446e+09 – 1.446e+09	0
1.446e+09 – 1.446e+09	0
1.446e+09 – 1.446e+09	0
1.446e+09 – 1.446e+09	0
1.446e+09 – 1.446e+09	0
1.446e+09 – 1.446e+09	0
1.446e+09 – 1.446e+09	0
1.446e+09 – 1.446e+09	0
1.446e+09 – 1.446e+09	0
1.446e+09 – 1.446e+09	0
1.446e+09 – 1.446e+09	45716
1.446e+09 – 1.446e+09	0
1.446e+09 – 1.446e+09	0
1.446e+09 – 1.446e+09	0
1.446e+09 – 1.446e+09	0
1.446e+09 – 1.446e+09	0
1.446e+09 – 1.446e+09	0
1.446e+09 – 1.446e+09	0
1.446e+09 – 1.446e+09	0
1.446e+09 – 1.446e+09	0
1.446e+09 – 1.446e+09	0
1.446e+09 – 1.446e+09	0
1.446e+09 – 1.446e+09	0
1.446e+09 – 1.446e+09	0
1.446e+09 – 1.446e+09	0
1.446e+09 – 1.446e+09	0
1.446e+09 – 1.446e+09	0
1.446e+09 – 1.446e+09	0
1.446e+09 – 1.446e+09	0
1.446e+09 – 1.446e+09	0

created_meta unknown

no profiler for kind=unknown

rows45,716

null0 (0.0%)

updated_at numeric

only one distinct value

rows45,716

null0 (0.0%)

unique1

min1,446,143,734

max1,446,143,734

mean1,446,143,734

median1,446,143,734

std0.000

q11,446,143,734

q31,446,143,734

iqr0.000

skew0.000

kurtosis0.000

n_outliers0

outlier_rate0.000

zero_rate0.000

Show data table

Histogram bins for updated_at (median: 1446143734.0).
bin	count
1.446e+09 – 1.446e+09	0
1.446e+09 – 1.446e+09	0
1.446e+09 – 1.446e+09	0
1.446e+09 – 1.446e+09	0
1.446e+09 – 1.446e+09	0
1.446e+09 – 1.446e+09	0
1.446e+09 – 1.446e+09	0
1.446e+09 – 1.446e+09	0
1.446e+09 – 1.446e+09	0
1.446e+09 – 1.446e+09	0
1.446e+09 – 1.446e+09	0
1.446e+09 – 1.446e+09	0
1.446e+09 – 1.446e+09	0
1.446e+09 – 1.446e+09	0
1.446e+09 – 1.446e+09	0
1.446e+09 – 1.446e+09	0
1.446e+09 – 1.446e+09	0
1.446e+09 – 1.446e+09	0
1.446e+09 – 1.446e+09	0
1.446e+09 – 1.446e+09	0
1.446e+09 – 1.446e+09	45716
1.446e+09 – 1.446e+09	0
1.446e+09 – 1.446e+09	0
1.446e+09 – 1.446e+09	0
1.446e+09 – 1.446e+09	0
1.446e+09 – 1.446e+09	0
1.446e+09 – 1.446e+09	0
1.446e+09 – 1.446e+09	0
1.446e+09 – 1.446e+09	0
1.446e+09 – 1.446e+09	0
1.446e+09 – 1.446e+09	0
1.446e+09 – 1.446e+09	0
1.446e+09 – 1.446e+09	0
1.446e+09 – 1.446e+09	0
1.446e+09 – 1.446e+09	0
1.446e+09 – 1.446e+09	0
1.446e+09 – 1.446e+09	0
1.446e+09 – 1.446e+09	0
1.446e+09 – 1.446e+09	0
1.446e+09 – 1.446e+09	0

updated_meta unknown

no profiler for kind=unknown

rows45,716

null0 (0.0%)

meta categorical

top value is 100.0% of rows

rows45,716

null0 (0.0%)

unique1

top_value{ }

top_rate1.000

cardinality1

entropy-0.000

entropy_ratio0.000

Show data table

Top values for meta (1 unique shown, of 1 total).
value	count	share
{ }	45716	100.0%

Top values (rank 1–20)

{ } — 45,716

name text

100.0% of rows are unique strings

rows45,716

null0 (0.0%)

unique45,716

len_min2

len_max28

len_mean17.785

len_median19.000

len_p9527.000

word_mean2.772

word_median3.000

n_empty0

n_duplicates0

duplicate_rate0.000

vocab_size17,917

readability_flesch_mean63.744

emoji_rate0.000

url_rate0.000

one_word_rate0.047

allcaps_rate0.000

boilerplate_rate0.000

Show data table

Character-length distribution for name (mean: 17.78460495231429).
chars	count
2 – 3	2
3 – 3	16
3 – 4	0
4 – 5	97
5 – 5	224
5 – 6	0
6 – 7	420
7 – 7	430
7 – 8	0
8 – 8	449
8 – 9	958
9 – 10	0
10 – 10	1392
10 – 11	1235
11 – 12	0
12 – 12	3961
12 – 13	5478
13 – 14	0
14 – 14	297
14 – 15	0
15 – 16	1217
16 – 16	237
16 – 17	0
17 – 18	2772
18 – 18	2194
18 – 19	0
19 – 20	1826
20 – 20	3427
20 – 21	0
21 – 22	8562
22 – 22	5145
22 – 23	0
23 – 23	1466
23 – 24	33
24 – 25	0
25 – 25	405
25 – 26	21
26 – 27	0
27 – 27	3398
27 – 28	54

Sample values (first 10)

Atoka
Queen Alexandra Range 97947
Yamato 790306
Superior Valley 005
Larkman Nunatak 06417
Yamato 982185
Lewis Cliff 86041
Yamato 791912
Tanezrouft 033
Al Huwaysah 005

id_1 numeric

rows45,716

null0 (0.0%)

unique45,716

min1.000

max57,458

mean26,890

median24,262

std16,861

q112,689

q340,657

iqr27,968

skew0.267

kurtosis-1.160

n_outliers0

outlier_rate0.000

zero_rate0.000

Show data table

Histogram bins for id_1 (median: 24261.5).
bin	count
1 – 1437	1354
1437 – 2874	1151
2874 – 4310	814
4310 – 5747	1270
5747 – 7183	1416
7183 – 8620	1428
8620 – 1.006e+04	1433
1.006e+04 – 1.149e+04	1404
1.149e+04 – 1.293e+04	1394
1.293e+04 – 1.437e+04	1437
1.437e+04 – 1.58e+04	1415
1.58e+04 – 1.724e+04	1414
1.724e+04 – 1.867e+04	1420
1.867e+04 – 2.011e+04	1423
2.011e+04 – 2.155e+04	1437
2.155e+04 – 2.298e+04	1432
2.298e+04 – 2.442e+04	1368
2.442e+04 – 2.586e+04	1296
2.586e+04 – 2.729e+04	1368
2.729e+04 – 2.873e+04	900
2.873e+04 – 3.017e+04	1368
3.017e+04 – 3.16e+04	1078
3.16e+04 – 3.304e+04	529
3.304e+04 – 3.448e+04	763
3.448e+04 – 3.591e+04	1205
3.591e+04 – 3.735e+04	1049
3.735e+04 – 3.878e+04	582
3.878e+04 – 4.022e+04	810
4.022e+04 – 4.166e+04	392
4.166e+04 – 4.309e+04	0
4.309e+04 – 4.453e+04	186
4.453e+04 – 4.597e+04	1300
4.597e+04 – 4.74e+04	1281
4.74e+04 – 4.884e+04	1413
4.884e+04 – 5.028e+04	1129
5.028e+04 – 5.171e+04	1274
5.171e+04 – 5.315e+04	1022
5.315e+04 – 5.459e+04	1219
5.459e+04 – 5.602e+04	1275
5.602e+04 – 5.746e+04	1267

nametype categorical

top value is 99.8% of rows

rows45,716

null0 (0.0%)

unique2

top_valueValid

top_rate0.998

cardinality2

entropy0.018

entropy_ratio0.018

Show data table

Top values for nametype (2 unique shown, of 2 total).
value	count	share
Valid	45641	99.8%
Relict	75	0.2%

Top values (rank 1–20)

Valid — 45,641
Relict — 75

recclass categorical

rows45,716

null0 (0.0%)

unique466

top_valueL6

top_rate0.181

cardinality466

entropy4.548

entropy_ratio0.513

Show data table

Top values for recclass (20 unique shown, of 466 total).
value	count	share
L6	8285	18.1%
H5	7142	15.6%
L5	4796	10.5%
H6	4528	9.9%
H4	4211	9.2%
LL5	2766	6.1%
LL6	2043	4.5%
L4	1253	2.7%
H4/5	428	0.9%
CM2	416	0.9%
H3	386	0.8%
L3	365	0.8%
CO3	335	0.7%
Ureilite	300	0.7%
Iron, IIIAB	285	0.6%
LL4	268	0.6%
CV3	256	0.6%
Diogenite	241	0.5%
Howardite	240	0.5%
LL	225	0.5%

Top values (rank 1–20)

L6 — 8,285
H5 — 7,142
L5 — 4,796
H6 — 4,528
H4 — 4,211
LL5 — 2,766
LL6 — 2,043
L4 — 1,253
H4/5 — 428
CM2 — 416
H3 — 386
L3 — 365
CO3 — 335
Ureilite — 300
Iron, IIIAB — 285
LL4 — 268
CV3 — 256
Diogenite — 241
Howardite — 240
LL — 225

mass (g) numeric

skew=+76.91 15.5% rows beyond 1.5 IQR

rows45,716

null131 (0.3%)

unique12,576

min0.000

max60,000,000

mean13,278

median32.600

std574,989

q17.200

q3202.600

iqr195.400

skew76.908

kurtosis6,796

n_outliers7,086

outlier_rate0.155

zero_rate4.17e-04

Show data table

Histogram bins for mass (g) (median: 32.6).
bin	count
0 – 1.5e+06	45544
1.5e+06 – 3e+06	16
3e+06 – 4.5e+06	8
4.5e+06 – 6e+06	1
6e+06 – 7.5e+06	1
7.5e+06 – 9e+06	1
9e+06 – 1.05e+07	2
1.05e+07 – 1.2e+07	0
1.2e+07 – 1.35e+07	0
1.35e+07 – 1.5e+07	0
1.5e+07 – 1.65e+07	2
1.65e+07 – 1.8e+07	0
1.8e+07 – 1.95e+07	0
1.95e+07 – 2.1e+07	0
2.1e+07 – 2.25e+07	1
2.25e+07 – 2.4e+07	1
2.4e+07 – 2.55e+07	2
2.55e+07 – 2.7e+07	1
2.7e+07 – 2.85e+07	1
2.85e+07 – 3e+07	0
3e+07 – 3.15e+07	1
3.15e+07 – 3.3e+07	0
3.3e+07 – 3.45e+07	0
3.45e+07 – 3.6e+07	0
3.6e+07 – 3.75e+07	0
3.75e+07 – 3.9e+07	0
3.9e+07 – 4.05e+07	0
4.05e+07 – 4.2e+07	0
4.2e+07 – 4.35e+07	0
4.35e+07 – 4.5e+07	0
4.5e+07 – 4.65e+07	0
4.65e+07 – 4.8e+07	0
4.8e+07 – 4.95e+07	0
4.95e+07 – 5.1e+07	1
5.1e+07 – 5.25e+07	0
5.25e+07 – 5.4e+07	0
5.4e+07 – 5.55e+07	0
5.55e+07 – 5.7e+07	0
5.7e+07 – 5.85e+07	1
5.85e+07 – 6e+07	1

fall categorical

top value is 97.6% of rows

rows45,716

null0 (0.0%)

unique2

top_valueFound

top_rate0.976

cardinality2

entropy0.164

entropy_ratio0.164

Show data table

Top values for fall (2 unique shown, of 2 total).
value	count	share
Found	44609	97.6%
Fell	1107	2.4%

Top values (rank 1–20)

Found — 44,609
Fell — 1,107

year categorical

rows45,716

null291 (0.6%)

unique266

top_value2003-01-01T00:00:00

top_rate0.073

cardinality266

entropy5.299

entropy_ratio0.658

Show data table

Top values for year (20 unique shown, of 266 total).
value	count	share
2003-01-01T00:00:00	3323	7.3%
1979-01-01T00:00:00	3046	6.7%
1998-01-01T00:00:00	2697	5.9%
2006-01-01T00:00:00	2456	5.4%
1988-01-01T00:00:00	2296	5.0%
2002-01-01T00:00:00	2078	4.5%
2004-01-01T00:00:00	1940	4.2%
2000-01-01T00:00:00	1792	3.9%
1997-01-01T00:00:00	1696	3.7%
1999-01-01T00:00:00	1691	3.7%
2001-01-01T00:00:00	1650	3.6%
1990-01-01T00:00:00	1518	3.3%
2009-01-01T00:00:00	1497	3.3%
1986-01-01T00:00:00	1375	3.0%
2007-01-01T00:00:00	1189	2.6%
2010-01-01T00:00:00	1005	2.2%
1993-01-01T00:00:00	979	2.1%
2008-01-01T00:00:00	957	2.1%
1987-01-01T00:00:00	916	2.0%
1991-01-01T00:00:00	877	1.9%

Top values (rank 1–20)

2003-01-01T00:00:00 — 3,323
1979-01-01T00:00:00 — 3,046
1998-01-01T00:00:00 — 2,697
2006-01-01T00:00:00 — 2,456
1988-01-01T00:00:00 — 2,296
2002-01-01T00:00:00 — 2,078
2004-01-01T00:00:00 — 1,940
2000-01-01T00:00:00 — 1,792
1997-01-01T00:00:00 — 1,696
1999-01-01T00:00:00 — 1,691
2001-01-01T00:00:00 — 1,650
1990-01-01T00:00:00 — 1,518
2009-01-01T00:00:00 — 1,497
1986-01-01T00:00:00 — 1,375
2007-01-01T00:00:00 — 1,189
2010-01-01T00:00:00 — 1,005
1993-01-01T00:00:00 — 979
2008-01-01T00:00:00 — 957
1987-01-01T00:00:00 — 916
1991-01-01T00:00:00 — 877

reclat numeric

rows45,716

null7,315 (16.0%)

unique12,738

min-87.367

max81.167

mean-39.123

median-71.500

std46.379

q1-76.714

q30.000

iqr76.714

skew0.492

kurtosis-1.477

n_outliers0

outlier_rate0.000

zero_rate0.168

Show data table

Histogram bins for reclat (median: -71.5).
bin	count
-87.37 – -83.15	7090
-83.15 – -78.94	1218
-78.94 – -74.73	4083
-74.73 – -70.51	9707
-70.51 – -66.3	1
-66.3 – -62.09	0
-62.09 – -57.87	0
-57.87 – -53.66	1
-53.66 – -49.45	0
-49.45 – -45.23	3
-45.23 – -41.02	11
-41.02 – -36.81	27
-36.81 – -32.59	91
-32.59 – -28.38	550
-28.38 – -24.17	436
-24.17 – -19.95	93
-19.95 – -15.74	35
-15.74 – -11.53	18
-11.53 – -7.313	19
-7.313 – -3.1	24
-3.1 – 1.113	6448
1.113 – 5.327	15
5.327 – 9.54	19
9.54 – 13.75	55
13.75 – 17.97	40
17.97 – 22.18	3197
22.18 – 26.39	315
26.39 – 30.61	2239
30.61 – 34.82	859
34.82 – 39.03	649
39.03 – 43.25	403
43.25 – 47.46	230
47.46 – 51.67	196
51.67 – 55.89	155
55.89 – 60.1	119
60.1 – 64.31	30
64.31 – 68.53	17
68.53 – 72.74	4
72.74 – 76.95	3
76.95 – 81.17	1

reclong numeric

rows45,716

null7,315 (16.0%)

unique14,640

min-165.433

max354.473

mean61.074

median35.667

std80.647

q10.000

q3157.167

iqr157.167

skew-0.174

kurtosis-0.731

n_outliers0

outlier_rate0.000

zero_rate0.162

Show data table

Histogram bins for reclong (median: 35.66667).
bin	count
-165.4 – -152.4	30
-152.4 – -139.4	228
-139.4 – -126.4	6
-126.4 – -113.4	444
-113.4 – -100.4	795
-100.4 – -87.45	462
-87.45 – -74.45	214
-74.45 – -61.45	1386
-61.45 – -48.45	57
-48.45 – -35.46	33
-35.46 – -22.46	2
-22.46 – -9.461	76
-9.461 – 3.536	6696
3.536 – 16.53	2208
16.53 – 29.53	1782
29.53 – 42.53	5243
42.53 – 55.53	1818
55.53 – 68.52	1420
68.52 – 81.52	2616
81.52 – 94.52	78
94.52 – 107.5	45
107.5 – 120.5	131
120.5 – 133.5	483
133.5 – 146.5	178
146.5 – 159.5	4052
159.5 – 172.5	7724
172.5 – 185.5	193
185.5 – 198.5	0
198.5 – 211.5	0
211.5 – 224.5	0
224.5 – 237.5	0
237.5 – 250.5	0
250.5 – 263.5	0
263.5 – 276.5	0
276.5 – 289.5	0
289.5 – 302.5	0
302.5 – 315.5	0
315.5 – 328.5	0
328.5 – 341.5	0
341.5 – 354.5	1

GeoLocation text

55.5% duplicate strings

rows45,716

null7,315 (16.0%)

unique17,100

len_min33

len_max47

len_mean40.305

len_median41.000

len_p9545.000

word_mean5.000

word_median5.000

n_empty0

n_duplicates21,301

duplicate_rate0.555

vocab_size15,461

readability_flesch_mean117.160

emoji_rate0.000

url_rate0.000

one_word_rate0.000

allcaps_rate0.000

boilerplate_rate0.000

Show data table

Character-length distribution for GeoLocation (mean: 40.3046535246478).
chars	count
33 – 33	6214
33 – 34	0
34 – 34	27
34 – 34	0
34 – 35	0
35 – 35	139
35 – 35	0
35 – 36	0
36 – 36	1693
36 – 36	0
36 – 37	0
37 – 37	3283
37 – 38	0
38 – 38	0
38 – 38	527
38 – 39	0
39 – 39	0
39 – 39	488
39 – 40	0
40 – 40	0
40 – 40	5488
40 – 41	0
41 – 41	2086
41 – 41	0
41 – 42	0
42 – 42	3512
42 – 42	0
42 – 43	0
43 – 43	3760
43 – 44	0
44 – 44	0
44 – 44	2811
44 – 45	0
45 – 45	0
45 – 45	7678
45 – 46	0
46 – 46	0
46 – 46	672
46 – 47	0
47 – 47	23

Sample values (first 10)

[None, '38.5', '-94.3', None, False]
[None, '-84.0', '168.0', None, False]
[None, '-71.5', '35.66667', None, False]
[None, '-71.5', '35.66667', None, False]
[None, '19.72767', '55.72677', None, False]
[None, '0.0', '0.0', None, False]
[None, '50.375', '21.73333', None, False]
[None, '-71.5', '35.66667', None, False]
[None, '-71.5', '35.66667', None, False]
[None, '27.63556', '4.02528', None, False]

States numeric

96.4% null

rows45,716

null44,057 (96.4%)

unique45

min1.000

max51.000

mean17.338

median15.000

std10.411

q19.000

q323.000

iqr14.000

skew1.115

kurtosis0.689

n_outliers40

outlier_rate0.024

zero_rate0.000

Show data table

Histogram bins for States (median: 15.0).
bin	count
1 – 2.25	13
2.25 – 3.5	9
3.5 – 4.75	2
4.75 – 6	6
6 – 7.25	125
7.25 – 8.5	224
8.5 – 9.75	87
9.75 – 11	95
11 – 12.25	229
12.25 – 13.5	20
13.5 – 14.75	14
14.75 – 16	15
16 – 17.25	146
17.25 – 18.5	23
18.5 – 19.75	49
19.75 – 21	40
21 – 22.25	19
22.25 – 23.5	297
23.5 – 24.75	4
24.75 – 26	1
26 – 27.25	0
27.25 – 28.5	0
28.5 – 29.75	17
29.75 – 31	5
31 – 32.25	29
32.25 – 33.5	6
33.5 – 34.75	10
34.75 – 36	12
36 – 37.25	55
37.25 – 38.5	12
38.5 – 39.75	23
39.75 – 41	14
41 – 42.25	18
42.25 – 43.5	0
43.5 – 44.75	0
44.75 – 46	3
46 – 47.25	11
47.25 – 48.5	8
48.5 – 49.75	5
49.75 – 51	13

Counties numeric

96.4% null

rows45,716

null44,057 (96.4%)

unique662

min5.000

max3,210

mean1,353

median1,195

std994.089

q1482.000

q32,113

iqr1,631

skew0.237

kurtosis-1.190

n_outliers0

outlier_rate0.000

zero_rate0.000

Show data table

Histogram bins for Counties (median: 1195.0).
bin	count
5 – 85.12	303
85.12 – 165.2	8
165.2 – 245.4	17
245.4 – 325.5	26
325.5 – 405.6	20
405.6 – 485.8	64
485.8 – 565.9	8
565.9 – 646	34
646 – 726.1	20
726.1 – 806.2	113
806.2 – 886.4	59
886.4 – 966.5	46
966.5 – 1047	49
1047 – 1127	25
1127 – 1207	40
1207 – 1287	57
1287 – 1367	28
1367 – 1447	25
1447 – 1527	16
1527 – 1608	10
1608 – 1688	13
1688 – 1768	11
1768 – 1848	8
1848 – 1928	13
1928 – 2008	198
2008 – 2088	25
2088 – 2168	21
2168 – 2248	28
2248 – 2329	28
2329 – 2409	69
2409 – 2489	18
2489 – 2569	34
2569 – 2649	11
2649 – 2729	34
2729 – 2809	19
2809 – 2890	18
2890 – 2970	28
2970 – 3050	19
3050 – 3130	24
3130 – 3210	72