saturn

/home/coolhand/html/datavis/data_trove/data/wild/nasa_meteorites.csv 45,716 rows sample n=45,716 seed 42 2026-06-22T00:21:51+00:00

Overview

Source/home/coolhand/html/datavis/data_trove/data/wild/nasa_meteorites.csv
Total rows45,716
Profiled sample45,716
Columns20
Generated2026-06-22T00:21:51+00:00
Show data table
Per-column null rate across the corpus.
columnkindnull %
sidtext0.0%
idtext0.0%
positionnumeric0.0%
created_atnumeric0.0%
created_metaunknown0.0%
updated_atnumeric0.0%
updated_metaunknown0.0%
metacategorical0.0%
nametext0.0%
id_1numeric0.0%
nametypecategorical0.0%
recclasscategorical0.0%
mass (g)numeric0.3%
fallcategorical0.0%
yearcategorical0.6%
reclatnumeric16.0%
reclongnumeric16.0%
GeoLocationtext16.0%
Statesnumeric96.4%
Countiesnumeric96.4%

Insights opt-in

Model-generated narrative. These are opinions, not facts — the stats below are what saturn measured. Generated by: anthropic:default.

Dataset high anthropic:default

This dataset is a NASA meteorite landings catalogue covering 45,716 unique meteorite records with attributes including mass, classification, discovery year, and geographic coordinates. The most striking feature is the mass distribution: the median mass is just 32.6 g but the maximum reaches 60,000,000 g, producing extreme skew (skew=76.9) and over 7,000 statistical outliers — a handful of enormous meteorites are pulling the mean to 13,278 g. A second key finding is that 97.6% of records are classified as 'Found' rather than 'Fell', meaning nearly all entries are meteorites discovered on the ground rather than witnessed falling, which has strong implications for geographic and temporal bias in the data. The meteorite classification column (recclass) spans 466 types, dominated by ordinary chondrites (L6, H5, L5), and year of discovery shows a clear spike in the late 1990s–2000s likely tied to Antarctic collection campaigns.

id high anthropic:default

This column contains UUIDs (universally unique identifiers) serving as a primary key, with all 45,716 values being exactly 36 characters long and fully unique — zero duplicates, zero nulls. Notably, all sampled top values share the prefix '00000000-0000-0000-', suggesting the UUID version/variant fields are zeroed out, which is atypical of standard UUID v4 generation and may indicate a custom or synthetic ID scheme. The allcaps_rate of 1.0 is consistent with hex characters but worth noting if downstream systems are case-sensitive.

sid high anthropic:default

This column is a Socrata-style row identifier, recognizable from the 'row-XXXX.XXXX-XXXX' format visible in all sampled values. Every value is exactly 18 characters long (len_min = len_max = len_mean = 18.0), and all 45,716 rows are unique with a duplicate_rate of 0.0 and null_rate of 0.0 — a perfect surrogate key. No surprises in the data; it is entirely consistent with an auto-generated system identifier from a Socrata open-data platform.

mass (g) high anthropic:default

This column records the physical mass of objects in grams, almost certainly meteorite or asteroid specimen weights given the scale and distribution. The median is just 32.6 g while the mean explodes to 13,278 g and the maximum reaches 60,000,000 g (60 tonnes), indicating a tiny fraction of massive outliers dragging the distribution — skew of 76.9 and kurtosis of 6,796 confirm an extreme long tail. Fully 15.5% of rows (7,086) are flagged as outliers, meaning the bulk of specimens are small rocks while a handful of giants dominate the aggregate statistics.

GeoLocation high anthropic:default

GeoLocation stores geographic coordinates as serialized Python list strings in the format [None, '', '', None, False], representing what appears to be a structured geo-point object flattened to text. The most common value — '[None, '0.0', '0.0', None, False]' appearing 6,214 times — is almost certainly a null/unknown sentinel rather than a genuine equatorial location, masking true missingness beyond the 16% null rate. Duplicate rate is high at 55.5% (21,301 duplicates across 17,100 unique values), consistent with many records sharing the same geographic coordinates. The column should be parsed to extract numeric longitude and latitude fields rather than used as raw text.

fall high anthropic:default

This column captures whether a meteorite was discovered on the ground ('Found') versus observed falling ('Fell'), making it a binary classification label for meteorite recovery type. The distribution is severely imbalanced: 'Found' accounts for 97.6% of 45,716 records (44,609), while 'Fell' represents only 2.4% (1,107). The entropy ratio of 0.164 confirms near-minimum uncertainty, flagged explicitly as an imbalance alert. Any model using this as a target will require class-balancing techniques.

meta high anthropic:default

This column is a metadata field that contains exclusively the empty object literal '{ }' across all 45,716 rows, with zero nulls and a cardinality of 1. It carries no information whatsoever — entropy is 0.0 and top_rate is 1.0, meaning every single record is identical. This is almost certainly an unfilled placeholder or a defaulted JSON field that was never populated.

nametype high anthropic:default

This column is a meteorite name-type classification flag, distinguishing between currently valid meteorite names ('Valid') and relict/superseded ones ('Relict'). The distribution is extremely imbalanced: 45,641 of 45,716 records (99.84%) are 'Valid', with only 75 'Relict' entries. The near-zero entropy (0.018) confirms this column carries almost no information variance, which triggered the imbalance alert.

name high anthropic:default

This column contains proper names of geographic features — the top words ('range', 'hills', 'mountains', 'northwest', 'africa', 'grove', 'yamato', 'Queen Alexandra') are all typical components of named landforms or place names. Every one of the 45,716 rows has a distinct value (duplicate_rate 0.0, n_unique = 45,716) with zero nulls, making it a perfect natural identifier. The mean word count of 2.77 and median of 3.0 confirm multi-token names rather than single labels, while 'yamato' appearing 3,317 times as the top individual word suggests a large Antarctic or Japanese geographic sub-corpus driving partial lexical repetition even though full names are unique.

created_at high anthropic:default

This column is a Unix timestamp named 'created_at', but every single one of its 45,716 non-null rows holds the identical value 1446143734 (approximately 2015-10-29 UTC), triggering a 'constant' alert. With n_unique of 1, std of 0.0, and IQR of 0.0, the column carries zero information variance — strongly suggesting a bulk-load default, a data pipeline bug, or a one-time snapshot import where timestamps were not properly captured. This is a critical data quality issue that renders the column useless as a temporal signal.

position high anthropic:default

This column, named 'position', is a numeric field that is entirely constant — every one of its 45,716 non-null values is exactly 0.0 (zero_rate = 1.0, n_unique = 1). It carries zero information and would contribute nothing to any model or analysis. This is flagged as a constant alert, confirming it is safe to drop.

updated_at high anthropic:default

This column is a Unix epoch timestamp named `updated_at`, representing a last-modified datetime for each row. Every single one of the 45,716 non-null records holds the identical value 1446143734 (approximately 2015-10-29 UTC), meaning the column is a constant — it carries zero information variance. This strongly suggests a bulk data load or migration event where all rows were stamped with the same timestamp rather than tracking real update times.

created_meta low anthropic:default

The column 'created_meta' is likely a creation timestamp or metadata field associated with record provenance, but saturn classified it as 'unknown' kind and skipped all profiling, yielding zero stats and no uniqueness count. With 45,716 rows, zero nulls, and no further signal available, its actual dtype, distribution, and content cannot be assessed from this evidence alone.

updated_meta low anthropic:default

The column 'updated_meta' was skipped by the profiler, yielding no stats, no uniqueness count, and no type resolution beyond 'unknown'. With 45,716 non-null rows and a null rate of 0.0, the column is fully populated, but its content, structure, and distribution are entirely opaque from the available evidence. The name suggests it may hold metadata update timestamps or serialized metadata objects (e.g., JSON blobs), but nothing in the evidence confirms this.

id_1 high anthropic:default

This column is almost certainly a row or entity identifier: it has 45,716 unique values across 45,716 rows with zero nulls and zero duplicates, indicating a perfect 1-to-1 mapping. Values run from 1 to 57,458, suggesting either a sparse sequential ID (gaps exist since max > n) or a pre-filtered subset of a larger table. The near-uniform distribution (kurtosis −1.16, skew 0.27, zero outliers) is consistent with a sequential or pseudo-random integer key rather than a meaningful numeric feature.

reclat high anthropic:default

This column represents the recorded latitude of meteorite find/fall locations, with values ranging from -87.37° to +81.17° consistent with geographic latitude bounds. Surprising signals: the median of -71.5° indicates the majority of records are concentrated in high southern latitudes (likely Antarctic recovery sites), yet the Q3 is exactly 0.0°, suggesting a notable cluster at the equator or a placeholder zero — reinforced by a zero_rate of 16.8% that almost exactly matches the null_rate of 16%, implying zeros may be encoding missing coordinates rather than true equatorial finds. Kurtosis of -1.48 confirms a flat, bimodal-like distribution rather than a normal one.

reclong high anthropic:default

This column represents the recorded longitude of meteorite landing or find locations, covering a range from -165.43° to 354.47°. The maximum value of 354.47 is surprising — valid WGS84 longitude should cap at 180°, suggesting some records use a 0–360° convention rather than the standard -180 to 180° range, which will cause mapping errors if not normalised. The zero_rate of ~16% mirrors the null_rate of 16%, strongly implying that zero-filled values are placeholder/missing entries rather than genuine equatorial coordinates at the prime meridian. Distribution is near-symmetric (skew -0.17, kurtosis -0.73) with a large IQR of 157.17, consistent with a globally spread geographic variable.

recclass high anthropic:default

This column contains meteorite classification codes, identifying the mineralogical and petrologic type of each recovered meteorite specimen. The top 7 values (L6, H5, L5, H6, H4, LL5, LL6) are all ordinary chondrite classes and together account for the vast majority of records, with L6 alone representing 18.1% of the 45,716 rows. Despite 466 unique classes, the entropy ratio of 0.51 indicates moderate concentration — the long tail of rare classes (e.g., CM2 with only 416 occurrences) will create sparse dummy variables if one-hot encoded naively.

year high anthropic:default

This column represents a calendar year, stored as full ISO-8601 timestamps normalised to January 1st of each year (e.g. '2003-01-01T00:00:00'), confirming the time component carries no information. Despite being profiled as categorical, it is effectively an annual time dimension spanning at least the range visible in the top values (1979–2006). Surprising: cardinality is 266 distinct values against only ~45 years visible in top values, suggesting either a much wider date range or some malformed/unexpected entries worth inspecting. The top year (2003) accounts for just 7.3% of rows, indicating a reasonably spread distribution rather than heavy concentration.

Numeric correlation

Show data table
Pearson correlation across 9 numeric columns (values clipped to 2 decimals).
positioncreated_atupdated_atid_1mass (g)reclatreclongStatesCounties
position+nan+nan+nan+nan+nan+nan+nan+nan+nan
created_at+nan+nan+nan+nan+nan+nan+nan+nan+nan
updated_at+nan+nan+nan+nan+nan+nan+nan+nan+nan
id_1+nan+nan+nan+1.00-0.04+0.09-0.18-0.06-0.09
mass (g)+nan+nan+nan-0.04+1.00+0.06+0.02+0.09-0.01
reclat+nan+nan+nan+0.09+0.06+1.00-0.56+0.07+0.02
reclong+nan+nan+nan-0.18+0.02-0.56+1.00-0.03-0.08
States+nan+nan+nan-0.06+0.09+0.07-0.03+1.00+0.15
Counties+nan+nan+nan-0.09-0.01+0.02-0.08+0.15+1.00

Languages detected

Per-string language detection across text columns (sampled).

Show data table
Per-language counts (total 4,211 detected strings).
langcountshare
en4209100.0%
sh20.0%

sid text

100.0% of rows are unique strings 100.0% rows are a single word 95th-percentile length under 20 chars
rows45,716
null0 (0.0%)
unique45,716
len_min18
len_max18
len_mean18.000
len_median18.000
len_p9518.000
word_mean1.000
word_median1.000
n_empty0
n_duplicates0
duplicate_rate0.000
vocab_size20,000
readability_flesch_mean-5.680
emoji_rate0.000
url_rate0.000
one_word_rate1.000
allcaps_rate0.000
boilerplate_rate0.000
Show data table
Character-length distribution for sid (mean: 18.0).
charscount
18 – 180
18 – 180
18 – 180
18 – 180
18 – 180
18 – 180
18 – 180
18 – 180
18 – 180
18 – 180
18 – 180
18 – 180
18 – 180
18 – 180
18 – 180
18 – 180
18 – 180
18 – 180
18 – 180
18 – 180
18 – 1845716
18 – 180
18 – 180
18 – 180
18 – 180
18 – 180
18 – 180
18 – 180
18 – 180
18 – 180
18 – 180
18 – 180
18 – 180
18 – 180
18 – 180
18 – 180
18 – 180
18 – 180
18 – 180
18 – 180
Sample values (first 10)
  1. row-r47i_enas-8d5d
  2. row-2gxt~mwbj.kygv
  3. row-97xa-5e59-vagr
  4. row-7cp3~x6x4.vm6k
  5. row-3wek_jm8i.pni8
  6. row-y6uh~zk3x.wra8
  7. row-t5mj_vvcr~wn2s
  8. row-ki6c~wwdn-e92u
  9. row-8f4u.pck5.95b7
  10. row-7i8i-ffdi~r4gj

id text

100.0% of rows are unique strings 100.0% rows are a single word 100.0% rows are all-caps
rows45,716
null0 (0.0%)
unique45,716
len_min36
len_max36
len_mean36.000
len_median36.000
len_p9536.000
word_mean1.000
word_median1.000
n_empty0
n_duplicates0
duplicate_rate0.000
vocab_size20,000
readability_flesch_mean65.384
emoji_rate0.000
url_rate0.000
one_word_rate1.000
allcaps_rate1.000
boilerplate_rate0.000
Show data table
Character-length distribution for id (mean: 36.0).
charscount
36 – 360
36 – 360
36 – 360
36 – 360
36 – 360
36 – 360
36 – 360
36 – 360
36 – 360
36 – 360
36 – 360
36 – 360
36 – 360
36 – 360
36 – 360
36 – 360
36 – 360
36 – 360
36 – 360
36 – 360
36 – 3645716
36 – 360
36 – 360
36 – 360
36 – 360
36 – 360
36 – 360
36 – 360
36 – 360
36 – 360
36 – 360
36 – 360
36 – 360
36 – 360
36 – 360
36 – 360
36 – 360
36 – 360
36 – 360
36 – 360
Sample values (first 10)
  1. 00000000-0000-0000-160C-D58AE0A2ECD9
  2. 00000000-0000-0000-D601-94E78E43026D
  3. 00000000-0000-0000-5666-66D2DD76BE81
  4. 00000000-0000-0000-FAAF-5F9E2B6945C2
  5. 00000000-0000-0000-7DCE-D65B4702C6F1
  6. 00000000-0000-0000-A883-F7AB796E948C
  7. 00000000-0000-0000-8F12-152021CC7C30
  8. 00000000-0000-0000-B720-5ED557EDEF92
  9. 00000000-0000-0000-57BA-C8C8DC37D4C8
  10. 00000000-0000-0000-7DF4-0F5269CA84E7

position numeric

only one distinct value
rows45,716
null0 (0.0%)
unique1
min0.000
max0.000
mean0.000
median0.000
std0.000
q10.000
q30.000
iqr0.000
skew0.000
kurtosis0.000
n_outliers0
outlier_rate0.000
zero_rate1.000
Show data table
Histogram bins for position (median: 0.0).
bincount
-0.5 – -0.4750
-0.475 – -0.450
-0.45 – -0.4250
-0.425 – -0.40
-0.4 – -0.3750
-0.375 – -0.350
-0.35 – -0.3250
-0.325 – -0.30
-0.3 – -0.2750
-0.275 – -0.250
-0.25 – -0.2250
-0.225 – -0.20
-0.2 – -0.1750
-0.175 – -0.150
-0.15 – -0.1250
-0.125 – -0.10
-0.1 – -0.0750
-0.075 – -0.050
-0.05 – -0.0250
-0.025 – 00
0 – 0.02545716
0.025 – 0.050
0.05 – 0.0750
0.075 – 0.10
0.1 – 0.1250
0.125 – 0.150
0.15 – 0.1750
0.175 – 0.20
0.2 – 0.2250
0.225 – 0.250
0.25 – 0.2750
0.275 – 0.30
0.3 – 0.3250
0.325 – 0.350
0.35 – 0.3750
0.375 – 0.40
0.4 – 0.4250
0.425 – 0.450
0.45 – 0.4750
0.475 – 0.50

created_at numeric

only one distinct value
rows45,716
null0 (0.0%)
unique1
min1,446,143,734
max1,446,143,734
mean1,446,143,734
median1,446,143,734
std0.000
q11,446,143,734
q31,446,143,734
iqr0.000
skew0.000
kurtosis0.000
n_outliers0
outlier_rate0.000
zero_rate0.000
Show data table
Histogram bins for created_at (median: 1446143734.0).
bincount
1.446e+09 – 1.446e+090
1.446e+09 – 1.446e+090
1.446e+09 – 1.446e+090
1.446e+09 – 1.446e+090
1.446e+09 – 1.446e+090
1.446e+09 – 1.446e+090
1.446e+09 – 1.446e+090
1.446e+09 – 1.446e+090
1.446e+09 – 1.446e+090
1.446e+09 – 1.446e+090
1.446e+09 – 1.446e+090
1.446e+09 – 1.446e+090
1.446e+09 – 1.446e+090
1.446e+09 – 1.446e+090
1.446e+09 – 1.446e+090
1.446e+09 – 1.446e+090
1.446e+09 – 1.446e+090
1.446e+09 – 1.446e+090
1.446e+09 – 1.446e+090
1.446e+09 – 1.446e+090
1.446e+09 – 1.446e+0945716
1.446e+09 – 1.446e+090
1.446e+09 – 1.446e+090
1.446e+09 – 1.446e+090
1.446e+09 – 1.446e+090
1.446e+09 – 1.446e+090
1.446e+09 – 1.446e+090
1.446e+09 – 1.446e+090
1.446e+09 – 1.446e+090
1.446e+09 – 1.446e+090
1.446e+09 – 1.446e+090
1.446e+09 – 1.446e+090
1.446e+09 – 1.446e+090
1.446e+09 – 1.446e+090
1.446e+09 – 1.446e+090
1.446e+09 – 1.446e+090
1.446e+09 – 1.446e+090
1.446e+09 – 1.446e+090
1.446e+09 – 1.446e+090
1.446e+09 – 1.446e+090

created_meta unknown

no profiler for kind=unknown
rows45,716
null0 (0.0%)

updated_at numeric

only one distinct value
rows45,716
null0 (0.0%)
unique1
min1,446,143,734
max1,446,143,734
mean1,446,143,734
median1,446,143,734
std0.000
q11,446,143,734
q31,446,143,734
iqr0.000
skew0.000
kurtosis0.000
n_outliers0
outlier_rate0.000
zero_rate0.000
Show data table
Histogram bins for updated_at (median: 1446143734.0).
bincount
1.446e+09 – 1.446e+090
1.446e+09 – 1.446e+090
1.446e+09 – 1.446e+090
1.446e+09 – 1.446e+090
1.446e+09 – 1.446e+090
1.446e+09 – 1.446e+090
1.446e+09 – 1.446e+090
1.446e+09 – 1.446e+090
1.446e+09 – 1.446e+090
1.446e+09 – 1.446e+090
1.446e+09 – 1.446e+090
1.446e+09 – 1.446e+090
1.446e+09 – 1.446e+090
1.446e+09 – 1.446e+090
1.446e+09 – 1.446e+090
1.446e+09 – 1.446e+090
1.446e+09 – 1.446e+090
1.446e+09 – 1.446e+090
1.446e+09 – 1.446e+090
1.446e+09 – 1.446e+090
1.446e+09 – 1.446e+0945716
1.446e+09 – 1.446e+090
1.446e+09 – 1.446e+090
1.446e+09 – 1.446e+090
1.446e+09 – 1.446e+090
1.446e+09 – 1.446e+090
1.446e+09 – 1.446e+090
1.446e+09 – 1.446e+090
1.446e+09 – 1.446e+090
1.446e+09 – 1.446e+090
1.446e+09 – 1.446e+090
1.446e+09 – 1.446e+090
1.446e+09 – 1.446e+090
1.446e+09 – 1.446e+090
1.446e+09 – 1.446e+090
1.446e+09 – 1.446e+090
1.446e+09 – 1.446e+090
1.446e+09 – 1.446e+090
1.446e+09 – 1.446e+090
1.446e+09 – 1.446e+090

updated_meta unknown

no profiler for kind=unknown
rows45,716
null0 (0.0%)

meta categorical

top value is 100.0% of rows
rows45,716
null0 (0.0%)
unique1
top_value{ }
top_rate1.000
cardinality1
entropy-0.000
entropy_ratio0.000
Show data table
Top values for meta (1 unique shown, of 1 total).
valuecountshare
{ }45716100.0%
Top values (rank 1–20)
  1. { } — 45,716

name text

100.0% of rows are unique strings
rows45,716
null0 (0.0%)
unique45,716
len_min2
len_max28
len_mean17.785
len_median19.000
len_p9527.000
word_mean2.772
word_median3.000
n_empty0
n_duplicates0
duplicate_rate0.000
vocab_size17,917
readability_flesch_mean63.744
emoji_rate0.000
url_rate0.000
one_word_rate0.047
allcaps_rate0.000
boilerplate_rate0.000
Show data table
Character-length distribution for name (mean: 17.78460495231429).
charscount
2 – 32
3 – 316
3 – 40
4 – 597
5 – 5224
5 – 60
6 – 7420
7 – 7430
7 – 80
8 – 8449
8 – 9958
9 – 100
10 – 101392
10 – 111235
11 – 120
12 – 123961
12 – 135478
13 – 140
14 – 14297
14 – 150
15 – 161217
16 – 16237
16 – 170
17 – 182772
18 – 182194
18 – 190
19 – 201826
20 – 203427
20 – 210
21 – 228562
22 – 225145
22 – 230
23 – 231466
23 – 2433
24 – 250
25 – 25405
25 – 2621
26 – 270
27 – 273398
27 – 2854
Sample values (first 10)
  1. Atoka
  2. Queen Alexandra Range 97947
  3. Yamato 790306
  4. Superior Valley 005
  5. Larkman Nunatak 06417
  6. Yamato 982185
  7. Lewis Cliff 86041
  8. Yamato 791912
  9. Tanezrouft 033
  10. Al Huwaysah 005

id_1 numeric

rows45,716
null0 (0.0%)
unique45,716
min1.000
max57,458
mean26,890
median24,262
std16,861
q112,689
q340,657
iqr27,968
skew0.267
kurtosis-1.160
n_outliers0
outlier_rate0.000
zero_rate0.000
Show data table
Histogram bins for id_1 (median: 24261.5).
bincount
1 – 14371354
1437 – 28741151
2874 – 4310814
4310 – 57471270
5747 – 71831416
7183 – 86201428
8620 – 1.006e+041433
1.006e+04 – 1.149e+041404
1.149e+04 – 1.293e+041394
1.293e+04 – 1.437e+041437
1.437e+04 – 1.58e+041415
1.58e+04 – 1.724e+041414
1.724e+04 – 1.867e+041420
1.867e+04 – 2.011e+041423
2.011e+04 – 2.155e+041437
2.155e+04 – 2.298e+041432
2.298e+04 – 2.442e+041368
2.442e+04 – 2.586e+041296
2.586e+04 – 2.729e+041368
2.729e+04 – 2.873e+04900
2.873e+04 – 3.017e+041368
3.017e+04 – 3.16e+041078
3.16e+04 – 3.304e+04529
3.304e+04 – 3.448e+04763
3.448e+04 – 3.591e+041205
3.591e+04 – 3.735e+041049
3.735e+04 – 3.878e+04582
3.878e+04 – 4.022e+04810
4.022e+04 – 4.166e+04392
4.166e+04 – 4.309e+040
4.309e+04 – 4.453e+04186
4.453e+04 – 4.597e+041300
4.597e+04 – 4.74e+041281
4.74e+04 – 4.884e+041413
4.884e+04 – 5.028e+041129
5.028e+04 – 5.171e+041274
5.171e+04 – 5.315e+041022
5.315e+04 – 5.459e+041219
5.459e+04 – 5.602e+041275
5.602e+04 – 5.746e+041267

nametype categorical

top value is 99.8% of rows
rows45,716
null0 (0.0%)
unique2
top_valueValid
top_rate0.998
cardinality2
entropy0.018
entropy_ratio0.018
Show data table
Top values for nametype (2 unique shown, of 2 total).
valuecountshare
Valid4564199.8%
Relict750.2%
Top values (rank 1–20)
  1. Valid — 45,641
  2. Relict — 75

recclass categorical

rows45,716
null0 (0.0%)
unique466
top_valueL6
top_rate0.181
cardinality466
entropy4.548
entropy_ratio0.513
Show data table
Top values for recclass (20 unique shown, of 466 total).
valuecountshare
L6828518.1%
H5714215.6%
L5479610.5%
H645289.9%
H442119.2%
LL527666.1%
LL620434.5%
L412532.7%
H4/54280.9%
CM24160.9%
H33860.8%
L33650.8%
CO33350.7%
Ureilite3000.7%
Iron, IIIAB2850.6%
LL42680.6%
CV32560.6%
Diogenite2410.5%
Howardite2400.5%
LL2250.5%
Top values (rank 1–20)
  1. L6 — 8,285
  2. H5 — 7,142
  3. L5 — 4,796
  4. H6 — 4,528
  5. H4 — 4,211
  6. LL5 — 2,766
  7. LL6 — 2,043
  8. L4 — 1,253
  9. H4/5 — 428
  10. CM2 — 416
  11. H3 — 386
  12. L3 — 365
  13. CO3 — 335
  14. Ureilite — 300
  15. Iron, IIIAB — 285
  16. LL4 — 268
  17. CV3 — 256
  18. Diogenite — 241
  19. Howardite — 240
  20. LL — 225

mass (g) numeric

skew=+76.91 15.5% rows beyond 1.5 IQR
rows45,716
null131 (0.3%)
unique12,576
min0.000
max60,000,000
mean13,278
median32.600
std574,989
q17.200
q3202.600
iqr195.400
skew76.908
kurtosis6,796
n_outliers7,086
outlier_rate0.155
zero_rate4.17e-04
Show data table
Histogram bins for mass (g) (median: 32.6).
bincount
0 – 1.5e+0645544
1.5e+06 – 3e+0616
3e+06 – 4.5e+068
4.5e+06 – 6e+061
6e+06 – 7.5e+061
7.5e+06 – 9e+061
9e+06 – 1.05e+072
1.05e+07 – 1.2e+070
1.2e+07 – 1.35e+070
1.35e+07 – 1.5e+070
1.5e+07 – 1.65e+072
1.65e+07 – 1.8e+070
1.8e+07 – 1.95e+070
1.95e+07 – 2.1e+070
2.1e+07 – 2.25e+071
2.25e+07 – 2.4e+071
2.4e+07 – 2.55e+072
2.55e+07 – 2.7e+071
2.7e+07 – 2.85e+071
2.85e+07 – 3e+070
3e+07 – 3.15e+071
3.15e+07 – 3.3e+070
3.3e+07 – 3.45e+070
3.45e+07 – 3.6e+070
3.6e+07 – 3.75e+070
3.75e+07 – 3.9e+070
3.9e+07 – 4.05e+070
4.05e+07 – 4.2e+070
4.2e+07 – 4.35e+070
4.35e+07 – 4.5e+070
4.5e+07 – 4.65e+070
4.65e+07 – 4.8e+070
4.8e+07 – 4.95e+070
4.95e+07 – 5.1e+071
5.1e+07 – 5.25e+070
5.25e+07 – 5.4e+070
5.4e+07 – 5.55e+070
5.55e+07 – 5.7e+070
5.7e+07 – 5.85e+071
5.85e+07 – 6e+071

fall categorical

top value is 97.6% of rows
rows45,716
null0 (0.0%)
unique2
top_valueFound
top_rate0.976
cardinality2
entropy0.164
entropy_ratio0.164
Show data table
Top values for fall (2 unique shown, of 2 total).
valuecountshare
Found4460997.6%
Fell11072.4%
Top values (rank 1–20)
  1. Found — 44,609
  2. Fell — 1,107

year categorical

rows45,716
null291 (0.6%)
unique266
top_value2003-01-01T00:00:00
top_rate0.073
cardinality266
entropy5.299
entropy_ratio0.658
Show data table
Top values for year (20 unique shown, of 266 total).
valuecountshare
2003-01-01T00:00:0033237.3%
1979-01-01T00:00:0030466.7%
1998-01-01T00:00:0026975.9%
2006-01-01T00:00:0024565.4%
1988-01-01T00:00:0022965.0%
2002-01-01T00:00:0020784.5%
2004-01-01T00:00:0019404.2%
2000-01-01T00:00:0017923.9%
1997-01-01T00:00:0016963.7%
1999-01-01T00:00:0016913.7%
2001-01-01T00:00:0016503.6%
1990-01-01T00:00:0015183.3%
2009-01-01T00:00:0014973.3%
1986-01-01T00:00:0013753.0%
2007-01-01T00:00:0011892.6%
2010-01-01T00:00:0010052.2%
1993-01-01T00:00:009792.1%
2008-01-01T00:00:009572.1%
1987-01-01T00:00:009162.0%
1991-01-01T00:00:008771.9%
Top values (rank 1–20)
  1. 2003-01-01T00:00:00 — 3,323
  2. 1979-01-01T00:00:00 — 3,046
  3. 1998-01-01T00:00:00 — 2,697
  4. 2006-01-01T00:00:00 — 2,456
  5. 1988-01-01T00:00:00 — 2,296
  6. 2002-01-01T00:00:00 — 2,078
  7. 2004-01-01T00:00:00 — 1,940
  8. 2000-01-01T00:00:00 — 1,792
  9. 1997-01-01T00:00:00 — 1,696
  10. 1999-01-01T00:00:00 — 1,691
  11. 2001-01-01T00:00:00 — 1,650
  12. 1990-01-01T00:00:00 — 1,518
  13. 2009-01-01T00:00:00 — 1,497
  14. 1986-01-01T00:00:00 — 1,375
  15. 2007-01-01T00:00:00 — 1,189
  16. 2010-01-01T00:00:00 — 1,005
  17. 1993-01-01T00:00:00 — 979
  18. 2008-01-01T00:00:00 — 957
  19. 1987-01-01T00:00:00 — 916
  20. 1991-01-01T00:00:00 — 877

reclat numeric

rows45,716
null7,315 (16.0%)
unique12,738
min-87.367
max81.167
mean-39.123
median-71.500
std46.379
q1-76.714
q30.000
iqr76.714
skew0.492
kurtosis-1.477
n_outliers0
outlier_rate0.000
zero_rate0.168
Show data table
Histogram bins for reclat (median: -71.5).
bincount
-87.37 – -83.157090
-83.15 – -78.941218
-78.94 – -74.734083
-74.73 – -70.519707
-70.51 – -66.31
-66.3 – -62.090
-62.09 – -57.870
-57.87 – -53.661
-53.66 – -49.450
-49.45 – -45.233
-45.23 – -41.0211
-41.02 – -36.8127
-36.81 – -32.5991
-32.59 – -28.38550
-28.38 – -24.17436
-24.17 – -19.9593
-19.95 – -15.7435
-15.74 – -11.5318
-11.53 – -7.31319
-7.313 – -3.124
-3.1 – 1.1136448
1.113 – 5.32715
5.327 – 9.5419
9.54 – 13.7555
13.75 – 17.9740
17.97 – 22.183197
22.18 – 26.39315
26.39 – 30.612239
30.61 – 34.82859
34.82 – 39.03649
39.03 – 43.25403
43.25 – 47.46230
47.46 – 51.67196
51.67 – 55.89155
55.89 – 60.1119
60.1 – 64.3130
64.31 – 68.5317
68.53 – 72.744
72.74 – 76.953
76.95 – 81.171

reclong numeric

rows45,716
null7,315 (16.0%)
unique14,640
min-165.433
max354.473
mean61.074
median35.667
std80.647
q10.000
q3157.167
iqr157.167
skew-0.174
kurtosis-0.731
n_outliers0
outlier_rate0.000
zero_rate0.162
Show data table
Histogram bins for reclong (median: 35.66667).
bincount
-165.4 – -152.430
-152.4 – -139.4228
-139.4 – -126.46
-126.4 – -113.4444
-113.4 – -100.4795
-100.4 – -87.45462
-87.45 – -74.45214
-74.45 – -61.451386
-61.45 – -48.4557
-48.45 – -35.4633
-35.46 – -22.462
-22.46 – -9.46176
-9.461 – 3.5366696
3.536 – 16.532208
16.53 – 29.531782
29.53 – 42.535243
42.53 – 55.531818
55.53 – 68.521420
68.52 – 81.522616
81.52 – 94.5278
94.52 – 107.545
107.5 – 120.5131
120.5 – 133.5483
133.5 – 146.5178
146.5 – 159.54052
159.5 – 172.57724
172.5 – 185.5193
185.5 – 198.50
198.5 – 211.50
211.5 – 224.50
224.5 – 237.50
237.5 – 250.50
250.5 – 263.50
263.5 – 276.50
276.5 – 289.50
289.5 – 302.50
302.5 – 315.50
315.5 – 328.50
328.5 – 341.50
341.5 – 354.51

GeoLocation text

55.5% duplicate strings
rows45,716
null7,315 (16.0%)
unique17,100
len_min33
len_max47
len_mean40.305
len_median41.000
len_p9545.000
word_mean5.000
word_median5.000
n_empty0
n_duplicates21,301
duplicate_rate0.555
vocab_size15,461
readability_flesch_mean117.160
emoji_rate0.000
url_rate0.000
one_word_rate0.000
allcaps_rate0.000
boilerplate_rate0.000
Show data table
Character-length distribution for GeoLocation (mean: 40.3046535246478).
charscount
33 – 336214
33 – 340
34 – 3427
34 – 340
34 – 350
35 – 35139
35 – 350
35 – 360
36 – 361693
36 – 360
36 – 370
37 – 373283
37 – 380
38 – 380
38 – 38527
38 – 390
39 – 390
39 – 39488
39 – 400
40 – 400
40 – 405488
40 – 410
41 – 412086
41 – 410
41 – 420
42 – 423512
42 – 420
42 – 430
43 – 433760
43 – 440
44 – 440
44 – 442811
44 – 450
45 – 450
45 – 457678
45 – 460
46 – 460
46 – 46672
46 – 470
47 – 4723
Sample values (first 10)
  1. [None, '38.5', '-94.3', None, False]
  2. [None, '-84.0', '168.0', None, False]
  3. [None, '-71.5', '35.66667', None, False]
  4. [None, '-71.5', '35.66667', None, False]
  5. [None, '19.72767', '55.72677', None, False]
  6. [None, '0.0', '0.0', None, False]
  7. [None, '50.375', '21.73333', None, False]
  8. [None, '-71.5', '35.66667', None, False]
  9. [None, '-71.5', '35.66667', None, False]
  10. [None, '27.63556', '4.02528', None, False]

States numeric

96.4% null
rows45,716
null44,057 (96.4%)
unique45
min1.000
max51.000
mean17.338
median15.000
std10.411
q19.000
q323.000
iqr14.000
skew1.115
kurtosis0.689
n_outliers40
outlier_rate0.024
zero_rate0.000
Show data table
Histogram bins for States (median: 15.0).
bincount
1 – 2.2513
2.25 – 3.59
3.5 – 4.752
4.75 – 66
6 – 7.25125
7.25 – 8.5224
8.5 – 9.7587
9.75 – 1195
11 – 12.25229
12.25 – 13.520
13.5 – 14.7514
14.75 – 1615
16 – 17.25146
17.25 – 18.523
18.5 – 19.7549
19.75 – 2140
21 – 22.2519
22.25 – 23.5297
23.5 – 24.754
24.75 – 261
26 – 27.250
27.25 – 28.50
28.5 – 29.7517
29.75 – 315
31 – 32.2529
32.25 – 33.56
33.5 – 34.7510
34.75 – 3612
36 – 37.2555
37.25 – 38.512
38.5 – 39.7523
39.75 – 4114
41 – 42.2518
42.25 – 43.50
43.5 – 44.750
44.75 – 463
46 – 47.2511
47.25 – 48.58
48.5 – 49.755
49.75 – 5113

Counties numeric

96.4% null
rows45,716
null44,057 (96.4%)
unique662
min5.000
max3,210
mean1,353
median1,195
std994.089
q1482.000
q32,113
iqr1,631
skew0.237
kurtosis-1.190
n_outliers0
outlier_rate0.000
zero_rate0.000
Show data table
Histogram bins for Counties (median: 1195.0).
bincount
5 – 85.12303
85.12 – 165.28
165.2 – 245.417
245.4 – 325.526
325.5 – 405.620
405.6 – 485.864
485.8 – 565.98
565.9 – 64634
646 – 726.120
726.1 – 806.2113
806.2 – 886.459
886.4 – 966.546
966.5 – 104749
1047 – 112725
1127 – 120740
1207 – 128757
1287 – 136728
1367 – 144725
1447 – 152716
1527 – 160810
1608 – 168813
1688 – 176811
1768 – 18488
1848 – 192813
1928 – 2008198
2008 – 208825
2088 – 216821
2168 – 224828
2248 – 232928
2329 – 240969
2409 – 248918
2489 – 256934
2569 – 264911
2649 – 272934
2729 – 280919
2809 – 289018
2890 – 297028
2970 – 305019
3050 – 313024
3130 – 321072