Summary confidence: high
This dataset contains 22,043 fossil occurrence records with 21 columns spanning taxonomy (phylum, class, order, family, genus, name, rank), geography (country, state, lat/lon, paleolat/paleolng), and geologic age (early_age_mya, late_age_mya, period, late_interval). Taxonomy is dominated by Chordata (about 82% of rows) with Mammalia as the leading class (~32%) followed by Saurischia and Ornithischia, suggesting a strong vertebrate and dinosaur emphasis worth examining first. Geographically the data skews heavily to the US (~51%), with Wyoming, Montana, and New Mexico topping the state list, so any spatial analysis should account for this North American concentration. Age columns (early_age_mya, late_age_mya) are right-skewed with medians around 100 Mya and ~11% flagged as outliers, hinting at a long tail of very old records. Note that 'collection' and 'formation' are entirely empty and should be ignored.
citing: row_count · column_count · phylum.top_values · class.top_values · country.top_values · state.top_values · early_age_mya.stats · late_age_mya.stats · rank.top_values · collection.stats · formation.stats