Centralized index of datasets spanning paranormal phenomena, natural disasters, linguistics, economics, culture, health, geography, and accessibility research
Sources of truth published on GitHub, Hugging Face, or Kaggle. Includes interactive explorers, data poems, and Jupyter analysis notebooks.
354,770 georeferenced mysterious phenomena (UFOs, Bigfoot, haunted places). Canonical source for all quirky geospatial research.
Demographic and linguistic data for 16,382 people groups. Powering the 'Souls' visualization series.
Global disability demographics and web accessibility metrics (WebAIM Million 2025).
County-level metrics for food deserts, healthcare access, housing affordability, and demographics.
25K+ games connected by 861K shared-reviewer edges. Includes full game metadata (2005-2025).
Unified dataset of NOAA storms, USGS earthquakes, NTSB accidents, and EPA Superfund sites.
Digital attention metrics from Wikipedia, Google Trends, and GDELT events.
4.2M linguistic relationships and cognate sets covering global language evolution.
7,900+ languages integrating ISO 639-3, Glottolog, speaker populations, and endangerment status.
NASA catalog of meteorite falls witnessed and recorded by observers, with coordinates, mass, and classification.
County-level housing affordability analysis with median rents, household incomes, cost burden, and homelessness data.
Veteran demographics merged from 5 sources: population, employment, homelessness, suicide rates, and firearm correlations.
Bluesky post data export for sentiment analysis and social network research.
NASA catalog of significant meteorite falls and finds over 10kg with coordinates, mass, and discovery year.
Ghost sightings, UFO reports, Bigfoot encounters, cryptids, crop circles, and roller coaster data.
354,770 georeferenced mysterious phenomena from NASA, NOAA, USGS, BFRO, NUFORC, OpenStreetMap, the Megalithic Portal, and the Shadowlands Haunted Places Index. Tornadoes, caves, UFOs, megaliths, meteorites, ghost towns, storms, haunted places, thermal springs, bigfoot sightings, earthquakes, shipwrecks, fireballs, and volcanoes.
Ghost sighting reports from The Shadowlands archive with location data, descriptions, and regional analysis.
Detailed breakdowns of UFO sightings from NUFORC data with aggregations by state, shape classification, region, year, and cross-tabulations.
Every report from the Bigfoot Field Researchers Organization, scraped county by county across all 50 US states and 7 Canadian provinces. Includes Class A/B/C classification, date, county, and description for each sighting. Phase 2 enrichment adds GPS coordinates, season, and nearest town from individual report pages.
Roller coaster specifications from RCDB including height, speed, length, inversions, manufacturer, and operating status.
Meteorites, earthquakes, exoplanets, solar flares, bird migration, coral reefs, waterfalls, volcanoes, planets, moons, and aurora forecasts.
NASA's catalog of significant meteorite falls and finds over 10kg with coordinates, mass, classification, and discovery year.
Significant earthquake events from USGS with magnitude, depth, location, and impact data.
Confirmed exoplanets from NASA with orbital parameters, mass, radius, discovery method, and host star properties.
NOAA space weather data including solar flares, coronal mass ejections, and geomagnetic storm forecasts.
eBird observation data showing bird species migration patterns, stopover locations, and seasonal timing across North America.
Global coral reef health monitoring data including bleaching events, temperature anomalies, and biodiversity assessments.
Notable waterfalls worldwide with height, flow rate, location, and accessibility information.
Smithsonian Institution's database of active and potentially active volcanoes with eruption histories and hazard assessments.
Solar system planet data: mass, diameter, orbital period, distance from sun, and atmospheric composition.
Natural satellites in our solar system with parent planet, orbital characteristics, physical properties, and discovery information.
NOAA Space Weather Prediction Center's Ovation aurora forecast model with predicted viewing probabilities by location and time.
Lighthouse locations from OpenStreetMap with height, year built, operator
Cave locations from OpenStreetMap with access info, tourism status, descriptions
Shipwreck locations from OpenStreetMap with year sunk, type, Wikipedia links
Megalithic site locations from OpenStreetMap including dolmens, menhirs, stone circles
Fossil occurrence records from the Paleobiology Database with taxonomic classification, age, and coordinates
Deep sea organism occurrences (depth >1000m) from OBIS with full taxonomic hierarchy
Bioluminescent organism occurrences from OBIS with taxonomic classification and coordinates
US tornado records from NOAA Storm Prediction Center with magnitude, injuries, fatalities, path data
571 NOAA weather alert records covering 43 event types including ball lightning, waterspouts, mammatus clouds, volcanic lightning, nacreous clouds, fire whirls, derechos, and sprites. Combines current NOAA alerts with documented rare atmospheric phenomena.
610 GBIF occurrence records for carnivorous plant species across 35 countries. Covers 123 species including Venus flytraps, sundews, pitcher plants, and bladderworts, with taxonomic classification and georeferenced coordinates.
Airplane crashes, shark attacks, lightning strikes, storms, temperature anomalies, wildfires, and disaster mashups.
Historical record of airplane crashes and fatalities worldwide from the first recorded airplane fatality (1908) through 2009. Includes operator, aircraft type, route, and casualty details.
Shark attack records from the Shark Research Institute covering provoked, unprovoked, and boat attacks with species, activity, and outcome details.
Daily lightning strike counts with geographic center points from NOAA's National Lightning Detection Network. Preprocessed data available for heatmap, monthly, and state-level analysis.
Combined disaster dataset merging NOAA storms, USGS earthquakes, NTSB aviation accidents, and EPA Superfund sites across the US.
Curated significant US storm events from NOAA with geographic coordinates, event types, and descriptions.
NASA GISS Surface Temperature Analysis showing global temperature anomalies relative to 1951-1980 baseline with monthly and annual data.
National Interagency Fire Center data on wildfire incidents including location, size, containment status, and fire perimeters.
National Flood Insurance Program claims data showing flood event frequency, severity, and financial impacts by county.
US Drought Monitor weekly drought severity classifications showing spatial extent and intensity of drought conditions.
Extreme heat event tracking with heat index calculations, duration, affected populations, and heat-related mortality statistics.
NYC 311, GDELT events, parking tickets, restaurant inspections, bridges, Wikipedia activity, Internet Archive, and Bluesky social.
Complete NYC 311 service request data including noise complaints, street conditions, housing issues, and response times by neighborhood.
Cross-platform attention metrics tracking global focus on the United States. Combines Wikipedia pageviews, GDELT global event mentions, and Google Trends search interest on a weekly cadence from 2020-2025.
GDELT Project's global event database tracking political, social, and conflict events worldwide from news media coverage.
Wikipedia pageview statistics showing trending articles, search patterns, and attention flows across topics and languages.
Anonymized Bluesky posts with VADER sentiment analysis. SHA-256 hashed DIDs/URIs, 44K unique authors. Dec 2-25, 2025. CC-BY-4.0 license.
NYC parking violation data with ticket types, fine amounts, vehicle types, and issuance locations for spatial and temporal analysis.
Internet Archive's digital lending library statistics showing book circulation patterns, popular titles, and access trends.
Federal Highway Administration bridge inventory with structural condition ratings, age, traffic volume, and deficiency status.
WALS, PHOIBLE, Glottolog, ISO 639-3, World Languages, CLMET, Historical Lemma, Language Families, Brown Corpus, COCA API, Historical Corpora, Joshua Project languages, and etymology atlas.
Language database integrating ISO 639-3, Glottolog, native names, speaker populations, endangerment status, and geographic distribution.
7,101 world languages with endangerment status on a scale from extinct to safe. Data integrates Glottolog, UNESCO Atlas of the World's Languages in Danger, and Ethnologue classifications. Powers the "Silence" data poetry visualization.
Typological feature database covering phonology, morphology, syntax, and lexicon across languages worldwide.
Phoneme inventory database with IPA symbols, features (consonant/vowel/tone), and cross-linguistic phonological patterns.
Language classification with family hierarchies, geographic coordinates, and historical relationships.
Official ISO 639-3 three-letter language codes with mappings to ISO 639-1 (2-letter) and ISO 639-2 (bibliographic) codes.
Etymological database from Lexibank CLDF data with cognate relationships, word forms, and language evolution patterns.
Demographic, linguistic, and religious data for 16,382 people groups worldwide from the Joshua Project API. Includes Bible translation status, evangelical percentages, and enriched Parquet files for fast analysis.
Billionaires, market caps, World Bank GDP, patents, retail locations, and NYC housing.
Forbes billionaire rankings with net worth, source of wealth, age, country, and industry classification.
Historical GDP data from World Bank with total GDP, per capita GDP, and growth rates for countries worldwide.
Market capitalization data for major corporations including Dow Jones Industrial Average components and Fortune 500 companies.
US Patent and Trademark Office data with patent classifications, filing dates, inventors, and citation networks.
Retail store location data for major chains including Walmart, Target, Kroger, CVS, Walgreens, and regional chains.
Census tract-level housing data for all five NYC boroughs: median rent, household income, rent burden, and tenure (owner vs renter). Includes borough and neighborhood GeoJSON boundaries with data joined.
Executive orders, tariff policies, federal workforce changes, market sector performance, and geopolitical events.
Executive orders issued by U.S. presidents with metadata including issue date, Federal Register number, title, status, and policy areas.
Historical and current tariff rates by product category, country, and time period. Includes weighted average rates and trade impact analysis.
Department of Government Efficiency proposed workforce reductions by federal agency, with personnel counts and policy rationale.
Economic sector performance indices tracking technology, finance, healthcare, energy, and other major market sectors over time.
Timeline of significant geopolitical events including military actions, diplomatic incidents, and international policy changes.
US favorability ratings and presidential confidence across 15 countries from the Pew Research Center Global Attitudes Survey (2020-2025). Tracks international perception of the United States over the Trump and Biden administrations.
Cheese, wine, edible insects, OpenFoodFacts, baby names, emoji, street art, social norms, food access, SNAP, and grocery density.
Cheese database with varieties, countries of origin, milk types, aging processes, and flavor profiles.
Wine varieties with regional classifications, grape composition, flavor profiles, and production statistics.
Edible insect species with nutritional profiles, regional consumption patterns, and sustainability metrics.
USDA atlas measuring food access by census tract with low-income/low-access designations and SNAP usage statistics.
State-level aggregation of USDA Food Access Atlas data showing percentage of population in food deserts and low-access areas.
Supplemental Nutrition Assistance Program data showing participation rates, benefit amounts, and coverage gaps by county.
Grocery store density measurements by county showing stores per capita and accessibility metrics for fresh food.
Complete Unicode emoji dataset with categories, release dates, names, and historical evolution of emoji standards.
Social Security Administration baby name data showing popularity trends, regional variations, and cultural naming patterns.
Open database of food products worldwide with nutrition facts, ingredients, allergens, labels, and Nutri-Score ratings.
Chocolate bar ratings with company, bean origin, cocoa percentage, and tasting notes
Hot sauces and peppers with Scoville ratings, origins, and species
175 hot pepper varieties with Scoville heat unit ranges, origin countries, flavor profiles, and common uses. Covers everything from Bell peppers (0 SHU) to Carolina Reapers (2.2M SHU). Source: PepperScale comprehensive pepper list.
Network graph of social actions and moral judgments from the Social Chemistry 101 dataset
Housing affordability, HUD PIT counts, eviction, CMS hospitals, Medicaid, hospital closures, food access, and veterans data (5 military datasets).
County-level inequality data for all ~3,200 US counties, keyed on 5-digit FIPS codes. Covers food deserts, healthcare access, housing affordability, hospital infrastructure, and veteran demographics. All from Census ACS, CMS, USDA, and HRSA.
County-level housing affordability analysis with median rents, household incomes, cost burden percentages, and homelessness data.
HUD Point-in-Time homeless population counts by CoC (Continuum of Care) with sheltered/unsheltered breakdowns and demographics.
Eviction Lab data showing eviction filing rates, judgment rates, and temporal trends across US counties.
Veteran demographics merged from 5 sources: population, employment, homelessness, suicide rates, and firearm ownership correlations.
VA and Census veteran population estimates by county with age, gender, and period-of-service breakdowns.
Bureau of Labor Statistics veteran employment data with labor force participation, unemployment, and industry breakdowns.
HUD Point-in-Time counts for veteran homelessness with sheltered/unsheltered status and CoC-level data.
VA data on veteran suicide rates by state with comparisons to civilian populations and mental health service utilization.
Centers for Medicare & Medicaid Services hospital data with quality ratings, ownership, emergency services, and patient experience scores.
State-level Medicaid enrollment data with monthly trends, eligibility categories, and managed care penetration.
Healthcare desert identification using provider-to-population ratios, travel distances, and HRSA shortage area designations.
Health Resources and Services Administration shortage area designations (HPSA) for primary care, dental, and mental health.
County analysis, SCARS, country centroids, election results, geology, county boundaries, air quality, ocean plastic, light pollution, infrastructure (submarine cables, satellites, internet topology), and endangered languages.
Standardized FIPS county codes with names, state associations, land area, population, and geographic centroids for spatial joins.
Unified county-level dataset integrating economic, demographic, health, housing, and infrastructure indicators for comparative analysis.
Geographic centroids for all countries with ISO 3166-1 alpha-2/alpha-3 codes, names, and population-weighted centers.
County-level presidential election results with total votes, party breakdown, and vote margins. Mergeable with other FIPS-keyed datasets.
Geological regions, mineral deposits, Cretaceous-era arc boundaries, and Deep South historical timeline features. Part of the SCARS project mapping geology against socioeconomic data.
County and state boundary polygons in GeoJSON format, including simplified versions for web rendering and full-resolution versions with data attributes joined.
EPA Air Quality Index data with daily measurements for PM2.5, PM10, ozone, and other criteria pollutants by monitoring station.
Ocean plastic pollution density estimates with gyres mapping, microplastic concentrations, and source attribution.
Light pollution measurements showing night sky brightness levels, Bortle scale classifications, and dark sky preserve locations.
Global submarine telecommunications cable routes with landing points, capacity, owners, and ready-for-service dates.
Two-Line Element (TLE) data for active satellites from CelesTrak with orbital parameters and classification.
UNESCO Atlas of Endangered Languages with vitality assessments, speaker populations, and documentation status.
SAP (Augmentative Communication), WLASL (Sign Language), ACS disability data, WebAIM web compliance, WHO global health, and accessibility audit tools.
Disability demographics (3,222 US counties, 185 WHO countries), web accessibility compliance (WebAIM Million 2025, screen reader survey), ADA lawsuits, Section 508 federal compliance, VizWiz blind user VQA, and WLASL sign language index.
Census ACS data on communication disabilities by state, covering hearing, vision, cognitive, and self-care difficulty rates.
Word-level American Sign Language video dataset with annotations, glosses, and metadata for sign language recognition research.
American Community Survey disability data with prevalence rates for hearing, vision, cognitive, ambulatory, self-care, and independent living disabilities.
Accessibility audit results and tools for WCAG 2.1 AA compliance checking including axe-core and Pa11y scan outputs.
Gaming, film, music, and satire datasets
Steam game catalog with titles, release dates, pricing, and platform availability
41 million Steam user game recommendations. File is 1.9 GB and gitignored.
14 million Steam user profiles. File is 185 MB and gitignored.
Bond franchise companion characters with actress/Bond actor ages, film financials in actual and inflation-adjusted USD
Boy band discography with members, active years, and chart performance