Centralized index of datasets spanning paranormal phenomena, natural disasters, linguistics, economics, culture, health, geography, and accessibility research
Ghost sightings, UFO reports, Bigfoot encounters, cryptids, crop circles, and roller coaster data.
354,770 georeferenced mysterious phenomena from NASA, NOAA, USGS, BFRO, NUFORC, OpenStreetMap, the Megalithic Portal, and the Shadowlands Haunted Places Index. Tornadoes, caves, UFOs, megaliths, meteorites, ghost towns, storms, haunted places, thermal springs, bigfoot sightings, earthquakes, shipwrecks, fireballs, and volcanoes.
Ghost sighting reports from The Shadowlands archive with location data, descriptions, and regional analysis.
Detailed breakdowns of UFO sightings from NUFORC data with aggregations by state, shape classification, region, year, and cross-tabulations.
Every report from the Bigfoot Field Researchers Organization, scraped county by county across all 50 US states and 7 Canadian provinces. Includes Class A/B/C classification, date, county, and description for each sighting. Phase 2 enrichment adds GPS coordinates, season, and nearest town from individual report pages.
Roller coaster specifications from RCDB including height, speed, length, inversions, manufacturer, and operating status.
European witch trial records with year, city, country, number tried and killed
Meteorites, earthquakes, exoplanets, solar flares, bird migration, coral reefs, waterfalls, volcanoes, planets, moons, and aurora forecasts.
NASA's catalog of significant meteorite falls and finds over 10kg with coordinates, mass, classification, and discovery year.
Significant earthquake events from USGS with magnitude, depth, location, and impact data.
Confirmed exoplanets from NASA with orbital parameters, mass, radius, discovery method, and host star properties.
NOAA space weather data including solar flares, coronal mass ejections, and geomagnetic storm forecasts.
eBird observation data showing bird species migration patterns, stopover locations, and seasonal timing across North America.
Global coral reef health monitoring data including bleaching events, temperature anomalies, and biodiversity assessments.
Notable waterfalls worldwide with height, flow rate, location, and accessibility information.
Smithsonian Institution's database of active and potentially active volcanoes with eruption histories and hazard assessments.
Solar system planet data: mass, diameter, orbital period, distance from sun, and atmospheric composition.
Natural satellites in our solar system with parent planet, orbital characteristics, physical properties, and discovery information.
NOAA Space Weather Prediction Center's Ovation aurora forecast model with predicted viewing probabilities by location and time.
Lighthouse locations from OpenStreetMap with height, year built, operator
Cave locations from OpenStreetMap with access info, tourism status, descriptions
Shipwreck locations from OpenStreetMap with year sunk, type, Wikipedia links
Megalithic site locations from OpenStreetMap including dolmens, menhirs, stone circles
Archaeological and historic ruin locations from OpenStreetMap with site type, period, Wikipedia links
Fossil occurrence records from the Paleobiology Database with taxonomic classification, age, and coordinates
Deep sea organism occurrences (depth >1000m) from OBIS with full taxonomic hierarchy
Bioluminescent organism occurrences from OBIS with taxonomic classification and coordinates
US tornado records from NOAA Storm Prediction Center with magnitude, injuries, fatalities, path data
Airplane crashes, shark attacks, lightning strikes, storms, temperature anomalies, wildfires, and disaster mashups.
Historical record of airplane crashes and fatalities worldwide from the first recorded airplane fatality (1908) through 2009. Includes operator, aircraft type, route, and casualty details.
Shark attack records from the Shark Research Institute covering provoked, unprovoked, and boat attacks with species, activity, and outcome details.
Daily lightning strike counts with geographic center points from NOAA's National Lightning Detection Network. Preprocessed data available for heatmap, monthly, and state-level analysis.
Combined disaster dataset merging NOAA storms, USGS earthquakes, NTSB aviation accidents, and EPA Superfund sites across the US.
Curated significant US storm events from NOAA with geographic coordinates, event types, and descriptions.
NASA GISS Surface Temperature Analysis showing global temperature anomalies relative to 1951-1980 baseline with monthly and annual data.
National Interagency Fire Center data on wildfire incidents including location, size, containment status, and fire perimeters.
National Flood Insurance Program claims data showing flood event frequency, severity, and financial impacts by county.
US Drought Monitor weekly drought severity classifications showing spatial extent and intensity of drought conditions.
Extreme heat event tracking with heat index calculations, duration, affected populations, and heat-related mortality statistics.
NYC 311, GDELT events, parking tickets, restaurant inspections, bridges, Wikipedia activity, Internet Archive, and Bluesky social.
Complete NYC 311 service request data including noise complaints, street conditions, housing issues, and response times by neighborhood.
Cross-platform attention metrics tracking global focus on the United States. Combines Wikipedia pageviews, GDELT global event mentions, and Google Trends search interest on a weekly cadence from 2020-2025.
GDELT Project's global event database tracking political, social, and conflict events worldwide from news media coverage.
Wikipedia pageview statistics showing trending articles, search patterns, and attention flows across topics and languages.
Anonymized Bluesky posts with VADER sentiment analysis. SHA-256 hashed DIDs/URIs, 44K unique authors. Dec 2-25, 2025. CC-BY-4.0 license.
NYC parking violation data with ticket types, fine amounts, vehicle types, and issuance locations for spatial and temporal analysis.
Internet Archive's digital lending library statistics showing book circulation patterns, popular titles, and access trends.
Federal Highway Administration bridge inventory with structural condition ratings, age, traffic volume, and deficiency status.
WALS, PHOIBLE, Glottolog, ISO 639-3, World Languages, CLMET, Historical Lemma, Language Families, Brown Corpus, COCA API, Historical Corpora, Joshua Project languages, and etymology atlas.
Language database integrating ISO 639-3, Glottolog, native names, speaker populations, endangerment status, and geographic distribution.
Typological feature database covering phonology, morphology, syntax, and lexicon across languages worldwide.
Phoneme inventory database with IPA symbols, features (consonant/vowel/tone), and cross-linguistic phonological patterns.
Language classification with family hierarchies, geographic coordinates, and historical relationships.
Official ISO 639-3 three-letter language codes with mappings to ISO 639-1 (2-letter) and ISO 639-2 (bibliographic) codes.
Etymological database from Lexibank CLDF data with cognate relationships, word forms, and language evolution patterns.
Demographic, linguistic, and religious data for 16,382 people groups worldwide from the Joshua Project API. Includes Bible translation status, evangelical percentages, and enriched Parquet files for fast analysis.
Billionaires, market caps, World Bank GDP, patents, retail locations, and NYC housing.
Forbes billionaire rankings with net worth, source of wealth, age, country, and industry classification.
Historical GDP data from World Bank with total GDP, per capita GDP, and growth rates for countries worldwide.
Market capitalization data for major corporations including Dow Jones Industrial Average components and Fortune 500 companies.
US Patent and Trademark Office data with patent classifications, filing dates, inventors, and citation networks.
Retail store location data for major chains including Walmart, Target, Kroger, CVS, Walgreens, and regional chains.
Census tract-level housing data for all five NYC boroughs: median rent, household income, rent burden, and tenure (owner vs renter). Includes borough and neighborhood GeoJSON boundaries with data joined.
Executive orders, tariff policies, federal workforce changes, market sector performance, and geopolitical events.
Executive orders issued by U.S. presidents with metadata including issue date, Federal Register number, title, status, and policy areas.
Historical and current tariff rates by product category, country, and time period. Includes weighted average rates and trade impact analysis.
Department of Government Efficiency proposed workforce reductions by federal agency, with personnel counts and policy rationale.
Economic sector performance indices tracking technology, finance, healthcare, energy, and other major market sectors over time.
Timeline of significant geopolitical events including military actions, diplomatic incidents, and international policy changes.
Cheese, wine, edible insects, OpenFoodFacts, baby names, emoji, street art, social norms, food access, SNAP, and grocery density.
Cheese database with varieties, countries of origin, milk types, aging processes, and flavor profiles.
Wine varieties with regional classifications, grape composition, flavor profiles, and production statistics.
Edible insect species with nutritional profiles, regional consumption patterns, and sustainability metrics.
USDA atlas measuring food access by census tract with low-income/low-access designations and SNAP usage statistics.
State-level aggregation of USDA Food Access Atlas data showing percentage of population in food deserts and low-access areas.
Supplemental Nutrition Assistance Program data showing participation rates, benefit amounts, and coverage gaps by county.
Grocery store density measurements by county showing stores per capita and accessibility metrics for fresh food.
Complete Unicode emoji dataset with categories, release dates, names, and historical evolution of emoji standards.
Social Security Administration baby name data showing popularity trends, regional variations, and cultural naming patterns.
Open database of food products worldwide with nutrition facts, ingredients, allergens, labels, and Nutri-Score ratings.
Chocolate bar ratings with company, bean origin, cocoa percentage, and tasting notes
Hot sauces and peppers with Scoville ratings, origins, and species
Network graph of social actions and moral judgments from the Social Chemistry 101 dataset
Housing affordability, HUD PIT counts, eviction, CMS hospitals, Medicaid, hospital closures, food access, and veterans data (5 military datasets).
County-level inequality data for all ~3,200 US counties, keyed on 5-digit FIPS codes. Covers food deserts, healthcare access, housing affordability, hospital infrastructure, and veteran demographics. All from Census ACS, CMS, USDA, and HRSA.
County-level housing affordability analysis with median rents, household incomes, cost burden percentages, and homelessness data.
HUD Point-in-Time homeless population counts by CoC (Continuum of Care) with sheltered/unsheltered breakdowns and demographics.
Eviction Lab data showing eviction filing rates, judgment rates, and temporal trends across US counties.
Veteran demographics merged from 5 sources: population, employment, homelessness, suicide rates, and firearm ownership correlations.
VA and Census veteran population estimates by county with age, gender, and period-of-service breakdowns.
Bureau of Labor Statistics veteran employment data with labor force participation, unemployment, and industry breakdowns.
HUD Point-in-Time counts for veteran homelessness with sheltered/unsheltered status and CoC-level data.
VA data on veteran suicide rates by state with comparisons to civilian populations and mental health service utilization.
Centers for Medicare & Medicaid Services hospital data with quality ratings, ownership, emergency services, and patient experience scores.
State-level Medicaid enrollment data with monthly trends, eligibility categories, and managed care penetration.
Healthcare desert identification using provider-to-population ratios, travel distances, and HRSA shortage area designations.
Health Resources and Services Administration shortage area designations (HPSA) for primary care, dental, and mental health.
County analysis, SCARS, country centroids, election results, geology, county boundaries, air quality, ocean plastic, light pollution, infrastructure (submarine cables, satellites, internet topology), and endangered languages.
Standardized FIPS county codes with names, state associations, land area, population, and geographic centroids for spatial joins.
Unified county-level dataset integrating economic, demographic, health, housing, and infrastructure indicators for comparative analysis.
Geographic centroids for all countries with ISO 3166-1 alpha-2/alpha-3 codes, names, and population-weighted centers.
County-level presidential election results with total votes, party breakdown, and vote margins. Mergeable with other FIPS-keyed datasets.
Geological regions, mineral deposits, Cretaceous-era arc boundaries, and Deep South historical timeline features. Part of the SCARS project mapping geology against socioeconomic data.
County and state boundary polygons in GeoJSON format, including simplified versions for web rendering and full-resolution versions with data attributes joined.
EPA Air Quality Index data with daily measurements for PM2.5, PM10, ozone, and other criteria pollutants by monitoring station.
Ocean plastic pollution density estimates with gyres mapping, microplastic concentrations, and source attribution.
Light pollution measurements showing night sky brightness levels, Bortle scale classifications, and dark sky preserve locations.
Global submarine telecommunications cable routes with landing points, capacity, owners, and ready-for-service dates.
Two-Line Element (TLE) data for active satellites from CelesTrak with orbital parameters and classification.
UNESCO Atlas of Endangered Languages with vitality assessments, speaker populations, and documentation status.
SAP (Augmentative Communication), WLASL (Sign Language), ACS disability data, WebAIM web compliance, WHO global health, and accessibility audit tools.
Disability demographics (3,222 US counties, 185 WHO countries), web accessibility compliance (WebAIM Million 2025, screen reader survey), ADA lawsuits, Section 508 federal compliance, VizWiz blind user VQA, and WLASL sign language index.
Census ACS data on communication disabilities by state, covering hearing, vision, cognitive, and self-care difficulty rates.
Word-level American Sign Language video dataset with annotations, glosses, and metadata for sign language recognition research.
American Community Survey disability data with prevalence rates for hearing, vision, cognitive, ambulatory, self-care, and independent living disabilities.
Accessibility audit results and tools for WCAG 2.1 AA compliance checking including axe-core and Pa11y scan outputs.
Gaming, film, music, and satire datasets
Steam game catalog with titles, release dates, pricing, and platform availability
41 million Steam user game recommendations. File is 1.9 GB and gitignored.
14 million Steam user profiles. File is 185 MB and gitignored.
Bond franchise companion characters with actress/Bond actor ages, film financials in actual and inflation-adjusted USD
Boy band discography with members, active years, and chart performance
The Onion headline index with article IDs and image URLs. No header row.