saturn

/home/coolhand/html/datavis/data_trove/geographic/geology/geological_regions.geojson 14 rows sample n=14 seed 42 2026-06-22T01:05:37+00:00

Overview

Source/home/coolhand/html/datavis/data_trove/geographic/geology/geological_regions.geojson
Total rows14
Profiled sample14
Columns7
Generated2026-06-22T01:05:37+00:00
Show data table
Per-column null rate across the corpus.
columnkindnull %
namecategorical0.0%
geology_typecategorical0.0%
primary_resourcescategorical0.0%
agecategorical0.0%
descriptioncategorical0.0%
colorcategorical0.0%
geometry_typecategorical0.0%

Insights opt-in

Model-generated narrative. These are opinions, not facts — the stats below are what saturn measured. Generated by: anthropic:default.

Dataset medium anthropic:default

This dataset is a small geospatial catalogue of 14 named geological regions across the United States, each described as a polygon with attributes covering geology type, geological age, primary resources, and a short description. The most notable pattern is the dominance of Sedimentary Basins (4 of 14 regions) as the leading geology type, which aligns with the prevalence of oil and natural gas as primary resources. The geological ages span a wide range from Precambrian to Tertiary, suggesting this catalogue captures regions of very different formation histories — worth examining alongside resource type to spot any age-resource relationships.

geometry_type high anthropic:default

This column records the geometry type of spatial features and contains exactly one value, 'Polygon', across all 14 rows with no nulls. It is a constant column — zero entropy, cardinality of 1, and a top_rate of 1.0 — meaning it carries no discriminative information whatsoever. The imbalance alert is technically correct but understates the situation: this is not imbalanced, it is entirely invariant.

age high anthropic:default

This column captures geological time period / stratigraphic age, classifying records by the geologic era or period of their origin (e.g., 'Precambrian', 'Cretaceous', 'Devonian'). With only 14 rows, 12 distinct values, and an entropy ratio of 0.98, the distribution is nearly flat — almost every record has a unique age label, which limits its predictive utility as a categorical feature. The 'long_tail' alert is consistent with this near-uniform spread, and the top value ('Precambrian') appears only twice (14.3% frequency). Label inconsistency is also present: overlapping ranges like 'Cretaceous-Tertiary' and 'Tertiary-Cretaceous' likely refer to the same interval, suggesting unstandardized entry.

color high anthropic:default

This column contains CSS hex color codes (e.g., '#1e3a8a', '#4a5568'), likely representing UI theme colors, category badges, or tag styling values. With 13 unique values across only 14 rows and an entropy ratio of 0.99, the distribution is nearly uniform — every color appears exactly once except '#1e3a8a' which appears twice. The long-tail alert is technically triggered but is a minor artefact of the tiny dataset size; the dominant value holds only a 14.3% share.

description high anthropic:default

This column contains free-text descriptive annotations for 14 geographic or geological regions, each explaining their natural resource profile and economic significance (oil, gas, coal, mining, agriculture). Every row has a unique description (cardinality 14, entropy_ratio 1.0), meaning it functions purely as a human-readable label with no repeated values. The top_rate of 0.071 confirms perfect uniformity — no single value dominates. The 'long_tail' alert is technically triggered but is trivially explained by all values appearing exactly once.

geology_type medium anthropic:default

This column classifies geological formation types associated with each record, covering 11 distinct categories across only 14 rows. 'Sedimentary Basin' dominates with 4 occurrences (28.6% top_rate), while all other 10 categories appear exactly once — a textbook long-tail distribution flagged in alerts. The near-maximum entropy ratio of 0.935 confirms the distribution is close to uniform outside the top value, meaning the dataset is too small to draw reliable frequency-based conclusions. The mix of broadly defined types ('Sedimentary Basin', 'Metamorphic') alongside highly specific ones ('Porphyry Copper', 'Shale/Carbonate') suggests inconsistent taxonomy.

name high anthropic:default

This column contains names of geological formations, basins, and resource districts (e.g., 'Permian Basin', 'Marcellus-Utica Shale', 'Bakken Formation'), making it a label or identifier for geological regions in a small reference dataset of 14 rows. Every value is unique (cardinality = 14, n = 14), producing a perfect entropy ratio of 1.0 — the column is essentially a primary key of human-readable names. The 'long_tail' alert is a statistical artefact of all values appearing exactly once (top_rate = 0.071), not a meaningful distribution signal. No nulls are present.

primary_resources high anthropic:default

This column captures the primary natural resources of geographic entities (likely countries or regions), expressed as free-form comma-separated lists. With only 14 rows and 12 unique values, the dataset is tiny; the top values 'Oil, Natural Gas' and 'Gold' each appear twice (14.3% each), while all other entries are singletons. The near-maximum entropy ratio (0.982) and long-tail alert confirm extreme fragmentation — semantically equivalent entries like 'Oil, Natural Gas' and 'Natural Gas, Oil' are treated as distinct, indicating inconsistent ordering that inflates apparent cardinality.

name categorical

14 singleton categories
rows14
null0 (0.0%)
unique14
top_valueCretaceous Interior Seaway
top_rate0.071
cardinality14
entropy3.807
entropy_ratio1.000
Show data table
Top values for name (14 unique shown, of 14 total).
valuecountshare
Cretaceous Interior Seaway17.1%
Appalachian Coal Basin17.1%
Marcellus-Utica Shale17.1%
Gulf Coastal Plain17.1%
Permian Basin17.1%
Bakken Formation17.1%
Illinois Basin17.1%
Mesabi Iron Range17.1%
Colorado Mineral Belt17.1%
Nevada Mining District17.1%
Copper Belt - Arizona17.1%
Black Hills17.1%
Southern Appalachian Gold Belt17.1%
Florida Phosphate District17.1%
Top values (rank 1–20)
  1. Cretaceous Interior Seaway — 1
  2. Appalachian Coal Basin — 1
  3. Marcellus-Utica Shale — 1
  4. Gulf Coastal Plain — 1
  5. Permian Basin — 1
  6. Bakken Formation — 1
  7. Illinois Basin — 1
  8. Mesabi Iron Range — 1
  9. Colorado Mineral Belt — 1
  10. Nevada Mining District — 1
  11. Copper Belt - Arizona — 1
  12. Black Hills — 1
  13. Southern Appalachian Gold Belt — 1
  14. Florida Phosphate District — 1

geology_type categorical

10 singleton categories
rows14
null0 (0.0%)
unique11
top_valueSedimentary Basin
top_rate0.286
cardinality11
entropy3.236
entropy_ratio0.935
Show data table
Top values for geology_type (11 unique shown, of 11 total).
valuecountshare
Sedimentary Basin428.6%
Ancient Marine Basin17.1%
Shale Formation17.1%
Shale/Carbonate17.1%
Precambrian Shield17.1%
Igneous/Metamorphic17.1%
Basin and Range17.1%
Porphyry Copper17.1%
Precambrian Uplift17.1%
Metamorphic17.1%
Sedimentary17.1%
Top values (rank 1–20)
  1. Sedimentary Basin — 4
  2. Ancient Marine Basin — 1
  3. Shale Formation — 1
  4. Shale/Carbonate — 1
  5. Precambrian Shield — 1
  6. Igneous/Metamorphic — 1
  7. Basin and Range — 1
  8. Porphyry Copper — 1
  9. Precambrian Uplift — 1
  10. Metamorphic — 1
  11. Sedimentary — 1

primary_resources categorical

10 singleton categories
rows14
null0 (0.0%)
unique12
top_valueOil, Natural Gas
top_rate0.143
cardinality12
entropy3.522
entropy_ratio0.982
Show data table
Top values for primary_resources (12 unique shown, of 12 total).
valuecountshare
Oil, Natural Gas214.3%
Gold214.3%
Oil, Natural Gas, Coal, Rich Soils17.1%
Coal, Natural Gas17.1%
Natural Gas, Oil17.1%
Oil, Natural Gas, Sulfur17.1%
Coal, Oil, Natural Gas17.1%
Iron Ore17.1%
Gold, Silver, Copper, Lead, Zinc17.1%
Gold, Silver, Copper17.1%
Copper, Molybdenum17.1%
Phosphate17.1%
Top values (rank 1–20)
  1. Oil, Natural Gas — 2
  2. Gold — 2
  3. Oil, Natural Gas, Coal, Rich Soils — 1
  4. Coal, Natural Gas — 1
  5. Natural Gas, Oil — 1
  6. Oil, Natural Gas, Sulfur — 1
  7. Coal, Oil, Natural Gas — 1
  8. Iron Ore — 1
  9. Gold, Silver, Copper, Lead, Zinc — 1
  10. Gold, Silver, Copper — 1
  11. Copper, Molybdenum — 1
  12. Phosphate — 1

age categorical

10 singleton categories
rows14
null0 (0.0%)
unique12
top_valuePrecambrian
top_rate0.143
cardinality12
entropy3.522
entropy_ratio0.982
Show data table
Top values for age (12 unique shown, of 12 total).
valuecountshare
Precambrian214.3%
Tertiary214.3%
Cretaceous (145-66 million years ago)17.1%
Pennsylvanian-Permian17.1%
Devonian17.1%
Tertiary-Cretaceous17.1%
Permian17.1%
Devonian-Mississippian17.1%
Pennsylvanian17.1%
Cretaceous-Tertiary17.1%
Paleozoic17.1%
Miocene-Pliocene17.1%
Top values (rank 1–20)
  1. Precambrian — 2
  2. Tertiary — 2
  3. Cretaceous (145-66 million years ago) — 1
  4. Pennsylvanian-Permian — 1
  5. Devonian — 1
  6. Tertiary-Cretaceous — 1
  7. Permian — 1
  8. Devonian-Mississippian — 1
  9. Pennsylvanian — 1
  10. Cretaceous-Tertiary — 1
  11. Paleozoic — 1
  12. Miocene-Pliocene — 1

description categorical

14 singleton categories
rows14
null0 (0.0%)
unique14
top_valueAncient sea divided North America; left rich sediments forming oil/gas deposits and fertile agricultural soils. Shaped settlement, agriculture, and economy across the Great Plains.
top_rate0.071
cardinality14
entropy3.807
entropy_ratio1.000
Show data table
Top values for description (14 unique shown, of 14 total).
valuecountshare
Ancient sea divided North America; left rich sediments forming oil/gas deposits and fertile agricultural soils. Shaped settlement, agriculture, and economy across the Great Plains.17.1%
Major coal-producing region, historically drove industrialization17.1%
Major shale gas play, modern fracking boom17.1%
Major oil and gas region, petrochemical industry center17.1%
One of the most productive oil regions in US history17.1%
Major shale oil play, North Dakota boom17.1%
Coal and oil production, agricultural region17.1%
Historic iron mining, built US steel industry17.1%
Rich mining district, gold rush history17.1%
Comstock Lode, major silver and gold production17.1%
Major copper mining, mining towns17.1%
Homestake Mine, gold rush history17.1%
First US gold rush, Dahlonega17.1%
Major phosphate mining for fertilizers17.1%
Top values (rank 1–20)
  1. Ancient sea divided North America; left rich sediments forming oil/gas deposits and fertile agricultural soils. Shaped settlement, agriculture, and economy across the Great Plains. — 1
  2. Major coal-producing region, historically drove industrialization — 1
  3. Major shale gas play, modern fracking boom — 1
  4. Major oil and gas region, petrochemical industry center — 1
  5. One of the most productive oil regions in US history — 1
  6. Major shale oil play, North Dakota boom — 1
  7. Coal and oil production, agricultural region — 1
  8. Historic iron mining, built US steel industry — 1
  9. Rich mining district, gold rush history — 1
  10. Comstock Lode, major silver and gold production — 1
  11. Major copper mining, mining towns — 1
  12. Homestake Mine, gold rush history — 1
  13. First US gold rush, Dahlonega — 1
  14. Major phosphate mining for fertilizers — 1

color categorical

12 singleton categories
rows14
null0 (0.0%)
unique13
top_value#1e3a8a
top_rate0.143
cardinality13
entropy3.664
entropy_ratio0.990
Show data table
Top values for color (13 unique shown, of 13 total).
valuecountshare
#1e3a8a214.3%
#4a556817.1%
#2d374817.1%
#74421017.1%
#92400e17.1%
#37415117.1%
#7c2d1217.1%
#ca8a0417.1%
#a1620717.1%
#b4530917.1%
#713f1217.1%
#854d0e17.1%
#065f4617.1%
Top values (rank 1–20)
  1. #1e3a8a — 2
  2. #4a5568 — 1
  3. #2d3748 — 1
  4. #744210 — 1
  5. #92400e — 1
  6. #374151 — 1
  7. #7c2d12 — 1
  8. #ca8a04 — 1
  9. #a16207 — 1
  10. #b45309 — 1
  11. #713f12 — 1
  12. #854d0e — 1
  13. #065f46 — 1

geometry_type categorical

top value is 100.0% of rows
rows14
null0 (0.0%)
unique1
top_valuePolygon
top_rate1.000
cardinality1
entropy-0.000
entropy_ratio0.000
Show data table
Top values for geometry_type (1 unique shown, of 1 total).
valuecountshare
Polygon14100.0%
Top values (rank 1–20)
  1. Polygon — 14