saturn

/home/coolhand/html/datavis/data_trove/data/quirky/volcanoes.json 200 rows sample n=200 seed 42 2026-06-22T00:26:41+00:00

Overview

Source/home/coolhand/html/datavis/data_trove/data/quirky/volcanoes.json
Total rows200
Profiled sample200
Columns9
Generated2026-06-22T00:26:41+00:00
Show data table
Per-column null rate across the corpus.
columnkindnull %
namecategorical0.0%
countrycategorical0.0%
latnumeric0.0%
lonnumeric0.0%
elevationnumeric0.0%
typecategorical0.0%
veinumeric0.0%
yearnumeric0.0%
last_eruptionnumeric0.0%

Insights opt-in

Model-generated narrative. These are opinions, not facts — the stats below are what saturn measured. Generated by: anthropic:default.

Dataset high anthropic:default

This dataset captures 200 volcanic eruption records across 33 countries, covering events from 1900 to 1999, with 9 attributes including eruption intensity, volcano type, elevation, and geographic coordinates. The most striking feature is the heavy geographic concentration — Indonesia alone accounts for 28.5% of all records (57 out of 200), with Semeru appearing 13 times as the single most frequent volcano. Volcano type is strongly skewed toward stratovolcanoes, which make up 69.5% of all records, so the 'type' breakdown is worth examining to understand how rare other forms like calderas or shield volcanoes are by comparison. The Volcanic Explosivity Index (VEI) flags 15 outliers at the high end, with a maximum of 6.0 against a mean of 2.6, suggesting a small number of exceptionally powerful eruptions that deserve individual attention.

vei high anthropic:default

This column is almost certainly the Volcanic Explosivity Index (VEI), a logarithmic scale rating volcanic eruption intensity. With only 7 unique integer values ranging from 0 to 6 and a median of 3, it behaves more like an ordinal category than a continuous numeric. Notably, 15 outliers (7.5% of rows) sit at the upper end of the scale — VEI 5–6 events are rare in real-world volcanology, so their presence is worth verifying. The IQR of 1.0 and tight Q1–Q3 band of 2–3 confirm most eruptions cluster at moderate intensity.

name high anthropic:default

This column contains volcano names, functioning as a label for individual volcanic entities in the dataset. With 111 unique values across 200 rows, many volcanoes appear multiple times — 'Semeru' leads with 13 occurrences (6.5% of rows), suggesting repeated eruption or activity events per volcano rather than one row per volcano. The high entropy ratio of 0.946 combined with the long-tail alert indicates the distribution is broad but uneven, with a handful of well-known volcanoes (Semeru, Merapi, Etna, Stromboli) dominating while most names appear only once or twice.

elevation high anthropic:default

This column represents geographic elevation in metres (or feet) for 200 location records, spanning from -185.0 (below sea level, consistent with places like the Dead Sea or Death Valley) to 5393.0 (alpine/high-altitude terrain). The distribution is broad and fairly flat — IQR of 1809.75 against a mean of 2074.21 — with a slight positive skew (0.39) and near-platykurtic shape (kurtosis -0.57), suggesting a deliberately diverse geographic sample rather than a natural population draw. With only 109 unique values across 200 rows, roughly 45% of values are repeated, which may indicate rounding to nearest metre or binned elevation bands.

last_eruption high anthropic:default

This column almost certainly records the year of a volcano's last known eruption, ranging from 1900 to 1999 with a mean of 1952.3 and median of 1955 — consistent with a dataset scoped to the 20th century. The distribution is notably platykurtic (kurtosis ≈ −1.17), meaning eruption years are spread fairly uniformly across the century rather than clustering tightly around any single period. With only 84 unique values across 200 rows, many volcanoes share the same recorded eruption year, which is unsurprising given that annual granularity naturally produces ties. No nulls, no outliers, and near-zero skew make this a clean numeric feature.

lat high anthropic:default

This column contains geographic latitude values, spanning from -41.33° (southern hemisphere, e.g., southern South America or New Zealand) to 63.983° (northern hemisphere, e.g., Scandinavia or Canada), consistent with a globally distributed dataset. With only 111 unique values across 200 rows, many locations are repeated, suggesting the dataset references a limited set of geographic points rather than unique coordinates per record. The distribution is nearly symmetric (skew 0.20, kurtosis -0.48) and spans a wide IQR of 40.73°, indicating broad global coverage rather than clustering in one region. No nulls, outliers, or zeros are present.

lon high anthropic:default

This column is a geographic longitude coordinate, spanning the full valid range of −175.65 to 177.18 degrees, indicating global coverage. Surprisingly, with only 111 unique values across 200 rows (~55% uniqueness), there is notable coordinate repetition, suggesting many records share the same location or coordinates have been rounded/binned. The mean (59.16) is substantially pulled away from the median (112.31) by a left skew (−0.86), implying a cluster of observations in Eastern hemisphere longitudes with a tail of negative (Western hemisphere) values dragging the mean down.

year high anthropic:default

This column represents a calendar year, spanning 1900 to 1999 — exactly one century of data with no nulls. With 84 unique values across 200 rows, some years appear multiple times, suggesting records grouped by year rather than unique annual entries. The distribution is notably flat (kurtosis ≈ −1.17) and nearly symmetric (skew ≈ −0.21), with the bulk of records concentrated between 1928 and 1977 (IQR = 49.25 years), which is surprisingly wide and uniform for a year field.

country high anthropic:default

This column records the country associated with each record — likely the location of a seismic, volcanic, or natural-disaster event given the top countries (Indonesia, Japan, Philippines, Papua New Guinea, Chile). Indonesia dominates heavily at 28.5% of all 200 rows (57 occurrences), followed by Japan at 14.5%, which is a pronounced geographic skew toward the Pacific Ring of Fire. With only 33 unique values and zero nulls, coverage is clean, but the top-heavy distribution (entropy ratio 0.78) means most records cluster around a handful of high-activity nations.

type high anthropic:default

This column classifies volcanic structures into 13 morphological types, making it a geological label for each record. 'Stratovolcano' dominates heavily at 69.5% of 200 records (139 occurrences), while the remaining 12 types share the rest — an extreme concentration that yields an entropy ratio of only 0.47. The long tail of rare categories (e.g., 'Cinder cone' and 'Maar' each appearing ≤2 times) may cause class-imbalance problems in any supervised modelling task.

Numeric correlation

Show data table
Pearson correlation across 6 numeric columns (values clipped to 2 decimals).
latlonelevationveiyearlast_eruption
lat+1.00-0.04-0.07+0.13+0.03+0.03
lon-0.04+1.00-0.26-0.16-0.00-0.00
elevation-0.07-0.26+1.00+0.06+0.14+0.14
vei+0.13-0.16+0.06+1.00+0.03+0.03
year+0.03-0.00+0.14+0.03+1.00+1.00
last_eruption+0.03-0.00+0.14+0.03+1.00+1.00

name categorical

69 singleton categories
rows200
null0 (0.0%)
unique111
top_valueSemeru
top_rate0.065
cardinality111
entropy6.427
entropy_ratio0.946
Show data table
Top values for name (20 unique shown, of 111 total).
valuecountshare
Semeru136.5%
Merapi84.0%
Fuego52.5%
Kelud52.5%
Etna52.5%
Mayon52.5%
Asamayama52.5%
Stromboli42.0%
Paluweh42.0%
Dieng Volcanic Complex42.0%
Lengai, Ol Doinyo42.0%
Izu-Oshima31.5%
Iliwerung31.5%
Toya31.5%
Karangetang31.5%
Villarrica31.5%
Aira31.5%
Vesuvius31.5%
Tungurahua21.0%
Rabaul21.0%
Top values (rank 1–20)
  1. Semeru — 13
  2. Merapi — 8
  3. Fuego — 5
  4. Kelud — 5
  5. Etna — 5
  6. Mayon — 5
  7. Asamayama — 5
  8. Stromboli — 4
  9. Paluweh — 4
  10. Dieng Volcanic Complex — 4
  11. Lengai, Ol Doinyo — 4
  12. Izu-Oshima — 3
  13. Iliwerung — 3
  14. Toya — 3
  15. Karangetang — 3
  16. Villarrica — 3
  17. Aira — 3
  18. Vesuvius — 3
  19. Tungurahua — 2
  20. Rabaul — 2

country categorical

rows200
null0 (0.0%)
unique33
top_valueIndonesia
top_rate0.285
cardinality33
entropy3.934
entropy_ratio0.780
Show data table
Top values for country (20 unique shown, of 33 total).
valuecountshare
Indonesia5728.5%
Japan2914.5%
Italy136.5%
Philippines126.0%
Guatemala84.0%
Papua New Guinea84.0%
United States84.0%
Chile84.0%
Russia73.5%
Ecuador42.0%
Mexico42.0%
Tanzania42.0%
Cameroon31.5%
Iceland31.5%
Congo, DRC31.5%
Solomon Is.31.5%
New Zealand31.5%
Nicaragua21.0%
Vanuatu21.0%
Colombia21.0%
Top values (rank 1–20)
  1. Indonesia — 57
  2. Japan — 29
  3. Italy — 13
  4. Philippines — 12
  5. Guatemala — 8
  6. Papua New Guinea — 8
  7. United States — 8
  8. Chile — 8
  9. Russia — 7
  10. Ecuador — 4
  11. Mexico — 4
  12. Tanzania — 4
  13. Cameroon — 3
  14. Iceland — 3
  15. Congo, DRC — 3
  16. Solomon Is. — 3
  17. New Zealand — 3
  18. Nicaragua — 2
  19. Vanuatu — 2
  20. Colombia — 2

lat numeric

rows200
null0 (0.0%)
unique111
min-41.330
max63.983
mean10.078
median4.548
std24.290
q1-7.935
q332.792
iqr40.727
skew0.199
kurtosis-0.476
n_outliers0
outlier_rate0.000
zero_rate0.000
Show data table
Histogram bins for lat (median: 4.5475).
bincount
-41.33 – -33.8111
-33.81 – -26.290
-26.29 – -18.763
-18.76 – -11.244
-11.24 – -3.71857
-3.718 – 3.80424
3.804 – 11.338
11.33 – 18.8529
18.85 – 26.376
26.37 – 33.8910
33.89 – 41.4227
41.42 – 48.947
48.94 – 56.468
56.46 – 63.986

lon numeric

rows200
null0 (0.0%)
unique111
min-175.650
max177.180
mean59.156
median112.314
std97.973
q12.107
q3130.389
iqr128.282
skew-0.856
kurtosis-0.679
n_outliers0
outlier_rate0.000
zero_rate0.000
Show data table
Histogram bins for lon (median: 112.314).
bincount
-175.7 – -150.49
-150.4 – -125.20
-125.2 – -1002
-100 – -74.8423
-74.84 – -49.6413
-49.64 – -24.440
-24.44 – 0.7653
0.765 – 25.9717
25.97 – 51.1710
51.17 – 76.371
76.37 – 101.62
101.6 – 126.866
126.8 – 15237
152 – 177.217

elevation numeric

rows200
null0 (0.0%)
unique109
min-185.000
max5,393
mean2,074
median1,848
std1,235
q11,113
q32,923
iqr1,810
skew0.388
kurtosis-0.574
n_outliers0
outlier_rate0.000
zero_rate0.000
Show data table
Histogram bins for elevation (median: 1848.5).
bincount
-185 – 213.44
213.4 – 611.917
611.9 – 101027
1010 – 140924
1409 – 180727
1807 – 220612
2206 – 260422
2604 – 300223
3002 – 340111
3401 – 379922
3799 – 41983
4198 – 45960
4596 – 49954
4995 – 53934

type categorical

rows200
null0 (0.0%)
unique13
top_valueStratovolcano
top_rate0.695
cardinality13
entropy1.740
entropy_ratio0.470
Show data table
Top values for type (13 unique shown, of 13 total).
valuecountshare
Stratovolcano13969.5%
Complex volcano2211.0%
Shield volcano126.0%
Caldera94.5%
Submarine volcano42.0%
Pyroclastic shield31.5%
Lava dome31.5%
Maar21.0%
Tuff cone21.0%
Cinder cone10.5%
Compound volcano10.5%
Pyroclastic cone10.5%
Subglacial volcano10.5%
Top values (rank 1–20)
  1. Stratovolcano — 139
  2. Complex volcano — 22
  3. Shield volcano — 12
  4. Caldera — 9
  5. Submarine volcano — 4
  6. Pyroclastic shield — 3
  7. Lava dome — 3
  8. Maar — 2
  9. Tuff cone — 2
  10. Cinder cone — 1
  11. Compound volcano — 1
  12. Pyroclastic cone — 1
  13. Subglacial volcano — 1

vei numeric

7.5% rows beyond 1.5 IQR
rows200
null0 (0.0%)
unique7
min0.000
max6.000
mean2.565
median3.000
std1.068
q12.000
q33.000
iqr1.000
skew0.214
kurtosis0.838
n_outliers15
outlier_rate0.075
zero_rate0.035
Show data table
Histogram bins for vei (median: 3.0).
bincount
0 – 0.42867
0.4286 – 0.85710
0.8571 – 1.28615
1.286 – 1.7140
1.714 – 2.14377
2.143 – 2.5710
2.571 – 30
3 – 3.42970
3.429 – 3.8570
3.857 – 4.28623
4.286 – 4.7140
4.714 – 5.1436
5.143 – 5.5710
5.571 – 62

year numeric

rows200
null0 (0.0%)
unique84
min1,900
max1,999
mean1,952
median1,955
std28.791
q11,928
q31,977
iqr49.250
skew-0.213
kurtosis-1.169
n_outliers0
outlier_rate0.000
zero_rate0.000
Show data table
Histogram bins for year (median: 1955.0).
bincount
1900 – 190716
1907 – 191414
1914 – 192112
1921 – 19289
1928 – 193512
1935 – 194210
1942 – 195014
1950 – 195715
1957 – 196415
1964 – 197117
1971 – 197816
1978 – 198521
1985 – 199213
1992 – 199916

last_eruption numeric

rows200
null0 (0.0%)
unique84
min1,900
max1,999
mean1,952
median1,955
std28.791
q11,928
q31,977
iqr49.250
skew-0.213
kurtosis-1.169
n_outliers0
outlier_rate0.000
zero_rate0.000
Show data table
Histogram bins for last_eruption (median: 1955.0).
bincount
1900 – 190716
1907 – 191414
1914 – 192112
1921 – 19289
1928 – 193512
1935 – 194210
1942 – 195014
1950 – 195715
1957 – 196415
1964 – 197117
1971 – 197816
1978 – 198521
1985 – 199213
1992 – 199916