saturn

/home/coolhand/html/datavis/data_trove/data/quirky/carnivorous_plants_real.json 610 rows sample n=610 seed 42 2026-05-01T17:28:58+00:00

Overview

Source/home/coolhand/html/datavis/data_trove/data/quirky/carnivorous_plants_real.json
Total rows610
Profiled sample610
Columns14
Generated2026-05-01T17:28:58+00:00

Insights opt-in

Model-generated narrative. These are opinions, not facts — the stats below are what saturn measured. Generated by: anthropic:claude-opus-4-7.

Dataset high anthropic:claude-opus-4-7

This dataset holds 610 GBIF biodiversity occurrence records across 14 columns, mixing taxonomy (family, genus, species), geography (country, stateProvince, latitude/longitude), and observation metadata (basisOfRecord, year, month, coordinateUncertainty). Despite the 'carnivorous_plants' filename, the taxonomy is dominated by two unrelated families — Hesperiidae (skipper butterflies) and Canellaceae — each with 300 records, plus a small Araceae tail; this taxonomic split is the first thing worth investigating. Geographically, records skew to the Americas (USA 130, Mexico 73, Brazil 51) but span 35 countries, and 90% are HUMAN_OBSERVATION rather than preserved specimens. Watch coordinateUncertainty closely: it is highly skewed (skew 17.3) with a max of 766,917 m and 22.6% nulls, so any spatial analysis needs filtering. Years are tightly clustered in 2021–2026, indicating a recent-only snapshot.

scientificName high anthropic:claude-opus-4-7

Taxonomic binomials with authorship — almost certainly biodiversity occurrence records keyed by Linnaean scientific name. The distribution is heavily concentrated: 157 distinct taxa across 610 rows, with Canella winterana alone claiming 28.5% (174 records) and a long tail flagged by the profiler. Notably the names mix plants (Canella, Warburgia, Cinnamodendron, Pinellia) with butterflies (Hylephila, Ocybadistes, Urbanus), so this column spans multiple kingdoms rather than a single clade.

species high anthropic:claude-opus-4-7

Categorical taxonomic labels — mostly Linnaean binomials (e.g. Canella winterana, Warburgia salutaris) with a few family-level names mixed in (Droseraceae, Sarraceniaceae), suggesting inconsistent taxonomic granularity. One species, Canella winterana, dominates at 28.5% of 610 rows, yet 123 distinct values and an entropy ratio of 0.74 indicate a long tail. The mix of plant genera (Cinnamodendron, Cinnamosma) and butterfly/skipper species (Hylephila phyleus, Ocybadistes walkeri, Urbanus dorantes) is unusual for a single 'species' column.

genus high anthropic:claude-opus-4-7

Categorical genus name with 94 distinct values across 610 rows and no nulls. The distribution is heavy-tailed: 'Canella' alone accounts for 28.5% (174 records), and the top four values appear to be plant genera (Canella, Warburgia, Cinnamosma, Cinnamodendron) while subsequent entries (Urbanus, Hylephila, Burnsius, Pyrgus) are butterfly/skipper genera, suggesting the column mixes taxa from different kingdoms. Entropy ratio of 0.74 reflects moderate concentration around the dominant genus.

family high anthropic:claude-opus-4-7

Categorical column holding taxonomic family labels across 610 rows with only 3 distinct values and no nulls. The distribution is essentially bimodal — Hesperiidae and Canellaceae each appear 300 times (top_rate 0.492) while Araceae appears just 10 times — and notably mixes an animal family (Hesperiidae, skipper butterflies) with two plant families, which is an unusual cross-kingdom blend.

latitude high anthropic:claude-opus-4-7

This column holds geographic latitudes in decimal degrees, ranging from -43.245933 to 46.704735 with a median of 17.008014. The wide IQR of 47.748 and bimodal-leaning kurtosis of -1.28 suggest observations are spread across both hemispheres rather than clustered in one region. With 466 unique values across 610 rows and no nulls or outliers, coverage is clean but globally dispersed.

longitude high anthropic:claude-opus-4-7

Geographic longitude in decimal degrees, spanning -115.04 to 153.39 across 610 rows with no nulls and 467 unique values. The distribution is right-skewed (1.18) with a median of -63.06 sitting well below the mean of -32.94, suggesting a concentration of points in the Western Hemisphere with a long tail reaching into the Eastern Hemisphere. No outliers flagged, consistent with valid lon bounds.

country high anthropic:claude-opus-4-7

Country of origin or observation, with 35 distinct values across 610 complete rows. The distribution is moderately concentrated: United States of America leads at 21.3% (130 rows), followed by Mexico (73) and Brazil (51), and the entropy ratio of 0.77 indicates a fairly diverse but US-tilted mix. Notable is the prominence of small territories like Guadeloupe (48) and Puerto Rico (37) ranking above larger nations, suggesting a tropical/Americas sampling bias rather than a global population sample.

stateProvince high anthropic:claude-opus-4-7

Holds state or province names for 610 records spanning 108 distinct values across multiple countries (Texas, Florida, Nayarit, Queensland, KwaZulu-Natal). The mix is uneven: Texas alone covers 13.3% of rows, and the categories blend US states, Mexican states, Brazilian states, and a French city ('Pointe-à-Pitre'), suggesting inconsistent administrative granularity. 30 rows carry an empty-string value that null_rate=0 does not flag, and an explicit 'Other' bucket appears 11 times.

locality high anthropic:claude-opus-4-7

Free-text locality descriptions for specimen records, mostly in French with Malagasy place names (districts, communes, fokontany in Madagascar). 563 of 610 rows (top_rate 0.923) are empty strings, so the field is effectively blank for the vast majority of records, and the remaining 29 unique values are long sentence-length descriptions rather than controlled vocabulary. Entropy ratio of 0.154 confirms the distribution is dominated by the empty value.

basisOfRecord high anthropic:claude-opus-4-7

Categorical provenance flag from a biodiversity occurrence record (GBIF-style basisOfRecord), with only two values present out of the wider controlled vocabulary. HUMAN_OBSERVATION dominates at 550/610 (90.2%), with PRESERVED_SPECIMEN making up the remaining 60; no nulls. Entropy ratio 0.46 confirms the heavy imbalance.

year high anthropic:claude-opus-4-7

Calendar year of the record, spanning only 2021 to 2026 across 610 rows with 6 distinct values. The distribution is left-skewed (skew -0.80) and concentrated at the recent end: median and Q3 both sit at 2026, with Q1 at 2024.

month high anthropic:claude-opus-4-7

Integer values bounded between 1 and 12 with 12 unique levels strongly indicate a calendar month index. The distribution is heavily front-loaded: the median is 1.0 and Q3 is only 7.0, so at least half the rows fall in January and the skew of 1.00 confirms a long tail toward year-end months. Nulls are negligible (0.16%) and no outliers are flagged.

coordinateUncertainty high anthropic:claude-opus-4-7

Numeric coordinate uncertainty values, almost certainly meters of GPS/locality error attached to occurrence records. The distribution is severely right-skewed (skew 17.3, kurtosis 335.7): the median is 35 but the mean is 6463 and the max reaches 766917, with 19.3% of values flagged as outliers. Roughly 22.6% of rows are null, so coverage is partial.

gbifID high anthropic:claude-opus-4-7

This is the GBIF occurrence identifier: every one of the 610 rows carries a unique numeric ID (n_unique=610, top_rate=0.0016, entropy_ratio≈1.0) with no nulls. The top values cluster tightly in the 5937748304–5937748333 range, suggesting the records were ingested in a single contiguous GBIF batch rather than sampled across time.

Numeric correlation

scientificName categorical

81 singleton categories
rows610
null0 (0.0%)
unique157
top_valueCanella winterana (L.) Gaertn.
top_rate0.285
cardinality157
entropy5.517
entropy_ratio0.756
Top values (rank 1–20)
  1. Canella winterana (L.) Gaertn. — 174
  2. Warburgia salutaris (Bertol.fil.) Chiov. — 35
  3. Cinnamodendron dinisii Schwacke — 20
  4. Hylephila phyleus (Drury, 1773) — 18
  5. Cinnamosma Baill. — 17
  6. Cinnamosma fragrans Baill. — 14
  7. Ocybadistes walkeri Heron, 1894 — 11
  8. Cinnamodendron occhionianum F.Barros & J.Salazar — 10
  9. Pinellia fujianensis H.Li & G.H.Zhu — 10
  10. Urbanus proteus (Linnaeus, 1758) — 8
  11. Cinnamosma madagascariensis Danguy — 8
  12. Warburgia ugandensis Sprague — 8
  13. Lerodea eufala (Edwards, 1869) — 7
  14. Burnsius albezens Grishin, 2022 — 7
  15. Lerema Scudder, 1872 — 7
  16. Quasimellana eulogius (Plötz, 1882) — 6
  17. Spicauda procne (Plötz, 1880) — 6
  18. Burnsius oileus (Linnaeus, 1767) — 5
  19. Burnsius orcynoides — 5
  20. Cephrenes augiades (Felder, 1860) — 5

species categorical

rows610
null0 (0.0%)
unique123
top_valueCanella winterana
top_rate0.285
cardinality123
entropy5.144
entropy_ratio0.741
Top values (rank 1–20)
  1. Canella winterana — 174
  2. Droseraceae — 38
  3. Warburgia salutaris — 35
  4. Cinnamodendron dinisii — 20
  5. Hylephila phyleus — 19
  6. Sarraceniaceae — 19
  7. Cinnamosma fragrans — 14
  8. Ocybadistes walkeri — 11
  9. Urbanus dorantes — 10
  10. Cinnamodendron occhionianum — 10
  11. Pinellia fujianensis — 10
  12. Pyrgus oileus — 9
  13. Cinnamosma madagascariensis — 9
  14. Mellana eulogius — 8
  15. Urbanus proteus — 8
  16. Warburgia ugandensis — 8
  17. Urbanus procne — 7
  18. Lerodea eufala — 7
  19. Burnsius albezens — 7
  20. Gorgythion begga — 6

genus categorical

rows610
null0 (0.0%)
unique94
top_valueCanella
top_rate0.285
cardinality94
entropy4.840
entropy_ratio0.738
Top values (rank 1–20)
  1. Canella — 174
  2. Warburgia — 43
  3. Cinnamosma — 40
  4. Cinnamodendron — 34
  5. Urbanus — 25
  6. Hylephila — 19
  7. Burnsius — 16
  8. Pyrgus — 11
  9. Lerema — 11
  10. Ocybadistes — 11
  11. Pinellia — 10
  12. Mellana — 8
  13. Trapezites — 8
  14. Heliopetes — 7
  15. Lerodea — 7
  16. Toxidia — 7
  17. Pleodendron — 7
  18. Staphylus — 6
  19. Gorgythion — 6
  20. Polites — 6

family categorical

rows610
null0 (0.0%)
unique3
top_valueHesperiidae
top_rate0.492
cardinality3
entropy1.104
entropy_ratio0.697
Top values (rank 1–20)
  1. Hesperiidae — 300
  2. Canellaceae — 300
  3. Araceae — 10

latitude numeric

rows610
null0 (0.0%)
unique466
min-43.246
max46.705
mean5.200
median17.008
std22.746
q1-22.922
q324.826
iqr47.748
skew-0.652
kurtosis-1.283
n_outliers0
outlier_rate0.000
zero_rate0.000

longitude numeric

rows610
null0 (0.0%)
unique467
min-115.044
max153.391
mean-32.935
median-63.056
std78.931
q1-89.369
q330.839
iqr120.208
skew1.184
kurtosis0.084
n_outliers0
outlier_rate0.000
zero_rate0.000

country categorical

rows610
null0 (0.0%)
unique35
top_valueUnited States of America
top_rate0.213
cardinality35
entropy3.961
entropy_ratio0.772
Top values (rank 1–20)
  1. United States of America — 130
  2. Mexico — 73
  3. Brazil — 51
  4. Guadeloupe — 48
  5. Australia — 47
  6. South Africa — 41
  7. Madagascar — 40
  8. Puerto Rico — 37
  9. Dominican Republic — 16
  10. Panama — 15
  11. Argentina — 14
  12. Singapore — 10
  13. Cayman Islands — 10
  14. Antigua and Barbuda — 10
  15. China — 10
  16. Virgin Islands (U.S.) — 8
  17. Kenya — 8
  18. Hong Kong — 6
  19. Costa Rica — 5
  20. Sint Maarten (Dutch part) — 4

stateProvince categorical

rows610
null0 (0.0%)
unique108
top_valueTexas
top_rate0.133
cardinality108
entropy5.530
entropy_ratio0.819
Top values (rank 1–20)
  1. Texas — 81
  2. Florida — 46
  3. Pointe-à-Pitre — 35
  4. Nayarit — 33
  5. — 30
  6. Sinaloa — 26
  7. Queensland — 24
  8. KwaZulu-Natal — 17
  9. Santa Catarina — 16
  10. Other — 11
  11. Rio Grande do Sul — 11
  12. Mpumalanga — 10
  13. Mahajanga — 10
  14. Fujian — 10
  15. Cabo Rojo — 9
  16. Limpopo — 9
  17. Toliara — 9
  18. New South Wales — 8
  19. Fajardo — 8
  20. Paraná — 8

locality categorical

17 singleton categories
rows610
null0 (0.0%)
unique29
top_value
top_rate0.923
cardinality29
entropy0.747
entropy_ratio0.154
Top values (rank 1–20)
  1. — 563
  2. District de Soanierana Ivongo, Commune de Manompana, Fokontany de Vohijiny, Village d'Ambohitsara. Forêt littorale de Sahavalanina, au Sud-Est d'Ambohitsara. — 3
  3. District Mahabo, Commune Analamisandy,Fokontany Soazato, Forêt d'Azohy. Collectés avec: Ando, Tefy, Cécile, Jean Michel. Échantillon préservé en l'alcool. — 3
  4. Région Vatovavy, Kianjavato, Ambodifandramanana, Ankarabo, vestige de forêt au sud du Mt Vatovavy. Echantillons préservés en alcool, récoltés avec équipe polisinala (Auguste, Jean Frédéric). — 3
  5. Antsiranana, SAVA, District de Vohémar, Commune rurale d'Antsirabe-nord, Fokontany d'Andravinambo, foret d'Antsolatra Marojala Sokitra. Plantes préservées en alcool, récoltées avec Bezanaka Jean Honoré. — 3
  6. Région Sofia, District de Mandritsara, commune rurale Marotandrano, fokontany Antsiatsiaka. Foret de Bezavona à 2 km à l'Est du village d'Antsiatsiaka, foret humide sempervirente de moyenne altitude sur latérite. Avec Raharimanana Théo, Ranaivoson Ernest, Marojery Réné chef FKT, Traravola, Rabemalaza Justin, Risy guides locaux. — 3
  7. Distrit Sakaraha Commune Rurale Amboronabo Fokontany Mitia village Belambo Collecté avec Mamomjy, Tariha, Rehary — 3
  8. District Sakaraha, Commune rurale Amboronabo, Fokotany Mitia-Est. Forêt de Herea, au Nord d'Analavelona, sur sable. Hameau le plus proche Belambo. — 3
  9. District Vaingaindrano, Commnune Tsianofana, Fokontany Abaronga, localité Andasibe . Forêt humide de la nouvelle aire protégée d' Agnakatrika. Collecté avec Iakily Armand. — 3
  10. Serra da Farinha-seca, encosta do Morro Sete. — 2
  11. Serra da Graciosa. Encosta próxima ao Recanto Bela Vista. — 2
  12. Estância do Meio. — 2
  13. UTM25_32T_0600_5150 — 1
  14. Parque Estadual da Serra da Baitaca, proximidades da Cachoeira do Samambaia. — 1
  15. Parque Estadual da Serra da Baitaca, — 1
  16. Ca. 700 m al sur de San Francisco de San Isidro, costado sur (del parqueo sur) de la escuela Golden Valley. Remanentes de bosque muy húmedo, en cafetales, casas, potreros y finca de Hammel y Pérez por el Río Tures. — 1
  17. Estação de Tratamento de Água Piraí (ETA Piraí) — 1
  18. Alto Benedito. — 1
  19. Comfloresta. — 1
  20. Sítio Barcelos. Área de PRAD. Propriedade de Vilmar de Lima Barcelos. — 1

basisOfRecord categorical

rows610
null0 (0.0%)
unique2
top_valueHUMAN_OBSERVATION
top_rate0.902
cardinality2
entropy0.464
entropy_ratio0.464
Top values (rank 1–20)
  1. HUMAN_OBSERVATION — 550
  2. PRESERVED_SPECIMEN — 60

year numeric

rows610
null0 (0.0%)
unique6
min2,021
max2,026
mean2,025
median2,026
std1.503
q12,024
q32,026
iqr2.000
skew-0.797
kurtosis-0.793
n_outliers0
outlier_rate0.000
zero_rate0.000

month numeric

rows610
null1 (0.2%)
unique12
min1.000
max12.000
mean3.752
median1.000
std3.750
q11.000
q37.000
iqr6.000
skew1.002
kurtosis-0.508
n_outliers0
outlier_rate0.000
zero_rate0.000

coordinateUncertainty numeric

22.6% null skew=+17.30 19.3% rows beyond 1.5 IQR
rows610
null138 (22.6%)
unique151
min1.000
max766,917
mean6,463
median35.000
std38,136
q15.000
q3466.750
iqr461.750
skew17.305
kurtosis335.690
n_outliers91
outlier_rate0.193
zero_rate0.000

gbifID categorical

610 singleton categories
rows610
null0 (0.0%)
unique610
top_value5937748304
top_rate1.64e-03
cardinality610
entropy9.253
entropy_ratio1.000
Top values (rank 1–20)
  1. 5937748304 — 1
  2. 5937748308 — 1
  3. 5937748309 — 1
  4. 5937748312 — 1
  5. 5937748316 — 1
  6. 5937748322 — 1
  7. 5937748325 — 1
  8. 5937748327 — 1
  9. 5937748329 — 1
  10. 5937748333 — 1
  11. 5937748335 — 1
  12. 5937748336 — 1
  13. 5937748338 — 1
  14. 5937748342 — 1
  15. 5937748344 — 1
  16. 5937748350 — 1
  17. 5937748352 — 1
  18. 5937748353 — 1
  19. 5937748363 — 1
  20. 5937748369 — 1