saturn

/home/coolhand/html/datavis/data_trove/data/quirky/bioluminescence.json 43,060 rows sample n=43,060 seed 42 2026-06-22T00:42:25+00:00

Overview

Source/home/coolhand/html/datavis/data_trove/data/quirky/bioluminescence.json
Total rows43,060
Profiled sample43,060
Columns14
Generated2026-06-22T00:42:25+00:00
Show data table
Per-column null rate across the corpus.
columnkindnull %
scientificNamecategorical0.0%
genuscategorical0.0%
familycategorical0.0%
phylumcategorical0.0%
classcategorical0.0%
ordercategorical0.0%
latitudenumeric0.0%
longitudenumeric0.0%
depthnumeric24.8%
datetext12.0%
yearcategorical42.2%
countrycategorical0.0%
datasetcategorical0.0%
bioluminescence_groupcategorical0.0%

Insights opt-in

Model-generated narrative. These are opinions, not facts — the stats below are what saturn measured. Generated by: anthropic:default.

Dataset medium anthropic:default

This dataset contains 43,060 occurrence records of bioluminescent marine organisms, covering 26 named groups across 7 phyla — from dinoflagellates and jellyfish to krill and bacteria — with geographic coordinates, taxonomy, and sampling depth. The most notable issue is that depth has a 24.75% null rate, extreme skew (max 10,000 m vs. median 52.5 m), and over 10% outliers, meaning depth-based analysis needs careful filtering before any conclusions are drawn. A second area to investigate is geographic bias: over 63% of country values are blank, yet Australia, the United States, Peru, and Canada dominate the named entries, suggesting strong regional over-representation in the sourced datasets. The year column also carries a 42% null rate, which limits time-trend analysis despite records spanning from at least 1962 to 2017.

date high anthropic:default

This column contains publication or recording date strings stored as text, using multiple formats including year ranges ('1962/1964'), year-month ranges ('2010-05/2010-06'), year-month ('2013-08'), and full ISO dates ('2017-05-30'). The heterogeneous format mix and wide length range (min 4, max 51 chars) will require normalization before any temporal analysis. A 67.4% duplicate rate across 43,060 rows with only 12,338 unique values indicates many records share the same date, consistent with batch or periodical publication data. The 12.0% null rate is also notable and should be investigated for systematic missingness.

depth high anthropic:default

This column represents a physical depth measurement (e.g., depth below surface for geological, seismic, or oceanographic observations), ranging from -53.0 to 10,000.0 with a median of only 52.5 — meaning most records are shallow, but a long tail of deep measurements drives extreme skew (4.72) and very high kurtosis (35.89). The null rate of 24.75% and outlier rate of 10.63% (3,444 rows) are both flagged as alerts, and 11.92% of values are exactly zero, suggesting possible default-fill or surface-level records that may need special handling. The IQR of 313.5 against a std of 570.18 confirms the heavy-tailed distribution driven by a minority of extreme deep values up to 10,000.0.

year high anthropic:default

This column represents a calendar year associated with each record, stored as a categorical type despite being numeric in nature. The top value is '2000' with 1,287 occurrences (~5.2% of total), and the 137 unique values span a range that includes at least 1979 through 2016. Two signals stand out: the null rate is 42.18%, which is severe enough to warrant an alert, and the year 2000 is notably over-represented relative to adjacent years (e.g., 2001 has only 703), suggesting either a data collection artifact or a large batch of undated records defaulted to that year.

latitude high anthropic:default

This column contains geographic latitude values, spanning from -76.619° (deep Southern Hemisphere) to 88.29° (near the North Pole), with 14,146 unique values across 43,060 rows suggesting many repeated locations. The mean (19.1°) is notably lower than the median (36.7°), and the IQR of 69.6° spans nearly the full usable latitude range, indicating records are spread across both hemispheres rather than clustered in any one region. The slight negative skew (-0.66) and near-zero outlier rate confirm a broadly spread but reasonably uniform distribution, which is unusual — most real-world geo datasets cluster in populated mid-latitude bands. The 0.046% zero-rate (≈20 rows) warrants inspection as 0.0° latitude may represent missing/default values.

longitude high anthropic:default

This column contains geographic longitude values, spanning the full valid range from -179.9987 to 179.99 degrees, confirming global coverage. The distribution is remarkably flat and near-symmetric (skew 0.138, kurtosis -0.646) with an IQR of 124.12 degrees, indicating records are spread broadly across both hemispheres rather than clustered in any region. The mean (9.64) is notably higher than the median (3.06), hinting at a slight eastward bias in the dataset. A zero_rate of 0.11% warrants a check for null-substituted zeros masquerading as the Prime Meridian.

bioluminescence_group high anthropic:default

This column classifies observations into one of 26 named bioluminescent organism groups (e.g., 'Dinoflagellate', 'Crystal jelly (source of GFP)', 'Krill (many species bioluminescent)'), covering marine taxa from dinoflagellates to jellyfish and crustaceans. The distribution is remarkably uniform: the top category 'Dinoflagellate' holds only 9.3% of rows, and the entropy ratio of 0.95 (near-maximum for 26 categories) indicates near-flat class balance. With no nulls across 43,060 rows and exactly 2,000 records for all visible non-top categories, the dataset appears deliberately balanced or synthetically constructed.

class high anthropic:default

This column contains biological taxonomic class labels for marine/aquatic organisms, with 13 distinct classes across 43,060 records and zero nulls. The entropy ratio of 0.927 indicates a near-uniform distribution across classes, though mild imbalance exists: 'Dinophyceae' is the most frequent at 18.6% (8,000 records), while several classes like 'Scyphozoa' and 'Malacostraca' each hold exactly 6,000 records, suggesting some classes may have been deliberately sampled to round numbers. The mix spans protists (Dinophyceae, Gammaproteobacteria), crustaceans (Malacostraca, Ostracoda, Copepoda), molluscs (Cephalopoda), and cnidarians/ctenophores (Scyphozoa, Hydrozoa, Nuda, Tentaculata), pointing to a plankton or marine biodiversity dataset.

country high anthropic:default

This column captures the country associated with each record, with 130 distinct values across 43,060 rows. The dominant 'value' is an empty string, accounting for 63.7% of all records (27,422 rows) — a critical data quality issue that functionally resembles a very high null rate. The remaining values show inconsistent normalisation: mixed casing ('PERU', 'SOVIET UNION' vs. 'Australia', 'Canada'), abbreviations ('GB' instead of 'United Kingdom'), and anachronistic entities ('SOVIET UNION'), suggesting data was collected from heterogeneous or historical sources without standardisation.

dataset high anthropic:default

This column identifies the source dataset or survey program for each observation, with 214 distinct named sources across 43,060 rows. The dominant value is an empty string, accounting for 61.1% of all records (26,317 rows), meaning the majority of observations carry no dataset attribution — a significant data quality concern. The remaining records span named marine/environmental monitoring programs (trawl surveys, coastal monitoring, jellyfish sightings, etc.), with the largest named source ('Environmental Monitoring database (MOD) DNV') covering only ~4% of rows.

family high anthropic:default

This column contains biological family-level taxonomic classifications, covering 22 distinct families across 43,060 records with no nulls. The distribution is notably near-uniform for the top entries: four families (Pyrocystaceae, Euphausiidae, Cypridinidae, Vibrionaceae) each have exactly 4,000 records, strongly suggesting deliberate stratified sampling or synthetic balancing rather than natural occurrence frequencies. The high entropy ratio of 0.932 (close to maximum for 22 categories) confirms an unusually flat distribution. Families represented include bioluminescent marine organisms (dinoflagellates, krill, ostracods, bacteria), hinting this is a marine bioluminescence or plankton dataset.

genus high anthropic:default

This column contains biological genus names for marine organisms — including bioluminescent dinoflagellates (Noctiluca, Pyrocystis, Lingulodinium, Alexandrium), jellyfish (Pelagia, Atolla, Periphylla), and ctenophores (Mnemiopsis, Beroe) — suggesting a marine biology or bioluminescence dataset. With 27 unique genera across 43,060 rows and no nulls, the distribution is remarkably flat: every visible top value appears exactly 2,000 times, implying a deliberately balanced or stratified dataset. The entropy ratio of 0.9586 is very high for only 27 categories, confirming near-uniform representation across genera. No skew or imbalance alerts were triggered.

order high anthropic:default

This column contains biological taxonomic order classifications for marine organisms, with 17 distinct orders spanning bacteria (Vibrionales), dinoflagellates (Gonyaulacales, Noctilucales), jellyfish (Coronatae, Leptothecata), crustaceans (Euphausiacea, Calanoida), and cephalopods (Oegopsida) among others. The distribution is moderately uneven — Gonyaulacales dominates at 13.9% (6,000 rows) while several orders share exactly 4,000 rows, suggesting possible stratified or quota-based sampling rather than natural observation frequencies. Entropy ratio of 0.949 indicates near-uniform spread across the 17 classes, which is unusually high for a taxonomic label and reinforces the structured-sampling hypothesis. No nulls are present.

phylum high anthropic:default

This column encodes biological phylum classifications across 43,060 records with exactly 7 distinct values and no nulls, making it a clean taxonomic label field. The distribution spans both animal (Arthropoda, Cnidaria, Ctenophora, Mollusca, Annelida) and non-animal (Myzozoa, Proteobacteria) kingdoms, suggesting the dataset covers a broad range of marine or environmental organisms. Arthropoda dominates at 28.6% (12,297 records), while the entropy ratio of 0.923 indicates a fairly well-spread distribution across categories. The presence of Proteobacteria (bacteria) and Myzozoa (protists) alongside metazoans may surprise analysts expecting a purely animal-focused dataset.

scientificName high anthropic:default

This column contains scientific (Latin) names of marine organisms, covering 245 distinct taxa across 43,060 records with no nulls. The values span both genus-only entries (e.g., 'Lingulodinium', 'Photobacterium', 'Vibrio') and full binomial species names, suggesting inconsistent taxonomic resolution across records. The top value 'Mnemiopsis leidyi' appears 2,000 times (~4.6% of rows), and the top 10 values together account for a substantial share of records, indicating the dataset is dominated by a relatively small set of species. Entropy ratio of 0.747 confirms moderate-to-high concentration for a 245-cardinality field.

Numeric correlation

Show data table
Pearson correlation across 3 numeric columns (values clipped to 2 decimals).
latitudelongitudedepth
latitude+1.00-0.33-0.06
longitude-0.33+1.00+0.00
depth-0.06+0.00+1.00

scientificName categorical

rows43,060
null0 (0.0%)
unique245
top_valueMnemiopsis leidyi
top_rate0.046
cardinality245
entropy5.928
entropy_ratio0.747
Show data table
Top values for scientificName (20 unique shown, of 245 total).
valuecountshare
Mnemiopsis leidyi20004.6%
Lingulodinium19764.6%
Meganyctiphanes norvegica19284.5%
Photobacterium18424.3%
Periphylla periphylla18024.2%
Pelagia noctiluca17684.1%
Noctiluca scintillans17284.0%
Vibrio15843.7%
Vargula norvegica14823.4%
Cypridina dentata13203.1%
Euphausia superba12983.0%
Chaetopterus variopedatus12222.8%
Beroe12022.8%
Oplophorus spinosus11702.7%
Histioteuthis9522.2%
Alexandrium9442.2%
Metridia lucens8722.0%
Aequorea7981.9%
Atolla wyvillei7561.8%
Pyrocystis pseudonoctiluca7421.7%
Top values (rank 1–20)
  1. Mnemiopsis leidyi — 2,000
  2. Lingulodinium — 1,976
  3. Meganyctiphanes norvegica — 1,928
  4. Photobacterium — 1,842
  5. Periphylla periphylla — 1,802
  6. Pelagia noctiluca — 1,768
  7. Noctiluca scintillans — 1,728
  8. Vibrio — 1,584
  9. Vargula norvegica — 1,482
  10. Cypridina dentata — 1,320
  11. Euphausia superba — 1,298
  12. Chaetopterus variopedatus — 1,222
  13. Beroe — 1,202
  14. Oplophorus spinosus — 1,170
  15. Histioteuthis — 952
  16. Alexandrium — 944
  17. Metridia lucens — 872
  18. Aequorea — 798
  19. Atolla wyvillei — 756
  20. Pyrocystis pseudonoctiluca — 742

genus categorical

rows43,060
null0 (0.0%)
unique27
top_valueNoctiluca
top_rate0.046
cardinality27
entropy4.558
entropy_ratio0.959
Show data table
Top values for genus (20 unique shown, of 27 total).
valuecountshare
Noctiluca20004.6%
Pyrocystis20004.6%
Lingulodinium20004.6%
Alexandrium20004.6%
Aequorea20004.6%
Pelagia20004.6%
Mnemiopsis20004.6%
Atolla20004.6%
Periphylla20004.6%
Beroe20004.6%
Euphausia20004.6%
Meganyctiphanes20004.6%
Metridia20004.6%
Oplophorus20004.6%
Vargula20004.6%
Cypridina20004.6%
Histioteuthis20004.6%
Vibrio20004.6%
Photobacterium20004.6%
Chaetopterus20004.6%
Top values (rank 1–20)
  1. Noctiluca — 2,000
  2. Pyrocystis — 2,000
  3. Lingulodinium — 2,000
  4. Alexandrium — 2,000
  5. Aequorea — 2,000
  6. Pelagia — 2,000
  7. Mnemiopsis — 2,000
  8. Atolla — 2,000
  9. Periphylla — 2,000
  10. Beroe — 2,000
  11. Euphausia — 2,000
  12. Meganyctiphanes — 2,000
  13. Metridia — 2,000
  14. Oplophorus — 2,000
  15. Vargula — 2,000
  16. Cypridina — 2,000
  17. Histioteuthis — 2,000
  18. Vibrio — 2,000
  19. Photobacterium — 2,000
  20. Chaetopterus — 2,000

family categorical

rows43,060
null0 (0.0%)
unique22
top_valuePyrocystaceae
top_rate0.093
cardinality22
entropy4.157
entropy_ratio0.932
Show data table
Top values for family (20 unique shown, of 22 total).
valuecountshare
Pyrocystaceae40009.3%
Euphausiidae40009.3%
Cypridinidae40009.3%
Vibrionaceae40009.3%
Metridinidae22975.3%
Noctilucaceae20004.6%
Lingulodiniaceae20004.6%
Aequoreidae20004.6%
Pelagiidae20004.6%
Bolinopsidae20004.6%
Atollidae20004.6%
Periphyllidae20004.6%
Beroidae20004.6%
Oplophoridae20004.6%
Histioteuthidae20004.6%
Chaetopteridae20004.6%
Pholadidae9282.2%
Renillidae8742.0%
Vampyroteuthidae4841.1%
Thysanoteuthidae2090.5%
Top values (rank 1–20)
  1. Pyrocystaceae — 4,000
  2. Euphausiidae — 4,000
  3. Cypridinidae — 4,000
  4. Vibrionaceae — 4,000
  5. Metridinidae — 2,297
  6. Noctilucaceae — 2,000
  7. Lingulodiniaceae — 2,000
  8. Aequoreidae — 2,000
  9. Pelagiidae — 2,000
  10. Bolinopsidae — 2,000
  11. Atollidae — 2,000
  12. Periphyllidae — 2,000
  13. Beroidae — 2,000
  14. Oplophoridae — 2,000
  15. Histioteuthidae — 2,000
  16. Chaetopteridae — 2,000
  17. Pholadidae — 928
  18. Renillidae — 874
  19. Vampyroteuthidae — 484
  20. Thysanoteuthidae — 209

phylum categorical

rows43,060
null0 (0.0%)
unique7
top_valueArthropoda
top_rate0.286
cardinality7
entropy2.593
entropy_ratio0.923
Show data table
Top values for phylum (7 unique shown, of 7 total).
valuecountshare
Arthropoda1229728.6%
Cnidaria887420.6%
Myzozoa800018.6%
Ctenophora41689.7%
Proteobacteria40009.3%
Mollusca37218.6%
Annelida20004.6%
Top values (rank 1–20)
  1. Arthropoda — 12,297
  2. Cnidaria — 8,874
  3. Myzozoa — 8,000
  4. Ctenophora — 4,168
  5. Proteobacteria — 4,000
  6. Mollusca — 3,721
  7. Annelida — 2,000

class categorical

rows43,060
null0 (0.0%)
unique13
top_valueDinophyceae
top_rate0.186
cardinality13
entropy3.430
entropy_ratio0.927
Show data table
Top values for class (13 unique shown, of 13 total).
valuecountshare
Dinophyceae800018.6%
Scyphozoa600013.9%
Malacostraca600013.9%
Ostracoda40009.3%
Gammaproteobacteria40009.3%
Cephalopoda27936.5%
Copepoda22975.3%
Tentaculata21685.0%
Hydrozoa20004.6%
Nuda20004.6%
Polychaeta20004.6%
Bivalvia9282.2%
Octocorallia8742.0%
Top values (rank 1–20)
  1. Dinophyceae — 8,000
  2. Scyphozoa — 6,000
  3. Malacostraca — 6,000
  4. Ostracoda — 4,000
  5. Gammaproteobacteria — 4,000
  6. Cephalopoda — 2,793
  7. Copepoda — 2,297
  8. Tentaculata — 2,168
  9. Hydrozoa — 2,000
  10. Nuda — 2,000
  11. Polychaeta — 2,000
  12. Bivalvia — 928
  13. Octocorallia — 874

order categorical

rows43,060
null0 (0.0%)
unique17
top_valueGonyaulacales
top_rate0.139
cardinality17
entropy3.879
entropy_ratio0.949
Show data table
Top values for order (17 unique shown, of 17 total).
valuecountshare
Gonyaulacales600013.9%
Coronatae40009.3%
Euphausiacea40009.3%
Myodocopida40009.3%
Vibrionales40009.3%
Oegopsida23095.4%
Calanoida22975.3%
Lobata21685.0%
Noctilucales20004.6%
Leptothecata20004.6%
Semaeostomeae20004.6%
Beroida20004.6%
Decapoda20004.6%
20004.6%
Myida9282.2%
Scleralcyonacea8742.0%
Vampyromorpha4841.1%
Top values (rank 1–20)
  1. Gonyaulacales — 6,000
  2. Coronatae — 4,000
  3. Euphausiacea — 4,000
  4. Myodocopida — 4,000
  5. Vibrionales — 4,000
  6. Oegopsida — 2,309
  7. Calanoida — 2,297
  8. Lobata — 2,168
  9. Noctilucales — 2,000
  10. Leptothecata — 2,000
  11. Semaeostomeae — 2,000
  12. Beroida — 2,000
  13. Decapoda — 2,000
  14. — 2,000
  15. Myida — 928
  16. Scleralcyonacea — 874
  17. Vampyromorpha — 484

latitude numeric

rows43,060
null0 (0.0%)
unique14,146
min-76.619
max88.290
mean19.105
median36.710
std40.266
q1-19.308
q350.303
iqr69.612
skew-0.661
kurtosis-0.936
n_outliers0
outlier_rate0.000
zero_rate4.64e-04
Show data table
Histogram bins for latitude (median: 36.710105896).
bincount
-76.62 – -72.534
-72.5 – -68.37134
-68.37 – -64.25770
-64.25 – -60.13872
-60.13 – -56.01500
-56.01 – -51.88309
-51.88 – -47.76279
-47.76 – -43.64377
-43.64 – -39.511598
-39.51 – -35.39900
-35.39 – -31.272736
-31.27 – -27.151218
-27.15 – -23.02589
-23.02 – -18.9615
-18.9 – -14.78671
-14.78 – -10.66768
-10.66 – -6.533598
-6.533 – -2.41504
-2.41 – 1.713319
1.713 – 5.836199
5.836 – 9.958628
9.958 – 14.08953
14.08 – 18.2793
18.2 – 22.33744
22.33 – 26.45566
26.45 – 30.57783
30.57 – 34.691840
34.69 – 38.822424
38.82 – 42.942931
42.94 – 47.063500
47.06 – 51.194244
51.19 – 55.313508
55.31 – 59.432052
59.43 – 63.551070
63.55 – 67.68764
67.68 – 71.81560
71.8 – 75.92532
75.92 – 80.0494
80.04 – 84.1752
84.17 – 88.2932

longitude numeric

rows43,060
null0 (0.0%)
unique14,637
min-179.999
max179.990
mean9.640
median3.057
std88.609
q1-60.186
q363.933
iqr124.119
skew0.138
kurtosis-0.646
n_outliers0
outlier_rate0.000
zero_rate1.11e-03
Show data table
Histogram bins for longitude (median: 3.05735505).
bincount
-180 – -171405
-171 – -162653
-162 – -153381
-153 – -144284
-144 – -135177
-135 – -126485
-126 – -1171914
-117 – -108117
-108 – -99151
-99 – -90289
-90 – -81785
-81 – -721676
-72 – -632314
-63 – -542269
-54 – -45643
-45 – -36680
-36 – -27556
-27 – -18530
-18 – -9.0041463
-9.004 – -0.0043333887
-0.004333 – 8.9954520
8.995 – 182373
18 – 26.991671
26.99 – 35.992361
35.99 – 44.99834
44.99 – 53.99313
53.99 – 62.99501
62.99 – 71.99696
71.99 – 80.99561
80.99 – 89.99360
89.99 – 98.99288
98.99 – 10870
108 – 117753
117 – 126480
126 – 135964
135 – 144761
144 – 1533177
153 – 1621422
162 – 171582
171 – 180714

depth numeric

24.8% null skew=+4.72 10.6% rows beyond 1.5 IQR
rows43,060
null10,658 (24.8%)
unique3,283
min-53.000
max10,000
mean281.209
median52.500
std570.178
q17.500
q3321.000
iqr313.500
skew4.724
kurtosis35.887
n_outliers3,444
outlier_rate0.106
zero_rate0.119
Show data table
Histogram bins for depth (median: 52.5).
bincount
-53 – 198.321893
198.3 – 449.64443
449.6 – 7011966
701 – 952.31504
952.3 – 12041070
1204 – 1455303
1455 – 1706226
1706 – 1958182
1958 – 2209255
2209 – 246095
2460 – 2712111
2712 – 296357
2963 – 321455
3214 – 346656
3466 – 371742
3717 – 396824
3968 – 422031
4220 – 447120
4471 – 47226
4722 – 497414
4974 – 522512
5225 – 547614
5476 – 57276
5727 – 59792
5979 – 62304
6230 – 64812
6481 – 67330
6733 – 69840
6984 – 72350
7235 – 74870
7487 – 77383
7738 – 79890
7989 – 82410
8241 – 84920
8492 – 87432
8743 – 89950
8995 – 92460
9246 – 94970
9497 – 97490
9749 – 1e+044

date text

97.0% rows are a single word 100.0% rows are all-caps 67.4% duplicate strings
rows43,060
null5,182 (12.0%)
unique12,338
len_min4
len_max51
len_mean16.448
len_median19.000
len_p9539.000
word_mean1.030
word_median1.000
n_empty0
n_duplicates25,540
duplicate_rate0.674
vocab_size10,135
readability_flesch_mean121.200
emoji_rate0.000
url_rate0.000
one_word_rate0.970
allcaps_rate1.000
boilerplate_rate0.000
Show data table
Character-length distribution for date (mean: 16.4484133269972).
charscount
4 – 5276
5 – 611
6 – 81316
8 – 978
9 – 10573
10 – 1113982
11 – 120
12 – 134
13 – 150
15 – 161820
16 – 17691
17 – 1849
18 – 193297
19 – 2011038
20 – 22392
22 – 23858
23 – 24102
24 – 25993
25 – 260
26 – 2815
28 – 29100
29 – 30126
30 – 3112
31 – 320
32 – 33224
33 – 350
35 – 360
36 – 371
37 – 380
38 – 391632
39 – 402
40 – 42226
42 – 430
43 – 440
44 – 4518
45 – 460
46 – 470
47 – 490
49 – 500
50 – 5142
Sample values (first 10)
  1. 2016-05-20
  2. 1996-07-04
  3. 2016-09-20T10:18:00Z
  4. 2015-01-18T14:35:00Z
  5. 2017-08-19
  6. 2017-04-29T00:01:00
  7. 2018-08-30
  8. 2020-11-05T16:06:00Z
  9. 2013-08-05T14:15:00Z
  10. 2012-03-24

year categorical

42.2% null
rows43,060
null18,164 (42.2%)
unique137
top_value2000
top_rate0.052
cardinality137
entropy6.142
entropy_ratio0.865
Show data table
Top values for year (20 unique shown, of 137 total).
valuecountshare
200012873.0%
20017031.6%
20166911.6%
20086881.6%
20106511.5%
20025791.3%
20135561.3%
20115541.3%
19795241.2%
20145191.2%
20035141.2%
20045111.2%
20155041.2%
20124931.1%
20074591.1%
20064421.0%
20054381.0%
19984371.0%
20204371.0%
20194361.0%
Top values (rank 1–20)
  1. 2000 — 1,287
  2. 2001 — 703
  3. 2016 — 691
  4. 2008 — 688
  5. 2010 — 651
  6. 2002 — 579
  7. 2013 — 556
  8. 2011 — 554
  9. 1979 — 524
  10. 2014 — 519
  11. 2003 — 514
  12. 2004 — 511
  13. 2015 — 504
  14. 2012 — 493
  15. 2007 — 459
  16. 2006 — 442
  17. 2005 — 438
  18. 1998 — 437
  19. 2020 — 437
  20. 2019 — 436

country categorical

rows43,060
null0 (0.0%)
unique130
top_value
top_rate0.637
cardinality130
entropy2.569
entropy_ratio0.366
Show data table
Top values for country (20 unique shown, of 130 total).
valuecountshare
2742263.7%
Australia457310.6%
United States14163.3%
PERU10982.5%
Canada9762.3%
SOVIET UNION6341.5%
Israel5501.3%
GB4651.1%
Spain3700.9%
Sweden3400.8%
USA3230.8%
Ukraine3160.7%
Romania3100.7%
Antarctica2420.6%
Republic of Korea2250.5%
Colombia2140.5%
Italy2130.5%
New Zealand2120.5%
FR2100.5%
Brazil1790.4%
Top values (rank 1–20)
  1. — 27,422
  2. Australia — 4,573
  3. United States — 1,416
  4. PERU — 1,098
  5. Canada — 976
  6. SOVIET UNION — 634
  7. Israel — 550
  8. GB — 465
  9. Spain — 370
  10. Sweden — 340
  11. USA — 323
  12. Ukraine — 316
  13. Romania — 310
  14. Antarctica — 242
  15. Republic of Korea — 225
  16. Colombia — 214
  17. Italy — 213
  18. New Zealand — 212
  19. FR — 210
  20. Brazil — 179

dataset categorical

rows43,060
null0 (0.0%)
unique214
top_value
top_rate0.611
cardinality214
entropy3.190
entropy_ratio0.412
Show data table
Top values for dataset (20 unique shown, of 214 total).
valuecountshare
2631761.1%
Environmental Monitoring database (MOD) DNV17604.1%
Jellyfish sightings along the Italian coastline from 2009 to 201710242.4%
QUADRIGE - Coastal monitoring database and products, 1974 onwards. (6064)9782.3%
MBIS research trawl surveys7141.7%
Groundfish Survey Invertebrate Data6741.6%
DFO Quebec Region Ecosystemic bottom trawl surveys6501.5%
Marine Recorder Snapshot extract of surveys entered by SeaSearch6431.5%
CPR6041.4%
DATRAS: ICES Database of trawl surveys5911.4%
Citizen Science based jellyfish observations along the Israeli Mediterranean coast in 2011-20255461.3%
BioChem: Sameoto zooplankton collection5161.2%
Marine Recorder Snapshot extract of surveys entered by JNCC3960.9%
Atlantic Reference Centre3830.9%
DFO Central and Arctic Multi-species Stock Assessment Surveys3640.8%
MEDITS-Spain: Demersal and mega-benthic species from the MEDITS (Mediterranean International Trawl Survey) project on the Spanish continental shelf between 1994 and 20102770.6%
NIWA Invertebrate Collection2670.6%
ANEMOON Beach washup monitoring (SMP) data along the Dutch coastline collected through citizen science2400.6%
Phytoplankton abundance and composition in the Ebro delta embayments (Alfacs Bay and Fangar Bay, North Western Mediterranean) during 1990-20191980.5%
Romanian Black Sea Zooplankton data from 1981 to 20001960.5%
Top values (rank 1–20)
  1. — 26,317
  2. Environmental Monitoring database (MOD) DNV — 1,760
  3. Jellyfish sightings along the Italian coastline from 2009 to 2017 — 1,024
  4. QUADRIGE - Coastal monitoring database and products, 1974 onwards. (6064) — 978
  5. MBIS research trawl surveys — 714
  6. Groundfish Survey Invertebrate Data — 674
  7. DFO Quebec Region Ecosystemic bottom trawl surveys — 650
  8. Marine Recorder Snapshot extract of surveys entered by SeaSearch — 643
  9. CPR — 604
  10. DATRAS: ICES Database of trawl surveys — 591
  11. Citizen Science based jellyfish observations along the Israeli Mediterranean coast in 2011-2025 — 546
  12. BioChem: Sameoto zooplankton collection — 516
  13. Marine Recorder Snapshot extract of surveys entered by JNCC — 396
  14. Atlantic Reference Centre — 383
  15. DFO Central and Arctic Multi-species Stock Assessment Surveys — 364
  16. MEDITS-Spain: Demersal and mega-benthic species from the MEDITS (Mediterranean International Trawl Survey) project on the Spanish continental shelf between 1994 and 2010 — 277
  17. NIWA Invertebrate Collection — 267
  18. ANEMOON Beach washup monitoring (SMP) data along the Dutch coastline collected through citizen science — 240
  19. Phytoplankton abundance and composition in the Ebro delta embayments (Alfacs Bay and Fangar Bay, North Western Mediterranean) during 1990-2019 — 198
  20. Romanian Black Sea Zooplankton data from 1981 to 2000 — 196

bioluminescence_group categorical

rows43,060
null0 (0.0%)
unique26
top_valueDinoflagellate
top_rate0.093
cardinality26
entropy4.465
entropy_ratio0.950
Show data table
Top values for bioluminescence_group (20 unique shown, of 26 total).
valuecountshare
Dinoflagellate40009.3%
Sea sparkle dinoflagellate20004.6%
Bioluminescent dinoflagellate20004.6%
Crystal jelly (source of GFP)20004.6%
Mauve stinger jellyfish20004.6%
Warty comb jelly20004.6%
Crown jellyfish (alarm jelly)20004.6%
Helmet jellyfish20004.6%
Comb jelly20004.6%
Krill (many species bioluminescent)20004.6%
Northern krill20004.6%
Copepod (secretes luminous fluid)20004.6%
Deep-sea shrimp (NanoLuc source)20004.6%
Sea firefly ostracod20004.6%
Bioluminescent ostracod20004.6%
Cock-eyed squid20004.6%
Bioluminescent marine bacteria20004.6%
Marine luminous bacteria20004.6%
Parchment tube worm20004.6%
Boring clam (piddock)9282.2%
Top values (rank 1–20)
  1. Dinoflagellate — 4,000
  2. Sea sparkle dinoflagellate — 2,000
  3. Bioluminescent dinoflagellate — 2,000
  4. Crystal jelly (source of GFP) — 2,000
  5. Mauve stinger jellyfish — 2,000
  6. Warty comb jelly — 2,000
  7. Crown jellyfish (alarm jelly) — 2,000
  8. Helmet jellyfish — 2,000
  9. Comb jelly — 2,000
  10. Krill (many species bioluminescent) — 2,000
  11. Northern krill — 2,000
  12. Copepod (secretes luminous fluid) — 2,000
  13. Deep-sea shrimp (NanoLuc source) — 2,000
  14. Sea firefly ostracod — 2,000
  15. Bioluminescent ostracod — 2,000
  16. Cock-eyed squid — 2,000
  17. Bioluminescent marine bacteria — 2,000
  18. Marine luminous bacteria — 2,000
  19. Parchment tube worm — 2,000
  20. Boring clam (piddock) — 928