saturn·

quirky hot sauces

saturn notebook · generated 2026-05-01 Report Notebook

Overview

Source: /home/coolhand/html/datavis/data_trove/data/quirky/hot_sauces.json

Saturn profiled 258 rows across 9 columns. The stats below are deterministic and machine-readable; the prose is a language-model interpretation of those stats (opt-in, added after the fact, never sees raw rows).

[2]:
!pip install saturn-dissect
import subprocess
subprocess.run([
    "saturn", "analyze", "/home/coolhand/html/datavis/data_trove/data/quirky/hot_sauces.json",
    "--findings", "quirky-hot_sauces.json",
    "--llm", "anthropic:claude-opus-4-7",
])

Summary confidence: high

This dataset catalogs 258 hot sauce products sourced entirely from OpenFoodFacts, with 9 categorical columns covering brand, category, country, ingredients, labels, name, and URL. Brands are highly fragmented across 158 unique values, with Tabasco (12) and McIlhenny Company, Tabasco (11) leading but no dominant player — and 37 records have a blank brand worth investigating. Geographically, the United States (54) and France (28) account for the largest shares of the 123 country values, though inconsistent encoding (e.g., 'en:us' vs 'United States') suggests a data-cleaning task. The labels column is sparse: 145 of 258 rows are blank, so dietary tags like 'No gluten' or 'Non GMO project' apply to only a small minority. Note that source and type are constant (OpenFoodFacts / hot_sauce_product) and carry no analytical signal.

citing: brand · countries · labels · categories · name · source · type

Out[4]:

saturn.schema() · 9 columns

column kind n null% unique alerts
name categorical 258 0.0% 221 long_tail
brand categorical 258 0.0% 158 long_tail
countries categorical 258 0.0% 123 long_tail
categories categorical 258 0.0% 106 long_tail
ingredients categorical 258 0.0% 207 long_tail
labels categorical 258 0.0% 77 long_tail
url categorical 258 0.0% 258 long_tail
source categorical 258 0.0% 1 imbalance
type categorical 258 0.0% 1 imbalance
Fig 1.
brand · Top brands are Tabasco-related, but the long tail of 158 brands and 37 blanks dominates the field.
Show data table
Top values for brand (20 unique shown, of 158 total).
valuecountshare
3714.3%
Tabasco124.7%
McIlhenny Company, Tabasco114.3%
Flying Goose Brand62.3%
Melinda's51.9%
Lola's Fine Hot Sauce51.9%
Cholula41.6%
Encona41.6%
El Yucateco41.6%
Mrs. Renfro's41.6%
Huy Fong Foods, Inc.31.2%
Sauce Shop31.2%
Go-Tan20.8%
Vitasia20.8%
Valentina20.8%
Heinz20.8%
sauce shop20.8%
CHOLULA20.8%
TABASCO20.8%
Serpis20.8%
Fig 2.
countries · United States and France lead, but watch for duplicate encodings like 'en:us' that need normalization.
Show data table
Top values for countries (20 unique shown, of 123 total).
valuecountshare
United States5420.9%
France2810.9%
en:us103.9%
en:gb83.1%
en:fr83.1%
en:france41.6%
en:germany41.6%
United States, World41.6%
en:United States41.6%
United Kingdom31.2%
en:United Kingdom31.2%
France, United States31.2%
en:Canada31.2%
World31.2%
France, en:morocco20.8%
en:ma20.8%
France,Royaume-Uni20.8%
en:Germany20.8%
Belgique,France20.8%
Canada20.8%
Fig 3.
labels · Over half the rows have no label at all — non-blank tags like 'No gluten' and 'Non GMO project' are rare.
Show data table
Top values for labels (20 unique shown, of 77 total).
valuecountshare
14556.2%
No gluten93.5%
No GMOs, Non GMO project93.5%
Sans gluten51.9%
Halal41.6%
en:vegan41.6%
No GMOs, Non GMO project, en:no-gluten31.2%
Point Vert31.2%
Vegetarian, Vegan, Green Dot20.8%
Triman20.8%
No gluten, en:vegan20.8%
Punto Verde20.8%
Sans OGM,en:Non GMO project20.8%
en:halal20.8%
en:no-gluten20.8%
Sin gluten,Punto Verde10.4%
Vegetarian, Vegan, European Vegetarian Union, European Vegetarian Union Vegan, Nutriscore, Rainforest Alliance, en:green-dot10.4%
Vegetarian10.4%
Thai quality label, Halal, Natural colorings, Thailand Diversity & Refinement, The Central Islamic Committee of Thailand10.4%
No gluten, No added MSG10.4%
Fig 4.
categories · Most products cluster into a few Condiments/Sauces/Hot sauces variants with inconsistent delimiters.
Show data table
Top values for categories (20 unique shown, of 106 total).
valuecountshare
3513.6%
Condiments, Sauces, Hot sauces, Groceries3212.4%
Condiments, Sauces, Groceries238.9%
Condiments, Sauces, Dips, Groceries135.0%
Condiments, Sauces, Sauces chili, en:groceries93.5%
Condiments,Sauces83.1%
Condiments, Sauces, Hot sauces72.7%
Hot sauces51.9%
Condiments,Sauces,Hot sauces51.9%
Condiments,Sauces,Hot sauces,Groceries51.9%
Condiments, Sauces, en:hot-sauces41.6%
Condiments,Sauces,Sauces chili41.6%
Sauces chili31.2%
Condiments, Sauces, Sauces chili, Sauces sriracha, en:groceries31.2%
Condiments, Sauces, Barbecue sauces, Groceries31.2%
Condiments, Sauces31.2%
undefined31.2%
Condimentos,Salsas,Salsas de chiles,en:groceries20.8%
en:hot-sauces20.8%
Condiments, Sauces, Hot sauces, Sriracha sauces20.8%
Fig 5.
name · Product names are nearly unique (221 of 258); only a few staples like 'Carolina Reaper Hot Sauce' and 'Tabasco' repeat.
Show data table
Top values for name (20 unique shown, of 221 total).
valuecountshare
Carolina Reaper Hot Sauce62.3%
Tabasco51.9%
Sriracha Hot Chilli Sauce31.2%
Sriracha Hot Chili Sauce31.2%
Sauce de piment sriracha31.2%
31.2%
Ghost pepper hot sauce31.2%
Carolina Reaper Sauce31.2%
Carolina reaper hot sauce31.2%
Carolina Reaper31.2%
Salsa Picante20.8%
Sriracha Sauce20.8%
Sriracha20.8%
Sauce sriracha20.8%
Sauce de Piment Sriracha20.8%
Tabasco Green Pepper Sauce20.8%
Tabasco® brand pepper sauce20.8%
Habanero Hot Sauce20.8%
Hot Sauce Chile Habanero20.8%
Ghost Pepper20.8%
Fig 6.
Per-column null rate across the corpus. Columns are ordered by input position.
Show data table
Per-column null rate across the corpus.
columnkindnull %
namecategorical0.0%
brandcategorical0.0%
countriescategorical0.0%
categoriescategorical0.0%
ingredientscategorical0.0%
labelscategorical0.0%
urlcategorical0.0%
sourcecategorical0.0%
typecategorical0.0%

name categorical label

This is a product name field for hot sauces, with 221 unique values across 258 rows and near-maximal entropy ratio of 0.984. The top value 'Carolina Reaper Hot Sauce' only covers 2.3% of rows, and casing/spelling variants ('Carolina Reaper Hot Sauce' vs 'Carolina reaper hot sauce', 'Sriracha Hot Chilli Sauce' vs 'Sriracha Hot Chili Sauce') plus a French entry and 3 empty strings indicate inconsistent normalization despite a 0.0 null rate.

Treatment: normalize casing and spelling variants (and treat empty strings as missing) before grouping or joining.

anthropic:claude-opus-4-7 · confidence high
Out[12]:

saturn.columns["name"].stats

statvalue
n258
nulls0 (0.0%)
unique221
top_value Carolina Reaper Hot Sauce
top_rate 0.02326
cardinality 221
entropy 7.666
entropy_ratio 0.9843
alert: long_tail199 singleton categories
Fig 7.
Top values for name.
Show data table
Top values for name (20 unique shown, of 221 total).
valuecountshare
Carolina Reaper Hot Sauce62.3%
Tabasco51.9%
Sriracha Hot Chilli Sauce31.2%
Sriracha Hot Chili Sauce31.2%
Sauce de piment sriracha31.2%
31.2%
Ghost pepper hot sauce31.2%
Carolina Reaper Sauce31.2%
Carolina reaper hot sauce31.2%
Carolina Reaper31.2%
Salsa Picante20.8%
Sriracha Sauce20.8%
Sriracha20.8%
Sauce sriracha20.8%
Sauce de Piment Sriracha20.8%
Tabasco Green Pepper Sauce20.8%
Tabasco® brand pepper sauce20.8%
Habanero Hot Sauce20.8%
Hot Sauce Chile Habanero20.8%
Ghost Pepper20.8%

brand categorical feature

Categorical brand label for what appears to be a hot sauce catalogue, with 158 distinct brands across 258 rows and very high entropy ratio (0.894) indicating a long tail. The most common value is the empty string at 37 occurrences (14.3% top rate), meaning missing-as-blank dominates over real brands like Tabasco (12) and McIlhenny Company, Tabasco (11). Note also that 'Tabasco' and 'McIlhenny Company, Tabasco' likely refer to the same maker but appear as separate categories, suggesting inconsistent normalisation.

Treatment: Replace empty strings with explicit nulls, normalise brand aliases (e.g. Tabasco vs McIlhenny), then group rare brands into 'Other' before encoding.

anthropic:claude-opus-4-7 · confidence high
Out[15]:

saturn.columns["brand"].stats

statvalue
n258
nulls0 (0.0%)
unique158
top_value
top_rate 0.1434
cardinality 158
entropy 6.53
entropy_ratio 0.8941
alert: long_tail132 singleton categories
Fig 8.
Top values for brand.
Show data table
Top values for brand (20 unique shown, of 158 total).
valuecountshare
3714.3%
Tabasco124.7%
McIlhenny Company, Tabasco114.3%
Flying Goose Brand62.3%
Melinda's51.9%
Lola's Fine Hot Sauce51.9%
Cholula41.6%
Encona41.6%
El Yucateco41.6%
Mrs. Renfro's41.6%
Huy Fong Foods, Inc.31.2%
Sauce Shop31.2%
Go-Tan20.8%
Vitasia20.8%
Valentina20.8%
Heinz20.8%
sauce shop20.8%
CHOLULA20.8%
TABASCO20.8%
Serpis20.8%

countries categorical feature

This is a country-of-origin or sale label for 258 records, with 123 distinct values and no nulls. The encoding is inconsistent: plain names ('United States', 54) coexist with Open Food Facts-style tag prefixes ('en:us', 10; 'en:United States', 4) and multi-country strings ('United States, World'), so the same country appears under several spellings. High entropy ratio (0.82) and a long tail confirm the values are fragmented well beyond the 20.9% top rate.

Treatment: Normalize to ISO country codes (strip 'en:' prefixes, split comma lists) before grouping or encoding.

anthropic:claude-opus-4-7 · confidence high
Out[18]:

saturn.columns["countries"].stats

statvalue
n258
nulls0 (0.0%)
unique123
top_value United States
top_rate 0.2093
cardinality 123
entropy 5.676
entropy_ratio 0.8176
alert: long_tail99 singleton categories
Fig 9.
Top values for countries.
Show data table
Top values for countries (20 unique shown, of 123 total).
valuecountshare
United States5420.9%
France2810.9%
en:us103.9%
en:gb83.1%
en:fr83.1%
en:france41.6%
en:germany41.6%
United States, World41.6%
en:United States41.6%
United Kingdom31.2%
en:United Kingdom31.2%
France, United States31.2%
en:Canada31.2%
World31.2%
France, en:morocco20.8%
en:ma20.8%
France,Royaume-Uni20.8%
en:Germany20.8%
Belgique,France20.8%
Canada20.8%

categories categorical feature

Comma-delimited product category tags, dominated by condiment/sauce/hot-sauce hierarchies. Cardinality is high (106 unique across 258 rows, entropy ratio 0.82) and the most common value is the empty string at 13.6% (35 rows), indicating missing labels encoded as blanks rather than nulls. Near-duplicate variants differ only by spacing, casing, or 'en:' prefixes (e.g., 'Condiments,Sauces' vs 'Condiments, Sauces, Groceries'), so raw cardinality overstates the true taxonomy.

Treatment: Normalise delimiters/casing, treat empty strings as missing, then split into a multi-hot tag encoding.

anthropic:claude-opus-4-7 · confidence high
Out[21]:

saturn.columns["categories"].stats

statvalue
n258
nulls0 (0.0%)
unique106
top_value
top_rate 0.1357
cardinality 106
entropy 5.506
entropy_ratio 0.8183
alert: long_tail85 singleton categories
Fig 10.
Top values for categories.
Show data table
Top values for categories (20 unique shown, of 106 total).
valuecountshare
3513.6%
Condiments, Sauces, Hot sauces, Groceries3212.4%
Condiments, Sauces, Groceries238.9%
Condiments, Sauces, Dips, Groceries135.0%
Condiments, Sauces, Sauces chili, en:groceries93.5%
Condiments,Sauces83.1%
Condiments, Sauces, Hot sauces72.7%
Hot sauces51.9%
Condiments,Sauces,Hot sauces51.9%
Condiments,Sauces,Hot sauces,Groceries51.9%
Condiments, Sauces, en:hot-sauces41.6%
Condiments,Sauces,Sauces chili41.6%
Sauces chili31.2%
Condiments, Sauces, Sauces chili, Sauces sriracha, en:groceries31.2%
Condiments, Sauces, Barbecue sauces, Groceries31.2%
Condiments, Sauces31.2%
undefined31.2%
Condimentos,Salsas,Salsas de chiles,en:groceries20.8%
en:hot-sauces20.8%
Condiments, Sauces, Hot sauces, Sriracha sauces20.8%

ingredients categorical free_text

Free-text ingredient lists for what appears to be hot-sauce or chili products, with 207 distinct strings across 258 rows and entropy ratio 0.90 indicating near-unique values. The dominant 'value' is an empty string at 49 rows (19% top_rate), so roughly a fifth of records have no ingredients recorded. The remaining entries mix multiple languages (English, French, Norwegian, German) and formatting conventions, so direct categorical use is not viable.

Treatment: Treat empty strings as missing, then tokenize/normalize across languages and extract ingredient features before modelling.

anthropic:claude-opus-4-7 · confidence high
Out[24]:

saturn.columns["ingredients"].stats

statvalue
n258
nulls0 (0.0%)
unique207
top_value
top_rate 0.1899
cardinality 207
entropy 6.922
entropy_ratio 0.8997
alert: long_tail203 singleton categories
Fig 11.
Top values for ingredients.
Show data table
Top values for ingredients (20 unique shown, of 207 total).
valuecountshare
4919.0%
Distilled Vinegar, Red Pepper (19%), Salt.20.8%
Vinaigre d'alcool, piment rouge (19%), sel.20.8%
Distilled vinegar, red pepper, salt.20.8%
Rød chillipepper 54%, sukker, hvitløk, salt, vann, syre (eddiksyre, sitronsyre), smaksforsterker (mononatriumglutamat), konserveringsmiddel (natriumbenzoat).10.4%
Wasser, 30% Zucker, 8% Chilischoten*, Paprika, modifizierte Stärke, Speisesalz Säuerungsmittel: Essigsäure; Knoblauch, Zwiebeln, Verdickungsmittel: Xanthan; Konservierungsstoff: Kaliumsorbat.10.4%
soybean oil [45%], chilli [25%], onion [15%], fermented soybeans [soybeans, water], flavour enhancer [e621], salt, sugar, sichuan pepper powder,10.4%
Water, Chili Pepper, Vinegar, Salt, Spice, Sodium Benzoate (Preservative).10.4%
Fermented Red Cayenne Peppers (35%), Spirit Vinegar, Water, Salt, Garlic Powder.10.4%
Eau, piments (5%), sel, acidifiant (acide acétique), stabilisant (gomme xanthane), farine de riz, épices, vinaigre de cidre, arômes naturels.10.4%
Vineger, Louisiana type Red Chili Pepper, Salt, Thickener(Xanthan Gum), Green Pepper Natural Identical Flavor, Natural Color(E120), Antioxidant (Ascorbic Acid). May Contain Celery.10.4%
chilli 61%, sugar, water, salt, garlic, flavour enhancer: monosodium glutamate, stabiliser: xanthan gum, acidity regulator: acetic acid, citric acid, preservative: potassium sorbate10.4%
pickled red chilli 64% [chili, salt, acidity regulator (acetic acid)), sugar, water, garlic, salt, thickener (modified starch, xanthan gum), acidity regulator (acetic acid, citric acid), flavour enhancer (yeast extract), preservative (potassium sorbate), colour (paprika oleoresin).10.4%
WATER, DRIED CHILI PEPPERS (5.0%) (ARBOL & PIQUIN), SALT, VINEGAR BLEND (SPIRIT VINEGAR CIDER VINEGAR), SPICES, STABILISER (XANTHAN GUM)10.4%
Red hot pepper (87%), Garlic, Coriander, Salt, Caraway, Acidifying : E330,10.4%
45% raapzaadolie, water, 20% sriracha saus (rode pepers, suiker, knoflook, zout, water, voedingszuur (azijnzuur, citroenzuur),smaakversterker (mononatriumglutamaat), conserveermiddel (natriumbenzoaat)), suiker, azijn, mosterd (water, azijn, MOSTERDZAAD,suiker, zout), zout, gemodificeerde zetmelen, voedingszuur (melkzuur), HEEL EIPOEDER, conserveermiddelen (kaliumsorbaat,natriumbenzoaat), verdikkingsmiddel (xanthaangom), antioxidant (calcium-dinatrium-EDTA).10.4%
Chilis, Zucker, Knoblauch, Salz, Essigsäure E260, Konservierungsmittel Kaliumsorbat E202, Konservierungsmittel Natriumbisulfit E222, Xanthan E415.10.4%
water, 32% piri-piri pepper, salt, acidity regulators: acetic acid, lactic acid, citric acid, wine vinegar (contains sulphites), spices, thickener: xanthan gum, paprika extract, preservatives: sodium benzoate, potassium sorbate,10.4%
Chili (83.23%), Sugar, Salt, Garlic (3.60%), Acetic Acid, Potassium Sorbate and Sodium Bisulfite as preservatives, Xanthan Gum. CONTAINS SULPHITE (SODIUM BISULFITE) INGR10.4%
Chili 70%, Zucker, Wasser, Salz, Sauerungsmittel: Essigsäure, Citronensäure; Verdickungsmittel: Xanthan; Geschmacksverstärker Mononatriumglutamat; Konservierungsstoff Kaliumsorbat10.4%

labels categorical feature

Free-form product label tags (dietary, certification, packaging) with 77 distinct values across 258 rows. Over half the rows (56.2%) carry an empty string rather than a true null, so null_rate=0 is misleading. Values mix languages (English 'No gluten' vs French 'Sans gluten') and formats (raw text vs Open Food Facts taxonomy codes like 'en:vegan'), and many cells concatenate multiple labels with commas.

Treatment: Treat empty strings as missing, split on commas, normalise language/taxonomy variants, then multi-hot encode.

anthropic:claude-opus-4-7 · confidence high
Out[27]:

saturn.columns["labels"].stats

statvalue
n258
nulls0 (0.0%)
unique77
top_value
top_rate 0.562
cardinality 77
entropy 3.557
entropy_ratio 0.5675
alert: long_tail62 singleton categories
Fig 12.
Top values for labels.
Show data table
Top values for labels (20 unique shown, of 77 total).
valuecountshare
14556.2%
No gluten93.5%
No GMOs, Non GMO project93.5%
Sans gluten51.9%
Halal41.6%
en:vegan41.6%
No GMOs, Non GMO project, en:no-gluten31.2%
Point Vert31.2%
Vegetarian, Vegan, Green Dot20.8%
Triman20.8%
No gluten, en:vegan20.8%
Punto Verde20.8%
Sans OGM,en:Non GMO project20.8%
en:halal20.8%
en:no-gluten20.8%
Sin gluten,Punto Verde10.4%
Vegetarian, Vegan, European Vegetarian Union, European Vegetarian Union Vegan, Nutriscore, Rainforest Alliance, en:green-dot10.4%
Vegetarian10.4%
Thai quality label, Halal, Natural colorings, Thailand Diversity & Refinement, The Central Islamic Committee of Thailand10.4%
No gluten, No added MSG10.4%

url categorical identifier

This column holds Open Food Facts product URLs, one per row, with the trailing path segment being the product barcode. Every one of the 258 values is unique (entropy_ratio 1.0, top_rate 0.0039), so it functions as a row identifier rather than a feature.

Treatment: Drop from modelling; keep as a lookup link or join key on the embedded barcode.

anthropic:claude-opus-4-7 · confidence high
Out[30]:

saturn.columns["url"].stats

statvalue
n258
nulls0 (0.0%)
unique258
top_value https://world.openfoodfacts.org/product/8710605030051
top_rate 0.003876
cardinality 258
entropy 8.011
entropy_ratio 1
alert: long_tail258 singleton categories
Fig 13.
Top values for url.
Show data table
Top values for url (20 unique shown, of 258 total).
valuecountshare
https://world.openfoodfacts.org/product/871060503005110.4%
https://world.openfoodfacts.org/product/2017019610.4%
https://world.openfoodfacts.org/product/692180470026910.4%
https://world.openfoodfacts.org/product/009733900005410.4%
https://world.openfoodfacts.org/product/004150088812510.4%
https://world.openfoodfacts.org/product/316629655221410.4%
https://world.openfoodfacts.org/product/622103317110710.4%
https://world.openfoodfacts.org/product/885366205602910.4%
https://world.openfoodfacts.org/product/502058001699910.4%
https://world.openfoodfacts.org/product/004973300021510.4%
https://world.openfoodfacts.org/product/619404910004410.4%
https://world.openfoodfacts.org/product/871060503004410.4%
https://world.openfoodfacts.org/product/002446306109510.4%
https://world.openfoodfacts.org/product/2002675210.4%
https://world.openfoodfacts.org/product/002446306116310.4%
https://world.openfoodfacts.org/product/885366205606710.4%
https://world.openfoodfacts.org/product/070238299910010.4%
https://world.openfoodfacts.org/product/955604113106310.4%
https://world.openfoodfacts.org/product/001622991243710.4%
https://world.openfoodfacts.org/product/063314810062410.4%

source categorical metadata

This column records the data provenance, with every one of the 258 rows tagged 'OpenFoodFacts'. Cardinality is 1 and entropy is 0, so it carries no information for modelling and simply documents that the entire slice came from a single source.

Treatment: Drop before modelling; retain only as dataset-level provenance.

anthropic:claude-opus-4-7 · confidence high
Out[33]:

saturn.columns["source"].stats

statvalue
n258
nulls0 (0.0%)
unique1
top_value OpenFoodFacts
top_rate 1
cardinality 1
entropy 0
entropy_ratio 0
alert: imbalancetop value is 100.0% of rows
Fig 14.
Top values for source.
Show data table
Top values for source (1 unique shown, of 1 total).
valuecountshare
OpenFoodFacts258100.0%

type categorical metadata

This column is a constant categorical tag identifying every row as 'hot_sauce_product', appearing in all 258 records with no nulls. Cardinality is 1 and entropy is 0, so it carries no discriminative information. It likely served as a type marker from an ingestion pipeline rather than a usable feature.

Treatment: Drop before modelling; single constant value provides no signal.

anthropic:claude-opus-4-7 · confidence high
Out[36]:

saturn.columns["type"].stats

statvalue
n258
nulls0 (0.0%)
unique1
top_value hot_sauce_product
top_rate 1
cardinality 1
entropy 0
entropy_ratio 0
alert: imbalancetop value is 100.0% of rows
Fig 15.
Top values for type.
Show data table
Top values for type (1 unique shown, of 1 total).
valuecountshare
hot_sauce_product258100.0%

How to cite

click to copy

BibTeX
@misc{saturn-quirky-hot-sauces-2026,
  author       = {Steuber, Luke},
  title        = {Saturn reading: quirky hot sauces},
  year         ={2026},
  howpublished = {\url{https://dr.eamer.dev/saturn/view/quirky-hot_sauces}},
  note         = {Profiled with saturn-dissect v0.2.0, prompt saturn-insight-v2, model anthropic:claude-opus-4-7},
}
APA
Steuber, L. (2026). Saturn reading: quirky hot sauces. Source: /home/coolhand/html/datavis/data_trove/data/quirky/hot_sauces.json. Profiled with saturn-dissect v0.2.0 (saturn-insight-v2, anthropic:claude-opus-4-7). Retrieved from https://dr.eamer.dev/saturn/view/quirky-hot_sauces