saturn·

data trove openfoodfacts database

saturn notebook · generated 2026-06-21 Report Notebook

Overview

Source: /home/coolhand/html/datavis/data_trove/cache/wild/openfoodfacts_sample.json

Saturn profiled 50 rows across 545 columns. The stats below are deterministic and machine-readable; the prose is a language-model interpretation of those stats (opt-in, added after the fact, never sees raw rows).

[2]:
!pip install saturn-dissect
import subprocess
subprocess.run([
    "saturn", "analyze", "/home/coolhand/html/datavis/data_trove/cache/wild/openfoodfacts_sample.json",
    "--findings", "data-trove-openfoodfacts-database.json",
    "--llm", "anthropic:default",
])

Summary confidence: medium

This is a 50-product sample from the Open Food Facts database, an open crowdsourced food product catalogue with 545 columns spanning multilingual product names, ingredient texts, allergen data, nutritional scores, packaging details, and community metadata. The most striking structural issue is extreme sparsity: the vast majority of language-specific columns (e.g. product_name_dz, ingredients_text_ja) have null rates of 96–98%, meaning content is concentrated in French and English fields. Two things most deserve a closer look: first, the Nutri-Score distribution is heavily skewed toward grade 'e' (54% of products), suggesting the sample leans toward nutritionally poor items; second, scan counts (scans_n, mean 578, max 2523) show a strong right-skewed tail with a few highly popular products dominating community attention.

citing: nutriscore_grade.top_value · nutriscore_grade.stats.top_rate · scans_n.stats.mean · scans_n.stats.max · scans_n.alerts · nova_groups.top_value · nova_groups.stats.top_rate · ecoscore_grade.stats.cardinality · emb_code.null_rate · ingredients_text_ja.null_rate

Out[4]:

saturn.schema() · 545 columns

column kind n null% unique alerts
ingredients_with_unspecified_percent_sum numeric 50 0.0% 22
purchase_places categorical 50 2.0% 32 long_tail
rev numeric 50 0.0% 46
product_name_it categorical 50 68.0% 12 long_tail null_rate
editors unknown 50 0.0% skipped
nutrition_score_warning_fruits_vegetables_nuts_estimate_from_ingredients numeric 50 10.0% 1 constant
traces_hierarchy unknown 50 0.0% skipped
packaging categorical 50 12.0% 41 long_tail
packagings_n numeric 50 18.0% 5 outliers
categories_properties unknown 50 0.0% skipped
generic_name_en categorical 50 14.0% 8 long_tail
food_groups categorical 50 2.0% 11
ingredients_without_ciqual_codes_n numeric 50 0.0% 15
origin_sv categorical 50 92.0% 1 null_rate imbalance
product_name_ja categorical 50 98.0% 1 long_tail null_rate imbalance
data_quality_warnings_tags unknown 50 0.0% skipped
packaging_recycling_tags unknown 50 0.0% skipped
scores unknown 50 0.0% skipped
nucleotides_prev_tags unknown 50 0.0% skipped
data_quality_dimensions unknown 50 0.0% skipped
product_name_fi categorical 50 90.0% 4 long_tail null_rate
origin_de categorical 50 60.0% 1 null_rate imbalance
packaging_lc categorical 50 12.0% 7
correctors_tags unknown 50 0.0% skipped
categories_hierarchy unknown 50 0.0% skipped
ingredients_ids_debug unknown 50 0.0% skipped
traces_lc categorical 50 4.0% 6
environment_impact_level_tags unknown 50 0.0% skipped
last_image_t numeric 50 0.0% 50 high_skew
ingredients_that_may_be_from_palm_oil_n numeric 50 8.0% 3 high_skew outliers
max_imgid categorical 50 0.0% 38 long_tail
nutriscore_tags unknown 50 0.0% skipped
generic_name_sv categorical 50 92.0% 4 long_tail null_rate
ingredients_text_with_allergens_nb categorical 50 96.0% 1 null_rate imbalance
quantity categorical 50 2.0% 36 long_tail
countries_hierarchy unknown 50 0.0% skipped
data_quality_tags unknown 50 0.0% skipped
ingredients_n numeric 50 0.0% 22
grades unknown 50 0.0% skipped
additives_original_tags unknown 50 0.0% skipped
nutrition_score_beverage numeric 50 0.0% 2 high_skew
packaging_text_nl categorical 50 76.0% 1 null_rate imbalance
photographers unknown 50 0.0% skipped
pnns_groups_1 categorical 50 0.0% 7
product_name_en categorical 50 14.0% 34 long_tail
traces_from_user categorical 50 0.0% 35 long_tail
generic_name_nl categorical 50 76.0% 4 long_tail null_rate
nutrition_grade_fr categorical 50 0.0% 6
image_front_thumb_url categorical 50 0.0% 50 long_tail
last_editor categorical 50 2.0% 24 long_tail
nutrient_levels_tags unknown 50 0.0% skipped
product_name_nb categorical 50 96.0% 2 long_tail null_rate
packaging_shapes_tags unknown 50 0.0% skipped
_keywords unknown 50 0.0% skipped
emb_codes_tags unknown 50 0.0% skipped
images unknown 50 0.0% skipped
states_tags unknown 50 0.0% skipped
packaging_text_sv categorical 50 92.0% 1 null_rate imbalance
informers_tags unknown 50 0.0% skipped
ingredients_text_pl categorical 50 90.0% 3 long_tail null_rate
labels categorical 50 2.0% 42 long_tail
sources unknown 50 0.0% skipped
checkers_tags unknown 50 0.0% skipped
product_quantity_unit categorical 50 10.0% 2 imbalance
last_modified_by categorical 50 2.0% 24 long_tail
image_front_url categorical 50 0.0% 50 long_tail
nutrition_data_prepared categorical 50 4.0% 1 imbalance
packaging_text_fi categorical 50 90.0% 1 null_rate imbalance
interface_version_created categorical 50 2.0% 3
nutrient_levels unknown 50 0.0% skipped
languages_tags unknown 50 0.0% skipped
vitamins_prev_tags unknown 50 0.0% skipped
other_nutritional_substances_tags unknown 50 0.0% skipped
product_name_de categorical 50 60.0% 16 long_tail null_rate
nutrition_grades categorical 50 0.0% 6
countries_beforescanbot categorical 50 14.0% 38 long_tail
ingredients_text_with_allergens_es categorical 50 62.0% 13 long_tail null_rate
labels_lc categorical 50 2.0% 6
nova_group_debug categorical 50 0.0% 3 long_tail imbalance
nutrition_score_warning_fruits_vegetables_legumes_estimate_from_ingredients_value numeric 50 8.0% 6 high_skew outliers
lc categorical 50 0.0% 5
allergens_from_user categorical 50 0.0% 34 long_tail
debug_param_sorted_langs unknown 50 0.0% skipped
ecoscore_tags unknown 50 0.0% skipped
nutriscore_score_opposite numeric 50 2.0% 28
image_small_url categorical 50 0.0% 50 long_tail
codes_tags unknown 50 0.0% skipped
pnns_groups_2_tags unknown 50 0.0% skipped
ingredients_analysis_tags unknown 50 0.0% skipped
purchase_places_tags unknown 50 0.0% skipped
unique_scans_n numeric 50 0.0% 48 high_skew outliers
update_key categorical 50 0.0% 9 long_tail
emb_codes_orig categorical 50 34.0% 5 long_tail null_rate
ingredients_text_with_allergens_de categorical 50 66.0% 16 long_tail null_rate
ingredients_without_ecobalyse_ids_n numeric 50 0.0% 20
main_countries_tags unknown 50 0.0% skipped
ingredients_text_with_allergens_en categorical 50 16.0% 36 long_tail
nucleotides_tags unknown 50 0.0% skipped
ingredients_text_with_allergens_sv categorical 50 92.0% 4 long_tail null_rate
entry_dates_tags unknown 50 0.0% skipped
allergens_from_ingredients categorical 50 0.0% 35 long_tail
nova_groups categorical 50 4.0% 3
product_quantity categorical 50 6.0% 27 long_tail
ingredients_debug unknown 50 0.0% skipped
generic_name categorical 50 4.0% 28 long_tail
origins_tags unknown 50 0.0% skipped
added_countries_tags unknown 50 0.0% skipped
categories_lc categorical 50 0.0% 6
image_url categorical 50 0.0% 50 long_tail
ingredients_sweeteners_n numeric 50 0.0% 1 constant
ingredients_text_ja categorical 50 98.0% 1 long_tail null_rate imbalance
allergens_tags unknown 50 0.0% skipped
origin_es categorical 50 60.0% 1 null_rate imbalance
last_updated_t numeric 50 0.0% 50 outliers
origin_fr categorical 50 8.0% 7 long_tail
nutrition_score_warning_fruits_vegetables_nuts_estimate_from_ingredients_value numeric 50 10.0% 13 high_skew outliers
ingredients_without_ecobalyse_ids unknown 50 0.0% skipped
ingredients_text_with_allergens_it categorical 50 68.0% 12 long_tail null_rate
data_quality_errors_tags unknown 50 0.0% skipped
origin_pl categorical 50 90.0% 1 null_rate imbalance
packaging_text_fr categorical 50 6.0% 14 long_tail
debug_tags unknown 50 0.0% skipped
ingredients_text_sv categorical 50 92.0% 4 long_tail null_rate
cities_tags unknown 50 0.0% skipped
ingredients_with_unspecified_percent_n numeric 50 0.0% 18
product_name_fr categorical 50 2.0% 47 long_tail
traces categorical 50 0.0% 23 long_tail
known_ingredients_n numeric 50 0.0% 22
packaging_text_pl categorical 50 90.0% 1 null_rate imbalance
image_front_small_url categorical 50 0.0% 50 long_tail
origin_en categorical 50 14.0% 2 imbalance
interface_version_modified categorical 50 0.0% 2
serving_size categorical 50 12.0% 37 long_tail
states categorical 50 0.0% 26 long_tail
generic_name_fi categorical 50 90.0% 5 long_tail null_rate
schema_version numeric 50 0.0% 1 constant
packaging_old_before_taxonomization categorical 50 24.0% 36 long_tail null_rate
nova_groups_markers unknown 50 0.0% skipped
amino_acids_prev_tags unknown 50 0.0% skipped
product unknown 50 0.0% skipped
emb_codes categorical 50 4.0% 11 long_tail
labels_tags unknown 50 0.0% skipped
selected_images unknown 50 0.0% skipped
nutriscore unknown 50 0.0% skipped
packaging_tags unknown 50 0.0% skipped
traces_from_ingredients categorical 50 0.0% 12 long_tail
nutrition_data_per categorical 50 0.0% 2
ecoscore_grade categorical 50 0.0% 9
packaging_hierarchy unknown 50 0.0% skipped
nova_group numeric 50 4.0% 3 high_skew
additives_tags unknown 50 0.0% skipped
emb_codes_20141016 categorical 50 58.0% 7 long_tail null_rate
ingredients_without_ciqual_codes unknown 50 0.0% skipped
categories_tags unknown 50 0.0% skipped
category_properties unknown 50 0.0% skipped
packagings unknown 50 0.0% skipped
languages_codes unknown 50 0.0% skipped
ingredients_text_with_allergens_fi categorical 50 90.0% 4 long_tail null_rate
ciqual_food_name_tags unknown 50 0.0% skipped
complete numeric 50 0.0% 2
ingredients_text_with_allergens_pl categorical 50 92.0% 3 long_tail null_rate
allergens_hierarchy unknown 50 0.0% skipped
languages_hierarchy unknown 50 0.0% skipped
nova_groups_tags unknown 50 0.0% skipped
ingredients_tags unknown 50 0.0% skipped
ingredients_text_it categorical 50 68.0% 12 long_tail null_rate
informers unknown 50 0.0% skipped
origin_nb categorical 50 96.0% 1 null_rate imbalance
creator categorical 50 0.0% 13 long_tail
packaging_text_ja categorical 50 98.0% 1 long_tail null_rate imbalance
sortkey numeric 50 12.0% 44 high_skew outliers
packagings_materials_main categorical 50 62.0% 3 null_rate
ingredients_percent_analysis numeric 50 0.0% 2 high_skew outliers
amino_acids_tags unknown 50 0.0% skipped
categories_properties_tags unknown 50 0.0% skipped
environment_impact_level categorical 50 56.0% 1 null_rate imbalance
expiration_date categorical 50 4.0% 34 long_tail
ingredients_from_or_that_may_be_from_palm_oil_n numeric 50 6.0% 3
nutriscore_score numeric 50 2.0% 28
ingredients_text_with_allergens categorical 50 0.0% 50 long_tail
ingredients_with_specified_percent_sum numeric 50 0.0% 22
nutriscore_version categorical 50 0.0% 1 imbalance
lang categorical 50 0.0% 5
origins_hierarchy unknown 50 0.0% skipped
origins_lc categorical 50 4.0% 6
origin_it categorical 50 68.0% 1 null_rate imbalance
serving_quantity categorical 50 12.0% 27 long_tail
checkers unknown 50 0.0% skipped
editors_tags unknown 50 0.0% skipped
stores categorical 50 4.0% 31 long_tail
product_name_pl categorical 50 90.0% 3 long_tail null_rate
weighters_tags unknown 50 0.0% skipped
ecoscore_score numeric 50 14.0% 31
generic_name_it categorical 50 68.0% 5 long_tail null_rate
obsolete categorical 50 12.0% 1 imbalance
other_nutritional_substances_prev_tags unknown 50 0.0% skipped
compared_to_category categorical 50 0.0% 35 long_tail
generic_name_es categorical 50 60.0% 7 long_tail null_rate
correctors unknown 50 0.0% skipped
additives_n numeric 50 0.0% 8
ingredients_text_nb categorical 50 96.0% 1 null_rate imbalance
ingredients_text_es categorical 50 60.0% 13 long_tail null_rate
manufacturing_places_tags unknown 50 0.0% skipped
origin categorical 50 6.0% 6 long_tail
origins_old categorical 50 22.0% 9 long_tail null_rate
packaging_text_de categorical 50 60.0% 2 null_rate
languages unknown 50 0.0% skipped
categories_old categorical 50 2.0% 45 long_tail
ingredients_from_palm_oil_tags unknown 50 0.0% skipped
minerals_prev_tags unknown 50 0.0% skipped
origin_fi categorical 50 90.0% 1 null_rate imbalance
packaging_old categorical 50 14.0% 40 long_tail
ingredients_text_fi categorical 50 90.0% 4 long_tail null_rate
product_type categorical 50 0.0% 1 imbalance
ingredients_hierarchy unknown 50 0.0% skipped
removed_countries_tags unknown 50 0.0% skipped
unknown_nutrients_tags unknown 50 0.0% skipped
no_nutrition_data categorical 50 4.0% 1 imbalance
ingredients_analysis unknown 50 0.0% skipped
packagings_materials unknown 50 0.0% skipped
serving_quantity_unit categorical 50 8.0% 2 imbalance
product_name categorical 50 0.0% 49 long_tail
id categorical 50 0.0% 50 long_tail
ingredients_text_with_allergens_nl categorical 50 78.0% 9 long_tail null_rate
categories categorical 50 0.0% 46 long_tail
nutrition_grades_tags unknown 50 0.0% skipped
nutriscore_2023_tags unknown 50 0.0% skipped
origin_ja categorical 50 98.0% 1 long_tail null_rate imbalance
nutrition_score_debug categorical 50 0.0% 2 imbalance
teams categorical 50 8.0% 39 long_tail
unknown_ingredients_n numeric 50 0.0% 6 high_skew outliers
url categorical 50 0.0% 50 long_tail
data_quality_completeness_tags unknown 50 0.0% skipped
ecoscore_data unknown 50 0.0% skipped
generic_name_pl categorical 50 90.0% 2 null_rate
nutrition_data categorical 50 2.0% 1 imbalance
generic_name_ja categorical 50 98.0% 1 long_tail null_rate imbalance
nutriments unknown 50 0.0% skipped
last_image_dates_tags unknown 50 0.0% skipped
brands categorical 50 0.0% 41 long_tail
minerals_tags unknown 50 0.0% skipped
nutrition_data_prepared_per categorical 50 0.0% 1 imbalance
popularity_tags unknown 50 0.0% skipped
packaging_text_es categorical 50 60.0% 2 null_rate
manufacturing_places categorical 50 2.0% 20 long_tail
generic_name_nb categorical 50 96.0% 1 null_rate imbalance
last_modified_t numeric 50 0.0% 50 outliers
vitamins_tags unknown 50 0.0% skipped
_id categorical 50 0.0% 50 long_tail
teams_tags unknown 50 0.0% skipped
countries categorical 50 0.0% 43 long_tail
pnns_groups_2 categorical 50 0.0% 11
states_hierarchy unknown 50 0.0% skipped
code categorical 50 0.0% 50 long_tail
countries_lc categorical 50 2.0% 6
stores_tags unknown 50 0.0% skipped
generic_name_de categorical 50 60.0% 9 long_tail null_rate
ingredients_n_tags unknown 50 0.0% skipped
allergens categorical 50 0.0% 16
allergens_lc categorical 50 4.0% 6
ingredients_text_en categorical 50 12.0% 36 long_tail
misc_tags unknown 50 0.0% skipped
photographers_tags unknown 50 0.0% skipped
packaging_materials_tags unknown 50 0.0% skipped
product_name_nl categorical 50 76.0% 7 long_tail null_rate
nutrition_score_warning_fruits_vegetables_legumes_estimate_from_ingredients numeric 50 8.0% 1 constant
product_name_sv categorical 50 92.0% 4 long_tail null_rate
food_groups_tags unknown 50 0.0% skipped
completeness numeric 50 0.0% 14 outliers
pnns_groups_1_tags unknown 50 0.0% skipped
ingredients_with_specified_percent_n numeric 50 0.0% 7
origin_nl categorical 50 76.0% 1 null_rate imbalance
fruits-vegetables-nuts_100g_estimate numeric 50 46.0% 2 null_rate high_skew
brands_old categorical 50 32.0% 29 long_tail null_rate
generic_name_fr categorical 50 6.0% 34 long_tail
ingredients unknown 50 0.0% skipped
countries_tags unknown 50 0.0% skipped
ingredients_original_tags unknown 50 0.0% skipped
ingredients_text_de categorical 50 60.0% 16 long_tail null_rate
nutriscore_grade categorical 50 0.0% 6
image_thumb_url categorical 50 0.0% 50 long_tail
packaging_text_en categorical 50 14.0% 5 long_tail
packaging_text_it categorical 50 68.0% 3 long_tail null_rate
traces_tags unknown 50 0.0% skipped
brands_tags unknown 50 0.0% skipped
nutriscore_2021_tags unknown 50 0.0% skipped
packaging_text categorical 50 4.0% 13 long_tail
popularity_key numeric 50 0.0% 49 high_skew outliers
ingredients_text categorical 50 0.0% 50 long_tail
ingredients_text_with_allergens_fr categorical 50 4.0% 47 long_tail
ingredients_text_nl categorical 50 76.0% 9 long_tail null_rate
product_name_es categorical 50 60.0% 17 long_tail null_rate
data_sources_tags unknown 50 0.0% skipped
data_quality_bugs_tags unknown 50 0.0% skipped
obsolete_since_date categorical 50 12.0% 1 imbalance
weighers_tags unknown 50 0.0% skipped
ingredients_text_debug categorical 50 28.0% 35 long_tail null_rate
link categorical 50 4.0% 28 long_tail
created_t numeric 50 0.0% 50
ingredients_text_fr categorical 50 4.0% 47 long_tail
labels_hierarchy unknown 50 0.0% skipped
ingredients_non_nutritive_sweeteners_n numeric 50 0.0% 1 constant
last_edit_dates_tags unknown 50 0.0% skipped
packaging_text_nb categorical 50 96.0% 1 null_rate imbalance
packagings_complete numeric 50 4.0% 2
data_sources categorical 50 0.0% 43 long_tail
labels_old categorical 50 8.0% 38 long_tail
data_quality_info_tags unknown 50 0.0% skipped
ingredients_from_palm_oil_n numeric 50 8.0% 2 outliers
ingredients_text_with_allergens_ja categorical 50 98.0% 1 long_tail null_rate imbalance
ingredients_lc categorical 50 0.0% 4
origins categorical 50 4.0% 20 long_tail
nutriscore_data unknown 50 0.0% skipped
scans_n numeric 50 0.0% 49 high_skew outliers
ingredients_that_may_be_from_palm_oil_tags unknown 50 0.0% skipped
generic_name_ar categorical 50 80.0% 2 null_rate
product_name_uk categorical 50 98.0% 1 long_tail null_rate imbalance
last_checked_t numeric 50 86.0% 7 null_rate
last_check_dates_tags unknown 50 0.0% skipped
ingredients_text_uk categorical 50 98.0% 1 long_tail null_rate imbalance
carbon_footprint_from_known_ingredients_debug categorical 50 72.0% 14 long_tail null_rate
packaging_text_ar categorical 50 80.0% 1 null_rate imbalance
generic_name_uk categorical 50 98.0% 1 long_tail null_rate imbalance
last_checker categorical 50 86.0% 4 null_rate
checked categorical 50 86.0% 1 null_rate imbalance
product_name_ar categorical 50 78.0% 6 long_tail null_rate
ingredients_text_with_allergens_uk categorical 50 98.0% 1 long_tail null_rate imbalance
origin_uk categorical 50 98.0% 1 long_tail null_rate imbalance
packaging_text_uk categorical 50 98.0% 1 long_tail null_rate imbalance
ingredients_text_ar categorical 50 78.0% 2 null_rate
ingredients_text_with_allergens_ar categorical 50 82.0% 2 null_rate
carbon_footprint_percent_of_known_ingredients numeric 50 62.0% 19 null_rate
origin_ar categorical 50 80.0% 1 null_rate imbalance
nutrition_score_warning_no_fiber numeric 50 70.0% 1 null_rate constant
ingredients_text_debug_tags unknown 50 0.0% skipped
nutriments_estimated unknown 50 0.0% skipped
completed_t numeric 50 68.0% 16 null_rate
taxonomies_enhancer_tags unknown 50 0.0% skipped
ingredients_text_with_allergens_sl categorical 50 98.0% 1 long_tail null_rate imbalance
packaging_text_sk categorical 50 98.0% 1 long_tail null_rate imbalance
ingredients_text_with_allergens_bg categorical 50 94.0% 3 long_tail null_rate
ingredients_text_pt categorical 50 80.0% 4 long_tail null_rate
ingredients_text_dz categorical 50 98.0% 1 long_tail null_rate imbalance
generic_name_ca categorical 50 96.0% 1 null_rate imbalance
generic_name_bg categorical 50 94.0% 1 null_rate imbalance
origin_sl categorical 50 98.0% 1 long_tail null_rate imbalance
product_name_et categorical 50 94.0% 3 long_tail null_rate
origin_et categorical 50 94.0% 1 null_rate imbalance
ingredients_text_with_allergens_sk categorical 50 98.0% 1 long_tail null_rate imbalance
ingredients_text_with_allergens_et categorical 50 94.0% 3 long_tail null_rate
nutrition_score_warning_nutriments_estimated numeric 50 96.0% 1 null_rate constant
ingredients_text_sk categorical 50 98.0% 1 long_tail null_rate imbalance
generic_name_pt categorical 50 80.0% 3 long_tail null_rate
ingredients_text_bg categorical 50 94.0% 3 long_tail null_rate
packaging_text_et categorical 50 94.0% 1 null_rate imbalance
product_name_sk categorical 50 98.0% 1 long_tail null_rate imbalance
ingredients_text_ca categorical 50 96.0% 1 null_rate imbalance
ingredients_text_with_allergens_ca categorical 50 98.0% 1 long_tail null_rate imbalance
product_name_dz categorical 50 98.0% 1 long_tail null_rate imbalance
product_name_sl categorical 50 98.0% 1 long_tail null_rate imbalance
origin_sk categorical 50 98.0% 1 long_tail null_rate imbalance
generic_name_et categorical 50 94.0% 1 null_rate imbalance
ingredients_text_et categorical 50 94.0% 3 long_tail null_rate
packaging_text_ca categorical 50 96.0% 1 null_rate imbalance
packaging_text_sl categorical 50 98.0% 1 long_tail null_rate imbalance
generic_name_dz categorical 50 98.0% 1 long_tail null_rate imbalance
origin_ca categorical 50 96.0% 1 null_rate imbalance
product_name_ca categorical 50 96.0% 1 null_rate imbalance
packaging_text_pt categorical 50 80.0% 1 null_rate imbalance
origin_bg categorical 50 94.0% 1 null_rate imbalance
packaging_text_bg categorical 50 94.0% 1 null_rate imbalance
origin_pt categorical 50 80.0% 1 null_rate imbalance
ingredients_text_with_allergens_pt categorical 50 84.0% 4 long_tail null_rate
product_name_bg categorical 50 94.0% 3 long_tail null_rate
ingredients_text_sl categorical 50 98.0% 1 long_tail null_rate imbalance
generic_name_sl categorical 50 98.0% 1 long_tail null_rate imbalance
generic_name_sk categorical 50 98.0% 1 long_tail null_rate imbalance
product_name_pt categorical 50 80.0% 7 long_tail null_rate
lc_imported categorical 50 84.0% 2 null_rate
abbreviated_product_name_fr_imported categorical 50 86.0% 7 long_tail null_rate
generic_name_zh categorical 50 98.0% 1 long_tail null_rate imbalance
obsolete_imported categorical 50 86.0% 1 null_rate imbalance
generic_name_fr_imported categorical 50 86.0% 7 long_tail null_rate
owners_tags categorical 50 86.0% 6 long_tail null_rate
owner_imported categorical 50 88.0% 5 long_tail null_rate
customer_service categorical 50 86.0% 6 long_tail null_rate
ingredients_text_zh_debug_tags unknown 50 0.0% skipped
countries_imported categorical 50 84.0% 2 null_rate
data_sources_imported categorical 50 84.0% 8 long_tail null_rate
product_name_zh categorical 50 98.0% 1 long_tail null_rate imbalance
categories_imported categorical 50 88.0% 5 long_tail null_rate
quantity_imported categorical 50 86.0% 7 long_tail null_rate
ingredients_text_zh categorical 50 98.0% 1 long_tail null_rate imbalance
emb_code categorical 50 98.0% 1 long_tail null_rate imbalance
origins_fr categorical 50 96.0% 2 long_tail null_rate
nutrition_data_prepared_per_imported categorical 50 86.0% 1 null_rate imbalance
product_name_zh_debug_tags unknown 50 0.0% skipped
sources_fields unknown 50 0.0% skipped
customer_service_fr categorical 50 86.0% 6 long_tail null_rate
nutrition_data_per_imported categorical 50 84.0% 1 null_rate imbalance
owner categorical 50 86.0% 6 long_tail null_rate
abbreviated_product_name categorical 50 86.0% 7 long_tail null_rate
conservation_conditions_fr categorical 50 86.0% 7 long_tail null_rate
brands_imported categorical 50 86.0% 6 long_tail null_rate
owner_fields unknown 50 0.0% skipped
conservation_conditions_fr_imported categorical 50 86.0% 7 long_tail null_rate
origin_fr_imported categorical 50 96.0% 2 long_tail null_rate
customer_service_fr_imported categorical 50 86.0% 6 long_tail null_rate
generic_name_zh_debug_tags unknown 50 0.0% skipped
product_name_fr_imported categorical 50 86.0% 7 long_tail null_rate
lang_imported categorical 50 86.0% 1 null_rate imbalance
abbreviated_product_name_fr categorical 50 86.0% 7 long_tail null_rate
ingredients_text_fr_imported categorical 50 86.0% 7 long_tail null_rate
conservation_conditions categorical 50 86.0% 7 long_tail null_rate
nova_group_error categorical 50 96.0% 1 null_rate imbalance
producer_version_id_imported categorical 50 92.0% 3 long_tail null_rate
ingredients_text_de_ocr_1648990410 categorical 50 98.0% 1 long_tail null_rate imbalance
product_name_ro categorical 50 96.0% 2 long_tail null_rate
packaging_imported categorical 50 92.0% 2 null_rate
ingredients_text_de_ocr_1648990410_result categorical 50 98.0% 1 long_tail null_rate imbalance
ingredients_text_ro categorical 50 96.0% 1 null_rate imbalance
producer_version_id categorical 50 92.0% 3 long_tail null_rate
labels_imported categorical 50 90.0% 3 long_tail null_rate
allergens_imported categorical 50 90.0% 4 long_tail null_rate
origin_ro categorical 50 96.0% 1 null_rate imbalance
no_nutrition_data_imported categorical 50 92.0% 1 null_rate imbalance
serving_size_imported categorical 50 88.0% 6 long_tail null_rate
generic_name_ro categorical 50 96.0% 1 null_rate imbalance
ingredients_text_de_ocr_1648897071 categorical 50 98.0% 1 long_tail null_rate imbalance
ingredients_text_de_ocr_1648897071_result categorical 50 98.0% 1 long_tail null_rate imbalance
packaging_text_ro categorical 50 96.0% 1 null_rate imbalance
abbreviated_product_name_imported categorical 50 94.0% 3 long_tail null_rate
traces_imported categorical 50 92.0% 4 long_tail null_rate
specific_ingredients unknown 50 0.0% skipped
packaging_text_ru categorical 50 94.0% 1 null_rate imbalance
origin_ru categorical 50 94.0% 1 null_rate imbalance
ingredients_text_with_allergens_ru categorical 50 94.0% 1 null_rate imbalance
product_name_ru categorical 50 94.0% 2 null_rate
generic_name_ru categorical 50 94.0% 2 null_rate
ingredients_text_ru categorical 50 94.0% 1 null_rate imbalance
packaging_text_da categorical 50 96.0% 1 null_rate imbalance
generic_name_da categorical 50 96.0% 2 long_tail null_rate
forest_footprint_data unknown 50 0.0% skipped
product_name_da categorical 50 96.0% 2 long_tail null_rate
ingredients_text_with_allergens_da categorical 50 96.0% 2 long_tail null_rate
origin_da categorical 50 96.0% 1 null_rate imbalance
ingredients_text_da categorical 50 96.0% 2 long_tail null_rate
ingredients_text_cs categorical 50 94.0% 2 null_rate
ingredients_text_nl_ocr_1675675383_result categorical 50 98.0% 1 long_tail null_rate imbalance
product_name_cs categorical 50 94.0% 2 null_rate
ingredients_text_hu_ocr_1571428260_result categorical 50 98.0% 1 long_tail null_rate imbalance
packaging_text_cs categorical 50 94.0% 1 null_rate imbalance
ingredients_text_sr categorical 50 96.0% 2 long_tail null_rate
origin_sr categorical 50 96.0% 1 null_rate imbalance
ingredients_text_hu_ocr_1571428260 categorical 50 98.0% 1 long_tail null_rate imbalance
packaging_text_hu categorical 50 92.0% 1 null_rate imbalance
origin_cs categorical 50 96.0% 1 null_rate imbalance
ingredients_text_nl_ocr_1675675383 categorical 50 98.0% 1 long_tail null_rate imbalance
product_name_sr categorical 50 96.0% 2 long_tail null_rate
generic_name_hu categorical 50 92.0% 2 null_rate
packaging_text_sr categorical 50 96.0% 1 null_rate imbalance
ingredients_text_with_allergens_cs categorical 50 98.0% 1 long_tail null_rate imbalance
ingredients_text_with_allergens_sr categorical 50 96.0% 2 long_tail null_rate
ingredients_text_hu categorical 50 92.0% 4 long_tail null_rate
product_name_hu categorical 50 92.0% 3 long_tail null_rate
generic_name_sr categorical 50 96.0% 2 long_tail null_rate
origin_hu categorical 50 92.0% 1 null_rate imbalance
ingredients_text_with_allergens_hu categorical 50 94.0% 3 long_tail null_rate
generic_name_cs categorical 50 94.0% 1 null_rate imbalance
ingredients_text_xx categorical 50 96.0% 1 null_rate imbalance
origin_xx categorical 50 98.0% 1 long_tail null_rate imbalance
product_name_xx categorical 50 96.0% 1 null_rate imbalance
packaging_text_xx categorical 50 98.0% 1 long_tail null_rate imbalance
generic_name_xx categorical 50 96.0% 1 null_rate imbalance
ingredients_text_es_ocr_1548767061 categorical 50 98.0% 1 long_tail null_rate imbalance
ingredients_text_es_ocr_1548767061_result categorical 50 98.0% 1 long_tail null_rate imbalance
origin_ur categorical 50 98.0% 1 long_tail null_rate imbalance
product_name_he categorical 50 96.0% 2 long_tail null_rate
ingredients_text_he categorical 50 98.0% 1 long_tail null_rate imbalance
product_name_ur categorical 50 98.0% 1 long_tail null_rate imbalance
generic_name_he categorical 50 98.0% 1 long_tail null_rate imbalance
packaging_text_he categorical 50 98.0% 1 long_tail null_rate imbalance
ingredients_text_ur categorical 50 98.0% 1 long_tail null_rate imbalance
origin_he categorical 50 98.0% 1 long_tail null_rate imbalance
generic_name_ur categorical 50 98.0% 1 long_tail null_rate imbalance
packaging_text_ur categorical 50 98.0% 1 long_tail null_rate imbalance
ingredients_text_with_allergens_he categorical 50 98.0% 1 long_tail null_rate imbalance
nutriscore_grade_producer_imported categorical 50 94.0% 3 long_tail null_rate
nutriscore_grade_producer categorical 50 94.0% 3 long_tail null_rate
ingredients_text_el categorical 50 98.0% 1 long_tail null_rate imbalance
ingredients_text_with_allergens_el categorical 50 98.0% 1 long_tail null_rate imbalance
packaging_text_el categorical 50 98.0% 1 long_tail null_rate imbalance
origin_el categorical 50 98.0% 1 long_tail null_rate imbalance
product_name_el categorical 50 98.0% 1 long_tail null_rate imbalance
generic_name_el categorical 50 98.0% 1 long_tail null_rate imbalance
ingredients_text_it_ocr_1559410715 categorical 50 98.0% 1 long_tail null_rate imbalance
ingredients_text_de_ocr_1559410715 categorical 50 98.0% 1 long_tail null_rate imbalance
product_name_th categorical 50 98.0% 1 long_tail null_rate imbalance
ingredients_text_th categorical 50 98.0% 1 long_tail null_rate imbalance
ingredients_text_de_ocr_1548767354 categorical 50 98.0% 1 long_tail null_rate imbalance
ingredients_text_de_ocr_1548767354_result categorical 50 98.0% 1 long_tail null_rate imbalance
generic_name_th categorical 50 98.0% 1 long_tail null_rate imbalance
ingredients_text_it_ocr_1559410715_result categorical 50 98.0% 1 long_tail null_rate imbalance
packaging_text_th categorical 50 98.0% 1 long_tail null_rate imbalance
origin_th categorical 50 98.0% 1 long_tail null_rate imbalance
ingredients_text_with_allergens_th categorical 50 98.0% 1 long_tail null_rate imbalance
ingredients_text_de_ocr_1559410715_result categorical 50 98.0% 1 long_tail null_rate imbalance
packaging_text_fr_imported categorical 50 98.0% 1 long_tail null_rate imbalance
preparation categorical 50 98.0% 1 long_tail null_rate imbalance
preparation_fr_imported categorical 50 98.0% 1 long_tail null_rate imbalance
preparation_fr categorical 50 98.0% 1 long_tail null_rate imbalance
generic_name_lc categorical 50 98.0% 1 long_tail null_rate imbalance
product_name_lc categorical 50 98.0% 1 long_tail null_rate imbalance
ingredients_text_lc categorical 50 98.0% 1 long_tail null_rate imbalance
ingredients_text_with_allergens_lc categorical 50 98.0% 1 long_tail null_rate imbalance
generic_name_xx_debug_tags unknown 50 0.0% skipped
ingredients_text_xx_debug_tags unknown 50 0.0% skipped
product_name_xx_debug_tags unknown 50 0.0% skipped
ingredients_text_fr_ocr_1561814324_result categorical 50 98.0% 1 long_tail null_rate imbalance
ingredients_text_fr_ocr_1561814324 categorical 50 98.0% 1 long_tail null_rate imbalance
ingredients_text_fr_ocr_1624039072 categorical 50 98.0% 1 long_tail null_rate imbalance
ingredients_text_fr_ocr_1624039072_result categorical 50 98.0% 1 long_tail null_rate imbalance
ingredients_text_fr_ocr_1573108349 categorical 50 98.0% 1 long_tail null_rate imbalance
ingredients_text_fr_ocr_1573108349_result categorical 50 98.0% 1 long_tail null_rate imbalance
ingredients_text_fr_ocr_1573107560_result categorical 50 98.0% 1 long_tail null_rate imbalance
ingredients_text_fr_ocr_1573108360 categorical 50 98.0% 1 long_tail null_rate imbalance
ingredients_text_fr_ocr_1573107556_result categorical 50 98.0% 1 long_tail null_rate imbalance
ingredients_text_fr_ocr_1573109955 categorical 50 98.0% 1 long_tail null_rate imbalance
ingredients_text_fr_ocr_1566920858 categorical 50 98.0% 1 long_tail null_rate imbalance
ingredients_text_fr_ocr_1573107560 categorical 50 98.0% 1 long_tail null_rate imbalance
ingredients_text_fr_ocr_1573108346 categorical 50 98.0% 1 long_tail null_rate imbalance
ingredients_text_fr_ocr_1573108346_result categorical 50 98.0% 1 long_tail null_rate imbalance
ingredients_text_fr_ocr_1573109955_result categorical 50 98.0% 1 long_tail null_rate imbalance
ingredients_text_fr_ocr_1566920858_result categorical 50 98.0% 1 long_tail null_rate imbalance
ingredients_text_fr_ocr_1573107556 categorical 50 98.0% 1 long_tail null_rate imbalance
ingredients_text_fr_ocr_1573108360_result categorical 50 98.0% 1 long_tail null_rate imbalance
ingredients_text_with_allergens_ro categorical 50 98.0% 1 long_tail null_rate imbalance
origin_lt categorical 50 98.0% 1 long_tail null_rate imbalance
ingredients_text_with_allergens_lt categorical 50 98.0% 1 long_tail null_rate imbalance
product_name_lt categorical 50 98.0% 1 long_tail null_rate imbalance
ingredients_text_lt categorical 50 98.0% 1 long_tail null_rate imbalance
packaging_text_lt categorical 50 98.0% 1 long_tail null_rate imbalance
generic_name_lt categorical 50 98.0% 1 long_tail null_rate imbalance
ingredients_text_fr_ocr_1713713129_result categorical 50 98.0% 1 long_tail null_rate imbalance
ingredients_text_fr_ocr_1713713129 categorical 50 98.0% 1 long_tail null_rate imbalance
Fig 1.
nutriscore_grade · Check whether grade 'e' dominates the distribution, which would indicate the sample skews toward nutritionally poor products.
Show data table
Top values for nutriscore_grade (6 unique shown, of 6 total).
valuecountshare
e2754.0%
d918.0%
c714.0%
a48.0%
b24.0%
unknown12.0%
Fig 2.
scans_n · Look for the strong right skew (mean 578, max 2523) revealing a small set of highly-scanned popular products.
Show data table
Histogram bins for scans_n (median: 492.0).
bincount
333 – 645.939
645.9 – 958.77
958.7 – 12723
1272 – 15840
1584 – 18970
1897 – 22100
2210 – 25231
Fig 3.
nova_groups · Notice that NOVA group 4 (ultra-processed) accounts for the majority of products in this sample.
Show data table
Top values for nova_groups (3 unique shown, of 3 total).
valuecountshare
43366.0%
31428.0%
112.0%
Fig 4.
ecoscore_grade · Inspect how eco-scores spread across grades a through f, with 'e' and 'unknown' representing a large share.
Show data table
Top values for ecoscore_grade (9 unique shown, of 9 total).
valuecountshare
e1224.0%
d918.0%
b816.0%
c816.0%
unknown612.0%
a36.0%
a-plus24.0%
not-applicable12.0%
f12.0%
Fig 5.
pnns_groups_1 · See that 'Sugary snacks' dominates the food group classification, confirming the sample's heavy confectionery bias.
Show data table
Top values for pnns_groups_1 (7 unique shown, of 7 total).
valuecountshare
Sugary snacks3876.0%
Salty snacks48.0%
Cereals and potatoes36.0%
unknown24.0%
Milk and dairy products12.0%
Beverages12.0%
Fruits and vegetables12.0%
Fig 6.
Per-column null rate across the corpus. Columns are ordered by input position.
Show data table
Per-column null rate across the corpus.
columnkindnull %
ingredients_with_unspecified_percent_sumnumeric0.0%
purchase_placescategorical2.0%
revnumeric0.0%
product_name_itcategorical68.0%
editorsunknown0.0%
nutrition_score_warning_fruits_vegetables_nuts_estimate_from_ingredientsnumeric10.0%
traces_hierarchyunknown0.0%
packagingcategorical12.0%
packagings_nnumeric18.0%
categories_propertiesunknown0.0%
generic_name_encategorical14.0%
food_groupscategorical2.0%
ingredients_without_ciqual_codes_nnumeric0.0%
origin_svcategorical92.0%
product_name_jacategorical98.0%
data_quality_warnings_tagsunknown0.0%
packaging_recycling_tagsunknown0.0%
scoresunknown0.0%
nucleotides_prev_tagsunknown0.0%
data_quality_dimensionsunknown0.0%
product_name_ficategorical90.0%
origin_decategorical60.0%
packaging_lccategorical12.0%
correctors_tagsunknown0.0%
categories_hierarchyunknown0.0%
ingredients_ids_debugunknown0.0%
traces_lccategorical4.0%
environment_impact_level_tagsunknown0.0%
last_image_tnumeric0.0%
ingredients_that_may_be_from_palm_oil_nnumeric8.0%
max_imgidcategorical0.0%
nutriscore_tagsunknown0.0%
generic_name_svcategorical92.0%
ingredients_text_with_allergens_nbcategorical96.0%
quantitycategorical2.0%
countries_hierarchyunknown0.0%
data_quality_tagsunknown0.0%
ingredients_nnumeric0.0%
gradesunknown0.0%
additives_original_tagsunknown0.0%
nutrition_score_beveragenumeric0.0%
packaging_text_nlcategorical76.0%
photographersunknown0.0%
pnns_groups_1categorical0.0%
product_name_encategorical14.0%
traces_from_usercategorical0.0%
generic_name_nlcategorical76.0%
nutrition_grade_frcategorical0.0%
image_front_thumb_urlcategorical0.0%
last_editorcategorical2.0%
nutrient_levels_tagsunknown0.0%
product_name_nbcategorical96.0%
packaging_shapes_tagsunknown0.0%
_keywordsunknown0.0%
emb_codes_tagsunknown0.0%
imagesunknown0.0%
states_tagsunknown0.0%
packaging_text_svcategorical92.0%
informers_tagsunknown0.0%
ingredients_text_plcategorical90.0%
labelscategorical2.0%
sourcesunknown0.0%
checkers_tagsunknown0.0%
product_quantity_unitcategorical10.0%
last_modified_bycategorical2.0%
image_front_urlcategorical0.0%
nutrition_data_preparedcategorical4.0%
packaging_text_ficategorical90.0%
interface_version_createdcategorical2.0%
nutrient_levelsunknown0.0%
languages_tagsunknown0.0%
vitamins_prev_tagsunknown0.0%
other_nutritional_substances_tagsunknown0.0%
product_name_decategorical60.0%
nutrition_gradescategorical0.0%
countries_beforescanbotcategorical14.0%
ingredients_text_with_allergens_escategorical62.0%
labels_lccategorical2.0%
nova_group_debugcategorical0.0%
nutrition_score_warning_fruits_vegetables_legumes_estimate_from_ingredients_valuenumeric8.0%
lccategorical0.0%
allergens_from_usercategorical0.0%
debug_param_sorted_langsunknown0.0%
ecoscore_tagsunknown0.0%
nutriscore_score_oppositenumeric2.0%
image_small_urlcategorical0.0%
codes_tagsunknown0.0%
pnns_groups_2_tagsunknown0.0%
ingredients_analysis_tagsunknown0.0%
purchase_places_tagsunknown0.0%
unique_scans_nnumeric0.0%
update_keycategorical0.0%
emb_codes_origcategorical34.0%
ingredients_text_with_allergens_decategorical66.0%
ingredients_without_ecobalyse_ids_nnumeric0.0%
main_countries_tagsunknown0.0%
ingredients_text_with_allergens_encategorical16.0%
nucleotides_tagsunknown0.0%
ingredients_text_with_allergens_svcategorical92.0%
entry_dates_tagsunknown0.0%
allergens_from_ingredientscategorical0.0%
nova_groupscategorical4.0%
product_quantitycategorical6.0%
ingredients_debugunknown0.0%
generic_namecategorical4.0%
origins_tagsunknown0.0%
added_countries_tagsunknown0.0%
categories_lccategorical0.0%
image_urlcategorical0.0%
ingredients_sweeteners_nnumeric0.0%
ingredients_text_jacategorical98.0%
allergens_tagsunknown0.0%
origin_escategorical60.0%
last_updated_tnumeric0.0%
origin_frcategorical8.0%
nutrition_score_warning_fruits_vegetables_nuts_estimate_from_ingredients_valuenumeric10.0%
ingredients_without_ecobalyse_idsunknown0.0%
ingredients_text_with_allergens_itcategorical68.0%
data_quality_errors_tagsunknown0.0%
origin_plcategorical90.0%
packaging_text_frcategorical6.0%
debug_tagsunknown0.0%
ingredients_text_svcategorical92.0%
cities_tagsunknown0.0%
ingredients_with_unspecified_percent_nnumeric0.0%
product_name_frcategorical2.0%
tracescategorical0.0%
known_ingredients_nnumeric0.0%
packaging_text_plcategorical90.0%
image_front_small_urlcategorical0.0%
origin_encategorical14.0%
interface_version_modifiedcategorical0.0%
serving_sizecategorical12.0%
statescategorical0.0%
generic_name_ficategorical90.0%
schema_versionnumeric0.0%
packaging_old_before_taxonomizationcategorical24.0%
nova_groups_markersunknown0.0%
amino_acids_prev_tagsunknown0.0%
productunknown0.0%
emb_codescategorical4.0%
labels_tagsunknown0.0%
selected_imagesunknown0.0%
nutriscoreunknown0.0%
packaging_tagsunknown0.0%
traces_from_ingredientscategorical0.0%
nutrition_data_percategorical0.0%
ecoscore_gradecategorical0.0%
packaging_hierarchyunknown0.0%
nova_groupnumeric4.0%
additives_tagsunknown0.0%
emb_codes_20141016categorical58.0%
ingredients_without_ciqual_codesunknown0.0%
categories_tagsunknown0.0%
category_propertiesunknown0.0%
packagingsunknown0.0%
languages_codesunknown0.0%
ingredients_text_with_allergens_ficategorical90.0%
ciqual_food_name_tagsunknown0.0%
completenumeric0.0%
ingredients_text_with_allergens_plcategorical92.0%
allergens_hierarchyunknown0.0%
languages_hierarchyunknown0.0%
nova_groups_tagsunknown0.0%
ingredients_tagsunknown0.0%
ingredients_text_itcategorical68.0%
informersunknown0.0%
origin_nbcategorical96.0%
creatorcategorical0.0%
packaging_text_jacategorical98.0%
sortkeynumeric12.0%
packagings_materials_maincategorical62.0%
ingredients_percent_analysisnumeric0.0%
amino_acids_tagsunknown0.0%
categories_properties_tagsunknown0.0%
environment_impact_levelcategorical56.0%
expiration_datecategorical4.0%
ingredients_from_or_that_may_be_from_palm_oil_nnumeric6.0%
nutriscore_scorenumeric2.0%
ingredients_text_with_allergenscategorical0.0%
ingredients_with_specified_percent_sumnumeric0.0%
nutriscore_versioncategorical0.0%
langcategorical0.0%
origins_hierarchyunknown0.0%
origins_lccategorical4.0%
origin_itcategorical68.0%
serving_quantitycategorical12.0%
checkersunknown0.0%
editors_tagsunknown0.0%
storescategorical4.0%
product_name_plcategorical90.0%
weighters_tagsunknown0.0%
ecoscore_scorenumeric14.0%
generic_name_itcategorical68.0%
obsoletecategorical12.0%
other_nutritional_substances_prev_tagsunknown0.0%
compared_to_categorycategorical0.0%
generic_name_escategorical60.0%
correctorsunknown0.0%
additives_nnumeric0.0%
ingredients_text_nbcategorical96.0%
ingredients_text_escategorical60.0%
manufacturing_places_tagsunknown0.0%
origincategorical6.0%
origins_oldcategorical22.0%
packaging_text_decategorical60.0%
languagesunknown0.0%
categories_oldcategorical2.0%
ingredients_from_palm_oil_tagsunknown0.0%
minerals_prev_tagsunknown0.0%
origin_ficategorical90.0%
packaging_oldcategorical14.0%
ingredients_text_ficategorical90.0%
product_typecategorical0.0%
ingredients_hierarchyunknown0.0%
removed_countries_tagsunknown0.0%
unknown_nutrients_tagsunknown0.0%
no_nutrition_datacategorical4.0%
ingredients_analysisunknown0.0%
packagings_materialsunknown0.0%
serving_quantity_unitcategorical8.0%
product_namecategorical0.0%
idcategorical0.0%
ingredients_text_with_allergens_nlcategorical78.0%
categoriescategorical0.0%
nutrition_grades_tagsunknown0.0%
nutriscore_2023_tagsunknown0.0%
origin_jacategorical98.0%
nutrition_score_debugcategorical0.0%
teamscategorical8.0%
unknown_ingredients_nnumeric0.0%
urlcategorical0.0%
data_quality_completeness_tagsunknown0.0%
ecoscore_dataunknown0.0%
generic_name_plcategorical90.0%
nutrition_datacategorical2.0%
generic_name_jacategorical98.0%
nutrimentsunknown0.0%
last_image_dates_tagsunknown0.0%
brandscategorical0.0%
minerals_tagsunknown0.0%
nutrition_data_prepared_percategorical0.0%
popularity_tagsunknown0.0%
packaging_text_escategorical60.0%
manufacturing_placescategorical2.0%
generic_name_nbcategorical96.0%
last_modified_tnumeric0.0%
vitamins_tagsunknown0.0%
_idcategorical0.0%
teams_tagsunknown0.0%
countriescategorical0.0%
pnns_groups_2categorical0.0%
states_hierarchyunknown0.0%
codecategorical0.0%
countries_lccategorical2.0%
stores_tagsunknown0.0%
generic_name_decategorical60.0%
ingredients_n_tagsunknown0.0%
allergenscategorical0.0%
allergens_lccategorical4.0%
ingredients_text_encategorical12.0%
misc_tagsunknown0.0%
photographers_tagsunknown0.0%
packaging_materials_tagsunknown0.0%
product_name_nlcategorical76.0%
nutrition_score_warning_fruits_vegetables_legumes_estimate_from_ingredientsnumeric8.0%
product_name_svcategorical92.0%
food_groups_tagsunknown0.0%
completenessnumeric0.0%
pnns_groups_1_tagsunknown0.0%
ingredients_with_specified_percent_nnumeric0.0%
origin_nlcategorical76.0%
fruits-vegetables-nuts_100g_estimatenumeric46.0%
brands_oldcategorical32.0%
generic_name_frcategorical6.0%
ingredientsunknown0.0%
countries_tagsunknown0.0%
ingredients_original_tagsunknown0.0%
ingredients_text_decategorical60.0%
nutriscore_gradecategorical0.0%
image_thumb_urlcategorical0.0%
packaging_text_encategorical14.0%
packaging_text_itcategorical68.0%
traces_tagsunknown0.0%
brands_tagsunknown0.0%
nutriscore_2021_tagsunknown0.0%
packaging_textcategorical4.0%
popularity_keynumeric0.0%
ingredients_textcategorical0.0%
ingredients_text_with_allergens_frcategorical4.0%
ingredients_text_nlcategorical76.0%
product_name_escategorical60.0%
data_sources_tagsunknown0.0%
data_quality_bugs_tagsunknown0.0%
obsolete_since_datecategorical12.0%
weighers_tagsunknown0.0%
ingredients_text_debugcategorical28.0%
linkcategorical4.0%
created_tnumeric0.0%
ingredients_text_frcategorical4.0%
labels_hierarchyunknown0.0%
ingredients_non_nutritive_sweeteners_nnumeric0.0%
last_edit_dates_tagsunknown0.0%
packaging_text_nbcategorical96.0%
packagings_completenumeric4.0%
data_sourcescategorical0.0%
labels_oldcategorical8.0%
data_quality_info_tagsunknown0.0%
ingredients_from_palm_oil_nnumeric8.0%
ingredients_text_with_allergens_jacategorical98.0%
ingredients_lccategorical0.0%
originscategorical4.0%
nutriscore_dataunknown0.0%
scans_nnumeric0.0%
ingredients_that_may_be_from_palm_oil_tagsunknown0.0%
generic_name_arcategorical80.0%
product_name_ukcategorical98.0%
last_checked_tnumeric86.0%
last_check_dates_tagsunknown0.0%
ingredients_text_ukcategorical98.0%
carbon_footprint_from_known_ingredients_debugcategorical72.0%
packaging_text_arcategorical80.0%
generic_name_ukcategorical98.0%
last_checkercategorical86.0%
checkedcategorical86.0%
product_name_arcategorical78.0%
ingredients_text_with_allergens_ukcategorical98.0%
origin_ukcategorical98.0%
packaging_text_ukcategorical98.0%
ingredients_text_arcategorical78.0%
ingredients_text_with_allergens_arcategorical82.0%
carbon_footprint_percent_of_known_ingredientsnumeric62.0%
origin_arcategorical80.0%
nutrition_score_warning_no_fibernumeric70.0%
ingredients_text_debug_tagsunknown0.0%
nutriments_estimatedunknown0.0%
completed_tnumeric68.0%
taxonomies_enhancer_tagsunknown0.0%
ingredients_text_with_allergens_slcategorical98.0%
packaging_text_skcategorical98.0%
ingredients_text_with_allergens_bgcategorical94.0%
ingredients_text_ptcategorical80.0%
ingredients_text_dzcategorical98.0%
generic_name_cacategorical96.0%
generic_name_bgcategorical94.0%
origin_slcategorical98.0%
product_name_etcategorical94.0%
origin_etcategorical94.0%
ingredients_text_with_allergens_skcategorical98.0%
ingredients_text_with_allergens_etcategorical94.0%
nutrition_score_warning_nutriments_estimatednumeric96.0%
ingredients_text_skcategorical98.0%
generic_name_ptcategorical80.0%
ingredients_text_bgcategorical94.0%
packaging_text_etcategorical94.0%
product_name_skcategorical98.0%
ingredients_text_cacategorical96.0%
ingredients_text_with_allergens_cacategorical98.0%
product_name_dzcategorical98.0%
product_name_slcategorical98.0%
origin_skcategorical98.0%
generic_name_etcategorical94.0%
ingredients_text_etcategorical94.0%
packaging_text_cacategorical96.0%
packaging_text_slcategorical98.0%
generic_name_dzcategorical98.0%
origin_cacategorical96.0%
product_name_cacategorical96.0%
packaging_text_ptcategorical80.0%
origin_bgcategorical94.0%
packaging_text_bgcategorical94.0%
origin_ptcategorical80.0%
ingredients_text_with_allergens_ptcategorical84.0%
product_name_bgcategorical94.0%
ingredients_text_slcategorical98.0%
generic_name_slcategorical98.0%
generic_name_skcategorical98.0%
product_name_ptcategorical80.0%
lc_importedcategorical84.0%
abbreviated_product_name_fr_importedcategorical86.0%
generic_name_zhcategorical98.0%
obsolete_importedcategorical86.0%
generic_name_fr_importedcategorical86.0%
owners_tagscategorical86.0%
owner_importedcategorical88.0%
customer_servicecategorical86.0%
ingredients_text_zh_debug_tagsunknown0.0%
countries_importedcategorical84.0%
data_sources_importedcategorical84.0%
product_name_zhcategorical98.0%
categories_importedcategorical88.0%
quantity_importedcategorical86.0%
ingredients_text_zhcategorical98.0%
emb_codecategorical98.0%
origins_frcategorical96.0%
nutrition_data_prepared_per_importedcategorical86.0%
product_name_zh_debug_tagsunknown0.0%
sources_fieldsunknown0.0%
customer_service_frcategorical86.0%
nutrition_data_per_importedcategorical84.0%
ownercategorical86.0%
abbreviated_product_namecategorical86.0%
conservation_conditions_frcategorical86.0%
brands_importedcategorical86.0%
owner_fieldsunknown0.0%
conservation_conditions_fr_importedcategorical86.0%
origin_fr_importedcategorical96.0%
customer_service_fr_importedcategorical86.0%
generic_name_zh_debug_tagsunknown0.0%
product_name_fr_importedcategorical86.0%
lang_importedcategorical86.0%
abbreviated_product_name_frcategorical86.0%
ingredients_text_fr_importedcategorical86.0%
conservation_conditionscategorical86.0%
nova_group_errorcategorical96.0%
producer_version_id_importedcategorical92.0%
ingredients_text_de_ocr_1648990410categorical98.0%
product_name_rocategorical96.0%
packaging_importedcategorical92.0%
ingredients_text_de_ocr_1648990410_resultcategorical98.0%
ingredients_text_rocategorical96.0%
producer_version_idcategorical92.0%
labels_importedcategorical90.0%
allergens_importedcategorical90.0%
origin_rocategorical96.0%
no_nutrition_data_importedcategorical92.0%
serving_size_importedcategorical88.0%
generic_name_rocategorical96.0%
ingredients_text_de_ocr_1648897071categorical98.0%
ingredients_text_de_ocr_1648897071_resultcategorical98.0%
packaging_text_rocategorical96.0%
abbreviated_product_name_importedcategorical94.0%
traces_importedcategorical92.0%
specific_ingredientsunknown0.0%
packaging_text_rucategorical94.0%
origin_rucategorical94.0%
ingredients_text_with_allergens_rucategorical94.0%
product_name_rucategorical94.0%
generic_name_rucategorical94.0%
ingredients_text_rucategorical94.0%
packaging_text_dacategorical96.0%
generic_name_dacategorical96.0%
forest_footprint_dataunknown0.0%
product_name_dacategorical96.0%
ingredients_text_with_allergens_dacategorical96.0%
origin_dacategorical96.0%
ingredients_text_dacategorical96.0%
ingredients_text_cscategorical94.0%
ingredients_text_nl_ocr_1675675383_resultcategorical98.0%
product_name_cscategorical94.0%
ingredients_text_hu_ocr_1571428260_resultcategorical98.0%
packaging_text_cscategorical94.0%
ingredients_text_srcategorical96.0%
origin_srcategorical96.0%
ingredients_text_hu_ocr_1571428260categorical98.0%
packaging_text_hucategorical92.0%
origin_cscategorical96.0%
ingredients_text_nl_ocr_1675675383categorical98.0%
product_name_srcategorical96.0%
generic_name_hucategorical92.0%
packaging_text_srcategorical96.0%
ingredients_text_with_allergens_cscategorical98.0%
ingredients_text_with_allergens_srcategorical96.0%
ingredients_text_hucategorical92.0%
product_name_hucategorical92.0%
generic_name_srcategorical96.0%
origin_hucategorical92.0%
ingredients_text_with_allergens_hucategorical94.0%
generic_name_cscategorical94.0%
ingredients_text_xxcategorical96.0%
origin_xxcategorical98.0%
product_name_xxcategorical96.0%
packaging_text_xxcategorical98.0%
generic_name_xxcategorical96.0%
ingredients_text_es_ocr_1548767061categorical98.0%
ingredients_text_es_ocr_1548767061_resultcategorical98.0%
origin_urcategorical98.0%
product_name_hecategorical96.0%
ingredients_text_hecategorical98.0%
product_name_urcategorical98.0%
generic_name_hecategorical98.0%
packaging_text_hecategorical98.0%
ingredients_text_urcategorical98.0%
origin_hecategorical98.0%
generic_name_urcategorical98.0%
packaging_text_urcategorical98.0%
ingredients_text_with_allergens_hecategorical98.0%
nutriscore_grade_producer_importedcategorical94.0%
nutriscore_grade_producercategorical94.0%
ingredients_text_elcategorical98.0%
ingredients_text_with_allergens_elcategorical98.0%
packaging_text_elcategorical98.0%
origin_elcategorical98.0%
product_name_elcategorical98.0%
generic_name_elcategorical98.0%
ingredients_text_it_ocr_1559410715categorical98.0%
ingredients_text_de_ocr_1559410715categorical98.0%
product_name_thcategorical98.0%
ingredients_text_thcategorical98.0%
ingredients_text_de_ocr_1548767354categorical98.0%
ingredients_text_de_ocr_1548767354_resultcategorical98.0%
generic_name_thcategorical98.0%
ingredients_text_it_ocr_1559410715_resultcategorical98.0%
packaging_text_thcategorical98.0%
origin_thcategorical98.0%
ingredients_text_with_allergens_thcategorical98.0%
ingredients_text_de_ocr_1559410715_resultcategorical98.0%
packaging_text_fr_importedcategorical98.0%
preparationcategorical98.0%
preparation_fr_importedcategorical98.0%
preparation_frcategorical98.0%
generic_name_lccategorical98.0%
product_name_lccategorical98.0%
ingredients_text_lccategorical98.0%
ingredients_text_with_allergens_lccategorical98.0%
generic_name_xx_debug_tagsunknown0.0%
ingredients_text_xx_debug_tagsunknown0.0%
product_name_xx_debug_tagsunknown0.0%
ingredients_text_fr_ocr_1561814324_resultcategorical98.0%
ingredients_text_fr_ocr_1561814324categorical98.0%
ingredients_text_fr_ocr_1624039072categorical98.0%
ingredients_text_fr_ocr_1624039072_resultcategorical98.0%
ingredients_text_fr_ocr_1573108349categorical98.0%
ingredients_text_fr_ocr_1573108349_resultcategorical98.0%
ingredients_text_fr_ocr_1573107560_resultcategorical98.0%
ingredients_text_fr_ocr_1573108360categorical98.0%
ingredients_text_fr_ocr_1573107556_resultcategorical98.0%
ingredients_text_fr_ocr_1573109955categorical98.0%
ingredients_text_fr_ocr_1566920858categorical98.0%
ingredients_text_fr_ocr_1573107560categorical98.0%
ingredients_text_fr_ocr_1573108346categorical98.0%
ingredients_text_fr_ocr_1573108346_resultcategorical98.0%
ingredients_text_fr_ocr_1573109955_resultcategorical98.0%
ingredients_text_fr_ocr_1566920858_resultcategorical98.0%
ingredients_text_fr_ocr_1573107556categorical98.0%
ingredients_text_fr_ocr_1573108360_resultcategorical98.0%
ingredients_text_with_allergens_rocategorical98.0%
origin_ltcategorical98.0%
ingredients_text_with_allergens_ltcategorical98.0%
product_name_ltcategorical98.0%
ingredients_text_ltcategorical98.0%
packaging_text_ltcategorical98.0%
generic_name_ltcategorical98.0%
ingredients_text_fr_ocr_1713713129_resultcategorical98.0%
ingredients_text_fr_ocr_1713713129categorical98.0%
Fig 7.
Pearson correlation across numeric columns (sampled, bounded).
Show data table
Pearson correlation across 12 numeric columns (values clipped to 2 decimals).
ingredients_with_unspecified_percent_sumrevnutrition_score_warning_fruits_vegetables_nuts_estimate_from_ingredientspackagings_ningredients_without_ciqual_codes_nlast_image_tingredients_that_may_be_from_palm_oil_ningredients_nnutrition_score_beveragenutrition_score_warning_fruits_vegetables_legumes_estimate_from_ingredients_valuenutriscore_score_oppositeunique_scans_n
ingredients_with_unspecified_percent_sum+1.00-1.00+nan-1.00-1.00+1.00+nan-1.00+nan+1.00-1.00+1.00
rev-1.00+1.00+nan+1.00+1.00-1.00+nan+1.00+nan-1.00+1.00-1.00
nutrition_score_warning_fruits_vegetables_nuts_estimate_from_ingredients+nan+nan+nan+nan+nan+nan+nan+nan+nan+nan+nan+nan
packagings_n-1.00+1.00+nan+1.00+1.00-1.00+nan+1.00+nan-1.00+1.00-1.00
ingredients_without_ciqual_codes_n-1.00+1.00+nan+1.00+1.00-1.00+nan+1.00+nan-1.00+1.00-1.00
last_image_t+1.00-1.00+nan-1.00-1.00+1.00+nan-1.00+nan+1.00-1.00+1.00
ingredients_that_may_be_from_palm_oil_n+nan+nan+nan+nan+nan+nan+nan+nan+nan+nan+nan+nan
ingredients_n-1.00+1.00+nan+1.00+1.00-1.00+nan+1.00+nan-1.00+1.00-1.00
nutrition_score_beverage+nan+nan+nan+nan+nan+nan+nan+nan+nan+nan+nan+nan
nutrition_score_warning_fruits_vegetables_legumes_estimate_from_ingredients_value+1.00-1.00+nan-1.00-1.00+1.00+nan-1.00+nan+1.00-1.00+1.00
nutriscore_score_opposite-1.00+1.00+nan+1.00+1.00-1.00+nan+1.00+nan-1.00+1.00-1.00
unique_scans_n+1.00-1.00+nan-1.00-1.00+1.00+nan-1.00+nan+1.00-1.00+1.00

ingredients_with_unspecified_percent_sum numeric

Out[13]:

saturn.columns["ingredients_with_unspecified_percent_sum"].stats

statvalue
n50
nulls0 (0.0%)
unique22
min 0.4
max 100
mean 79.42
median 100
std 31.64
q1 53.6
q3 100
iqr 46.4
skew -1.183
kurtosis -0.133
n_outliers 0
outlier_rate 0
zero_rate 0
Fig 8.
Distribution of ingredients_with_unspecified_percent_sum. Vertical dash marks the median.
Show data table
Histogram bins for ingredients_with_unspecified_percent_sum (median: 100.0).
bincount
0.4 – 14.632
14.63 – 28.864
28.86 – 43.094
43.09 – 57.313
57.31 – 71.542
71.54 – 85.771
85.77 – 10034

purchase_places categorical

Out[16]:

saturn.columns["purchase_places"].stats

statvalue
n50
nulls1 (2.0%)
unique32
top_value France
top_rate 0.1837
cardinality 32
entropy 4.479
entropy_ratio 0.8958
alert: long_tail29 singleton categories
Fig 9.
Top values for purchase_places.
Show data table
Top values for purchase_places (20 unique shown, of 32 total).
valuecountshare
France918.0%
612.0%
Maroc510.0%
Casablanca,Morocco12.0%
F-77480 Mousseaux-les-Bray,France12.0%
Madrid,España,Montargis,France,Würzburg,Deutschland,Italia,Singapore,République tchèque,Toronto,Burlington,Oakville12.0%
France,Lacaune12.0%
Slovenija,Finland,United Kingdom12.0%
Villeurbanne,France,Toulon12.0%
Lund,Sweden12.0%
Fez,Morocco12.0%
Italien,France,Lacaune,Portugal12.0%
Bar-le-Duc,France12.0%
France,République tchèque,Lacaune12.0%
France,United Kingdom12.0%
Veynes,France,Trignac12.0%
Morocco12.0%
France,Belgique,Espagne,Estonie12.0%
España,France,Serbia,Praha,Czechia12.0%
France,Normandie12.0%

rev numeric

Out[19]:

saturn.columns["rev"].stats

statvalue
n50
nulls0 (0.0%)
unique46
min 19
max 674
mean 230
median 233.5
std 166.6
q1 72.75
q3 310.5
iqr 237.8
skew 0.7092
kurtosis -0.02278
n_outliers 1
outlier_rate 0.02
zero_rate 0
Fig 10.
Distribution of rev. Vertical dash marks the median.
Show data table
Histogram bins for rev (median: 233.5).
bincount
19 – 112.615
112.6 – 206.19
206.1 – 299.712
299.7 – 393.36
393.3 – 486.93
486.9 – 580.43
580.4 – 6742

product_name_it categorical

Out[22]:

saturn.columns["product_name_it"].stats

statvalue
n50
nulls34 (68.0%)
unique12
top_value
top_rate 0.3125
cardinality 12
entropy 3.274
entropy_ratio 0.9134
alert: long_tail11 singleton categories
alert: null_rate68.0% null
Fig 11.
Top values for product_name_it.
Show data table
Top values for product_name_it (12 unique shown, of 12 total).
valuecountshare
510.0%
Fondente Prodigioso 90% Cacao12.0%
Croccanti biscotti con cuore cremoso di Nutella12.0%
Excellence 85% Cacao Chocolat Noir Puissant Lindt % Lindt12.0%
cioccolato fondente12.0%
Original12.0%
Excellence 70% Cocoa Fondente Intenso12.0%
Cioccolato fondente12.0%
Pringles classiche 175 gr12.0%
Milka12.0%
Mix di frutta secca12.0%
Granola12.0%

editors unknown

Out[25]:

saturn.columns["editors"].stats

statvalue
n50
nulls0 (0.0%)
unique
alert: skippedno profiler for kind=unknown

nutrition_score_warning_fruits_vegetables_nuts_estimate_from_ingredients numeric

Out[27]:

saturn.columns["nutrition_score_warning_fruits_vegetables_nuts_estimate_from_ingredients"].stats

statvalue
n50
nulls5 (10.0%)
unique1
min 1
max 1
mean 1
median 1
std 0
q1 1
q3 1
iqr 0
skew 0
kurtosis 0
n_outliers 0
outlier_rate 0
zero_rate 0
alert: constantonly one distinct value
Fig 12.
Distribution of nutrition_score_warning_fruits_vegetables_nuts_estimate_from_ingredients. Vertical dash marks the median.
Show data table
Histogram bins for nutrition_score_warning_fruits_vegetables_nuts_estimate_from_ingredients (median: 1.0).
bincount
0.5 – 0.66670
0.6667 – 0.83330
0.8333 – 10
1 – 1.16745
1.167 – 1.3330
1.333 – 1.50

traces_hierarchy unknown

Out[30]:

saturn.columns["traces_hierarchy"].stats

statvalue
n50
nulls0 (0.0%)
unique
alert: skippedno profiler for kind=unknown

packaging categorical

Out[32]:

saturn.columns["packaging"].stats

statvalue
n50
nulls6 (12.0%)
unique41
top_value Plastique
top_rate 0.09091
cardinality 41
entropy 5.278
entropy_ratio 0.9851
alert: long_tail40 singleton categories
Fig 13.
Top values for packaging.
Show data table
Top values for packaging (20 unique shown, of 41 total).
valuecountshare
Plastique48.0%
Packet,Hdpe film-packet,Etui en carton,Film en plastique12.0%
en:Aluminium wrap,en:Box cardboard,en:Caja de cartón,en:Card-box,en:Foil-wrapper,es:Recipiente,pt:Papel de aluminio,Étui carton,Feuille aluminium12.0%
Cardboard,Plastic12.0%
Cardboard,Non-corrugated cardboard12.0%
Plastique,Bouteille ou Flacon,PET 1 - Polytéréphtalate d'éthylène,Bouteille,Bouchon en plastique12.0%
Métal,Papier,en:Recyclable Metals,Aluminium12.0%
Paper/Foil12.0%
Plastique,O 7 - Autres plastiques12.0%
Papier,Enveloppe,en:Package paper,en:Paper recycling12.0%
Métal,Carton,Métaux recyclables,Aluminium12.0%
en:MixedPlasticFilm-packet,en:mixed plastic film-packet12.0%
1 film to recycle, 1 paper wrap to recycle, en:paper-wrapper, en:foil-wrapper12.0%
fr:emballage carton,fr:papier aluminium12.0%
Étui,Carton,Plastique,Sec,Film12.0%
Plastikowe,Mixed plastic-packet,Sachet plastique de 3g,12.0%
Carta,Busta12.0%
Papier12.0%
12.0%
Plástico12.0%

packagings_n numeric

Out[35]:

saturn.columns["packagings_n"].stats

statvalue
n50
nulls9 (18.0%)
unique5
min 1
max 5
mean 2.073
median 2
std 0.8772
q1 2
q3 2
iqr 0
skew 0.9834
kurtosis 1.602
n_outliers 20
outlier_rate 0.4878
zero_rate 0
alert: outliers48.8% rows beyond 1.5 IQR
Fig 14.
Distribution of packagings_n. Vertical dash marks the median.
Show data table
Histogram bins for packagings_n (median: 2.0).
bincount
1 – 1.66710
1.667 – 2.33321
2.333 – 30
3 – 3.6678
3.667 – 4.3331
4.333 – 51

categories_properties unknown

Out[38]:

saturn.columns["categories_properties"].stats

statvalue
n50
nulls0 (0.0%)
unique
alert: skippedno profiler for kind=unknown

generic_name_en categorical

Out[40]:

saturn.columns["generic_name_en"].stats

statvalue
n50
nulls7 (14.0%)
unique8
top_value
top_rate 0.8372
cardinality 8
entropy 1.098
entropy_ratio 0.366
alert: long_tail7 singleton categories
Fig 15.
Top values for generic_name_en.
Show data table
Top values for generic_name_en (8 unique shown, of 8 total).
valuecountshare
3672.0%
Extra fine dark chocolate 90% cocoa12.0%
Dark chocolate12.0%
Compound Chocolate with MILK AND ALMONDS12.0%
Lightly sea salted potato chips12.0%
Crackers12.0%
Dark Chocolate 70% cocoa12.0%
Chocolate bar with milk and hazelnuts12.0%

food_groups categorical

Out[43]:

saturn.columns["food_groups"].stats

statvalue
n50
nulls1 (2.0%)
unique11
top_value en:biscuits-and-cakes
top_rate 0.3469
cardinality 11
entropy 2.549
entropy_ratio 0.7367
Fig 16.
Top values for food_groups.
Show data table
Top values for food_groups (11 unique shown, of 11 total).
valuecountshare
en:biscuits-and-cakes1734.0%
en:chocolate-products1632.0%
en:appetizers48.0%
en:pastries36.0%
en:bread24.0%
en:sweets24.0%
en:dairy-desserts12.0%
en:unsweetened-beverages12.0%
en:cereals12.0%
en:dried-fruits12.0%
en:cereals-and-potatoes12.0%

ingredients_without_ciqual_codes_n numeric

Out[46]:

saturn.columns["ingredients_without_ciqual_codes_n"].stats

statvalue
n50
nulls0 (0.0%)
unique15
min 0
max 22
mean 4.98
median 3.5
std 4.825
q1 1
q3 7.75
iqr 6.75
skew 1.208
kurtosis 1.491
n_outliers 1
outlier_rate 0.02
zero_rate 0.18
Fig 17.
Distribution of ingredients_without_ciqual_codes_n. Vertical dash marks the median.
Show data table
Histogram bins for ingredients_without_ciqual_codes_n (median: 3.5).
bincount
0 – 3.14325
3.143 – 6.2869
6.286 – 9.4298
9.429 – 12.574
12.57 – 15.713
15.71 – 18.860
18.86 – 221

origin_sv categorical other

This column, likely an origin or source indicator (possibly a survey or system variant field), is effectively empty: 92% of its 50 rows are null, and the sole non-null 'value' present in 4 rows is itself an empty string. With cardinality of 1 and entropy of 0, there is zero information content in this column. The combination of near-total nulls and a blank top value means the column carries no usable signal whatsoever.

Treatment: Drop — column contains no information (92% null, remaining values are empty strings).

anthropic:default · confidence high
Out[49]:

saturn.columns["origin_sv"].stats

statvalue
n50
nulls46 (92.0%)
unique1
top_value
top_rate 1
cardinality 1
entropy 0
entropy_ratio 0
alert: null_rate92.0% null
alert: imbalancetop value is 100.0% of rows
Fig 18.
Top values for origin_sv.
Show data table
Top values for origin_sv (1 unique shown, of 1 total).
valuecountshare
48.0%

product_name_ja categorical

Out[52]:

saturn.columns["product_name_ja"].stats

statvalue
n50
nulls49 (98.0%)
unique1
top_value
top_rate 1
cardinality 1
entropy 0
entropy_ratio 0
alert: long_tail1 singleton categories
alert: null_rate98.0% null
alert: imbalancetop value is 100.0% of rows
Fig 19.
Top values for product_name_ja.
Show data table
Top values for product_name_ja (1 unique shown, of 1 total).
valuecountshare
12.0%

data_quality_warnings_tags unknown

Out[55]:

saturn.columns["data_quality_warnings_tags"].stats

statvalue
n50
nulls0 (0.0%)
unique
alert: skippedno profiler for kind=unknown

packaging_recycling_tags unknown

Out[57]:

saturn.columns["packaging_recycling_tags"].stats

statvalue
n50
nulls0 (0.0%)
unique
alert: skippedno profiler for kind=unknown

scores unknown

Out[59]:

saturn.columns["scores"].stats

statvalue
n50
nulls0 (0.0%)
unique
alert: skippedno profiler for kind=unknown

nucleotides_prev_tags unknown

Out[61]:

saturn.columns["nucleotides_prev_tags"].stats

statvalue
n50
nulls0 (0.0%)
unique
alert: skippedno profiler for kind=unknown

data_quality_dimensions unknown

Out[63]:

saturn.columns["data_quality_dimensions"].stats

statvalue
n50
nulls0 (0.0%)
unique
alert: skippedno profiler for kind=unknown

product_name_fi categorical

Out[65]:

saturn.columns["product_name_fi"].stats

statvalue
n50
nulls45 (90.0%)
unique4
top_value
top_rate 0.4
cardinality 4
entropy 1.922
entropy_ratio 0.961
alert: long_tail3 singleton categories
alert: null_rate90.0% null
Fig 20.
Top values for product_name_fi.
Show data table
Top values for product_name_fi (4 unique shown, of 4 total).
valuecountshare
24.0%
Excellence: 90% cocoa Dark Supreme12.0%
Arriba 85% Cacao Dark Chocolate12.0%
Original12.0%

origin_de categorical label

This column appears to be a German-language origin/source label field ('origin_de'), but it contains effectively no usable data: the only observed value across all 50 rows is an empty string, appearing 20 times, with 60% of rows (30) being null. Cardinality is 1, entropy is 0, and top_rate is 1.0 — the column is entirely uninformative in its current state.

Treatment: Drop this column; it carries zero information (all non-null values are empty strings and 60% are null).

anthropic:default · confidence high
Out[68]:

saturn.columns["origin_de"].stats

statvalue
n50
nulls30 (60.0%)
unique1
top_value
top_rate 1
cardinality 1
entropy 0
entropy_ratio 0
alert: null_rate60.0% null
alert: imbalancetop value is 100.0% of rows
Fig 21.
Top values for origin_de.
Show data table
Top values for origin_de (1 unique shown, of 1 total).
valuecountshare
2040.0%

packaging_lc categorical

Out[71]:

saturn.columns["packaging_lc"].stats

statvalue
n50
nulls6 (12.0%)
unique7
top_value fr
top_rate 0.3864
cardinality 7
entropy 1.992
entropy_ratio 0.7094
Fig 22.
Top values for packaging_lc.
Show data table
Top values for packaging_lc (7 unique shown, of 7 total).
valuecountshare
fr1734.0%
en1734.0%
de510.0%
pt24.0%
it12.0%
es12.0%
hr12.0%

correctors_tags unknown

Out[74]:

saturn.columns["correctors_tags"].stats

statvalue
n50
nulls0 (0.0%)
unique
alert: skippedno profiler for kind=unknown

categories_hierarchy unknown

Out[76]:

saturn.columns["categories_hierarchy"].stats

statvalue
n50
nulls0 (0.0%)
unique
alert: skippedno profiler for kind=unknown

ingredients_ids_debug unknown

Out[78]:

saturn.columns["ingredients_ids_debug"].stats

statvalue
n50
nulls0 (0.0%)
unique
alert: skippedno profiler for kind=unknown

traces_lc categorical

Out[80]:

saturn.columns["traces_lc"].stats

statvalue
n50
nulls2 (4.0%)
unique6
top_value fr
top_rate 0.4792
cardinality 6
entropy 1.575
entropy_ratio 0.6093
Fig 23.
Top values for traces_lc.
Show data table
Top values for traces_lc (6 unique shown, of 6 total).
valuecountshare
fr2346.0%
en2040.0%
es24.0%
de12.0%
it12.0%
pl12.0%

environment_impact_level_tags unknown

Out[83]:

saturn.columns["environment_impact_level_tags"].stats

statvalue
n50
nulls0 (0.0%)
unique
alert: skippedno profiler for kind=unknown

last_image_t numeric

Out[85]:

saturn.columns["last_image_t"].stats

statvalue
n50
nulls0 (0.0%)
unique50
min 1.639e+09
max 1.768e+09
mean 1.745e+09
median 1.752e+09
std 2.681e+07
q1 1.735e+09
q3 1.764e+09
iqr 2.896e+07
skew -2.443
kurtosis 7.36
n_outliers 2
outlier_rate 0.04
zero_rate 0
alert: high_skewskew=-2.44
Fig 24.
Distribution of last_image_t. Vertical dash marks the median.
Show data table
Histogram bins for last_image_t (median: 1752195111.0).
bincount
1.639e+09 – 1.658e+092
1.658e+09 – 1.676e+090
1.676e+09 – 1.694e+090
1.694e+09 – 1.713e+090
1.713e+09 – 1.731e+095
1.731e+09 – 1.749e+0917
1.749e+09 – 1.768e+0926

ingredients_that_may_be_from_palm_oil_n numeric

Out[88]:

saturn.columns["ingredients_that_may_be_from_palm_oil_n"].stats

statvalue
n50
nulls4 (8.0%)
unique3
min 0
max 2
mean 0.1957
median 0
std 0.4531
q1 0
q3 0
iqr 0
skew 2.23
kurtosis 4.321
n_outliers 8
outlier_rate 0.1739
zero_rate 0.8261
alert: high_skewskew=+2.23
alert: outliers17.4% rows beyond 1.5 IQR
Fig 25.
Distribution of ingredients_that_may_be_from_palm_oil_n. Vertical dash marks the median.
Show data table
Histogram bins for ingredients_that_may_be_from_palm_oil_n (median: 0.0).
bincount
0 – 0.333338
0.3333 – 0.66670
0.6667 – 10
1 – 1.3337
1.333 – 1.6670
1.667 – 21

max_imgid categorical

Out[91]:

saturn.columns["max_imgid"].stats

statvalue
n50
nulls0 (0.0%)
unique38
top_value 47
top_rate 0.06
cardinality 38
entropy 5.149
entropy_ratio 0.9811
alert: long_tail27 singleton categories
Fig 26.
Top values for max_imgid.
Show data table
Top values for max_imgid (20 unique shown, of 38 total).
valuecountshare
4736.0%
10824.0%
1324.0%
1224.0%
624.0%
724.0%
8824.0%
1524.0%
8224.0%
6824.0%
7924.0%
2812.0%
23512.0%
912.0%
10512.0%
8012.0%
15812.0%
1112.0%
7312.0%
6612.0%

nutriscore_tags unknown

Out[94]:

saturn.columns["nutriscore_tags"].stats

statvalue
n50
nulls0 (0.0%)
unique
alert: skippedno profiler for kind=unknown

generic_name_sv categorical

Out[96]:

saturn.columns["generic_name_sv"].stats

statvalue
n50
nulls46 (92.0%)
unique4
top_value Fin mörk choklad med 90% kakao
top_rate 0.25
cardinality 4
entropy 2
entropy_ratio 1
alert: long_tail4 singleton categories
alert: null_rate92.0% null
Fig 27.
Top values for generic_name_sv.
Show data table
Top values for generic_name_sv (4 unique shown, of 4 total).
valuecountshare
Fin mörk choklad med 90% kakao12.0%
Mörk choklad12.0%
12.0%
Kex12.0%

ingredients_text_with_allergens_nb categorical

Out[99]:

saturn.columns["ingredients_text_with_allergens_nb"].stats

statvalue
n50
nulls48 (96.0%)
unique1
top_value
top_rate 1
cardinality 1
entropy 0
entropy_ratio 0
alert: null_rate96.0% null
alert: imbalancetop value is 100.0% of rows
Fig 28.
Top values for ingredients_text_with_allergens_nb.
Show data table
Top values for ingredients_text_with_allergens_nb (1 unique shown, of 1 total).
valuecountshare
24.0%

quantity categorical

Out[102]:

saturn.columns["quantity"].stats

statvalue
n50
nulls1 (2.0%)
unique36
top_value 100 g
top_rate 0.1224
cardinality 36
entropy 4.956
entropy_ratio 0.9587
alert: long_tail28 singleton categories
Fig 29.
Top values for quantity.
Show data table
Top values for quantity (20 unique shown, of 36 total).
valuecountshare
100 g612.0%
100g36.0%
125g24.0%
42g24.0%
90g24.0%
24.0%
100 gram24.0%
230 g24.0%
300 g12.0%
22 g12.0%
230g12.0%
500 ml12.0%
150 g12.0%
304 g12.0%
275 g12.0%
150g12.0%
225 g12.0%
85 g12.0%
36 g12.0%
5212.0%

countries_hierarchy unknown

Out[105]:

saturn.columns["countries_hierarchy"].stats

statvalue
n50
nulls0 (0.0%)
unique
alert: skippedno profiler for kind=unknown

data_quality_tags unknown

Out[107]:

saturn.columns["data_quality_tags"].stats

statvalue
n50
nulls0 (0.0%)
unique
alert: skippedno profiler for kind=unknown

ingredients_n numeric

Out[109]:

saturn.columns["ingredients_n"].stats

statvalue
n50
nulls0 (0.0%)
unique22
min 1
max 39
mean 11.7
median 9
std 8.244
q1 5
q3 16
iqr 11
skew 1.237
kurtosis 1.435
n_outliers 2
outlier_rate 0.04
zero_rate 0
Fig 30.
Distribution of ingredients_n. Vertical dash marks the median.
Show data table
Histogram bins for ingredients_n (median: 9.0).
bincount
1 – 6.42918
6.429 – 11.869
11.86 – 17.2913
17.29 – 22.716
22.71 – 28.142
28.14 – 33.570
33.57 – 392

grades unknown

Out[112]:

saturn.columns["grades"].stats

statvalue
n50
nulls0 (0.0%)
unique
alert: skippedno profiler for kind=unknown

additives_original_tags unknown

Out[114]:

saturn.columns["additives_original_tags"].stats

statvalue
n50
nulls0 (0.0%)
unique
alert: skippedno profiler for kind=unknown

nutrition_score_beverage numeric

Out[116]:

saturn.columns["nutrition_score_beverage"].stats

statvalue
n50
nulls0 (0.0%)
unique2
min 0
max 1
mean 0.02
median 0
std 0.1414
q1 0
q3 0
iqr 0
skew 6.857
kurtosis 45.02
n_outliers 1
outlier_rate 0.02
zero_rate 0.98
alert: high_skewskew=+6.86
Fig 31.
Distribution of nutrition_score_beverage. Vertical dash marks the median.
Show data table
Histogram bins for nutrition_score_beverage (median: 0.0).
bincount
0 – 0.142949
0.1429 – 0.28570
0.2857 – 0.42860
0.4286 – 0.57140
0.5714 – 0.71430
0.7143 – 0.85710
0.8571 – 11

packaging_text_nl categorical other

This column appears to hold Dutch-language packaging text for products, but is effectively empty: 76% of the 50 rows are null, and the sole non-null value is an empty string appearing 12 times, giving a cardinality of 1 and zero entropy. Every observed value is either missing or a blank string, meaning this column carries no usable information in this sample.

Treatment: Drop this column; it contains no informative values in the current dataset.

anthropic:default · confidence high
Out[119]:

saturn.columns["packaging_text_nl"].stats

statvalue
n50
nulls38 (76.0%)
unique1
top_value
top_rate 1
cardinality 1
entropy 0
entropy_ratio 0
alert: null_rate76.0% null
alert: imbalancetop value is 100.0% of rows
Fig 32.
Top values for packaging_text_nl.
Show data table
Top values for packaging_text_nl (1 unique shown, of 1 total).
valuecountshare
1224.0%

photographers unknown

Out[122]:

saturn.columns["photographers"].stats

statvalue
n50
nulls0 (0.0%)
unique
alert: skippedno profiler for kind=unknown

pnns_groups_1 categorical

Out[124]:

saturn.columns["pnns_groups_1"].stats

statvalue
n50
nulls0 (0.0%)
unique7
top_value Sugary snacks
top_rate 0.76
cardinality 7
entropy 1.36
entropy_ratio 0.4846
Fig 33.
Top values for pnns_groups_1.
Show data table
Top values for pnns_groups_1 (7 unique shown, of 7 total).
valuecountshare
Sugary snacks3876.0%
Salty snacks48.0%
Cereals and potatoes36.0%
unknown24.0%
Milk and dairy products12.0%
Beverages12.0%
Fruits and vegetables12.0%

product_name_en categorical

Out[127]:

saturn.columns["product_name_en"].stats

statvalue
n50
nulls7 (14.0%)
unique34
top_value
top_rate 0.2326
cardinality 34
entropy 4.654
entropy_ratio 0.9147
alert: long_tail33 singleton categories
Fig 34.
Top values for product_name_en.
Show data table
Top values for product_name_en (20 unique shown, of 34 total).
valuecountshare
1020.0%
Perly12.0%
Prince Gout Chocolat12.0%
Edelbitter-Schokolade12.0%
tonik12.0%
Gerblé - Sesame Cookie, 230g (8.2oz)12.0%
Chocolat noir - 85% cacao12.0%
Hhhhh12.0%
Organic 70% Dark Chocolate Bar12.0%
biscuits12.0%
AUTHENTIQUE12.0%
Tyrell's Lightly Sea Salted12.0%
Intense dark chocolate12.0%
Excellence 85% Cacao Chocolat Noir Puissant Lindt % Lindt12.0%
Filled - Dark Chocolate12.0%
Extra dark 74% Cocoa12.0%
Fine Rye Crispbread - Fibre12.0%
Tuc Original12.0%
Intense Dark 70% Cocoa12.0%
Gerblé - Apple Hazelnut Cookie, 230g (8.2oz)12.0%

traces_from_user categorical

Out[130]:

saturn.columns["traces_from_user"].stats

statvalue
n50
nulls0 (0.0%)
unique35
top_value (en)
top_rate 0.14
cardinality 35
entropy 4.811
entropy_ratio 0.9379
alert: long_tail29 singleton categories
Fig 35.
Top values for traces_from_user.
Show data table
Top values for traces_from_user (20 unique shown, of 35 total).
valuecountshare
(en) 714.0%
(fr) 48.0%
(fr) en:milk,en:nuts48.0%
(en) en:milk,en:nuts24.0%
(en) en:milk,en:nuts,en:sesame-seeds,en:soybeans24.0%
(en) en:nuts24.0%
(en) Eggs12.0%
(fr) en:milk,en:nuts,en:soybeans12.0%
(fr) en:eggs,en:lupin,en:milk,en:mustard,en:nuts,en:soybeans12.0%
(fr) en:milk,en:soybeans12.0%
(fr) Lait,Fruits à coque12.0%
(fr) Soja12.0%
(es) en:mustard12.0%
(fr) en:lupin,en:milk,en:mustard,en:sesame-seeds,en:soybeans12.0%
(fr) en:milk,en:nuts,en:sesame-seeds,en:soybeans12.0%
(en) en:milk12.0%
(fr) Lait,Fruits à coque,Graines de sésame,Soja12.0%
(en) en:eggs,en:mustard,en:nuts,en:sesame-seeds,en:soybeans12.0%
(en) en:gluten,en:Amande,en:Arachides,en:Avoine,en:Blé,en:Lait,en:Noisettes,en:Noix,en:Noix de cajou,en:Noix de macadamia,en:Noix de pécan,en:Noix du brésil,en:Orge,en:Pistaches,en:Seigle12.0%
(fr) en:lupin,en:milk,en:mustard,en:soybeans12.0%

generic_name_nl categorical

Out[133]:

saturn.columns["generic_name_nl"].stats

statvalue
n50
nulls38 (76.0%)
unique4
top_value
top_rate 0.75
cardinality 4
entropy 1.208
entropy_ratio 0.6038
alert: long_tail3 singleton categories
alert: null_rate76.0% null
Fig 36.
Top values for generic_name_nl.
Show data table
Top values for generic_name_nl (4 unique shown, of 4 total).
valuecountshare
918.0%
Extra fijne pure chocolade12.0%
Biscuits bedekt met melkchocolade12.0%
Krokante volkorentoasts12.0%

nutrition_grade_fr categorical

Out[136]:

saturn.columns["nutrition_grade_fr"].stats

statvalue
n50
nulls0 (0.0%)
unique6
top_value e
top_rate 0.54
cardinality 6
entropy 1.913
entropy_ratio 0.7399
Fig 37.
Top values for nutrition_grade_fr.
Show data table
Top values for nutrition_grade_fr (6 unique shown, of 6 total).
valuecountshare
e2754.0%
d918.0%
c714.0%
a48.0%
b24.0%
unknown12.0%

image_front_thumb_url categorical

Out[139]:

saturn.columns["image_front_thumb_url"].stats

statvalue
n50
nulls0 (0.0%)
unique50
top_value https://images.openfoodfacts.org/images/products/611/124/210/0992/front_fr.172.100.jpg
top_rate 0.02
cardinality 50
entropy 5.644
entropy_ratio 1
alert: long_tail50 singleton categories
Fig 38.
Top values for image_front_thumb_url.
Show data table
Top values for image_front_thumb_url (20 unique shown, of 50 total).
valuecountshare
https://images.openfoodfacts.org/images/products/611/124/210/0992/front_fr.172.100.jpg12.0%
https://images.openfoodfacts.org/images/products/762/221/044/9283/front_en.605.100.jpg12.0%
https://images.openfoodfacts.org/images/products/304/692/002/9759/front_en.492.100.jpg12.0%
https://images.openfoodfacts.org/images/products/611/103/100/5064/front_fr.56.100.jpg12.0%
https://images.openfoodfacts.org/images/products/317/568/001/1480/front_en.221.100.jpg12.0%
https://images.openfoodfacts.org/images/products/000/002/099/5553/front_en.314.100.jpg12.0%
https://images.openfoodfacts.org/images/products/326/884/000/1008/front_fr.422.100.jpg12.0%
https://images.openfoodfacts.org/images/products/336/260/001/1044/front_fr.50.100.jpg12.0%
https://images.openfoodfacts.org/images/products/842/519/771/2024/front_en.60.100.jpg12.0%
https://images.openfoodfacts.org/images/products/762/221/057/8464/front_en.29.100.jpg12.0%
https://images.openfoodfacts.org/images/products/611/125/934/3108/front_fr.25.100.jpg12.0%
https://images.openfoodfacts.org/images/products/336/260/001/1228/front_fr.38.100.jpg12.0%
https://images.openfoodfacts.org/images/products/800/050/031/0427/front_fr.488.100.jpg12.0%
https://images.openfoodfacts.org/images/products/730/040/048/1595/front_fr.242.100.jpg12.0%
https://images.openfoodfacts.org/images/products/304/692/002/2651/front_en.159.100.jpg12.0%
https://images.openfoodfacts.org/images/products/506/004/264/1000/front_en.179.100.jpg12.0%
https://images.openfoodfacts.org/images/products/762/221/058/4724/front_en.95.100.jpg12.0%
https://images.openfoodfacts.org/images/products/304/692/002/2606/front_en.102.100.jpg12.0%
https://images.openfoodfacts.org/images/products/322/982/010/0234/front_fr.246.100.jpg12.0%
https://images.openfoodfacts.org/images/products/000/002/002/2464/front_en.301.100.jpg12.0%

last_editor categorical

Out[142]:

saturn.columns["last_editor"].stats

statvalue
n50
nulls1 (2.0%)
unique24
top_value foodless
top_rate 0.4286
cardinality 24
entropy 3.513
entropy_ratio 0.7662
alert: long_tail19 singleton categories
Fig 39.
Top values for last_editor.
Show data table
Top values for last_editor (20 unique shown, of 24 total).
valuecountshare
foodless2142.0%
municorn-calorie-counter-app36.0%
charlesnepote24.0%
macrofactor24.0%
bodysupport24.0%
moon-rabbit12.0%
gmlaa12.0%
prepperapp12.0%
marmotte7312.0%
laura-chaud12.0%
org-barilla-france-sa12.0%
tom170712.0%
bubu6312.0%
moncoachigbas12.0%
natrius12.0%
clxtng12.0%
roboto-app12.0%
fgouget12.0%
ludolm12.0%
foodiq12.0%

nutrient_levels_tags unknown

Out[145]:

saturn.columns["nutrient_levels_tags"].stats

statvalue
n50
nulls0 (0.0%)
unique
alert: skippedno profiler for kind=unknown

product_name_nb categorical

Out[147]:

saturn.columns["product_name_nb"].stats

statvalue
n50
nulls48 (96.0%)
unique2
top_value
top_rate 0.5
cardinality 2
entropy 1
entropy_ratio 1
alert: long_tail2 singleton categories
alert: null_rate96.0% null
Fig 40.
Top values for product_name_nb.
Show data table
Top values for product_name_nb (2 unique shown, of 2 total).
valuecountshare
12.0%
99% mørk sjokolade12.0%

packaging_shapes_tags unknown

Out[150]:

saturn.columns["packaging_shapes_tags"].stats

statvalue
n50
nulls0 (0.0%)
unique
alert: skippedno profiler for kind=unknown

_keywords unknown

Out[152]:

saturn.columns["_keywords"].stats

statvalue
n50
nulls0 (0.0%)
unique
alert: skippedno profiler for kind=unknown

emb_codes_tags unknown

Out[154]:

saturn.columns["emb_codes_tags"].stats

statvalue
n50
nulls0 (0.0%)
unique
alert: skippedno profiler for kind=unknown

images unknown

Out[156]:

saturn.columns["images"].stats

statvalue
n50
nulls0 (0.0%)
unique
alert: skippedno profiler for kind=unknown

states_tags unknown

Out[158]:

saturn.columns["states_tags"].stats

statvalue
n50
nulls0 (0.0%)
unique
alert: skippedno profiler for kind=unknown

packaging_text_sv categorical other

This column appears to be a Swedish-language packaging text field ('sv' suffix indicating Swedish locale), but it is effectively empty in this dataset. A 92% null rate leaves only 4 non-null rows, and all 4 of those contain an empty string — meaning there is zero usable content across all 50 rows. The cardinality of 1 and entropy of 0.0 confirm complete absence of informational signal.

Treatment: Drop — 100% of present values are empty strings and 92% are null, yielding no usable signal.

anthropic:default · confidence high
Out[160]:

saturn.columns["packaging_text_sv"].stats

statvalue
n50
nulls46 (92.0%)
unique1
top_value
top_rate 1
cardinality 1
entropy 0
entropy_ratio 0
alert: null_rate92.0% null
alert: imbalancetop value is 100.0% of rows
Fig 41.
Top values for packaging_text_sv.
Show data table
Top values for packaging_text_sv (1 unique shown, of 1 total).
valuecountshare
48.0%

informers_tags unknown

Out[163]:

saturn.columns["informers_tags"].stats

statvalue
n50
nulls0 (0.0%)
unique
alert: skippedno profiler for kind=unknown

ingredients_text_pl categorical

Out[165]:

saturn.columns["ingredients_text_pl"].stats

statvalue
n50
nulls45 (90.0%)
unique3
top_value
top_rate 0.6
cardinality 3
entropy 1.371
entropy_ratio 0.865
alert: long_tail2 singleton categories
alert: null_rate90.0% null
Fig 42.
Top values for ingredients_text_pl.
Show data table
Top values for ingredients_text_pl (3 unique shown, of 3 total).
valuecountshare
36.0%
Miazga kakaowa, cukier, tłuszcz kakaowy, kakao w proszku o obniżonej zawartości tłuszczu, emulgator: lecytyny (soja); naturalny aromat waniliowy. Czekolada gorzka: masa kakaowa minimum 74 %. Może zawierać orzeszki ziemne, orzechy, mleko i gluten (pszenica, żyt jęczmień, owies, pszenica orkisz i pszenica khorosan).12.0%
Miazga kakaowa, cukier, tłuszcz kakaowy, wanilia.12.0%

labels categorical

Out[168]:

saturn.columns["labels"].stats

statvalue
n50
nulls1 (2.0%)
unique42
top_value
top_rate 0.1633
cardinality 42
entropy 5.125
entropy_ratio 0.9504
alert: long_tail41 singleton categories
Fig 43.
Top values for labels.
Show data table
Top values for labels (20 unique shown, of 42 total).
valuecountshare
816.0%
Distributor labels,Charte LU Harmony,Triman12.0%
Point Vert,Triman12.0%
No preservatives, Made in France, Natural flavors, No colorings, No palm oil, Nutriscore, Nutriscore Grade B, Triman, en:green-dot12.0%
Vegetarian,Fair trade,Fairtrade International,No artificial flavors,Vegan,Fairtrade cocoa,FSC,FSC Mix,Max Havelaar12.0%
Triman,Sans Nitrates12.0%
Green Dot,Made in Spain,Ce12.0%
Commerce équitable,Bio,Végétarien,Bio européen,Fairtrade International,Végétalien,PL-EKO-07,en:Soil Association Organic,The Vegan Society,en:Commerce équitable12.0%
Green Dot12.0%
Vegetariano,fr:Ponto Verde12.0%
Végétarien,Point Vert,Triman12.0%
Sans conservateurs,Fabriqué en France,Triman,Lindt & Sprüngli Cacao Farming Program12.0%
No gluten,Vegetarian,No artificial flavors,Vegan,Assured Food Standards,Green Dot,No artificial colors,No flavour enhancer,No MSG,Triman,Made-in-england,Terracycle12.0%
Commerce équitable,Bio,Végétarien,Bio européen,Fairtrade International,Agriculture non UE,Végétalien,FR-BIO-01,en:FSC,FSC Mix,Point Vert,Max Havelaar,PL-EKO-07,en:Soil Association Organic,The Vegan Society12.0%
Agriculture non UE,Fabriqué en Belgique,Fabriqué en France,Sans huile de palme,Triman12.0%
Organic,EU Organic,Non-EU Agriculture,Certified B Corporation,EU Agriculture,EU/non-EU Agriculture,FR-BIO-01,No palm oil,Pure cocoa butter,AB Agriculture Biologique,fr:Farine de blé français12.0%
Vegetarian,Fair trade,Fairtrade International,Vegan,Fairtrade cocoa,Pure cocoa butter,Rainforest Alliance,Rainforest Alliance Cocoa,Commerce-equitable12.0%
Végétarien,Source de fibres alimentaires,Point Vert,Riche en fibres,Triman,Emballage-recyclable12.0%
Halal12.0%
en:Unknown12.0%

sources unknown

Out[171]:

saturn.columns["sources"].stats

statvalue
n50
nulls0 (0.0%)
unique
alert: skippedno profiler for kind=unknown

checkers_tags unknown

Out[173]:

saturn.columns["checkers_tags"].stats

statvalue
n50
nulls0 (0.0%)
unique
alert: skippedno profiler for kind=unknown

product_quantity_unit categorical

Out[175]:

saturn.columns["product_quantity_unit"].stats

statvalue
n50
nulls5 (10.0%)
unique2
top_value g
top_rate 0.9778
cardinality 2
entropy 0.1537
entropy_ratio 0.1537
alert: imbalancetop value is 97.8% of rows
Fig 44.
Top values for product_quantity_unit.
Show data table
Top values for product_quantity_unit (2 unique shown, of 2 total).
valuecountshare
g4488.0%
ml12.0%

last_modified_by categorical

Out[178]:

saturn.columns["last_modified_by"].stats

statvalue
n50
nulls1 (2.0%)
unique24
top_value foodless
top_rate 0.4286
cardinality 24
entropy 3.513
entropy_ratio 0.7662
alert: long_tail19 singleton categories
Fig 45.
Top values for last_modified_by.
Show data table
Top values for last_modified_by (20 unique shown, of 24 total).
valuecountshare
foodless2142.0%
municorn-calorie-counter-app36.0%
charlesnepote24.0%
macrofactor24.0%
bodysupport24.0%
moon-rabbit12.0%
gmlaa12.0%
prepperapp12.0%
marmotte7312.0%
laura-chaud12.0%
org-barilla-france-sa12.0%
tom170712.0%
bubu6312.0%
moncoachigbas12.0%
natrius12.0%
clxtng12.0%
roboto-app12.0%
fgouget12.0%
ludolm12.0%
foodiq12.0%

image_front_url categorical

Out[181]:

saturn.columns["image_front_url"].stats

statvalue
n50
nulls0 (0.0%)
unique50
top_value https://images.openfoodfacts.org/images/products/611/124/210/0992/front_fr.172.400.jpg
top_rate 0.02
cardinality 50
entropy 5.644
entropy_ratio 1
alert: long_tail50 singleton categories
Fig 46.
Top values for image_front_url.
Show data table
Top values for image_front_url (20 unique shown, of 50 total).
valuecountshare
https://images.openfoodfacts.org/images/products/611/124/210/0992/front_fr.172.400.jpg12.0%
https://images.openfoodfacts.org/images/products/762/221/044/9283/front_en.605.400.jpg12.0%
https://images.openfoodfacts.org/images/products/304/692/002/9759/front_en.492.400.jpg12.0%
https://images.openfoodfacts.org/images/products/611/103/100/5064/front_fr.56.400.jpg12.0%
https://images.openfoodfacts.org/images/products/317/568/001/1480/front_en.221.400.jpg12.0%
https://images.openfoodfacts.org/images/products/000/002/099/5553/front_en.314.400.jpg12.0%
https://images.openfoodfacts.org/images/products/326/884/000/1008/front_fr.422.400.jpg12.0%
https://images.openfoodfacts.org/images/products/336/260/001/1044/front_fr.50.400.jpg12.0%
https://images.openfoodfacts.org/images/products/842/519/771/2024/front_en.60.400.jpg12.0%
https://images.openfoodfacts.org/images/products/762/221/057/8464/front_en.29.400.jpg12.0%
https://images.openfoodfacts.org/images/products/611/125/934/3108/front_fr.25.400.jpg12.0%
https://images.openfoodfacts.org/images/products/336/260/001/1228/front_fr.38.400.jpg12.0%
https://images.openfoodfacts.org/images/products/800/050/031/0427/front_fr.488.400.jpg12.0%
https://images.openfoodfacts.org/images/products/730/040/048/1595/front_fr.242.400.jpg12.0%
https://images.openfoodfacts.org/images/products/304/692/002/2651/front_en.159.400.jpg12.0%
https://images.openfoodfacts.org/images/products/506/004/264/1000/front_en.179.400.jpg12.0%
https://images.openfoodfacts.org/images/products/762/221/058/4724/front_en.95.400.jpg12.0%
https://images.openfoodfacts.org/images/products/304/692/002/2606/front_en.102.400.jpg12.0%
https://images.openfoodfacts.org/images/products/322/982/010/0234/front_fr.246.400.jpg12.0%
https://images.openfoodfacts.org/images/products/000/002/002/2464/front_en.301.400.jpg12.0%

nutrition_data_prepared categorical

Out[184]:

saturn.columns["nutrition_data_prepared"].stats

statvalue
n50
nulls2 (4.0%)
unique1
top_value
top_rate 1
cardinality 1
entropy 0
entropy_ratio 0
alert: imbalancetop value is 100.0% of rows
Fig 47.
Top values for nutrition_data_prepared.
Show data table
Top values for nutrition_data_prepared (1 unique shown, of 1 total).
valuecountshare
4896.0%

packaging_text_fi categorical metadata

This column appears to be Finnish-language packaging text for a product dataset, but it is almost entirely empty: 90% of the 50 rows are null, and the sole non-null value across all 5 populated rows is an empty string. With cardinality of 1 and entropy of 0, the column carries zero information — it is effectively unpopulated.

Treatment: Drop this column; it is 90% null with only empty strings in the remaining rows and provides no signal for modelling or analysis.

anthropic:default · confidence high
Out[187]:

saturn.columns["packaging_text_fi"].stats

statvalue
n50
nulls45 (90.0%)
unique1
top_value
top_rate 1
cardinality 1
entropy 0
entropy_ratio 0
alert: null_rate90.0% null
alert: imbalancetop value is 100.0% of rows
Fig 48.
Top values for packaging_text_fi.
Show data table
Top values for packaging_text_fi (1 unique shown, of 1 total).
valuecountshare
510.0%

interface_version_created categorical

Out[190]:

saturn.columns["interface_version_created"].stats

statvalue
n50
nulls1 (2.0%)
unique3
top_value 20120622
top_rate 0.5918
cardinality 3
entropy 1.167
entropy_ratio 0.7363
Fig 49.
Top values for interface_version_created.
Show data table
Top values for interface_version_created (3 unique shown, of 3 total).
valuecountshare
201206222958.0%
20150316.jqm21836.0%
20130323.jqm24.0%

nutrient_levels unknown

Out[193]:

saturn.columns["nutrient_levels"].stats

statvalue
n50
nulls0 (0.0%)
unique
alert: skippedno profiler for kind=unknown

languages_tags unknown

Out[195]:

saturn.columns["languages_tags"].stats

statvalue
n50
nulls0 (0.0%)
unique
alert: skippedno profiler for kind=unknown

vitamins_prev_tags unknown

Out[197]:

saturn.columns["vitamins_prev_tags"].stats

statvalue
n50
nulls0 (0.0%)
unique
alert: skippedno profiler for kind=unknown

other_nutritional_substances_tags unknown

Out[199]:

saturn.columns["other_nutritional_substances_tags"].stats

statvalue
n50
nulls0 (0.0%)
unique
alert: skippedno profiler for kind=unknown

product_name_de categorical

Out[201]:

saturn.columns["product_name_de"].stats

statvalue
n50
nulls30 (60.0%)
unique16
top_value
top_rate 0.25
cardinality 16
entropy 3.741
entropy_ratio 0.9354
alert: long_tail15 singleton categories
alert: null_rate60.0% null
Fig 50.
Top values for product_name_de.
Show data table
Top values for product_name_de (16 unique shown, of 16 total).
valuecountshare
510.0%
Edelbitterschokolade Mild 90%12.0%
Edelbitter mild 85%12.0%
Knusprige Kekse mit einem cremigen Herz aus Nutella®12.0%
Lightly Sea Salted12.0%
85% kraftvoller schwarzer Kakao12.0%
Noir intense 74%cacao12.0%
Tuc Original12.0%
Schokolade Ecuador Edelbitter 70% Cacao12.0%
Nutella12.0%
Bitter Extra Kraftig12.0%
Schokolade (Alpenmilch Schokolade)12.0%
Granatapfel Sauerkirsche Fruchtgummi12.0%
Bio-Bitterschokolade 70%12.0%
Nuss-Frucht-Mix12.0%
Dark Milde Edelbitter Scholade 70%12.0%

nutrition_grades categorical

Out[204]:

saturn.columns["nutrition_grades"].stats

statvalue
n50
nulls0 (0.0%)
unique6
top_value e
top_rate 0.54
cardinality 6
entropy 1.913
entropy_ratio 0.7399
Fig 51.
Top values for nutrition_grades.
Show data table
Top values for nutrition_grades (6 unique shown, of 6 total).
valuecountshare
e2754.0%
d918.0%
c714.0%
a48.0%
b24.0%
unknown12.0%

countries_beforescanbot categorical

Out[207]:

saturn.columns["countries_beforescanbot"].stats

statvalue
n50
nulls7 (14.0%)
unique38
top_value France
top_rate 0.1395
cardinality 38
entropy 5.066
entropy_ratio 0.9653
alert: long_tail37 singleton categories
Fig 52.
Top values for countries_beforescanbot.
Show data table
Top values for countries_beforescanbot (20 unique shown, of 38 total).
valuecountshare
France612.0%
Maroc12.0%
Belgique,France,Polynésie française,Guadeloupe,Luxembourg,Portugal,La Réunion12.0%
Argelia,Bélgica,República Checa,Finlandia,Francia,Polinesia Francesa,Alemania,Italia,Mauricio,Marruecos,Países Bajos,Reunión,Singapur,España,Suecia,Suiza,Reino Unido12.0%
en:Morocco12.0%
nl:Duitsland,nl:Slovenië,nl:Spanje,nl:Frankrijk12.0%
Belgique,Côte d'Ivoire,France,Luxembourg,Mali,Martinique,Russie,Suisse,Royaume-Uni12.0%
Algérie, Cameroun, France, Maroc, en:spain12.0%
France,Suède,Royaume-Uni12.0%
France,Allemagne,Italie12.0%
France,Italie,Espagne,Suisse12.0%
Česko,Francie,Německo,Guadeloupe,Itálie,en:algerie,en:espagne,en:la-reunion,en:royaume-uni,en:suisse12.0%
Belgique,France,Royaume-Uni12.0%
Austria,France,Italy,Réunion,Spain,Alemania,Belgica,Francia,Paises-bajos,Suiza12.0%
Finland,France,Germany,Spain12.0%
France,Guadeloupe,La Réunion,Suisse,en:en12.0%
en:fr12.0%
Germany12.0%
Australia, Belgium, Denmark, Estonia, France, Germany, Hungary, Italy, Lebanon, Portugal, Serbia, Spain, Switzerland, United Kingdom, en:nl12.0%
Belgique,France,Pays-Bas,Sénégal12.0%

ingredients_text_with_allergens_es categorical

Out[210]:

saturn.columns["ingredients_text_with_allergens_es"].stats

statvalue
n50
nulls31 (62.0%)
unique13
top_value
top_rate 0.3684
cardinality 13
entropy 3.214
entropy_ratio 0.8684
alert: long_tail12 singleton categories
alert: null_rate62.0% null
Fig 53.
Top values for ingredients_text_with_allergens_es.
Show data table
Top values for ingredients_text_with_allergens_es (13 unique shown, of 13 total).
valuecountshare
714.0%
Pasta de cacao, manteca de cacao, cacao magro en polvo, azúcar, vainilla.12.0%
Azúcar, Grasa vegetal de palmiste parcialmente hidrogenada, Leche en polvo, Almendras, Cacao desgrasado en polvo, suero lácteo en polvo, Emulgente (lecitina de soja), aroma (vainilla).12.0%
Crema de avellanas y cacao 40% (azúcar, manteca de palma, avellanas 13%, leche desnatada en polvo 8,7%, cacao desgrasado 7.4%, emulgentes (lecitinas (soja), vainillina), harina de trigo 32,5%, grasas vegetales (palma, palmiste), azúcar de caña 8,5% (trigo), lactosa, salvado de trigo, leche entera en polvo, extracto en polvo de malta de cebada y maíz, miel, gasificantes (difosfato disódico, carbonato ácido de sodio, carbonato ácido de amonio), cacao desgrasado, sal, almidón de trigo, harina de cebada, malteada, emulsionantes (lecitinas (soja), vainillina.12.0%
70% pasta de cacao*, azúcar, rnanteca de cacao, cacao desgrasado en polvo, emulgente: lecitlna de girasol (E-322), aroma natural de vainilla. *Pasta de cacao Ralnforest Alliance Certified cocoa. Cacao: 74% mínimo.12.0%
Harina de TRIGO, grasa de palma, extracto de malta de CEBADA, gasificantes (carbonatos de amonio, carbonatos de sodio), sal, HUEVO, aroma, agente de tratamiento de la harina (METABISULFITO sódico).12.0%
Pasta de cacao, azúcar, manteca de cacao, vainilla.12.0%
Pasta de cacao, azúcar, manteca de cacao, emulgente: lecitina de girasol (E-322), extracto de vainilla. Cacao: 70% mínimo.12.0%
Copos de avena integral (60%),azúcar, aceite refinado de girasol, miel (3%), sal, melaza de caña, emulgente (lecitina de girasol), gasificante (carbonato ácido de sodio),12.0%
Pasta de cacao, cacao magro, manteca de cacao, azúcar moreno de caña12.0%
Zucker, Kakaobutter, Magermilchpulver, Kakaomasse, Molkenpulver (Milch), Butterreinfett, Emulgator (Sojalecithin), Haselnusspaste, natürliches Aroma12.0%
pasta de cacao, azúcar, manteca de cacao, emulgente (lecitina de soja), vainilla. Cacao: 70% mínimo.12.0%
Pasta de cacao, cacao desgrasado en polvo, manteca de cacao, azúcar, leche en polvo, pasta de almendras y avellanas, emulgentes (lecitinas de soja, girasol), aroma12.0%

labels_lc categorical

Out[213]:

saturn.columns["labels_lc"].stats

statvalue
n50
nulls1 (2.0%)
unique6
top_value en
top_rate 0.449
cardinality 6
entropy 1.57
entropy_ratio 0.6072
Fig 54.
Top values for labels_lc.
Show data table
Top values for labels_lc (6 unique shown, of 6 total).
valuecountshare
en2244.0%
fr2244.0%
es24.0%
de12.0%
it12.0%
pl12.0%

nova_group_debug categorical

Out[216]:

saturn.columns["nova_group_debug"].stats

statvalue
n50
nulls0 (0.0%)
unique3
top_value
top_rate 0.96
cardinality 3
entropy 0.2823
entropy_ratio 0.1781
alert: long_tail2 singleton categories
alert: imbalancetop value is 96.0% of rows
Fig 55.
Top values for nova_group_debug.
Show data table
Top values for nova_group_debug (3 unique shown, of 3 total).
valuecountshare
4896.0%
no nova group if too many ingredients are unknown: 5 out of 512.0%
no nova group if too many ingredients are unknown: 13 out of 1312.0%

nutrition_score_warning_fruits_vegetables_legumes_estimate_from_ingredients_value numeric

Out[219]:

saturn.columns["nutrition_score_warning_fruits_vegetables_legumes_estimate_from_ingredients_value"].stats

statvalue
n50
nulls4 (8.0%)
unique6
min 0
max 50
mean 1.652
median 0
std 7.551
q1 0
q3 0
iqr 0
skew 5.932
kurtosis 35.23
n_outliers 5
outlier_rate 0.1087
zero_rate 0.8913
alert: high_skewskew=+5.93
alert: outliers10.9% rows beyond 1.5 IQR
Fig 56.
Distribution of nutrition_score_warning_fruits_vegetables_legumes_estimate_from_ingredients_value. Vertical dash marks the median.
Show data table
Histogram bins for nutrition_score_warning_fruits_vegetables_legumes_estimate_from_ingredients_value (median: 0.0).
bincount
0 – 8.33344
8.333 – 16.671
16.67 – 250
25 – 33.330
33.33 – 41.670
41.67 – 501

lc categorical

Out[222]:

saturn.columns["lc"].stats

statvalue
n50
nulls0 (0.0%)
unique5
top_value fr
top_rate 0.7
cardinality 5
entropy 1.294
entropy_ratio 0.5572
Fig 57.
Top values for lc.
Show data table
Top values for lc (5 unique shown, of 5 total).
valuecountshare
fr3570.0%
en1020.0%
de36.0%
bg12.0%
ro12.0%

allergens_from_user categorical

Out[225]:

saturn.columns["allergens_from_user"].stats

statvalue
n50
nulls0 (0.0%)
unique34
top_value (fr)
top_rate 0.16
cardinality 34
entropy 4.636
entropy_ratio 0.9112
alert: long_tail30 singleton categories
Fig 58.
Top values for allergens_from_user.
Show data table
Top values for allergens_from_user (20 unique shown, of 34 total).
valuecountshare
(fr) 816.0%
(en) 714.0%
(fr) en:gluten36.0%
(en) en:soybeans, en:soybeans24.0%
(en) en:banana,en:milk12.0%
(en) Eggs,Gluten,Milk,Soybeans, en:milk12.0%
(fr) Gluten,Lait,Soja, en:gluten12.0%
(en) en:milk,en:nuts,en:soybeans12.0%
(fr) Gluten,Lait12.0%
(es) en:gluten,en:milk,en:nuts,en:soybeans12.0%
(en) en:gluten,en:milk,en:soybeans12.0%
(fr) en:gluten,en:sesame-seeds12.0%
(fr) Gluten12.0%
(fr) en:gluten,en:milk,en:soybeans12.0%
(de) en:eggs,en:gluten,en:sulphur-dioxide-and-sulphites12.0%
(en) en:gluten,en:nuts12.0%
(fr) en:soybeans12.0%
(en) en:milk,en:nuts,en:soybeans, en:soybeans12.0%
(it) en:gluten12.0%
(es) 12.0%

debug_param_sorted_langs unknown

Out[228]:

saturn.columns["debug_param_sorted_langs"].stats

statvalue
n50
nulls0 (0.0%)
unique
alert: skippedno profiler for kind=unknown

ecoscore_tags unknown

Out[230]:

saturn.columns["ecoscore_tags"].stats

statvalue
n50
nulls0 (0.0%)
unique
alert: skippedno profiler for kind=unknown

nutriscore_score_opposite numeric

Out[232]:

saturn.columns["nutriscore_score_opposite"].stats

statvalue
n50
nulls1 (2.0%)
unique28
min -40
max 0
mean -17.47
median -19
std 9.906
q1 -25
q3 -10
iqr 15
skew 0.1616
kurtosis -0.5337
n_outliers 0
outlier_rate 0
zero_rate 0.08163
Fig 59.
Distribution of nutriscore_score_opposite. Vertical dash marks the median.
Show data table
Histogram bins for nutriscore_score_opposite (median: -19.0).
bincount
-40 – -34.292
-34.29 – -28.572
-28.57 – -22.8612
-22.86 – -17.1413
-17.14 – -11.437
-11.43 – -5.7145
-5.714 – 08

image_small_url categorical

Out[235]:

saturn.columns["image_small_url"].stats

statvalue
n50
nulls0 (0.0%)
unique50
top_value https://images.openfoodfacts.org/images/products/611/124/210/0992/front_fr.172.200.jpg
top_rate 0.02
cardinality 50
entropy 5.644
entropy_ratio 1
alert: long_tail50 singleton categories
Fig 60.
Top values for image_small_url.
Show data table
Top values for image_small_url (20 unique shown, of 50 total).
valuecountshare
https://images.openfoodfacts.org/images/products/611/124/210/0992/front_fr.172.200.jpg12.0%
https://images.openfoodfacts.org/images/products/762/221/044/9283/front_en.605.200.jpg12.0%
https://images.openfoodfacts.org/images/products/304/692/002/9759/front_en.492.200.jpg12.0%
https://images.openfoodfacts.org/images/products/611/103/100/5064/front_fr.56.200.jpg12.0%
https://images.openfoodfacts.org/images/products/317/568/001/1480/front_en.221.200.jpg12.0%
https://images.openfoodfacts.org/images/products/000/002/099/5553/front_en.314.200.jpg12.0%
https://images.openfoodfacts.org/images/products/326/884/000/1008/front_fr.422.200.jpg12.0%
https://images.openfoodfacts.org/images/products/336/260/001/1044/front_fr.50.200.jpg12.0%
https://images.openfoodfacts.org/images/products/842/519/771/2024/front_en.60.200.jpg12.0%
https://images.openfoodfacts.org/images/products/762/221/057/8464/front_en.29.200.jpg12.0%
https://images.openfoodfacts.org/images/products/611/125/934/3108/front_fr.25.200.jpg12.0%
https://images.openfoodfacts.org/images/products/336/260/001/1228/front_fr.38.200.jpg12.0%
https://images.openfoodfacts.org/images/products/800/050/031/0427/front_fr.488.200.jpg12.0%
https://images.openfoodfacts.org/images/products/730/040/048/1595/front_fr.242.200.jpg12.0%
https://images.openfoodfacts.org/images/products/304/692/002/2651/front_en.159.200.jpg12.0%
https://images.openfoodfacts.org/images/products/506/004/264/1000/front_en.179.200.jpg12.0%
https://images.openfoodfacts.org/images/products/762/221/058/4724/front_en.95.200.jpg12.0%
https://images.openfoodfacts.org/images/products/304/692/002/2606/front_en.102.200.jpg12.0%
https://images.openfoodfacts.org/images/products/322/982/010/0234/front_fr.246.200.jpg12.0%
https://images.openfoodfacts.org/images/products/000/002/002/2464/front_en.301.200.jpg12.0%

codes_tags unknown

Out[238]:

saturn.columns["codes_tags"].stats

statvalue
n50
nulls0 (0.0%)
unique
alert: skippedno profiler for kind=unknown

pnns_groups_2_tags unknown

Out[240]:

saturn.columns["pnns_groups_2_tags"].stats

statvalue
n50
nulls0 (0.0%)
unique
alert: skippedno profiler for kind=unknown

ingredients_analysis_tags unknown

Out[242]:

saturn.columns["ingredients_analysis_tags"].stats

statvalue
n50
nulls0 (0.0%)
unique
alert: skippedno profiler for kind=unknown

purchase_places_tags unknown

Out[244]:

saturn.columns["purchase_places_tags"].stats

statvalue
n50
nulls0 (0.0%)
unique
alert: skippedno profiler for kind=unknown

unique_scans_n numeric feature

This column represents a count of unique scans (likely QR-code or barcode scan events) per record, with 50 observations and no nulls. The bulk of values cluster between 362.75 (Q1) and 560.75 (Q3), yet a right-skewed tail (skew=3.91, kurtosis=18.71) driven by 4 outliers pulls the mean (525.38) well above the median (432.0), with a maximum of 2257.0 — nearly 4× the Q3 value. The outlier rate of 8% in just 50 rows is a strong signal that a small number of records see dramatically higher scan volumes than the rest.

Treatment: Log-transform or apply robust scaling before modelling to reduce the influence of the 4 extreme outliers; investigate those records for data-quality issues.

anthropic:default · confidence high
Out[246]:

saturn.columns["unique_scans_n"].stats

statvalue
n50
nulls0 (0.0%)
unique48
min 319
max 2,257
mean 525.4
median 432
std 306.4
q1 362.8
q3 560.8
iqr 198
skew 3.911
kurtosis 18.71
n_outliers 4
outlier_rate 0.08
zero_rate 0
alert: high_skewskew=+3.91
alert: outliers8.0% rows beyond 1.5 IQR
Fig 61.
Distribution of unique_scans_n. Vertical dash marks the median.
Show data table
Histogram bins for unique_scans_n (median: 432.0).
bincount
319 – 595.939
595.9 – 872.77
872.7 – 11503
1150 – 14260
1426 – 17030
1703 – 19800
1980 – 22571

update_key categorical

Out[249]:

saturn.columns["update_key"].stats

statvalue
n50
nulls0 (0.0%)
unique9
top_value brands
top_rate 0.56
cardinality 9
entropy 2.015
entropy_ratio 0.6357
alert: long_tail5 singleton categories
Fig 62.
Top values for update_key.
Show data table
Top values for update_key (9 unique shown, of 9 total).
valuecountshare
brands2856.0%
sort1020.0%
divinfood510.0%
key_174833724824.0%
nova-yogurts12.0%
key_174483097012.0%
ingredients2024080512.0%
germany212.0%
france12.0%

emb_codes_orig categorical

Out[252]:

saturn.columns["emb_codes_orig"].stats

statvalue
n50
nulls17 (34.0%)
unique5
top_value
top_rate 0.8485
cardinality 5
entropy 0.9048
entropy_ratio 0.3897
alert: long_tail3 singleton categories
alert: null_rate34.0% null
Fig 63.
Top values for emb_codes_orig.
Show data table
Top values for emb_codes_orig (5 unique shown, of 5 total).
valuecountshare
2856.0%
EMB 3125024.0%
EMB 44068A12.0%
SOLENT GMBH & CO. KG,SCHWARZ BETEILIGUNGS GMBH12.0%
EMB 6442212.0%

ingredients_text_with_allergens_de categorical

Out[255]:

saturn.columns["ingredients_text_with_allergens_de"].stats

statvalue
n50
nulls33 (66.0%)
unique16
top_value
top_rate 0.1176
cardinality 16
entropy 3.97
entropy_ratio 0.9925
alert: long_tail15 singleton categories
alert: null_rate66.0% null
Fig 64.
Top values for ingredients_text_with_allergens_de.
Show data table
Top values for ingredients_text_with_allergens_de (16 unique shown, of 16 total).
valuecountshare
24.0%
Kakaomasse, Kakaobutter, fettarmes Kakaopulver, Zucker, Vanille12.0%
Kakaomasse, fettarmes Kakaopulver, Kakaobutter, Zucker, Emulgator: Lecithine (Soja); Vanilleextrakt.12.0%
Nuss-Nugat-Creme 40 % (Zucker, Palmöl, HASELNÜSSE 13 %, MAGERMILCHPULVER 8.7%, fettarmer Kakao 7,4 %, Emulgator Lecithine (SOJA), Vanillin), WEIZENMEHL (32,5 %), pflanzliche Fette (Palm, Palmkern), Rohrzucker 8,5 % (enthält WEIZEN), MILCHZUCKER, WEIZENKLEIE, VOLLMILCHPULVER, GERSTENMALZ - und Maisextraktpulver, Honig, Backtriebmittel: Dinatriumdiphosphat, Natriumhydrogencarbonat, Ammoniumhydrogencarbonat; fettarmer Kakao, Salz, WEIZENSTÄRKE, GERSTENMALZMEHL, Emulgator Lecithine (SOJA), Vanillin12.0%
Kakaomasse, Zucker, Kakaobutter, Vanille12.0%
Kartoffeln, Sonnenblumenöl, Meersalz.12.0%
Kakaomasse, fettarmes Kakaopulver, Kakaobutter, Rohrzucker, Vanille. Kann Schalenfrüchte, Milch, Soja, Sesamsamen und Weizen enthalten.12.0%
kakaomass of*, zucker, kakaobutter, kakaopulver stark entöit, emulgator: sonnenblumenlecithine (e-322), natürliche in vanille-aroma, * rainforest alliance certified, cocoa: 74% mindestens,12.0%
WEIZENMEHL, Palmöl, Glukosesirup, GERSTENMALZEXTRAKT, Backtriebmittel (Ammoniumcarbonate, Natriumcarbonate), Speisesalz 1,4 %, EIER, Aroma, Mehlbehandlungsmittel (NATRIUMMETABISULFIT).12.0%
Kakaomasse, Zucker, Kakaobutter, Emulgator: Lecithine (Soja); Vanilleextrakt.12.0%
Kartoffelpüreepulver, pflanzliche Öle (Sonnenblume, Palm, Mais) in veränderlichen Gewichtsanteilen, Weizenmehl, Maismehl, Reismehl, Maltodextrin, Emulgator (E471), Salz, Farbstoff (Annatto Norbixin).12.0%
Kakaomasse, fettarmes Kakaopulver, Kakaobutter . Kann Schalenfrüchte, Milch und Soja enthalten.12.0%
Alpenmilch Schokolade. Zutaten: Zucker, Kakaobutter, Magermilchpulver, Kakaomasse, Süßmolkenpulver (aus Milch), Butterreinfett, Haselnüsse, Emulgatoren (Sojalecithin, E476), Aroma. Kakao: 30 % mindestens. Kann andere Nüsse und Weizen enthalten. Ohne Farbstoffe** und Konservierungsstoffe** -**Gemäß rechtlicher Vorschriften.12.0%
Kakaomasse¹, Rohrzucker¹, Kakaobutter¹, Emulgator: Lecithine (Soja)¹. ¹aus kontrolliert ökologischem Anbau.12.0%
25% Walnusskerne, 25% Mandeln, 25% Sultaninen geschwefelt (Sultaninen, Sonnenblumenöl, Konservierungsstoff: Schwefeldioxid), 25% Cranberries (Cranberries, Zucker, Sonnenblumenöl).12.0%
Kakaomasse, Zucker, Kakaobutter, Emulgator (Sojalecithin), Vanille. Kann Haselnüsse, Mandeln, Milch enthalten.12.0%

ingredients_without_ecobalyse_ids_n numeric

Out[258]:

saturn.columns["ingredients_without_ecobalyse_ids_n"].stats

statvalue
n50
nulls0 (0.0%)
unique20
min 0
max 29
mean 8.16
median 6.5
std 5.898
q1 4
q3 11
iqr 7
skew 1.28
kurtosis 1.743
n_outliers 1
outlier_rate 0.02
zero_rate 0.02
Fig 65.
Distribution of ingredients_without_ecobalyse_ids_n. Vertical dash marks the median.
Show data table
Histogram bins for ingredients_without_ecobalyse_ids_n (median: 6.5).
bincount
0 – 4.14315
4.143 – 8.28616
8.286 – 12.438
12.43 – 16.576
16.57 – 20.713
20.71 – 24.861
24.86 – 291

main_countries_tags unknown

Out[261]:

saturn.columns["main_countries_tags"].stats

statvalue
n50
nulls0 (0.0%)
unique
alert: skippedno profiler for kind=unknown

ingredients_text_with_allergens_en categorical

Out[263]:

saturn.columns["ingredients_text_with_allergens_en"].stats

statvalue
n50
nulls8 (16.0%)
unique36
top_value
top_rate 0.1667
cardinality 36
entropy 4.924
entropy_ratio 0.9525
alert: long_tail35 singleton categories
Fig 66.
Top values for ingredients_text_with_allergens_en.
Show data table
Top values for ingredients_text_with_allergens_en (20 unique shown, of 36 total).
valuecountshare
714.0%
milk cream, cream, sugar, banana, bacteria12.0%
WHEAT flour 35%, whole WHEAT flour 15.7%, sugar, vegetable oils (palm, rapeseed), low-fat cocoa powder 4.5%, glucose syrup, WHEAT starch, raising agents (ammonium bicarbonate, sodium bicarbonate, disodium diphosphate), emulsifiers (SOY lecithin, sunflower lecithin), salt, skimmed MILK powder, lactose and MILK proteins, flavors, MAY CONTAIN EGG.12.0%
cocoa mass, cocoa butter, fat reduced cocoa, sugar, vanilla12.0%
Wheat flour, brown cane sugar, rapeseed oil, toasted sesame 10.6%, wheat germ 5.4%, whole wheat flour 5.4%, natural flavor, magnesium, emulsifier: lecithins, raising agents (potassium tartrates, sodium carbonates, ammonium carbonates), sea salt, wheat starch, vitamins (E, PP, B6, B1, B9).12.0%
cocoa mass, low-fat cocoa powder, cocoa butter, sugar, emulsifier: lecithin (soy), vanilla extract, may contain traces of nuts and milk,12.0%
Hhhhh12.0%
sugar, cocoa butter, whole milk powder, cocoa mass, almonds, emulsifier (soya lecithin), flavoring12.0%
cocoa mass #, cane sugar #, cocoa butter #, vanilla extract #, may contain nuts, milk,12.0%
wholemeal rye flour (77 g*), rye flour (28 g*), yeast, salt, may contain traces of milk and sesame seeds, *in g per 100 g of product,12.0%
cocoa paste, sugar, cocoa butter, vanilla,12.0%
Potatoes, sunflower oil, sea salt. May contain Milk.12.0%
cocoa mass, cocoa butter, fat-reduced cocoa powder, cane sugar, vanilla extract12.0%
Pâte de cacao, cacao maigre, beurre de cacao, cassonade, vanille bourbon naturelle en gousse.12.0%
Wheat flour 39%, dark chocolate 25% (cocoa mass, cane sugar, cocoa butter), unrefined brown cane sugar, wholemeal wheat flour 15%, oleic sunflower oil, natural vanilla flavouring, skimmed milk powder, sea salt, raising agents: ammonium carbonates, sodium carbonates, thickener: acacia gum, antioxidant: rosemary extract.12.0%
cocoa mass, sugar, cocoa butter, fat reduced cocoa powder, emulsifier: lecithins (soya), natural vanilla flavouring, dark chocolate contains: cocoa solids 74% minimum,12.0%
whole rye flour (57 g), wheat bran (27 g), oatmeal (13 g), sesame seeds (7.9 g), wheat germ, salt.12.0%
wheat flour, palm oil, glucose syrup, barley malt extract, raising agents (ammonium carbonates, sodium carbonates), salt, eggs , flavouring, flour treatment agent (sodium metabisulfite ),12.0%
cocoa mass, sugar, cocoa butter, vanilla,12.0%
Farine de maïs* (70%), farine de riz*, sel marin. * K issus de l'agriculture biologique. • sans sucres ajoutés(¹) (contient des sucres naturellement présents.12.0%

nucleotides_tags unknown

Out[266]:

saturn.columns["nucleotides_tags"].stats

statvalue
n50
nulls0 (0.0%)
unique
alert: skippedno profiler for kind=unknown

ingredients_text_with_allergens_sv categorical

Out[268]:

saturn.columns["ingredients_text_with_allergens_sv"].stats

statvalue
n50
nulls46 (92.0%)
unique4
top_value kakaomassa, kakaosmör, fettreducerat kakaopulver, socker, vanilj.
top_rate 0.25
cardinality 4
entropy 2
entropy_ratio 1
alert: long_tail4 singleton categories
alert: null_rate92.0% null
Fig 67.
Top values for ingredients_text_with_allergens_sv.
Show data table
Top values for ingredients_text_with_allergens_sv (4 unique shown, of 4 total).
valuecountshare
kakaomassa, kakaosmör, fettreducerat kakaopulver, socker, vanilj.12.0%
kakaomassa, fettreducerat kakaopulver, kakaosmör, socker, emulgeringsmedel (sojalecitin), vaniljextrakt. Minst 85 % kakao i chokladen. Kan innehålla spår av nötter och mjölk.12.0%
12.0%
VETEMJÖL/HVEDEMEL, palmolja/-olie, glukossirap, maltextrakt från KORN/BYG, bakpulver/hævemidler (ammoniumkarbonater, natriumkarbonater), salt, ÄGG/ÆG/EGG, arom, mjölbehandlingsmedel/melbehandlingsmiddel (NATRIUMDISULFIT).12.0%

entry_dates_tags unknown

Out[271]:

saturn.columns["entry_dates_tags"].stats

statvalue
n50
nulls0 (0.0%)
unique
alert: skippedno profiler for kind=unknown

allergens_from_ingredients categorical

Out[273]:

saturn.columns["allergens_from_ingredients"].stats

statvalue
n50
nulls0 (0.0%)
unique35
top_value
top_rate 0.3
cardinality 35
entropy 4.432
entropy_ratio 0.864
alert: long_tail33 singleton categories
Fig 68.
Top values for allergens_from_ingredients.
Show data table
Top values for allergens_from_ingredients (20 unique shown, of 35 total).
valuecountshare
1530.0%
en:gluten, froment24.0%
en:milk, en:milk, cream, banana12.0%
en:milk, en:milk, en:soybeans, en:gluten, en:gluten, en:gluten, blé, blé complet, lécithines de soja, lait12.0%
en:milk, Lécithine de soja, lait, blé, gluten, soja12.0%
en:gluten, en:gluten, en:gluten, en:sesame-seeds, en:gluten, blé12.0%
соеви12.0%
en:soybeans, en:nuts, en:milk, almonds, soya lecithin12.0%
en:milk, en:milk, en:gluten, froment, lait, lactosérum12.0%
en:soybeans, en:gluten, en:gluten, en:gluten, en:milk, en:gluten, en:gluten, en:soybeans, en:milk, en:nuts, NOISETTES , NOISETTES , LAIT , SOJA, FROMENT , BLE, LACTOSE, BLE, LAIT , ORGE , ORGE , FROMENT, SOJA, NOISETTES, NOISETTES, LAIT, SOJA, FROMENT, BLE, LACTOSE, BLE, LAIT, ORGE, ORGE, FROMENT, SOJA12.0%
SEIGLE, SEIGLE, SEIGLE, SEIGLE12.0%
en:milk, en:gluten, en:gluten, blé*12.0%
soya12.0%
en:gluten, en:sesame-seeds, en:gluten, SEIGLE , BLÉ , GRAINES DE SÉSAME , BLÉ, SEIGLE, BLÉ, GRAINES DE SÉSAME, BLÉ12.0%
en:soybeans, en:soybeans, en:gluten, blé, lécithine de soja12.0%
en:eggs, en:gluten, en:gluten, wheat flour, eggs12.0%
en:soybeans, en:soybeans, en:milk, en:milk, en:milk, en:gluten, Poudre de lait, Lécithine de soja12.0%
en:gluten, en:gluten, en:gluten, en:nuts, en:gluten, blé, noisettes, blé, orge, blé12.0%
en:soybeans12.0%
en:milk, en:nuts, ЛЕШНИЦИ, СОЯ12.0%

nova_groups categorical

Out[276]:

saturn.columns["nova_groups"].stats

statvalue
n50
nulls2 (4.0%)
unique3
top_value 4
top_rate 0.6875
cardinality 3
entropy 1.006
entropy_ratio 0.635
Fig 69.
Top values for nova_groups.
Show data table
Top values for nova_groups (3 unique shown, of 3 total).
valuecountshare
43366.0%
31428.0%
112.0%

product_quantity categorical

Out[279]:

saturn.columns["product_quantity"].stats

statvalue
n50
nulls3 (6.0%)
unique27
top_value 100
top_rate 0.234
cardinality 27
entropy 4.287
entropy_ratio 0.9017
alert: long_tail18 singleton categories
Fig 70.
Top values for product_quantity.
Show data table
Top values for product_quantity (20 unique shown, of 27 total).
valuecountshare
1001122.0%
23036.0%
4236.0%
12524.0%
50024.0%
15024.0%
9024.0%
024.0%
20024.0%
30012.0%
2212.0%
30412.0%
27512.0%
22512.0%
8512.0%
3612.0%
16012.0%
2012.0%
75012.0%
17512.0%

ingredients_debug unknown

Out[282]:

saturn.columns["ingredients_debug"].stats

statvalue
n50
nulls0 (0.0%)
unique
alert: skippedno profiler for kind=unknown

generic_name categorical

Out[284]:

saturn.columns["generic_name"].stats

statvalue
n50
nulls2 (4.0%)
unique28
top_value
top_rate 0.4375
cardinality 28
entropy 3.663
entropy_ratio 0.762
alert: long_tail27 singleton categories
Fig 71.
Top values for generic_name.
Show data table
Top values for generic_name (20 unique shown, of 28 total).
valuecountshare
2142.0%
BISCUITS FOURRÉS (35%) PARFUM CHOCOLAT12.0%
Chocolat noir extra-fin traditionnel à 90% de cacao12.0%
Biscuits au sésame12.0%
Eau de source12.0%
Compound Chocolate with MILK AND ALMONDS12.0%
Sablé coco12.0%
Biscuit fourré à la pâte à tartiner aux noisettes et au cacao Nutella®12.0%
Pain croustillant a la farine de seigle12.0%
Chocolat noir extra-fin traditionnel12.0%
Chips de pommes de terre légèrement salées au sel de mer12.0%
Chocolat noir extra fin, traditionnel12.0%
goûters fourrés au chocolat noir12.0%
Pain croustillant à la farine complète de seigle, avoine et sésame.12.0%
Crackers12.0%
Dark Chocolate 70% cocoa12.0%
Biscuits aux pommes et aux noisettes, très pauvres en sel, riches en vitamines B1, B2, B9 et E et source de vitamines PP et B612.0%
Nuss-Nugat-Creme12.0%
Snack Salé12.0%
Biscuits au son de blé et la figue, riches en fibres, magnesium et phosphore, source de fer, et tres pauvres en sodium.12.0%

origins_tags unknown

Out[287]:

saturn.columns["origins_tags"].stats

statvalue
n50
nulls0 (0.0%)
unique
alert: skippedno profiler for kind=unknown

added_countries_tags unknown

Out[289]:

saturn.columns["added_countries_tags"].stats

statvalue
n50
nulls0 (0.0%)
unique
alert: skippedno profiler for kind=unknown

categories_lc categorical

Out[291]:

saturn.columns["categories_lc"].stats

statvalue
n50
nulls0 (0.0%)
unique6
top_value fr
top_rate 0.5
cardinality 6
entropy 1.628
entropy_ratio 0.6297
Fig 72.
Top values for categories_lc.
Show data table
Top values for categories_lc (6 unique shown, of 6 total).
valuecountshare
fr2550.0%
en1938.0%
es24.0%
de24.0%
it12.0%
pl12.0%

image_url categorical

Out[294]:

saturn.columns["image_url"].stats

statvalue
n50
nulls0 (0.0%)
unique50
top_value https://images.openfoodfacts.org/images/products/611/124/210/0992/front_fr.172.400.jpg
top_rate 0.02
cardinality 50
entropy 5.644
entropy_ratio 1
alert: long_tail50 singleton categories
Fig 73.
Top values for image_url.
Show data table
Top values for image_url (20 unique shown, of 50 total).
valuecountshare
https://images.openfoodfacts.org/images/products/611/124/210/0992/front_fr.172.400.jpg12.0%
https://images.openfoodfacts.org/images/products/762/221/044/9283/front_en.605.400.jpg12.0%
https://images.openfoodfacts.org/images/products/304/692/002/9759/front_en.492.400.jpg12.0%
https://images.openfoodfacts.org/images/products/611/103/100/5064/front_fr.56.400.jpg12.0%
https://images.openfoodfacts.org/images/products/317/568/001/1480/front_en.221.400.jpg12.0%
https://images.openfoodfacts.org/images/products/000/002/099/5553/front_en.314.400.jpg12.0%
https://images.openfoodfacts.org/images/products/326/884/000/1008/front_fr.422.400.jpg12.0%
https://images.openfoodfacts.org/images/products/336/260/001/1044/front_fr.50.400.jpg12.0%
https://images.openfoodfacts.org/images/products/842/519/771/2024/front_en.60.400.jpg12.0%
https://images.openfoodfacts.org/images/products/762/221/057/8464/front_en.29.400.jpg12.0%
https://images.openfoodfacts.org/images/products/611/125/934/3108/front_fr.25.400.jpg12.0%
https://images.openfoodfacts.org/images/products/336/260/001/1228/front_fr.38.400.jpg12.0%
https://images.openfoodfacts.org/images/products/800/050/031/0427/front_fr.488.400.jpg12.0%
https://images.openfoodfacts.org/images/products/730/040/048/1595/front_fr.242.400.jpg12.0%
https://images.openfoodfacts.org/images/products/304/692/002/2651/front_en.159.400.jpg12.0%
https://images.openfoodfacts.org/images/products/506/004/264/1000/front_en.179.400.jpg12.0%
https://images.openfoodfacts.org/images/products/762/221/058/4724/front_en.95.400.jpg12.0%
https://images.openfoodfacts.org/images/products/304/692/002/2606/front_en.102.400.jpg12.0%
https://images.openfoodfacts.org/images/products/322/982/010/0234/front_fr.246.400.jpg12.0%
https://images.openfoodfacts.org/images/products/000/002/002/2464/front_en.301.400.jpg12.0%

ingredients_sweeteners_n numeric

Out[297]:

saturn.columns["ingredients_sweeteners_n"].stats

statvalue
n50
nulls0 (0.0%)
unique1
min 0
max 0
mean 0
median 0
std 0
q1 0
q3 0
iqr 0
skew 0
kurtosis 0
n_outliers 0
outlier_rate 0
zero_rate 1
alert: constantonly one distinct value
Fig 74.
Distribution of ingredients_sweeteners_n. Vertical dash marks the median.
Show data table
Histogram bins for ingredients_sweeteners_n (median: 0.0).
bincount
-0.5 – -0.35710
-0.3571 – -0.21430
-0.2143 – -0.071430
-0.07143 – 0.0714350
0.07143 – 0.21430
0.2143 – 0.35710
0.3571 – 0.50

ingredients_text_ja categorical

Out[300]:

saturn.columns["ingredients_text_ja"].stats

statvalue
n50
nulls49 (98.0%)
unique1
top_value
top_rate 1
cardinality 1
entropy 0
entropy_ratio 0
alert: long_tail1 singleton categories
alert: null_rate98.0% null
alert: imbalancetop value is 100.0% of rows
Fig 75.
Top values for ingredients_text_ja.
Show data table
Top values for ingredients_text_ja (1 unique shown, of 1 total).
valuecountshare
12.0%

allergens_tags unknown

Out[303]:

saturn.columns["allergens_tags"].stats

statvalue
n50
nulls0 (0.0%)
unique
alert: skippedno profiler for kind=unknown

origin_es categorical other

This column appears to be a Spanish-language origin/source label field ('origin_es'), but it is entirely devoid of meaningful content: the sole observed value is an empty string, appearing 20 times across 50 rows. With a 60% null rate and the remaining 40% being empty strings, the column carries zero informational entropy and is effectively blank across the entire dataset. This is a strong signal that the field was never populated.

Treatment: Drop this column; it contains no usable signal (cardinality 1, top value is empty string, 60% nulls).

anthropic:default · confidence high
Out[305]:

saturn.columns["origin_es"].stats

statvalue
n50
nulls30 (60.0%)
unique1
top_value
top_rate 1
cardinality 1
entropy 0
entropy_ratio 0
alert: null_rate60.0% null
alert: imbalancetop value is 100.0% of rows
Fig 76.
Top values for origin_es.
Show data table
Top values for origin_es (1 unique shown, of 1 total).
valuecountshare
2040.0%

last_updated_t numeric

Out[308]:

saturn.columns["last_updated_t"].stats

statvalue
n50
nulls0 (0.0%)
unique50
min 1.739e+09
max 1.769e+09
mean 1.763e+09
median 1.767e+09
std 8.037e+06
q1 1.762e+09
q3 1.768e+09
iqr 6.138e+06
skew -1.945
kurtosis 2.892
n_outliers 6
outlier_rate 0.12
zero_rate 0
alert: outliers12.0% rows beyond 1.5 IQR
Fig 77.
Distribution of last_updated_t. Vertical dash marks the median.
Show data table
Histogram bins for last_updated_t (median: 1766580948.5).
bincount
1.739e+09 – 1.743e+093
1.743e+09 – 1.747e+091
1.747e+09 – 1.752e+091
1.752e+09 – 1.756e+092
1.756e+09 – 1.76e+093
1.76e+09 – 1.764e+098
1.764e+09 – 1.769e+0932

origin_fr categorical

Out[311]:

saturn.columns["origin_fr"].stats

statvalue
n50
nulls4 (8.0%)
unique7
top_value
top_rate 0.8696
cardinality 7
entropy 0.8958
entropy_ratio 0.3191
alert: long_tail6 singleton categories
Fig 78.
Top values for origin_fr.
Show data table
Top values for origin_fr (7 unique shown, of 7 total).
valuecountshare
4080.0%
Fabriqué par: Aachen Allemagne12.0%
Germe de blé origine ue. Sésame origine non-ue.12.0%
France12.0%
fabriqué en France.pommes origine UE. noisettes origine UE et non UE12.0%
Fabriqué en France par Nutrition et Santé. Farine de blé: France. Figues : non UE12.0%
Pâte de cacao (Afrique de l'Ouest, Amérique du Sud)Afrique, Europe, Madagascar, Amérique du Sud, Afrique de l'Ouest12.0%

nutrition_score_warning_fruits_vegetables_nuts_estimate_from_ingredients_value numeric

Out[314]:

saturn.columns["nutrition_score_warning_fruits_vegetables_nuts_estimate_from_ingredients_value"].stats

statvalue
n50
nulls5 (10.0%)
unique13
min 0
max 100
mean 4.532
median 0
std 15.52
q1 0
q3 2.326
iqr 2.326
skew 5.411
kurtosis 30.37
n_outliers 7
outlier_rate 0.1556
zero_rate 0.7111
alert: high_skewskew=+5.41
alert: outliers15.6% rows beyond 1.5 IQR
Fig 79.
Distribution of nutrition_score_warning_fruits_vegetables_nuts_estimate_from_ingredients_value. Vertical dash marks the median.
Show data table
Histogram bins for nutrition_score_warning_fruits_vegetables_nuts_estimate_from_ingredients_value (median: 0.0).
bincount
0 – 16.6743
16.67 – 33.331
33.33 – 500
50 – 66.670
66.67 – 83.330
83.33 – 1001

ingredients_without_ecobalyse_ids unknown

Out[317]:

saturn.columns["ingredients_without_ecobalyse_ids"].stats

statvalue
n50
nulls0 (0.0%)
unique
alert: skippedno profiler for kind=unknown

ingredients_text_with_allergens_it categorical

Out[319]:

saturn.columns["ingredients_text_with_allergens_it"].stats

statvalue
n50
nulls34 (68.0%)
unique12
top_value
top_rate 0.3125
cardinality 12
entropy 3.274
entropy_ratio 0.9134
alert: long_tail11 singleton categories
alert: null_rate68.0% null
Fig 80.
Top values for ingredients_text_with_allergens_it.
Show data table
Top values for ingredients_text_with_allergens_it (12 unique shown, of 12 total).
valuecountshare
510.0%
Pasta di cacao, burro di cacao, cacao magro in polvere, zucchero. Può contenere nocciole, mandorle, altra frutta a guscio, latte, soia.12.0%
crema alle NOCCIOLE e al cacao 40% (zucchero, olio di palma, NOCCIOLE 13%, LATTE Scremato in polvere 8.7%, cacao magro 7,4%, emulsionanti: lecitine (SOIA): vanillina), farina di FRUMENTO (32%), grassi vegetali (palma, palmisto), zucchero di canna (9%), LATTOSIO, crusca di FRUMENTO, LATTE intero in polvere, estratto in polvere di malto d'ORZO e mais, miele, agenti lievitanti (difosfato disodico. carbonato acido di ammonio, carbonato acido di sodio), cacao magro, sale, amido di FRUMENTO, farina di ORZO maltato, emulsionanti: lecitine (SOIA), vanillina.12.0%
pasta di cacao, zucchero, burro di cacao, vaniglia12.0%
patate, olio di girasole, sale marino.12.0%
Pasta di cacao, cacao magro, burro di cacao, zucchero grezzo di canna, vaniglia.12.0%
Farina integrale di segale (59 g), crusca di grano (27 g), fiocchi d'avena (12 g), semi di sesamo (7,0 g), germe di grano, sale. Può contenere tracce di latte.12.0%
Farina di FRUMENTO, olio di palma, sciroppo di glucosio, estratto di malto d'ORZO, agenti lievitanti (carbonati di ammonio, carbonati di sodio), sale, UOVA, aroma, agente di trattamento della farina (METABISOLFITO di sodio).12.0%
Pasta di cacao, zucchero, burro di cacao, vaniglia.12.0%
Massa di cacao, zucchero, burro di cacao, emulsionante: lecitine (soia); estratto di vaniglia. Può contenere tracce di frutta a guscio e latte. Il 40% della massa di cacao proviene da piantagioni selezionate dell'Ecuador.12.0%
wdrated potatoes, sunflower oll, wheat flour, corn lour.test NRC b ber otin. Emulgator (E471), Salz, Farbstoff (Annatto Norbirin, k hottom (BB). Packaged in a protective atmosphere, (DE) KNAEF Kam ef s1sel colorant (n0rbixine de rocou). Peut contenir lait, soja. À conse gie vepackt. (FR) SNACK SALE. INGREDIENTS: Pommes de terre disht SNCK SALATO. : Patate disidratate, olio di girasole, (arina d frmu botisiha d annatto). Puo contenere latte, sola. Da consumarsi prelerbilmetp SEL NGREDIENTES: Batatas desidratadas, óleo de girasol, farinha de trigo.(aimha d mh e o, Pode conter leite, soja. Consumir de preferëncia antes de: ver fundo (BB), Enbazhyer OHTS Pttas deshidratadas, aceite de qirasol, harina de trigo, harina de maiz, haia ca rm e eche, soja. Consumir preferentemente antes del: ver parte interior (8B), Enast et 'Releenc itle dn 100 g | RI" /30g| Eectsge/Ayt acuilo medo 84U bole / Prodoth te /30g ji begja /Valor energetico Tpas (Grassi/ Unjdos / Grasas tan eậticte Fetsäuren / dont 2214 kJ 664 kJ 530 kcal 159 kcal adulo medio / 8% 31g 3.0 9 9.3 0.9g 17g 13% Produoad by: see yd Aii dd cassi satui / dos quais Producido por urdes thtrde | Glucites | 5% oidrati / MedaCoyK Sabd 55g 7% Uont sucres /di eui *FRSCAME QNg12.0%
25% noci, 25% mandorle, 25% uva sultanina (99,5% uva sultanina, olio di semi di girasole), 25% mirtilli rossi americani, essiccati e zuccherati (60% mirtilli rossi americani, 39% zucchero, olio di semi di girasole). Può contenere tracce di altra frutta a guscio e arachidi. Confezionato in atmosfera protettiva.12.0%

data_quality_errors_tags unknown

Out[322]:

saturn.columns["data_quality_errors_tags"].stats

statvalue
n50
nulls0 (0.0%)
unique
alert: skippedno profiler for kind=unknown

origin_pl categorical metadata

This column appears to be an 'origin platform' or similar provenance field, but it is essentially empty: 90% of its 50 rows are null, and the only non-null value is an empty string appearing 5 times. With cardinality of 1 and entropy of 0.0, it carries zero information. The combination of high null rate and a single blank value strongly suggests this field was never populated in this dataset slice.

Treatment: Drop — zero variance and 90% nulls make this column useless for modelling or analysis.

anthropic:default · confidence high
Out[324]:

saturn.columns["origin_pl"].stats

statvalue
n50
nulls45 (90.0%)
unique1
top_value
top_rate 1
cardinality 1
entropy 0
entropy_ratio 0
alert: null_rate90.0% null
alert: imbalancetop value is 100.0% of rows
Fig 81.
Top values for origin_pl.
Show data table
Top values for origin_pl (1 unique shown, of 1 total).
valuecountshare
510.0%

packaging_text_fr categorical

Out[327]:

saturn.columns["packaging_text_fr"].stats

statvalue
n50
nulls3 (6.0%)
unique14
top_value
top_rate 0.7234
cardinality 14
entropy 1.874
entropy_ratio 0.4923
alert: long_tail13 singleton categories
Fig 82.
Top values for packaging_text_fr.
Show data table
Top values for packaging_text_fr (14 unique shown, of 14 total).
valuecountshare
3468.0%
1 film en plastique à recycler 1 étui en papier ondulé à recycler12.0%
carton, plastique12.0%
1 bouchon en plastique à trier 1 bouteille en plastique à trier12.0%
1 étui en carton à recycler 1 feuille en aluminium à recycler12.0%
1 sachet plastique à jeter12.0%
1 étui en carton  à recycler 1 feuille en aluminium à recycler12.0%
LE TRI +FACILE + BAC DE TRI12.0%
4 FILMS PLASTIQUE A JETER 1 ÉTUI CARTON À RECYCLER12.0%
FR LE TRI + FACILE ÉTUI 8+ SACHETS BAC DE TRI A consommer de préférence avant le : en France par et Santé S.A.S. 10:02 11914538 112 eCastelnaudary REVEL 30 04 202412.0%
1 étui carton à recycler, 1 film plastique à jeter, 1 barquette plastique à jeter.12.0%
1 FEUILLE PAPIER À RECYCLER, 1 FEUILLE METAL À RECYCLER, 1 FILM PLASTIQUE À JETER12.0%
Sachet, clip à recycler12.0%
2 sachets en plastique à recycler 1 boîte en carton à recycler12.0%

debug_tags unknown

Out[330]:

saturn.columns["debug_tags"].stats

statvalue
n50
nulls0 (0.0%)
unique
alert: skippedno profiler for kind=unknown

ingredients_text_sv categorical

Out[332]:

saturn.columns["ingredients_text_sv"].stats

statvalue
n50
nulls46 (92.0%)
unique4
top_value kakaomassa, kakaosmör, fettreducerat kakaopulver, socker, vanilj.
top_rate 0.25
cardinality 4
entropy 2
entropy_ratio 1
alert: long_tail4 singleton categories
alert: null_rate92.0% null
Fig 83.
Top values for ingredients_text_sv.
Show data table
Top values for ingredients_text_sv (4 unique shown, of 4 total).
valuecountshare
kakaomassa, kakaosmör, fettreducerat kakaopulver, socker, vanilj.12.0%
kakaomassa, fettreducerat kakaopulver, kakaosmör, socker, emulgeringsmedel (_sojalecitin_), vaniljextrakt. Minst 85 % kakao i chokladen. Kan innehålla spår av nötter och mjölk.12.0%
12.0%
_VETEMJÖL_/_HVEDEMEL_, palmolja/-olie, glukossirap, maltextrakt från _KORN_/_BYG_, bakpulver/hævemidler (ammoniumkarbonater, natriumkarbonater), salt, _ÄGG_/_ÆG_/_EGG_, arom, mjölbehandlingsmedel/melbehandlingsmiddel (_NATRIUMDISULFIT_).12.0%

cities_tags unknown

Out[335]:

saturn.columns["cities_tags"].stats

statvalue
n50
nulls0 (0.0%)
unique
alert: skippedno profiler for kind=unknown

ingredients_with_unspecified_percent_n numeric

Out[337]:

saturn.columns["ingredients_with_unspecified_percent_n"].stats

statvalue
n50
nulls0 (0.0%)
unique18
min 1
max 33
mean 8.8
median 7
std 6.061
q1 5
q3 11
iqr 6
skew 1.645
kurtosis 3.545
n_outliers 2
outlier_rate 0.04
zero_rate 0
Fig 84.
Distribution of ingredients_with_unspecified_percent_n. Vertical dash marks the median.
Show data table
Histogram bins for ingredients_with_unspecified_percent_n (median: 7.0).
bincount
1 – 5.57122
5.571 – 10.1413
10.14 – 14.716
14.71 – 19.297
19.29 – 23.861
23.86 – 28.430
28.43 – 331

product_name_fr categorical

Out[340]:

saturn.columns["product_name_fr"].stats

statvalue
n50
nulls1 (2.0%)
unique47
top_value Henry’s
top_rate 0.04082
cardinality 47
entropy 5.533
entropy_ratio 0.9961
alert: long_tail45 singleton categories
Fig 85.
Top values for product_name_fr.
Show data table
Top values for product_name_fr (20 unique shown, of 47 total).
valuecountshare
Henry’s24.0%
Excellence Noir Subtil Doux 70% Cacao24.0%
Perly12.0%
Prince Goût Chocolat12.0%
Excellence Noir Prodigieux 90% Cacao12.0%
Tonik12.0%
Sésame12.0%
Chocolat noir - 85% cacao12.0%
CRISTALINE Eau De Source 0.5L12.0%
Maruja12.0%
Dark chocolate 70%12.0%
KING COOKIES12.0%
Sable coco Henry s 42g12.0%
Biscuits croquants au coeur onctueux de Nutella®12.0%
Tartine croustillante Authentique12.0%
Excellence Noir Intense 70% Cacao12.0%
Lightly sea salted crisps12.0%
Dark chocolate12.0%
Excellence Noir Puissant 85% Cacao12.0%
Fourrés Chocolat Noir12.0%

traces categorical

Out[343]:

saturn.columns["traces"].stats

statvalue
n50
nulls0 (0.0%)
unique23
top_value
top_rate 0.22
cardinality 23
entropy 3.922
entropy_ratio 0.8671
alert: long_tail16 singleton categories
Fig 86.
Top values for traces.
Show data table
Top values for traces (20 unique shown, of 23 total).
valuecountshare
1122.0%
en:milk,en:nuts714.0%
en:nuts510.0%
en:milk,en:nuts,en:sesame-seeds,en:soybeans48.0%
en:milk,en:soybeans36.0%
en:soybeans24.0%
en:lupin,en:milk,en:nuts,en:sesame-seeds,en:soybeans24.0%
en:eggs12.0%
en:milk,en:nuts,en:soybeans12.0%
en:eggs,en:lupin,en:milk,en:mustard,en:nuts,en:soybeans12.0%
en:mustard12.0%
en:lupin,en:milk,en:mustard,en:sesame-seeds,en:soybeans12.0%
en:milk12.0%
en:eggs,en:mustard,en:nuts,en:sesame-seeds,en:soybeans12.0%
en:gluten,en:Amande,en:Arachides,en:Avoine,en:Blé,en:Lait,en:Noisettes,en:Noix,en:Noix de cajou,en:Noix de macadamia,en:Noix de pécan,en:Noix du brésil,en:Orge,en:Pistaches,en:Seigle12.0%
en:lupin,en:milk,en:mustard,en:soybeans12.0%
en:gluten,en:milk12.0%
en:gluten,en:nuts,en:peanuts,en:soybeans12.0%
en:nuts,en:peanuts,en:soybeans12.0%
en:gluten,en:nuts12.0%

known_ingredients_n numeric

Out[346]:

saturn.columns["known_ingredients_n"].stats

statvalue
n50
nulls0 (0.0%)
unique22
min 0
max 36
mean 11.76
median 9
std 8.721
q1 5
q3 18.5
iqr 13.5
skew 0.8598
kurtosis 0.07411
n_outliers 0
outlier_rate 0
zero_rate 0.04
Fig 87.
Distribution of known_ingredients_n. Vertical dash marks the median.
Show data table
Histogram bins for known_ingredients_n (median: 9.0).
bincount
0 – 5.14316
5.143 – 10.2912
10.29 – 15.436
15.43 – 20.577
20.57 – 25.715
25.71 – 30.862
30.86 – 362

packaging_text_pl categorical other

This column appears to be a Polish-language packaging text field that is almost entirely empty: 90% of its 50 rows are null, and the sole non-null value present in 5 rows is an empty string. With cardinality of 1 and entropy of 0, the column carries zero information. The combination of a 90% null rate and a top_value of '' means not a single meaningful entry exists in this sample.

Treatment: Drop this column; it contains no usable information in the current sample.

anthropic:default · confidence high
Out[349]:

saturn.columns["packaging_text_pl"].stats

statvalue
n50
nulls45 (90.0%)
unique1
top_value
top_rate 1
cardinality 1
entropy 0
entropy_ratio 0
alert: null_rate90.0% null
alert: imbalancetop value is 100.0% of rows
Fig 88.
Top values for packaging_text_pl.
Show data table
Top values for packaging_text_pl (1 unique shown, of 1 total).
valuecountshare
510.0%

image_front_small_url categorical

Out[352]:

saturn.columns["image_front_small_url"].stats

statvalue
n50
nulls0 (0.0%)
unique50
top_value https://images.openfoodfacts.org/images/products/611/124/210/0992/front_fr.172.200.jpg
top_rate 0.02
cardinality 50
entropy 5.644
entropy_ratio 1
alert: long_tail50 singleton categories
Fig 89.
Top values for image_front_small_url.
Show data table
Top values for image_front_small_url (20 unique shown, of 50 total).
valuecountshare
https://images.openfoodfacts.org/images/products/611/124/210/0992/front_fr.172.200.jpg12.0%
https://images.openfoodfacts.org/images/products/762/221/044/9283/front_en.605.200.jpg12.0%
https://images.openfoodfacts.org/images/products/304/692/002/9759/front_en.492.200.jpg12.0%
https://images.openfoodfacts.org/images/products/611/103/100/5064/front_fr.56.200.jpg12.0%
https://images.openfoodfacts.org/images/products/317/568/001/1480/front_en.221.200.jpg12.0%
https://images.openfoodfacts.org/images/products/000/002/099/5553/front_en.314.200.jpg12.0%
https://images.openfoodfacts.org/images/products/326/884/000/1008/front_fr.422.200.jpg12.0%
https://images.openfoodfacts.org/images/products/336/260/001/1044/front_fr.50.200.jpg12.0%
https://images.openfoodfacts.org/images/products/842/519/771/2024/front_en.60.200.jpg12.0%
https://images.openfoodfacts.org/images/products/762/221/057/8464/front_en.29.200.jpg12.0%
https://images.openfoodfacts.org/images/products/611/125/934/3108/front_fr.25.200.jpg12.0%
https://images.openfoodfacts.org/images/products/336/260/001/1228/front_fr.38.200.jpg12.0%
https://images.openfoodfacts.org/images/products/800/050/031/0427/front_fr.488.200.jpg12.0%
https://images.openfoodfacts.org/images/products/730/040/048/1595/front_fr.242.200.jpg12.0%
https://images.openfoodfacts.org/images/products/304/692/002/2651/front_en.159.200.jpg12.0%
https://images.openfoodfacts.org/images/products/506/004/264/1000/front_en.179.200.jpg12.0%
https://images.openfoodfacts.org/images/products/762/221/058/4724/front_en.95.200.jpg12.0%
https://images.openfoodfacts.org/images/products/304/692/002/2606/front_en.102.200.jpg12.0%
https://images.openfoodfacts.org/images/products/322/982/010/0234/front_fr.246.200.jpg12.0%
https://images.openfoodfacts.org/images/products/000/002/002/2464/front_en.301.200.jpg12.0%

origin_en categorical

Out[355]:

saturn.columns["origin_en"].stats

statvalue
n50
nulls7 (14.0%)
unique2
top_value
top_rate 0.9767
cardinality 2
entropy 0.1594
entropy_ratio 0.1594
alert: imbalancetop value is 97.7% of rows
Fig 90.
Top values for origin_en.
Show data table
Top values for origin_en (2 unique shown, of 2 total).
valuecountshare
4284.0%
France12.0%

interface_version_modified categorical

Out[358]:

saturn.columns["interface_version_modified"].stats

statvalue
n50
nulls0 (0.0%)
unique2
top_value 20150316.jqm2
top_rate 0.84
cardinality 2
entropy 0.6343
entropy_ratio 0.6343
Fig 91.
Top values for interface_version_modified.
Show data table
Top values for interface_version_modified (2 unique shown, of 2 total).
valuecountshare
20150316.jqm24284.0%
20190830816.0%

serving_size categorical

Out[361]:

saturn.columns["serving_size"].stats

statvalue
n50
nulls6 (12.0%)
unique37
top_value 100g
top_rate 0.06818
cardinality 37
entropy 5.107
entropy_ratio 0.9803
alert: long_tail32 singleton categories
Fig 92.
Top values for serving_size.
Show data table
Top values for serving_size (20 unique shown, of 37 total).
valuecountshare
100g36.0%
10 g36.0%
42 g24.0%
100 g24.0%
30 g24.0%
20g12.0%
1 Square (10 g)12.0%
23g12.0%
11.5g12.0%
25 g12.0%
1 L12.0%
1 portion (100 g)12.0%
13,8 g12.0%
11.4 g (1 tranche)12.0%
1 serving (100 g)12.0%
6 squares (18 g)12.0%
50g12.0%
20 gram12.0%
10 g (1 tranche)12.0%
85g12.0%

states categorical

Out[364]:

saturn.columns["states"].stats

statvalue
n50
nulls0 (0.0%)
unique26
top_value en:to-be-completed, en:nutrition-facts-completed, en:ingredients-completed, en:expiration-date-completed, en:packaging-code-to-be-completed, en:characteristics-to-be-completed, en:origins-to-be-completed, en:categories-completed, en:brands-completed, en:packaging-completed, en:quantity-completed, en:product-name-completed, en:photos-validated, en:packaging-photo-selected, en:nutrition-photo-selected, en:ingredients-photo-selected, en:front-photo-selected, en:photos-uploaded
top_rate 0.16
cardinality 26
entropy 4.286
entropy_ratio 0.9119
alert: long_tail16 singleton categories
Fig 93.
Top values for states.
Show data table
Top values for states (20 unique shown, of 26 total).
valuecountshare
en:to-be-completed, en:nutrition-facts-completed, en:ingredients-completed, en:expiration-date-completed, en:packaging-code-to-be-completed, en:characteristics-to-be-completed, en:origins-to-be-completed, en:categories-completed, en:brands-completed, en:packaging-completed, en:quantity-completed, en:product-name-completed, en:photos-validated, en:packaging-photo-selected, en:nutrition-photo-selected, en:ingredients-photo-selected, en:front-photo-selected, en:photos-uploaded816.0%
en:to-be-checked, en:complete, en:nutrition-facts-completed, en:ingredients-completed, en:expiration-date-completed, en:packaging-code-completed, en:characteristics-completed, en:origins-completed, en:categories-completed, en:brands-completed, en:packaging-completed, en:quantity-completed, en:product-name-completed, en:photos-validated, en:packaging-photo-selected, en:nutrition-photo-selected, en:ingredients-photo-selected, en:front-photo-selected, en:photos-uploaded612.0%
en:to-be-completed, en:nutrition-facts-completed, en:ingredients-completed, en:expiration-date-completed, en:packaging-code-to-be-completed, en:characteristics-to-be-completed, en:origins-to-be-completed, en:categories-completed, en:brands-completed, en:packaging-completed, en:quantity-completed, en:product-name-completed, en:photos-to-be-validated, en:packaging-photo-to-be-selected, en:nutrition-photo-selected, en:ingredients-photo-selected, en:front-photo-selected, en:photos-uploaded510.0%
en:to-be-checked, en:complete, en:nutrition-facts-completed, en:ingredients-completed, en:expiration-date-completed, en:packaging-code-to-be-completed, en:characteristics-completed, en:origins-completed, en:categories-completed, en:brands-completed, en:packaging-completed, en:quantity-completed, en:product-name-completed, en:photos-validated, en:packaging-photo-selected, en:nutrition-photo-selected, en:ingredients-photo-selected, en:front-photo-selected, en:photos-uploaded36.0%
en:to-be-completed, en:nutrition-facts-completed, en:ingredients-completed, en:expiration-date-to-be-completed, en:packaging-code-to-be-completed, en:characteristics-to-be-completed, en:origins-to-be-completed, en:categories-completed, en:brands-completed, en:packaging-to-be-completed, en:quantity-to-be-completed, en:product-name-completed, en:photos-to-be-validated, en:packaging-photo-to-be-selected, en:nutrition-photo-selected, en:ingredients-photo-selected, en:front-photo-selected, en:photos-uploaded24.0%
en:checked, en:complete, en:nutrition-facts-completed, en:ingredients-completed, en:expiration-date-completed, en:packaging-code-completed, en:characteristics-completed, en:origins-completed, en:categories-completed, en:brands-completed, en:packaging-completed, en:quantity-completed, en:product-name-completed, en:photos-validated, en:packaging-photo-selected, en:nutrition-photo-selected, en:ingredients-photo-selected, en:front-photo-selected, en:photos-uploaded24.0%
en:to-be-completed, en:nutrition-facts-completed, en:ingredients-completed, en:expiration-date-to-be-completed, en:packaging-code-to-be-completed, en:characteristics-completed, en:origins-completed, en:categories-completed, en:brands-completed, en:packaging-completed, en:quantity-completed, en:product-name-completed, en:photos-to-be-validated, en:packaging-photo-to-be-selected, en:nutrition-photo-selected, en:ingredients-photo-selected, en:front-photo-selected, en:photos-uploaded24.0%
en:to-be-completed, en:nutrition-facts-completed, en:ingredients-completed, en:expiration-date-to-be-completed, en:packaging-code-to-be-completed, en:characteristics-to-be-completed, en:origins-to-be-completed, en:categories-completed, en:brands-completed, en:packaging-to-be-completed, en:quantity-completed, en:product-name-completed, en:photos-to-be-validated, en:packaging-photo-to-be-selected, en:nutrition-photo-selected, en:ingredients-photo-selected, en:front-photo-selected, en:photos-uploaded24.0%
en:to-be-completed, en:nutrition-facts-completed, en:ingredients-completed, en:expiration-date-completed, en:packaging-code-to-be-completed, en:characteristics-completed, en:origins-completed, en:categories-completed, en:brands-completed, en:packaging-completed, en:quantity-completed, en:product-name-completed, en:photos-to-be-validated, en:packaging-photo-to-be-selected, en:nutrition-photo-selected, en:ingredients-photo-selected, en:front-photo-selected, en:photos-uploaded24.0%
en:to-be-checked, en:complete, en:nutrition-facts-completed, en:ingredients-completed, en:expiration-date-to-be-completed, en:packaging-code-to-be-completed, en:characteristics-completed, en:origins-completed, en:categories-completed, en:brands-completed, en:packaging-completed, en:quantity-completed, en:product-name-completed, en:photos-validated, en:packaging-photo-selected, en:nutrition-photo-selected, en:ingredients-photo-selected, en:front-photo-selected, en:photos-uploaded24.0%
en:to-be-completed, en:nutrition-facts-completed, en:ingredients-completed, en:expiration-date-completed, en:packaging-code-completed, en:characteristics-completed, en:origins-completed, en:categories-completed, en:brands-completed, en:packaging-completed, en:quantity-completed, en:product-name-completed, en:photos-to-be-validated, en:packaging-photo-to-be-selected, en:nutrition-photo-to-be-selected, en:ingredients-photo-selected, en:front-photo-to-be-selected, en:photos-uploaded12.0%
en:to-be-completed, en:nutrition-facts-completed, en:ingredients-completed, en:expiration-date-completed, en:packaging-code-to-be-completed, en:characteristics-to-be-completed, en:origins-to-be-completed, en:categories-completed, en:brands-completed, en:packaging-completed, en:quantity-completed, en:product-name-completed, en:photos-to-be-validated, en:packaging-photo-selected, en:nutrition-photo-selected, en:ingredients-photo-selected, en:front-photo-to-be-selected, en:photos-uploaded12.0%
en:to-be-completed, en:nutrition-facts-completed, en:ingredients-completed, en:expiration-date-to-be-completed, en:packaging-code-to-be-completed, en:characteristics-to-be-completed, en:origins-completed, en:categories-completed, en:brands-completed, en:packaging-to-be-completed, en:quantity-completed, en:product-name-completed, en:photos-validated, en:packaging-photo-selected, en:nutrition-photo-selected, en:ingredients-photo-selected, en:front-photo-selected, en:photos-uploaded12.0%
en:checked, en:complete, en:nutrition-facts-completed, en:ingredients-completed, en:expiration-date-completed, en:packaging-code-to-be-completed, en:characteristics-completed, en:origins-completed, en:categories-completed, en:brands-completed, en:packaging-completed, en:quantity-completed, en:product-name-completed, en:photos-validated, en:packaging-photo-selected, en:nutrition-photo-selected, en:ingredients-photo-selected, en:front-photo-selected, en:photos-uploaded12.0%
en:to-be-completed, en:nutrition-facts-completed, en:ingredients-completed, en:expiration-date-completed, en:packaging-code-completed, en:characteristics-to-be-completed, en:origins-to-be-completed, en:categories-completed, en:brands-completed, en:packaging-completed, en:quantity-completed, en:product-name-completed, en:photos-to-be-validated, en:packaging-photo-to-be-selected, en:nutrition-photo-selected, en:ingredients-photo-selected, en:front-photo-selected, en:photos-uploaded12.0%
en:to-be-completed, en:nutrition-facts-completed, en:ingredients-completed, en:expiration-date-to-be-completed, en:packaging-code-to-be-completed, en:characteristics-to-be-completed, en:origins-to-be-completed, en:categories-completed, en:brands-completed, en:packaging-completed, en:quantity-completed, en:product-name-completed, en:photos-to-be-validated, en:packaging-photo-to-be-selected, en:nutrition-photo-selected, en:ingredients-photo-selected, en:front-photo-selected, en:photos-uploaded12.0%
en:to-be-completed, en:nutrition-facts-completed, en:ingredients-completed, en:expiration-date-completed, en:packaging-code-to-be-completed, en:characteristics-to-be-completed, en:origins-to-be-completed, en:categories-completed, en:brands-completed, en:packaging-completed, en:quantity-completed, en:product-name-to-be-completed, en:photos-to-be-validated, en:packaging-photo-to-be-selected, en:nutrition-photo-selected, en:ingredients-photo-selected, en:front-photo-selected, en:photos-uploaded12.0%
en:to-be-completed, en:nutrition-facts-completed, en:ingredients-completed, en:expiration-date-to-be-completed, en:packaging-code-completed, en:characteristics-to-be-completed, en:origins-to-be-completed, en:categories-completed, en:brands-completed, en:packaging-completed, en:quantity-completed, en:product-name-completed, en:photos-validated, en:packaging-photo-selected, en:nutrition-photo-selected, en:ingredients-photo-selected, en:front-photo-selected, en:photos-uploaded12.0%
en:to-be-completed, en:nutrition-facts-completed, en:ingredients-completed, en:expiration-date-to-be-completed, en:packaging-code-completed, en:characteristics-completed, en:origins-completed, en:categories-completed, en:brands-completed, en:packaging-completed, en:quantity-completed, en:product-name-completed, en:photos-to-be-validated, en:packaging-photo-to-be-selected, en:nutrition-photo-selected, en:ingredients-photo-selected, en:front-photo-selected, en:photos-uploaded12.0%
en:checked, en:complete, en:nutrition-facts-completed, en:ingredients-completed, en:expiration-date-to-be-completed, en:packaging-code-to-be-completed, en:characteristics-completed, en:origins-completed, en:categories-completed, en:brands-completed, en:packaging-completed, en:quantity-completed, en:product-name-completed, en:photos-validated, en:packaging-photo-selected, en:nutrition-photo-selected, en:ingredients-photo-selected, en:front-photo-selected, en:photos-uploaded12.0%

generic_name_fi categorical

Out[367]:

saturn.columns["generic_name_fi"].stats

statvalue
n50
nulls45 (90.0%)
unique5
top_value Hieno tumma suklaa jossa 90% kaakaota
top_rate 0.2
cardinality 5
entropy 2.322
entropy_ratio 1
alert: long_tail5 singleton categories
alert: null_rate90.0% null
Fig 94.
Top values for generic_name_fi.
Show data table
Top values for generic_name_fi (5 unique shown, of 5 total).
valuecountshare
Hieno tumma suklaa jossa 90% kaakaota12.0%
Tumma suklaa12.0%
tumma suklaa12.0%
Keksejä12.0%
12.0%

schema_version numeric

Out[370]:

saturn.columns["schema_version"].stats

statvalue
n50
nulls0 (0.0%)
unique1
min 996
max 996
mean 996
median 996
std 0
q1 996
q3 996
iqr 0
skew 0
kurtosis 0
n_outliers 0
outlier_rate 0
zero_rate 0
alert: constantonly one distinct value
Fig 95.
Distribution of schema_version. Vertical dash marks the median.
Show data table
Histogram bins for schema_version (median: 996.0).
bincount
995.5 – 995.60
995.6 – 995.80
995.8 – 995.90
995.9 – 996.150
996.1 – 996.20
996.2 – 996.40
996.4 – 996.50

packaging_old_before_taxonomization categorical

Out[373]:

saturn.columns["packaging_old_before_taxonomization"].stats

statvalue
n50
nulls12 (24.0%)
unique36
top_value plastique
top_rate 0.07895
cardinality 36
entropy 5.123
entropy_ratio 0.9909
alert: long_tail35 singleton categories
alert: null_rate24.0% null
Fig 96.
Top values for packaging_old_before_taxonomization.
Show data table
Top values for packaging_old_before_taxonomization (20 unique shown, of 36 total).
valuecountshare
plastique36.0%
fr:Film en plastique,paquet,fr:Etui en carton12.0%
Papel de aluminio,Caja de cartón,Carton,Karton,emballage,box cardboard,Aluminium wrap, en:card-box, en:foil-wrapper12.0%
Carton,Sachets,20 biscuits en 4 sachets,packet,paquetes12.0%
sl:PAP,fr:FSC mixte,Produkt,21 PAP12.0%
Papier,aluminium12.0%
Plastic12.0%
Plastique,en:mixed plastic-packet,Enveloppe12.0%
fr:Papier,Package paper,Paper recycling,papier,Enveloppe12.0%
carton,aluminium,Emballage carton12.0%
Sachet,Sous atmosphère protectrice,en:mixed plastic-packet12.0%
paper, foil12.0%
papier aluminium,emballage carton12.0%
fr:film plastique à jeter,fr:étui carton à recycler, fr:Film en plastique12.0%
papier,Enveloppe12.0%
paper12.0%
Kunststoff12.0%
Papel de aluminio, Caja de cartón, Carton, en:card-carton, en:aluminium-wrapper12.0%
Carton,plastique12.0%
4 sachets plastiques de 4 biscuits, Carton, fr:Film en plastique, fr:Etui en carton12.0%

nova_groups_markers unknown

Out[376]:

saturn.columns["nova_groups_markers"].stats

statvalue
n50
nulls0 (0.0%)
unique
alert: skippedno profiler for kind=unknown

amino_acids_prev_tags unknown

Out[378]:

saturn.columns["amino_acids_prev_tags"].stats

statvalue
n50
nulls0 (0.0%)
unique
alert: skippedno profiler for kind=unknown

product unknown

Out[380]:

saturn.columns["product"].stats

statvalue
n50
nulls0 (0.0%)
unique
alert: skippedno profiler for kind=unknown

emb_codes categorical

Out[382]:

saturn.columns["emb_codes"].stats

statvalue
n50
nulls2 (4.0%)
unique11
top_value
top_rate 0.7292
cardinality 11
entropy 1.72
entropy_ratio 0.4972
alert: long_tail7 singleton categories
Fig 97.
Top values for emb_codes.
Show data table
Top values for emb_codes (11 unique shown, of 11 total).
valuecountshare
3570.0%
FSC-C02144224.0%
FSC-C01248424.0%
EMB 3125024.0%
LPL.28.01.1312.0%
EMB 44068A12.0%
SOLENT GMBH & CO. KG,SCHWARZ BETEILIGUNGS GMBH12.0%
200029-N4/724312.0%
EMB 6442212.0%
FSC-C19042612.0%
C-352-255-22-1012.0%

labels_tags unknown

Out[385]:

saturn.columns["labels_tags"].stats

statvalue
n50
nulls0 (0.0%)
unique
alert: skippedno profiler for kind=unknown

selected_images unknown

Out[387]:

saturn.columns["selected_images"].stats

statvalue
n50
nulls0 (0.0%)
unique
alert: skippedno profiler for kind=unknown

nutriscore unknown

Out[389]:

saturn.columns["nutriscore"].stats

statvalue
n50
nulls0 (0.0%)
unique
alert: skippedno profiler for kind=unknown

packaging_tags unknown

Out[391]:

saturn.columns["packaging_tags"].stats

statvalue
n50
nulls0 (0.0%)
unique
alert: skippedno profiler for kind=unknown

traces_from_ingredients categorical

Out[393]:

saturn.columns["traces_from_ingredients"].stats

statvalue
n50
nulls0 (0.0%)
unique12
top_value
top_rate 0.78
cardinality 12
entropy 1.521
entropy_ratio 0.4243
alert: long_tail11 singleton categories
Fig 98.
Top values for traces_from_ingredients.
Show data table
Top values for traces_from_ingredients (12 unique shown, of 12 total).
valuecountshare
3978.0%
œuf12.0%
nuts, milk12.0%
LUPIN, LAIT, MOUTARDE, GRAINES DE SÉSAME , SOJA, LUPIN, LAIT, MOUTARDE, GRAINES DE SÉSAME, SOJA12.0%
fruits à coque, lait, soja, sésame12.0%
soja, œufs, fruits à coque, sésame, moutarde12.0%
LUPIN, LAIT, MOUTARDE , SOJA, LUPIN, LAIT, MOUTARDE, SOJA12.0%
Schalenfrüchte, Milch, Soja12.0%
LAIT, FRUITS A COQUE, LAIT, FRUITS A COQUE12.0%
lait, moutarde, soja12.0%
: fruits à coque12.0%
soja, sésame12.0%

nutrition_data_per categorical

Out[396]:

saturn.columns["nutrition_data_per"].stats

statvalue
n50
nulls0 (0.0%)
unique2
top_value 100g
top_rate 0.84
cardinality 2
entropy 0.6343
entropy_ratio 0.6343
Fig 99.
Top values for nutrition_data_per.
Show data table
Top values for nutrition_data_per (2 unique shown, of 2 total).
valuecountshare
100g4284.0%
serving816.0%

ecoscore_grade categorical

Out[399]:

saturn.columns["ecoscore_grade"].stats

statvalue
n50
nulls0 (0.0%)
unique9
top_value e
top_rate 0.24
cardinality 9
entropy 2.808
entropy_ratio 0.8857
Fig 100.
Top values for ecoscore_grade.
Show data table
Top values for ecoscore_grade (9 unique shown, of 9 total).
valuecountshare
e1224.0%
d918.0%
b816.0%
c816.0%
unknown612.0%
a36.0%
a-plus24.0%
not-applicable12.0%
f12.0%

packaging_hierarchy unknown

Out[402]:

saturn.columns["packaging_hierarchy"].stats

statvalue
n50
nulls0 (0.0%)
unique
alert: skippedno profiler for kind=unknown

nova_group numeric

Out[404]:

saturn.columns["nova_group"].stats

statvalue
n50
nulls2 (4.0%)
unique3
min 1
max 4
mean 3.646
median 4
std 0.601
q1 3
q3 4
iqr 1
skew -2.062
kurtosis 5.651
n_outliers 1
outlier_rate 0.02083
zero_rate 0
alert: high_skewskew=-2.06
Fig 101.
Distribution of nova_group. Vertical dash marks the median.
Show data table
Histogram bins for nova_group (median: 4.0).
bincount
1 – 1.51
1.5 – 20
2 – 2.50
2.5 – 30
3 – 3.514
3.5 – 433

additives_tags unknown

Out[407]:

saturn.columns["additives_tags"].stats

statvalue
n50
nulls0 (0.0%)
unique
alert: skippedno profiler for kind=unknown

emb_codes_20141016 categorical

Out[409]:

saturn.columns["emb_codes_20141016"].stats

statvalue
n50
nulls29 (58.0%)
unique7
top_value
top_rate 0.7143
cardinality 7
entropy 1.602
entropy_ratio 0.5705
alert: long_tail6 singleton categories
alert: null_rate58.0% null
Fig 102.
Top values for emb_codes_20141016.
Show data table
Top values for emb_codes_20141016 (7 unique shown, of 7 total).
valuecountshare
1530.0%
LINDT & SPRÜNGLI SAS,CHOCOLADEFABRIKEN LINDT & SPRÜNGLI AG12.0%
EMB 44068A12.0%
//HERSTELLER UND VERPACKER://,SOLENT GMBH & CO. KG,//DIE ZUGEHÖRIGKEIT ZU://,SCHWARZ BETEILIGUNGS GMBH12.0%
//FABRICANTE Y ENVASADOR://,LINDT & SPRÜNGLI SAS,//PERTENECIENTE A://,CHOCOLADEFABRIKEN LINDT & SPRÜNGLI AG12.0%
//FABRICANTE Y ENVASADOR://,RAUSCH SCHOKOLADEN GMBH12.0%
EMB 6442212.0%

ingredients_without_ciqual_codes unknown

Out[412]:

saturn.columns["ingredients_without_ciqual_codes"].stats

statvalue
n50
nulls0 (0.0%)
unique
alert: skippedno profiler for kind=unknown

categories_tags unknown

Out[414]:

saturn.columns["categories_tags"].stats

statvalue
n50
nulls0 (0.0%)
unique
alert: skippedno profiler for kind=unknown

category_properties unknown

Out[416]:

saturn.columns["category_properties"].stats

statvalue
n50
nulls0 (0.0%)
unique
alert: skippedno profiler for kind=unknown

packagings unknown

Out[418]:

saturn.columns["packagings"].stats

statvalue
n50
nulls0 (0.0%)
unique
alert: skippedno profiler for kind=unknown

languages_codes unknown

Out[420]:

saturn.columns["languages_codes"].stats

statvalue
n50
nulls0 (0.0%)
unique
alert: skippedno profiler for kind=unknown

ingredients_text_with_allergens_fi categorical

Out[422]:

saturn.columns["ingredients_text_with_allergens_fi"].stats

statvalue
n50
nulls45 (90.0%)
unique4
top_value
top_rate 0.4
cardinality 4
entropy 1.922
entropy_ratio 0.961
alert: long_tail3 singleton categories
alert: null_rate90.0% null
Fig 103.
Top values for ingredients_text_with_allergens_fi.
Show data table
Top values for ingredients_text_with_allergens_fi (4 unique shown, of 4 total).
valuecountshare
24.0%
kaakaomassa, kaakaovoi, vähärasvainen kaakaojauhe, sokeri, vanilja. Saattaa sisältää hasselpähkinää, muita pähkinöitä, maitoa, soijaa. Tummassa suklaassa kaakaota vähintään 90%.12.0%
kaakaomassa, vähärasvainen kaakaojauhe, kaakaovoi, sokeri, emulgointiaine (soijalesitiini), vaniljauute. Suklaassa kaakaota vähintään 85 %. Saattaa sisältää pieniä määriä pähkinää ja maitoa.12.0%
VEHNÄJAUHO, palmuöljy, tärkkelyssiirappi, OHRAMALLASUUTE, nostatusaineet ammoniumkarbonaatit, natriumkarbonaatit), suola, KANANMUNAT, aromi, jauhonparanne (NATRIUMDISULFIITTI).12.0%

ciqual_food_name_tags unknown

Out[425]:

saturn.columns["ciqual_food_name_tags"].stats

statvalue
n50
nulls0 (0.0%)
unique
alert: skippedno profiler for kind=unknown

complete numeric

Out[427]:

saturn.columns["complete"].stats

statvalue
n50
nulls0 (0.0%)
unique2
min 0
max 1
mean 0.32
median 0
std 0.4712
q1 0
q3 1
iqr 1
skew 0.7717
kurtosis -1.404
n_outliers 0
outlier_rate 0
zero_rate 0.68
Fig 104.
Distribution of complete. Vertical dash marks the median.
Show data table
Histogram bins for complete (median: 0.0).
bincount
0 – 0.142934
0.1429 – 0.28570
0.2857 – 0.42860
0.4286 – 0.57140
0.5714 – 0.71430
0.7143 – 0.85710
0.8571 – 116

ingredients_text_with_allergens_pl categorical

Out[430]:

saturn.columns["ingredients_text_with_allergens_pl"].stats

statvalue
n50
nulls46 (92.0%)
unique3
top_value
top_rate 0.5
cardinality 3
entropy 1.5
entropy_ratio 0.9464
alert: long_tail2 singleton categories
alert: null_rate92.0% null
Fig 105.
Top values for ingredients_text_with_allergens_pl.
Show data table
Top values for ingredients_text_with_allergens_pl (3 unique shown, of 3 total).
valuecountshare
24.0%
Miazga kakaowa, cukier, tłuszcz kakaowy, kakao w proszku o obniżonej zawartości tłuszczu, emulgator: lecytyny (soja); naturalny aromat waniliowy. Czekolada gorzka: masa kakaowa minimum 74 %. Może zawierać orzeszki ziemne, orzechy, mleko i gluten (pszenica, żyt jęczmień, owies, pszenica orkisz i pszenica khorosan).12.0%
Miazga kakaowa, cukier, tłuszcz kakaowy, wanilia.12.0%

allergens_hierarchy unknown

Out[433]:

saturn.columns["allergens_hierarchy"].stats

statvalue
n50
nulls0 (0.0%)
unique
alert: skippedno profiler for kind=unknown

languages_hierarchy unknown

Out[435]:

saturn.columns["languages_hierarchy"].stats

statvalue
n50
nulls0 (0.0%)
unique
alert: skippedno profiler for kind=unknown

nova_groups_tags unknown

Out[437]:

saturn.columns["nova_groups_tags"].stats

statvalue
n50
nulls0 (0.0%)
unique
alert: skippedno profiler for kind=unknown

ingredients_tags unknown

Out[439]:

saturn.columns["ingredients_tags"].stats

statvalue
n50
nulls0 (0.0%)
unique
alert: skippedno profiler for kind=unknown

ingredients_text_it categorical

Out[441]:

saturn.columns["ingredients_text_it"].stats

statvalue
n50
nulls34 (68.0%)
unique12
top_value
top_rate 0.3125
cardinality 12
entropy 3.274
entropy_ratio 0.9134
alert: long_tail11 singleton categories
alert: null_rate68.0% null
Fig 106.
Top values for ingredients_text_it.
Show data table
Top values for ingredients_text_it (12 unique shown, of 12 total).
valuecountshare
510.0%
Pasta di cacao, burro di cacao, cacao magro in polvere, zucchero. Può contenere nocciole, mandorle, altra frutta a guscio, latte, soia.12.0%
crema alle NOCCIOLE e al cacao 40% (zucchero, olio di palma, NOCCIOLE 13%, LATTE Scremato in polvere 8.7%, cacao magro 7,4%, emulsionanti: lecitine (SOIA): vanillina), farina di FRUMENTO (32%), grassi vegetali (palma, palmisto), zucchero di canna (9%), LATTOSIO, crusca di FRUMENTO, LATTE intero in polvere, estratto in polvere di malto d'ORZO e mais, miele, agenti lievitanti (difosfato disodico. carbonato acido di ammonio, carbonato acido di sodio), cacao magro, sale, amido di FRUMENTO, farina di ORZO maltato, emulsionanti: lecitine (SOIA), vanillina.12.0%
pasta di cacao, zucchero, burro di cacao, vaniglia12.0%
patate, olio di girasole, sale marino.12.0%
Pasta di cacao, cacao magro, burro di cacao, zucchero grezzo di canna, vaniglia.12.0%
Farina integrale di _segale_ (59 g), crusca di _grano_ (27 g), fiocchi d'_avena_ (12 g), semi di _sesamo_ (7,0 g), germe di _grano_, sale. Può contenere tracce di _latte_.12.0%
Farina di _FRUMENTO_, olio di palma, sciroppo di glucosio, estratto di malto d'_ORZO_, agenti lievitanti (carbonati di ammonio, carbonati di sodio), sale, _UOVA_, aroma, agente di trattamento della farina (_METABISOLFITO_ di sodio).12.0%
Pasta di cacao, zucchero, burro di cacao, vaniglia.12.0%
Massa di cacao, zucchero, burro di cacao, emulsionante: lecitine (soia); estratto di vaniglia. Può contenere tracce di frutta a guscio e latte. Il 40% della massa di cacao proviene da piantagioni selezionate dell'Ecuador.12.0%
wdrated potatoes, sunflower oll, wheat flour, corn lour.test NRC b ber otin. Emulgator (E471), Salz, Farbstoff (Annatto Norbirin, k hottom (BB). Packaged in a protective atmosphere, (DE) KNAEF Kam ef s1sel colorant (n0rbixine de rocou). Peut contenir lait, soja. À conse gie vepackt. (FR) SNACK SALE. INGREDIENTS: Pommes de terre disht SNCK SALATO. : Patate disidratate, olio di girasole, (arina d frmu botisiha d annatto). Puo contenere latte, sola. Da consumarsi prelerbilmetp SEL NGREDIENTES: Batatas desidratadas, óleo de girasol, farinha de trigo.(aimha d mh e o, Pode conter leite, soja. Consumir de preferëncia antes de: ver fundo (BB), Enbazhyer OHTS Pttas deshidratadas, aceite de qirasol, harina de trigo, harina de maiz, haia ca rm e eche, soja. Consumir preferentemente antes del: ver parte interior (8B), Enast et 'Releenc itle dn 100 g | RI" /30g| Eectsge/Ayt acuilo medo 84U bole / Prodoth te /30g ji begja /Valor energetico Tpas (Grassi/ Unjdos / Grasas tan eậticte Fetsäuren / dont 2214 kJ 664 kJ 530 kcal 159 kcal adulo medio / 8% 31g 3.0 9 9.3 0.9g 17g 13% Produoad by: see yd Aii dd cassi satui / dos quais Producido por urdes thtrde | Glucites | 5% oidrati / MedaCoyK Sabd 55g 7% Uont sucres /di eui *FRSCAME QNg12.0%
25% noci, 25% mandorle, 25% uva sultanina (99,5% uva sultanina, olio di semi di girasole), 25% mirtilli rossi americani, essiccati e zuccherati (60% mirtilli rossi americani, 39% zucchero, olio di semi di girasole). Può contenere tracce di altra frutta a guscio e arachidi. Confezionato in atmosfera protettiva.12.0%

informers unknown

Out[444]:

saturn.columns["informers"].stats

statvalue
n50
nulls0 (0.0%)
unique
alert: skippedno profiler for kind=unknown

origin_nb categorical

Out[446]:

saturn.columns["origin_nb"].stats

statvalue
n50
nulls48 (96.0%)
unique1
top_value
top_rate 1
cardinality 1
entropy 0
entropy_ratio 0
alert: null_rate96.0% null
alert: imbalancetop value is 100.0% of rows
Fig 107.
Top values for origin_nb.
Show data table
Top values for origin_nb (1 unique shown, of 1 total).
valuecountshare
24.0%

creator categorical

Out[449]:

saturn.columns["creator"].stats

statvalue
n50
nulls0 (0.0%)
unique13
top_value openfoodfacts-contributors
top_rate 0.46
cardinality 13
entropy 2.351
entropy_ratio 0.6353
alert: long_tail10 singleton categories
Fig 108.
Top values for creator.
Show data table
Top values for creator (13 unique shown, of 13 total).
valuecountshare
openfoodfacts-contributors2346.0%
kiliweb1530.0%
javichu24.0%
meryemali12.0%
vichenze12.0%
mllep12.0%
andre12.0%
sqoia12.0%
shaolan12.0%
tacite12.0%
mambl12.0%
norbert45fr12.0%
date-limite-app12.0%

packaging_text_ja categorical

Out[452]:

saturn.columns["packaging_text_ja"].stats

statvalue
n50
nulls49 (98.0%)
unique1
top_value
top_rate 1
cardinality 1
entropy 0
entropy_ratio 0
alert: long_tail1 singleton categories
alert: null_rate98.0% null
alert: imbalancetop value is 100.0% of rows
Fig 109.
Top values for packaging_text_ja.
Show data table
Top values for packaging_text_ja (1 unique shown, of 1 total).
valuecountshare
12.0%

sortkey numeric

Out[455]:

saturn.columns["sortkey"].stats

statvalue
n50
nulls6 (12.0%)
unique44
min 1.568e+09
max 1.611e+09
mean 1.605e+09
median 1.608e+09
std 8.692e+06
q1 1.604e+09
q3 1.61e+09
iqr 6.16e+06
skew -2.782
kurtosis 8.091
n_outliers 4
outlier_rate 0.09091
zero_rate 0
alert: high_skewskew=-2.78
alert: outliers9.1% rows beyond 1.5 IQR
Fig 110.
Distribution of sortkey. Vertical dash marks the median.
Show data table
Histogram bins for sortkey (median: 1608147866.0).
bincount
1.568e+09 – 1.575e+091
1.575e+09 – 1.582e+091
1.582e+09 – 1.589e+091
1.589e+09 – 1.596e+091
1.596e+09 – 1.604e+095
1.604e+09 – 1.611e+0935

packagings_materials_main categorical

Out[458]:

saturn.columns["packagings_materials_main"].stats

statvalue
n50
nulls31 (62.0%)
unique3
top_value en:paper-or-cardboard
top_rate 0.6842
cardinality 3
entropy 1.105
entropy_ratio 0.6972
alert: null_rate62.0% null
Fig 111.
Top values for packagings_materials_main.
Show data table
Top values for packagings_materials_main (3 unique shown, of 3 total).
valuecountshare
en:paper-or-cardboard1326.0%
en:plastic510.0%
en:unknown12.0%

ingredients_percent_analysis numeric feature

This column appears to be a binary flag or pass/fail indicator for ingredient percentage analysis, taking only two distinct values across all 50 rows: 1.0 (present in the vast majority) and -1.0 (a minority case). With Q1, median, and Q3 all equal to 1.0 and a mean of 0.84, roughly 84% of records are coded 1.0 while the remaining ~16% are -1.0, which are flagged as the 4 outliers (8% outlier rate). The extreme skew (−3.10) and kurtosis (7.59) are entirely explained by this near-constant binary distribution, not by a continuous numeric spread.

Treatment: Recode as a binary categorical (1 / -1 → 1 / 0) before modelling; verify whether -1.0 encodes 'fail' or 'missing' to avoid misinterpretation.

anthropic:default · confidence high
Out[461]:

saturn.columns["ingredients_percent_analysis"].stats

statvalue
n50
nulls0 (0.0%)
unique2
min -1
max 1
mean 0.84
median 1
std 0.5481
q1 1
q3 1
iqr 0
skew -3.096
kurtosis 7.587
n_outliers 4
outlier_rate 0.08
zero_rate 0
alert: high_skewskew=-3.10
alert: outliers8.0% rows beyond 1.5 IQR
Fig 112.
Distribution of ingredients_percent_analysis. Vertical dash marks the median.
Show data table
Histogram bins for ingredients_percent_analysis (median: 1.0).
bincount
-1 – -0.71434
-0.7143 – -0.42860
-0.4286 – -0.14290
-0.1429 – 0.14290
0.1429 – 0.42860
0.4286 – 0.71430
0.7143 – 146

amino_acids_tags unknown

Out[464]:

saturn.columns["amino_acids_tags"].stats

statvalue
n50
nulls0 (0.0%)
unique
alert: skippedno profiler for kind=unknown

categories_properties_tags unknown

Out[466]:

saturn.columns["categories_properties_tags"].stats

statvalue
n50
nulls0 (0.0%)
unique
alert: skippedno profiler for kind=unknown

environment_impact_level categorical other

This column is intended to capture an environmental impact level category, but it is effectively empty: 56% of the 50 rows are null and the remaining 44% (22 rows) contain only a blank string, yielding a single unique value and zero entropy. The column carries no usable information in its current state and is entirely uninformative for modelling or analysis.

Treatment: Drop this column; all non-null values are blank strings and it contains zero informational signal.

anthropic:default · confidence high
Out[468]:

saturn.columns["environment_impact_level"].stats

statvalue
n50
nulls28 (56.0%)
unique1
top_value
top_rate 1
cardinality 1
entropy 0
entropy_ratio 0
alert: null_rate56.0% null
alert: imbalancetop value is 100.0% of rows
Fig 113.
Top values for environment_impact_level.
Show data table
Top values for environment_impact_level (1 unique shown, of 1 total).
valuecountshare
2244.0%

expiration_date categorical

Out[471]:

saturn.columns["expiration_date"].stats

statvalue
n50
nulls2 (4.0%)
unique34
top_value
top_rate 0.3125
cardinality 34
entropy 4.364
entropy_ratio 0.8578
alert: long_tail33 singleton categories
Fig 114.
Top values for expiration_date.
Show data table
Top values for expiration_date (20 unique shown, of 34 total).
valuecountshare
1530.0%
30days12.0%
31/07/202012.0%
28/02/2412.0%
30/06/202512.0%
25.11.202512.0%
12.12.201812.0%
01/201812.0%
12/06/202112.0%
19-10-202312.0%
31 jul. 201912.0%
30-04-202412.0%
11/10/202512.0%
30 jun. 202012.0%
2024-04-0112.0%
31 mai 201912.0%
31-01-202512.0%
05 202612.0%
2021-11-1512.0%
31/12/202412.0%

ingredients_from_or_that_may_be_from_palm_oil_n numeric

Out[474]:

saturn.columns["ingredients_from_or_that_may_be_from_palm_oil_n"].stats

statvalue
n50
nulls3 (6.0%)
unique3
min 0
max 2
mean 0.3404
median 0
std 0.5625
q1 0
q3 1
iqr 1
skew 1.393
kurtosis 0.969
n_outliers 0
outlier_rate 0
zero_rate 0.7021
Fig 115.
Distribution of ingredients_from_or_that_may_be_from_palm_oil_n. Vertical dash marks the median.
Show data table
Histogram bins for ingredients_from_or_that_may_be_from_palm_oil_n (median: 0.0).
bincount
0 – 0.333333
0.3333 – 0.66670
0.6667 – 10
1 – 1.33312
1.333 – 1.6670
1.667 – 22

nutriscore_score numeric

Out[477]:

saturn.columns["nutriscore_score"].stats

statvalue
n50
nulls1 (2.0%)
unique28
min 0
max 40
mean 17.47
median 19
std 9.906
q1 10
q3 25
iqr 15
skew -0.1616
kurtosis -0.5337
n_outliers 0
outlier_rate 0
zero_rate 0.08163
Fig 116.
Distribution of nutriscore_score. Vertical dash marks the median.
Show data table
Histogram bins for nutriscore_score (median: 19.0).
bincount
0 – 5.7148
5.714 – 11.435
11.43 – 17.147
17.14 – 22.8613
22.86 – 28.5712
28.57 – 34.292
34.29 – 402

ingredients_text_with_allergens categorical

Out[480]:

saturn.columns["ingredients_text_with_allergens"].stats

statvalue
n50
nulls0 (0.0%)
unique50
top_value milk cream, cream, sugar, banana, bacteria
top_rate 0.02
cardinality 50
entropy 5.644
entropy_ratio 1
alert: long_tail50 singleton categories
Fig 117.
Top values for ingredients_text_with_allergens.
Show data table
Top values for ingredients_text_with_allergens (20 unique shown, of 50 total).
valuecountshare
milk cream, cream, sugar, banana, bacteria12.0%
Céréale 50 % (Farine de blé 34,8 %, farine de blé complet 15,2 %), sucre, huiles végétales (palme, colza), cacao maigre en poudre 4,5 %, sirop de glucose, amidon de blé, poudres à lever (carbonates d'ammonium, carbonates de sodium), émulsifiant (lécithines de soja), sel, lait écrémé en poudre, perméat de lactosérum (de lait), arômes. Peut contenir œuf.12.0%
Pâte de cacao, beurre de cacao, cacao maigre, sucre, vanille.12.0%
Coffret fourré au cacao (41,6%) et à la vanille (208) - Ingrédients Farine de blé, sucre, huile végétale non hydrogénée (huile de palme), filtrat de lait, poudre de cacao Émulsifiant à faible teneur en cacao (322) Lécithine de soja) Agent levant (5000) Sucre artificiel (vanilline) Sel Contient du lait, du blé (gluten) du soja12.0%
Farine de blé 57%, sucre de canne roux, huile de colza, sésame toasté 10,6%, germe de blé 5,4%, farine complète de blé 5,4%, arôme naturel, magnésium, émulsifiant : lécithines, poudres à lever (tartrates de potassium, carbonates de sodium, carbonates d'ammonium), sel de mer, amidon de blé, vitamines (E, PP, B6, B1, B9).12.0%
Какаова маса, нискомаслено какао на прах, какаово масло, захар, емулгатор: лецитин (соеви), екстракт от ванилия, Може да съдържа следи от ядки и мляко,12.0%
Eau de source12.0%
Farine de froment, sucre, graisse végétale, sucre inverti, agents levants ( bicarbonate d'ammonium - bicarbonate de sodium), sel, arome.12.0%
sugar, cocoa butter, whole milk powder, cocoa mass, almonds, emulsifier (soya lecithin), flavoring12.0%
cocoa mass #, cane sugar #, cocoa butter #, vanilla extract #, may contain nuts, milk,12.0%
دقيقالقمح،رقائق الشوكولاته20%[عجينة زيت النخلة.الكاكاو،سكر،دكستروز و مستحلب12.0%
Farine de froment, sucre, graisse végétale, noix de coco râpée, poudre de lait, poudre de lactosérum, sucre inverti, agents levants (bicarbonate d'ammonium - bicarbonate de Sodium), sel, arômes.12.0%
Pâte à tartiner aux NOISETTES et au cacao 40% (sucre, huile de palme, NOISETTES 13%**, LAIT écrémé en poudre 8,7%**, cacao maigre 7,4%**, émulsifiants : lécithines [SOJA]; vanilline), farine de FROMENT 32,5%, graisses végétales (palme, palmiste), sucre de canne (contient BLE) 8,5%, LACTOSE, son de BLE, LAIT en poudre, miel, poudres à lever (diphosphate disodique, carbonate acide de sodium, carbonate acide d'ammonium), farine d'ORGE malté, cacao maigre en poudre, sel, extrait en poudre de malt d'ORGE et de maïs, amidon de FROMENT, émulsifiants: lécithines [SOJA]; vanilline.12.0%
Farine complète de SEIGLE (77 g*), farine de SEIGLE (28 g*), levure, sel. Peut contenir des traces de LUPIN, LAIT, MOUTARDE, GRAINES DE SÉSAME et SOJA. *en g pour 100 g de produit.12.0%
Pâte de cacao, sucre, beurre de cacao, vanille. Peut contenir des fruits à coque, du lait, du soja et des graines de sésame.12.0%
Kartoffeln, Sonnenblumenöl, Meersalz.12.0%
pâte de cacao*, beurre de cacao*, cacao maigre en poudre*, sucre de canne*, extrait de vanille*, * ingrédients issus de l'agriculture biologique12.0%
Pâte de cacao, cacao maigre, beurre de cacao, cassonade, vanille12.0%
Farine de blé* 41%, Chocolat noir* 22% (pâte de cacao*, sucre de canne", beurre de cacao"), Sucre de canne* roux non raffiné, Farine complète de blé* 16%, Huile de tournesol oléique*, Arôme naturel de vanille, Lait écrémé en poudre, Sel de mer, carbonates d'ammonium, carbonates de sodium, gomme d'acacia*, extraits de romarin* Peut contenir du soja, des œufs, des fruits à coque, des graines de sésame et de la moutarde. *Ingrédients biologiques.12.0%
cocoa mass, sugar, cocoa butter, fat reduced cocoa powder, emulsifier: lecithins (soya), natural vanilla flavouring, dark chocolate contains: cocoa solids 74% minimum,12.0%

ingredients_with_specified_percent_sum numeric

Out[483]:

saturn.columns["ingredients_with_specified_percent_sum"].stats

statvalue
n50
nulls0 (0.0%)
unique22
min 0
max 99.6
mean 22.74
median 0
std 32.88
q1 0
q3 52.25
iqr 52.25
skew 0.9979
kurtosis -0.5856
n_outliers 0
outlier_rate 0
zero_rate 0.58
Fig 118.
Distribution of ingredients_with_specified_percent_sum. Vertical dash marks the median.
Show data table
Histogram bins for ingredients_with_specified_percent_sum (median: 0.0).
bincount
0 – 14.2333
14.23 – 28.460
28.46 – 42.692
42.69 – 56.914
56.91 – 71.145
71.14 – 85.374
85.37 – 99.62

nutriscore_version categorical

Out[486]:

saturn.columns["nutriscore_version"].stats

statvalue
n50
nulls0 (0.0%)
unique1
top_value 2023
top_rate 1
cardinality 1
entropy 0
entropy_ratio 0
alert: imbalancetop value is 100.0% of rows
Fig 119.
Top values for nutriscore_version.
Show data table
Top values for nutriscore_version (1 unique shown, of 1 total).
valuecountshare
202350100.0%

lang categorical

Out[489]:

saturn.columns["lang"].stats

statvalue
n50
nulls0 (0.0%)
unique5
top_value fr
top_rate 0.7
cardinality 5
entropy 1.294
entropy_ratio 0.5572
Fig 120.
Top values for lang.
Show data table
Top values for lang (5 unique shown, of 5 total).
valuecountshare
fr3570.0%
en1020.0%
de36.0%
bg12.0%
ro12.0%

origins_hierarchy unknown

Out[492]:

saturn.columns["origins_hierarchy"].stats

statvalue
n50
nulls0 (0.0%)
unique
alert: skippedno profiler for kind=unknown

origins_lc categorical

Out[494]:

saturn.columns["origins_lc"].stats

statvalue
n50
nulls2 (4.0%)
unique6
top_value fr
top_rate 0.4792
cardinality 6
entropy 1.575
entropy_ratio 0.6093
Fig 121.
Top values for origins_lc.
Show data table
Top values for origins_lc (6 unique shown, of 6 total).
valuecountshare
fr2346.0%
en2040.0%
es24.0%
de12.0%
it12.0%
pl12.0%

origin_it categorical other

This column appears to be an 'origin Italy' flag or similar origin/locale indicator, but it is effectively empty: 68% of its 50 rows are null, and the sole non-null value present is an empty string appearing 16 times. With cardinality of 1 and entropy of 0, the column carries zero information. The combination of high nulls and a blank-string-only value suggests the field was never populated in this dataset slice.

Treatment: Drop — zero variance and entirely unpopulated (null or empty string); contributes no signal to any downstream task.

anthropic:default · confidence high
Out[497]:

saturn.columns["origin_it"].stats

statvalue
n50
nulls34 (68.0%)
unique1
top_value
top_rate 1
cardinality 1
entropy 0
entropy_ratio 0
alert: null_rate68.0% null
alert: imbalancetop value is 100.0% of rows
Fig 122.
Top values for origin_it.
Show data table
Top values for origin_it (1 unique shown, of 1 total).
valuecountshare
1632.0%

serving_quantity categorical

Out[500]:

saturn.columns["serving_quantity"].stats

statvalue
n50
nulls6 (12.0%)
unique27
top_value 100
top_rate 0.1591
cardinality 27
entropy 4.322
entropy_ratio 0.9089
alert: long_tail21 singleton categories
Fig 123.
Top values for serving_quantity.
Show data table
Top values for serving_quantity (20 unique shown, of 27 total).
valuecountshare
100714.0%
10714.0%
2036.0%
2524.0%
4224.0%
3024.0%
2312.0%
11.512.0%
100012.0%
13.812.0%
11.412.0%
1812.0%
5012.0%
8512.0%
3612.0%
4012.0%
4512.0%
8.412.0%
7.14312.0%
5812.0%

checkers unknown

Out[503]:

saturn.columns["checkers"].stats

statvalue
n50
nulls0 (0.0%)
unique
alert: skippedno profiler for kind=unknown

editors_tags unknown

Out[505]:

saturn.columns["editors_tags"].stats

statvalue
n50
nulls0 (0.0%)
unique
alert: skippedno profiler for kind=unknown

stores categorical

Out[507]:

saturn.columns["stores"].stats

statvalue
n50
nulls2 (4.0%)
unique31
top_value
top_rate 0.2917
cardinality 31
entropy 4.233
entropy_ratio 0.8543
alert: long_tail29 singleton categories
Fig 124.
Top values for stores.
Show data table
Top values for stores (20 unique shown, of 31 total).
valuecountshare
1428.0%
Lidl510.0%
Carrefour Market,Magasins U,Auchan,Intermarché,Carrefour,Casino,Cora,Bi1,carrefour.fr,Netto,bannete,E.Leclerc12.0%
Carrefour,Géant,kupsch,Magasins U,Esselunga,Lindt,carrefour.fr,COOP,El Corte Inglés,Consum,Meny,Walmart12.0%
E.Leclerc,Carrefour,Auchan,Monoprix,carrefour.fr,Lidl,Intermarché12.0%
Sogeres,Holyday Inn Toulon12.0%
Leclerc,Magasins U,carrefour.fr,Intermarché12.0%
Magasins U,Carrefour,carrefour.fr,Carrefour Market,E.leclerc,Carrefour City,Intermarché12.0%
Carrefour,Magasins U,Sainsbury's,carrefour.fr,Plus,Albert Heijn,Asda,El Corte Inglés12.0%
Tesco12.0%
Magasins U,Carrefour,Auchan,carrefour.fr,E.leclerc,Carrefour Market,Carrefour City12.0%
LIDL,Monoprix,Carrefour,Auchan,Intermarché,Carrefour Market,Leclerc12.0%
Dia,Auchan,Magasins U,carrefour.fr,monoprix,Centre Commercial E.Leclerc12.0%
private shops,groceries,Marjane12.0%
Carrefour,E.Leclerc,REWE12.0%
biocoop12.0%
Franprix,Magasins U,Leclerc,E Leclerc,Delhaize,carrefour.fr,Carrefour,Auchan,Carrefour Market12.0%
Sainsbury's,Coop12.0%
E.leclerc12.0%
Franprix,Magasins U,Carrefour,carrefour.fr,Carrefour City12.0%

product_name_pl categorical

Out[510]:

saturn.columns["product_name_pl"].stats

statvalue
n50
nulls45 (90.0%)
unique3
top_value
top_rate 0.6
cardinality 3
entropy 1.371
entropy_ratio 0.865
alert: long_tail2 singleton categories
alert: null_rate90.0% null
Fig 125.
Top values for product_name_pl.
Show data table
Top values for product_name_pl (3 unique shown, of 3 total).
valuecountshare
36.0%
Czekolada gorzka 74%12.0%
Excellence 70% Cocoa Intense Dark12.0%

weighters_tags unknown

Out[513]:

saturn.columns["weighters_tags"].stats

statvalue
n50
nulls0 (0.0%)
unique
alert: skippedno profiler for kind=unknown

ecoscore_score numeric

Out[515]:

saturn.columns["ecoscore_score"].stats

statvalue
n50
nulls7 (14.0%)
unique31
min 13
max 94
mean 47.74
median 44
std 21.19
q1 27.5
q3 64
iqr 36.5
skew 0.3069
kurtosis -0.7946
n_outliers 0
outlier_rate 0
zero_rate 0
Fig 126.
Distribution of ecoscore_score. Vertical dash marks the median.
Show data table
Histogram bins for ecoscore_score (median: 44.0).
bincount
13 – 26.510
26.5 – 404
40 – 53.512
53.5 – 678
67 – 80.56
80.5 – 943

generic_name_it categorical

Out[518]:

saturn.columns["generic_name_it"].stats

statvalue
n50
nulls34 (68.0%)
unique5
top_value
top_rate 0.6875
cardinality 5
entropy 1.497
entropy_ratio 0.6446
alert: long_tail3 singleton categories
alert: null_rate68.0% null
Fig 127.
Top values for generic_name_it.
Show data table
Top values for generic_name_it (5 unique shown, of 5 total).
valuecountshare
1122.0%
Cioccolato extra fondente24.0%
Cioccolato fondente 90%12.0%
Prodotto da forno con segale ricco di fibre alimentari12.0%
Crackers12.0%

obsolete categorical

Out[521]:

saturn.columns["obsolete"].stats

statvalue
n50
nulls6 (12.0%)
unique1
top_value
top_rate 1
cardinality 1
entropy 0
entropy_ratio 0
alert: imbalancetop value is 100.0% of rows
Fig 128.
Top values for obsolete.
Show data table
Top values for obsolete (1 unique shown, of 1 total).
valuecountshare
4488.0%

other_nutritional_substances_prev_tags unknown

Out[524]:

saturn.columns["other_nutritional_substances_prev_tags"].stats

statvalue
n50
nulls0 (0.0%)
unique
alert: skippedno profiler for kind=unknown

compared_to_category categorical

Out[526]:

saturn.columns["compared_to_category"].stats

statvalue
n50
nulls0 (0.0%)
unique35
top_value en:dark-chocolate-bar-with-more-than-70-cocoa
top_rate 0.1
cardinality 35
entropy 4.886
entropy_ratio 0.9526
alert: long_tail28 singleton categories
Fig 129.
Top values for compared_to_category.
Show data table
Top values for compared_to_category (20 unique shown, of 35 total).
valuecountshare
en:dark-chocolate-bar-with-more-than-70-cocoa510.0%
en:biscuits48.0%
en:extra-fine-dark-chocolates36.0%
en:dark-chocolates36.0%
en:snacks-sucres36.0%
en:sandwich-biscuits24.0%
en:extruded-crispbreads24.0%
en:plain-fermented-dairy-desserts-with-cream12.0%
en:chocolate-stuffed-wafers12.0%
en:spring-waters12.0%
en:food12.0%
en:drop-cookies12.0%
en:shortbread-cookie-with-coconut12.0%
en:biscuits-cookies-shelf-stable12.0%
en:crispbreads12.0%
fr:chips-de-pommes-de-terre-classiques12.0%
en:dark-chocolate-bar12.0%
en:cacao-et-derives12.0%
en:crispbreads-wholemeal12.0%
en:biscuit-snack-with-chocolate-filling12.0%

generic_name_es categorical

Out[529]:

saturn.columns["generic_name_es"].stats

statvalue
n50
nulls30 (60.0%)
unique7
top_value
top_rate 0.65
cardinality 7
entropy 1.817
entropy_ratio 0.6471
alert: long_tail5 singleton categories
alert: null_rate60.0% null
Fig 130.
Top values for generic_name_es.
Show data table
Top values for generic_name_es (7 unique shown, of 7 total).
valuecountshare
1326.0%
Chocolate negro24.0%
Chocolate negro con un 74% de cacao mínimo12.0%
Crackers12.0%
Tableta de chocolate negro extrafino con 70% de cacao12.0%
Tableta de chocolate negro Ecuador con un 70% de cacao mínimo12.0%
Chocolate Negro 99%12.0%

correctors unknown

Out[532]:

saturn.columns["correctors"].stats

statvalue
n50
nulls0 (0.0%)
unique
alert: skippedno profiler for kind=unknown

additives_n numeric

Out[534]:

saturn.columns["additives_n"].stats

statvalue
n50
nulls0 (0.0%)
unique8
min 0
max 8
mean 1.52
median 1
std 1.821
q1 0
q3 2
iqr 2
skew 1.473
kurtosis 2.105
n_outliers 2
outlier_rate 0.04
zero_rate 0.4
Fig 131.
Distribution of additives_n. Vertical dash marks the median.
Show data table
Histogram bins for additives_n (median: 1.0).
bincount
0 – 1.14329
1.143 – 2.28611
2.286 – 3.4293
3.429 – 4.5713
4.571 – 5.7142
5.714 – 6.8571
6.857 – 81

ingredients_text_nb categorical

Out[537]:

saturn.columns["ingredients_text_nb"].stats

statvalue
n50
nulls48 (96.0%)
unique1
top_value
top_rate 1
cardinality 1
entropy 0
entropy_ratio 0
alert: null_rate96.0% null
alert: imbalancetop value is 100.0% of rows
Fig 132.
Top values for ingredients_text_nb.
Show data table
Top values for ingredients_text_nb (1 unique shown, of 1 total).
valuecountshare
24.0%

ingredients_text_es categorical

Out[540]:

saturn.columns["ingredients_text_es"].stats

statvalue
n50
nulls30 (60.0%)
unique13
top_value
top_rate 0.4
cardinality 13
entropy 3.122
entropy_ratio 0.8437
alert: long_tail12 singleton categories
alert: null_rate60.0% null
Fig 133.
Top values for ingredients_text_es.
Show data table
Top values for ingredients_text_es (13 unique shown, of 13 total).
valuecountshare
816.0%
Pasta de cacao, manteca de cacao, cacao magro en polvo, azúcar, vainilla.12.0%
Azúcar, Grasa vegetal de palmiste parcialmente hidrogenada, Leche en polvo, Almendras, Cacao desgrasado en polvo, suero lácteo en polvo, Emulgente (lecitina de soja), aroma (vainilla).12.0%
Crema de avellanas y cacao 40% (azúcar, manteca de palma, avellanas 13%, leche desnatada en polvo 8,7%, cacao desgrasado 7.4%, emulgentes (lecitinas (soja), vainillina), harina de trigo 32,5%, grasas vegetales (palma, palmiste), azúcar de caña 8,5% (trigo), lactosa, salvado de trigo, leche entera en polvo, extracto en polvo de malta de cebada y maíz, miel, gasificantes (difosfato disódico, carbonato ácido de sodio, carbonato ácido de amonio), cacao desgrasado, sal, almidón de trigo, harina de cebada, malteada, emulsionantes (lecitinas (soja), vainillina.12.0%
70% pasta de cacao*, azúcar, rnanteca de cacao, cacao desgrasado en polvo, emulgente: lecitlna de girasol (E-322), aroma natural de vainilla. *Pasta de cacao Ralnforest Alliance Certified cocoa. Cacao: 74% mínimo.12.0%
Harina de _TRIGO_, grasa de palma, extracto de malta de _CEBADA_, gasificantes (carbonatos de amonio, carbonatos de sodio), sal, _HUEVO_, aroma, agente de tratamiento de la harina (_METABISULFITO_ sódico).12.0%
Pasta de cacao, azúcar, manteca de cacao, vainilla.12.0%
Pasta de cacao, azúcar, manteca de cacao, emulgente: lecitina de girasol (E-322), extracto de vainilla. Cacao: 70% mínimo.12.0%
Copos de avena integral (60%),azúcar, aceite refinado de girasol, miel (3%), sal, melaza de caña, emulgente (lecitina de girasol), gasificante (carbonato ácido de sodio),12.0%
Pasta de cacao, cacao magro, manteca de cacao, azúcar moreno de caña12.0%
Zucker, Kakaobutter, Magermilchpulver, Kakaomasse, Molkenpulver (Milch), Butterreinfett, Emulgator (Sojalecithin), Haselnusspaste, natürliches Aroma12.0%
pasta de cacao, azúcar, manteca de cacao, emulgente (lecitina de _soja_), vainilla. Cacao: 70% mínimo.12.0%
Pasta de cacao, cacao desgrasado en polvo, manteca de cacao, azúcar, leche en polvo, pasta de almendras y avellanas, emulgentes (lecitinas de soja, girasol), aroma12.0%

manufacturing_places_tags unknown

Out[543]:

saturn.columns["manufacturing_places_tags"].stats

statvalue
n50
nulls0 (0.0%)
unique
alert: skippedno profiler for kind=unknown

origin categorical

Out[545]:

saturn.columns["origin"].stats

statvalue
n50
nulls3 (6.0%)
unique6
top_value
top_rate 0.8936
cardinality 6
entropy 0.7359
entropy_ratio 0.2847
alert: long_tail5 singleton categories
Fig 134.
Top values for origin.
Show data table
Top values for origin (6 unique shown, of 6 total).
valuecountshare
4284.0%
Fabriqué par: Aachen Allemagne12.0%
Germe de blé origine ue. Sésame origine non-ue.12.0%
France12.0%
fabriqué en France.pommes origine UE. noisettes origine UE et non UE12.0%
Fabriqué en France par Nutrition et Santé. Farine de blé: France. Figues : non UE12.0%

origins_old categorical

Out[548]:

saturn.columns["origins_old"].stats

statvalue
n50
nulls11 (22.0%)
unique9
top_value
top_rate 0.7949
cardinality 9
entropy 1.347
entropy_ratio 0.4251
alert: long_tail8 singleton categories
alert: null_rate22.0% null
Fig 135.
Top values for origins_old.
Show data table
Top values for origins_old (9 unique shown, of 9 total).
valuecountshare
3162.0%
France12.0%
Chambon-la-Forêt,France,Cairanne,Provence-Alpes-Côte d'Azur,Vaucluse,Italie,Source Sainte Cécile,Source Ofélia,Source Éléonore,Source Emma,Source Éléna12.0%
United Kingdom12.0%
biologique12.0%
Morocco12.0%
[KAKAO],Los Ríos (Provinz),Ecuador12.0%
Farine de blé: France12.0%
Afrique de l'Ouest,Amérique du Sud,Madagascar12.0%

packaging_text_de categorical

Out[551]:

saturn.columns["packaging_text_de"].stats

statvalue
n50
nulls30 (60.0%)
unique2
top_value
top_rate 0.95
cardinality 2
entropy 0.2864
entropy_ratio 0.2864
alert: null_rate60.0% null
Fig 136.
Top values for packaging_text_de.
Show data table
Top values for packaging_text_de (2 unique shown, of 2 total).
valuecountshare
1938.0%
1 Folie aus 22 PAP zum Recyclen12.0%

languages unknown

Out[554]:

saturn.columns["languages"].stats

statvalue
n50
nulls0 (0.0%)
unique
alert: skippedno profiler for kind=unknown

categories_old categorical

Out[556]:

saturn.columns["categories_old"].stats

statvalue
n50
nulls1 (2.0%)
unique45
top_value Snacks, Snacks sucrés, Biscuits et gâteaux, Biscuits, Biscuits secs
top_rate 0.04082
cardinality 45
entropy 5.451
entropy_ratio 0.9926
alert: long_tail41 singleton categories
Fig 137.
Top values for categories_old.
Show data table
Top values for categories_old (20 unique shown, of 45 total).
valuecountshare
Snacks, Snacks sucrés, Biscuits et gâteaux, Biscuits, Biscuits secs24.0%
Snacks, Snacks sucrés, Biscuits et gâteaux, Biscuits24.0%
Aliments et boissons à base de végétaux, Aliments d'origine végétale, Snacks, Céréales et pommes de terre, Pains, Tartines craquantes extrudées, Pains croustillants24.0%
Snacks, Sweet snacks, Cocoa and its products, Chocolates, Dark chocolates24.0%
Dairies, Fermented foods, Fermented milk products, Cheeses, Cream cheeses, fr:Fromages-frais-sucres, en:yogurts12.0%
Snacks, Snacks sucrés, Cacao et dérivés12.0%
Przekąski, Słodkie przekąski, Kakao i produkty na bazie kakao, Czekolada, Czekolada deserowa, Czekolada gorzka12.0%
Закуски, Сладки закуски, Какаови изделия, Шоколади, Тъмен шоколад12.0%
Boissons et préparations de boissons, Boissons, Eaux, Eaux de sources, Boissons sans sucre ajouté12.0%
Snacks, Snacks sucrés, Confiseries, Succédanés du chocolat, en:Vegecaos12.0%
Snacks, Sweet snacks, Cocoa and its products, Confectioneries, Chocolates, Dark chocolates12.0%
Snacks, Snacks sucrés, Biscuits et gâteaux, Biscuits, en:Biscuits et gâteaux, en:Snacks sucrés12.0%
Snacks, Snacks sucrés, Biscuits et gâteaux, Biscuits, Biscuits sablés, Sablés à la noix de coco12.0%
Botanas,Snacks dulces,Galletas y pasteles,Galletas,Galletas rellenas12.0%
Produits laitiers, Produits fermentés, Produits laitiers fermentés, Snacks, Fromages, Snacks sucrés, Cacao et dérivés, Chocolats, Chocolats noirs, Chocolats noirs en tablette, Chocolat noir en tablette extra dégustation à 70% de cacao minimum12.0%
Aliments et boissons à base de végétaux, Aliments d'origine végétale, Snacks, Céréales et pommes de terre, Snacks salés, Amuse-gueules, Chips et frites, Chips, Chips de pommes de terre, Chips de pommes de terre à l'huile de tournesol, en:Aliments d'origine végétale, en:Aliments et boissons à base de végétaux, en:Amuse-gueules, en:Chips, en:Chips de pommes de terre, en:Chips de pommes de terre classiques, en:Chips de pommes de terre à l'huile de tournesol, en:Chips et frites, en:Céréales et pommes de terre, en:Snacks salés12.0%
Snacks, Snacks sucrés, Cacao et dérivés, Chocolats, Chocolats noirs, Chocolats noirs en tablette12.0%
Snacks,Sweet snacks,Biscuits and cakes,Biscuits,Chocolate biscuits,Filled biscuits,Dark chocolate biscuits12.0%
Snacks, Sweet snacks, Cocoa and its products, Chocolates, Dark chocolates, Cacao-et-derives, Chocolats, Chocolats-noirs, Chocolats-noirs-extra-fin12.0%
Aliments et boissons à base de végétaux,Aliments d'origine végétale,Céréales et pommes de terre,Pains,Pains croustillants12.0%

ingredients_from_palm_oil_tags unknown

Out[559]:

saturn.columns["ingredients_from_palm_oil_tags"].stats

statvalue
n50
nulls0 (0.0%)
unique
alert: skippedno profiler for kind=unknown

minerals_prev_tags unknown

Out[561]:

saturn.columns["minerals_prev_tags"].stats

statvalue
n50
nulls0 (0.0%)
unique
alert: skippedno profiler for kind=unknown

origin_fi categorical other

This column, likely representing an origin financial institution or similar identifier, is almost entirely empty: 90% null rate with only 5 non-null rows across 50 records. Among those 5 non-null values, every single one is an empty string, meaning the column contains zero meaningful information—cardinality is 1, entropy is 0, and the sole 'value' is a blank.

Treatment: Drop this column entirely; it carries no information and is 100% effectively empty across all 50 rows.

anthropic:default · confidence high
Out[563]:

saturn.columns["origin_fi"].stats

statvalue
n50
nulls45 (90.0%)
unique1
top_value
top_rate 1
cardinality 1
entropy 0
entropy_ratio 0
alert: null_rate90.0% null
alert: imbalancetop value is 100.0% of rows
Fig 138.
Top values for origin_fi.
Show data table
Top values for origin_fi (1 unique shown, of 1 total).
valuecountshare
510.0%

packaging_old categorical

Out[566]:

saturn.columns["packaging_old"].stats

statvalue
n50
nulls7 (14.0%)
unique40
top_value Plastique
top_rate 0.06977
cardinality 40
entropy 5.269
entropy_ratio 0.9901
alert: long_tail38 singleton categories
Fig 139.
Top values for packaging_old.
Show data table
Top values for packaging_old (20 unique shown, of 40 total).
valuecountshare
Plastique36.0%
24.0%
Paquet, Etui en carton, Film en plastique12.0%
Cardboard, Container, Packaging, Paperboard, Aluminium wrap, Caja de cartón, Box cardboard, Card-box, Foil-wrapper, pt:Papel de aluminio12.0%
Sachet, Carton, Paquet, 20 biscuits en 4 sachets12.0%
Cardboard, Non-corrugated cardboard, Produkt, fr:FSC mixte, sl:PAP12.0%
fr:Point vert,fr:Triman,fr:Bouteille et bouchon 100% recyclable,fr:PET,en:Bottle12.0%
Métal, Papier, en:Recyclable Metals, Aluminium12.0%
Plastic, Envelope, Mixed plastic-packet12.0%
Papier, Enveloppe, en:Package paper, en:Paper recycling12.0%
Métal, en:Recyclable Metals, Aluminium, Carton, Emballage carton12.0%
Sachet, Sous atmosphère protectrice, en:mixed plastic-packet12.0%
Paper, Film12.0%
fr:emballage carton, fr:papier aluminium12.0%
Film en plastique, Film plastique à jeter, Étui carton à recycler12.0%
fr:Plastique,fr:Sachet plastique de 3g,en:mixed plastic-packet12.0%
Papier, Enveloppe12.0%
Papier12.0%
Plastic12.0%
Container, Caja de cartón, Aluminium-wrapper, Card-carton, pt:Papel de aluminio12.0%

ingredients_text_fi categorical

Out[569]:

saturn.columns["ingredients_text_fi"].stats

statvalue
n50
nulls45 (90.0%)
unique4
top_value
top_rate 0.4
cardinality 4
entropy 1.922
entropy_ratio 0.961
alert: long_tail3 singleton categories
alert: null_rate90.0% null
Fig 140.
Top values for ingredients_text_fi.
Show data table
Top values for ingredients_text_fi (4 unique shown, of 4 total).
valuecountshare
24.0%
kaakaomassa, kaakaovoi, vähärasvainen kaakaojauhe, sokeri, vanilja. Saattaa sisältää hasselpähkinää, muita pähkinöitä, maitoa, soijaa. Tummassa suklaassa kaakaota vähintään 90%.12.0%
kaakaomassa, vähärasvainen kaakaojauhe, kaakaovoi, sokeri, emulgointiaine (_soijalesitiini_), vaniljauute. Suklaassa kaakaota vähintään 85 %. Saattaa sisältää pieniä määriä pähkinää ja maitoa.12.0%
_VEHNÄJAUHO_, palmuöljy, tärkkelyssiirappi, _OHRAMALLASUUTE_, nostatusaineet ammoniumkarbonaatit, natriumkarbonaatit), suola, _KANANMUNAT_, aromi, jauhonparanne (_NATRIUMDISULFIITTI_).12.0%

product_type categorical

Out[572]:

saturn.columns["product_type"].stats

statvalue
n50
nulls0 (0.0%)
unique1
top_value food
top_rate 1
cardinality 1
entropy 0
entropy_ratio 0
alert: imbalancetop value is 100.0% of rows
Fig 141.
Top values for product_type.
Show data table
Top values for product_type (1 unique shown, of 1 total).
valuecountshare
food50100.0%

ingredients_hierarchy unknown

Out[575]:

saturn.columns["ingredients_hierarchy"].stats

statvalue
n50
nulls0 (0.0%)
unique
alert: skippedno profiler for kind=unknown

removed_countries_tags unknown

Out[577]:

saturn.columns["removed_countries_tags"].stats

statvalue
n50
nulls0 (0.0%)
unique
alert: skippedno profiler for kind=unknown

unknown_nutrients_tags unknown

Out[579]:

saturn.columns["unknown_nutrients_tags"].stats

statvalue
n50
nulls0 (0.0%)
unique
alert: skippedno profiler for kind=unknown

no_nutrition_data categorical

Out[581]:

saturn.columns["no_nutrition_data"].stats

statvalue
n50
nulls2 (4.0%)
unique1
top_value
top_rate 1
cardinality 1
entropy 0
entropy_ratio 0
alert: imbalancetop value is 100.0% of rows
Fig 142.
Top values for no_nutrition_data.
Show data table
Top values for no_nutrition_data (1 unique shown, of 1 total).
valuecountshare
4896.0%

ingredients_analysis unknown

Out[584]:

saturn.columns["ingredients_analysis"].stats

statvalue
n50
nulls0 (0.0%)
unique
alert: skippedno profiler for kind=unknown

packagings_materials unknown

Out[586]:

saturn.columns["packagings_materials"].stats

statvalue
n50
nulls0 (0.0%)
unique
alert: skippedno profiler for kind=unknown

serving_quantity_unit categorical

Out[588]:

saturn.columns["serving_quantity_unit"].stats

statvalue
n50
nulls4 (8.0%)
unique2
top_value g
top_rate 0.9783
cardinality 2
entropy 0.1511
entropy_ratio 0.1511
alert: imbalancetop value is 97.8% of rows
Fig 143.
Top values for serving_quantity_unit.
Show data table
Top values for serving_quantity_unit (2 unique shown, of 2 total).
valuecountshare
g4590.0%
ml12.0%

product_name categorical

Out[591]:

saturn.columns["product_name"].stats

statvalue
n50
nulls0 (0.0%)
unique49
top_value Henry’s
top_rate 0.04
cardinality 49
entropy 5.604
entropy_ratio 0.9981
alert: long_tail48 singleton categories
Fig 144.
Top values for product_name.
Show data table
Top values for product_name (20 unique shown, of 49 total).
valuecountshare
Henry’s24.0%
Perly12.0%
Prince Goût Chocolat12.0%
Excellence Noir Prodigieux 90% Cacao12.0%
Tonik12.0%
Sésame12.0%
Шоколад 85% какаова маса12.0%
CRISTALINE Eau De Source 0.5L12.0%
12.0%
Organic 70% Dark Chocolate Bar12.0%
KING COOKIES12.0%
Sable coco Henry s 42g12.0%
Biscuits croquants au coeur onctueux de Nutella®12.0%
Tartine croustillante Authentique12.0%
Excellence Noir Intense 70% Cacao12.0%
Lightly sea salted crisps12.0%
Dark chocolate12.0%
Excellence Noir Puissant 85% Cacao12.0%
Fourrés Chocolat Noir12.0%
Extra dark 74% Cocoa12.0%

id categorical

Out[594]:

saturn.columns["id"].stats

statvalue
n50
nulls0 (0.0%)
unique50
top_value 6111242100992
top_rate 0.02
cardinality 50
entropy 5.644
entropy_ratio 1
alert: long_tail50 singleton categories
Fig 145.
Top values for id.
Show data table
Top values for id (20 unique shown, of 50 total).
valuecountshare
611124210099212.0%
762221044928312.0%
304692002975912.0%
611103100506412.0%
317568001148012.0%
2099555312.0%
326884000100812.0%
336260001104412.0%
842519771202412.0%
762221057846412.0%
611125934310812.0%
336260001122812.0%
800050031042712.0%
730040048159512.0%
304692002265112.0%
506004264100012.0%
762221058472412.0%
304692002260612.0%
322982010023412.0%
2002246412.0%

ingredients_text_with_allergens_nl categorical

Out[597]:

saturn.columns["ingredients_text_with_allergens_nl"].stats

statvalue
n50
nulls39 (78.0%)
unique9
top_value
top_rate 0.2727
cardinality 9
entropy 3.027
entropy_ratio 0.955
alert: long_tail8 singleton categories
alert: null_rate78.0% null
Fig 146.
Top values for ingredients_text_with_allergens_nl.
Show data table
Top values for ingredients_text_with_allergens_nl (9 unique shown, of 9 total).
valuecountshare
36.0%
Cacaomassa, cacaoboter, magere cacaopoeder, suiker.12.0%
Aardappelen, zonnebloemolie, zeezout.12.0%
Cacaomassa, magere cacao, cacaoboter, bruine suiker, vanille. Kan noten, melk, soja, sesamzaad en tarwe bevatten.12.0%
Cacaomassa, suiker, cacaoboter, vanille.12.0%
Cacaomassa, magere cacaopoeder, cacaoboter, bruine suiker.12.0%
*Referentie inname van een gemiddelde volwassehe (8400 kJ/ 2000 ReJI), 16,7 g 46x4, www,snackmindful,com Milka www,milka,com ER Mondelez France SAS, 6 avenue Réaumur, CS 50014, 92142 Clamart Cedex, Service Consommateurs Nº Cristal:09,69,39,79,79 BE Mondelez Belgium, Stationsstraat 100, 2800 Mechelen, ND Mondelez Nederland, Verlengde Poolseweg 34, 4818 CL Breda, eu mondelezinternational,com e 100 g COCOA LIFE www,cocoalife,org 8 FR FRANCE ONLY 05 pp 3 045140 10550212.0%
tarwebloem 47%, melkchocolade 29% (suiker, cacaomassa, cacaoboter, weipoeder (van melk), magere melkpoeder, plantaardige vetten (shea, palm in wisselende verhoudingen), melkvet, emulgatoren (sojalecithine, E476), lactose (van melk), aroma), plantaardige oliën (palm, kokos), suiker, suikerstroop, tarwezemelen, rijsmiddelen (natriumwaterstofcarbonaat, ammoniumwaterstofcarbonaat), zout, tarwekiemen, voedingszuur (citroenzuur)12.0%
granen 98.3% (volkorentarwemeel 65.8%, roggebloem, tarwebloem 10.2%, rijstbloem, gemoute tarwebloem, tarwegriesmeel, boekweitbloem, gerstebloem), suiker, magere melkpoeder, zout, palmolie, tarwekiemen, emulgator (zonnebloemlecithine)12.0%

categories categorical

Out[600]:

saturn.columns["categories"].stats

statvalue
n50
nulls0 (0.0%)
unique46
top_value Snacks,Snacks sucrés,Cacao et dérivés,Chocolats,Chocolats noirs,Chocolat noir en tablette extra dégustation à 70% de cacao minimum
top_rate 0.06
cardinality 46
entropy 5.469
entropy_ratio 0.9901
alert: long_tail43 singleton categories
Fig 147.
Top values for categories.
Show data table
Top values for categories (20 unique shown, of 46 total).
valuecountshare
Snacks,Snacks sucrés,Cacao et dérivés,Chocolats,Chocolats noirs,Chocolat noir en tablette extra dégustation à 70% de cacao minimum36.0%
Snacks,Snacks sucrés,Biscuits et gâteaux,Biscuits sucrés & biscuits apéritifs,Biscuits24.0%
Snacks,Sweet snacks,Cocoa and its products,Chocolates,Dark chocolates24.0%
Dairies,Fermented foods,Fermented milk products,Snacks,Desserts,Dairy desserts,Fermented dairy desserts,Plain fermented dairy desserts,Plain fermented dairy desserts with cream12.0%
Snacks,Breakfasts,Sweet snacks,Biscuits and cakes,Biscuits and crackers,Sandwich biscuits12.0%
Snacks,Snacks sucrés,Cacao et dérivés,Chocolats,Chocolats noirs,Chocolats noirs en tablette,Chocolats noirs extra fin12.0%
Snacks sucrés,Biscuits et gâteaux,Gaufrettes fourrées au chocolat12.0%
Boissons et préparations de boissons,Boissons,Snacks,Eaux,Eaux de sources12.0%
Snacks,Snacks sucrés,Biscuits et gâteaux,Biscuits12.0%
Snacks,Sweet snacks,Cocoa and its products,Confectioneries,Chocolates,Compound chocolates,Food12.0%
Snacks,Sweet snacks,Biscuits and cakes,Biscuits and crackers,Biscuits,Drop cookies12.0%
Snacks,Snacks sucrés,Biscuits et gâteaux,Biscuits,Biscuits sablés,Sablés à la noix de coco12.0%
Botanas,Snacks dulces,Galletas y pasteles,en:Biscuits and crackers,Galletas,en:Biscuits/Cookies (Shelf Stable),fr:Biscoitos recheados12.0%
Aliments d'origine végétale,Snacks,Céréales et pommes de terre,Pains,Pains croustillants,Petit-déjeuners12.0%
Produits fermentés,Snacks,Snacks sucrés,Cacao et dérivés,Chocolats,Chocolats noirs,Chocolats noirs en tablette,Chocolat noir en tablette extra dégustation à 70% de cacao minimum12.0%
Plant-based foods and beverages,Plant-based foods,Snacks,Cereals and potatoes,Salty snacks,Appetizers,Chips and fries,Crisps,Potato crisps,Potato crisps in sunflower oil,fr:Chips de pommes de terre classiques12.0%
Snacks,Snacks sucrés,Cacao et dérivés,Confiseries,Confiseries chocolatées,Chocolats,Chocolats noirs12.0%
Snacks,Snacks sucrés,Cacao et dérivés,Chocolats,Chocolats noirs,Chocolats noirs en tablette12.0%
Snacks, Sweet snacks, Biscuits and cakes, Biscuits and crackers, Biscuits, Chocolate biscuits, Filled biscuits, Dark chocolate biscuits, Sandwich biscuits12.0%
Snacks,Sweet snacks,Cocoa and its products,Chocolates,Dark chocolates,Extra fine dark chocolates,Cacao-et-derives12.0%

nutrition_grades_tags unknown

Out[603]:

saturn.columns["nutrition_grades_tags"].stats

statvalue
n50
nulls0 (0.0%)
unique
alert: skippedno profiler for kind=unknown

nutriscore_2023_tags unknown

Out[605]:

saturn.columns["nutriscore_2023_tags"].stats

statvalue
n50
nulls0 (0.0%)
unique
alert: skippedno profiler for kind=unknown

origin_ja categorical

Out[607]:

saturn.columns["origin_ja"].stats

statvalue
n50
nulls49 (98.0%)
unique1
top_value
top_rate 1
cardinality 1
entropy 0
entropy_ratio 0
alert: long_tail1 singleton categories
alert: null_rate98.0% null
alert: imbalancetop value is 100.0% of rows
Fig 148.
Top values for origin_ja.
Show data table
Top values for origin_ja (1 unique shown, of 1 total).
valuecountshare
12.0%

nutrition_score_debug categorical

Out[610]:

saturn.columns["nutrition_score_debug"].stats

statvalue
n50
nulls0 (0.0%)
unique2
top_value
top_rate 0.98
cardinality 2
entropy 0.1414
entropy_ratio 0.1414
alert: imbalancetop value is 98.0% of rows
Fig 149.
Top values for nutrition_score_debug.
Show data table
Top values for nutrition_score_debug (2 unique shown, of 2 total).
valuecountshare
4998.0%
missing saturated-fat_100g - missing sugars_100g - missing sodium_100g12.0%

teams categorical

Out[613]:

saturn.columns["teams"].stats

statvalue
n50
nulls4 (8.0%)
unique39
top_value pain-au-chocolat
top_rate 0.1087
cardinality 39
entropy 5.124
entropy_ratio 0.9695
alert: long_tail36 singleton categories
Fig 150.
Top values for teams.
Show data table
Top values for teams (20 unique shown, of 39 total).
valuecountshare
pain-au-chocolat510.0%
stakano,chocolatine36.0%
swipe-studio,pain-au-chocolat24.0%
stakano,chocolatine,la-robe-est-bleue12.0%
pain-au-chocolat,shark-attack,chocolatine,la-robe-est-bleue,stakano,dietreflux,m,b,c,swipe-studio,gmlaa,heathy-app-cross-eat,specialtiz12.0%
stakano,chocolatine,swipe-studio,pain-au-chocolat12.0%
chocolatine,la-robe-est-bleue,scaneco,feat,stakano,specialtiz12.0%
gmlaa,pain-au-chocolat12.0%
stakano,chocolatine,scaneco,gmlaa,pain-au-chocolat12.0%
houda,chocolatine,la-robe-est-bleue,stakano12.0%
pain-au-chocolat,specialtiz,gmlaa12.0%
stakano,chocolatine,pain-au-chocolat12.0%
chocolatine,la-robe-est-bleue,pain-au-chocolat,stakano12.0%
pain-au-chocolat,shark-attack,swipe-studio,stakano,chocolatine,italy,feat12.0%
chocolatine,la-robe-est-bleue,pain-au-chocolat,shark-attack,feat12.0%
vendredi,pain-au-chocolat,stakano,chocolatine,gmlaa,italy12.0%
swipe-studio,pain-au-chocolat,chocolatine,la-robe-est-bleue,gmlaa12.0%
pain-au-chocolat,chocolatine,la-robe-est-bleue,vegan,specialtiz12.0%
chocolatine,la-robe-est-bleue,pain-au-chocolat,feat,stakano12.0%
swipe-studio,feat,bodysupport,pain-au-chocolat12.0%

unknown_ingredients_n numeric

Out[616]:

saturn.columns["unknown_ingredients_n"].stats

statvalue
n50
nulls0 (0.0%)
unique6
min 0
max 13
mean 0.66
median 0
std 2.255
q1 0
q3 0
iqr 0
skew 4.236
kurtosis 18.32
n_outliers 8
outlier_rate 0.16
zero_rate 0.84
alert: high_skewskew=+4.24
alert: outliers16.0% rows beyond 1.5 IQR
Fig 151.
Distribution of unknown_ingredients_n. Vertical dash marks the median.
Show data table
Histogram bins for unknown_ingredients_n (median: 0.0).
bincount
0 – 1.85746
1.857 – 3.7141
3.714 – 5.5711
5.571 – 7.4290
7.429 – 9.2861
9.286 – 11.140
11.14 – 131

url categorical

Out[619]:

saturn.columns["url"].stats

statvalue
n50
nulls0 (0.0%)
unique50
top_value https://world.openfoodfacts.org/product/6111242100992/perly
top_rate 0.02
cardinality 50
entropy 5.644
entropy_ratio 1
alert: long_tail50 singleton categories
Fig 152.
Top values for url.
Show data table
Top values for url (20 unique shown, of 50 total).
valuecountshare
https://world.openfoodfacts.org/product/6111242100992/perly12.0%
https://world.openfoodfacts.org/product/7622210449283/prince-gout-chocolat-lu12.0%
https://world.openfoodfacts.org/product/3046920029759/edelbitter-schokolade-lindt12.0%
https://world.openfoodfacts.org/product/6111031005064/tonik-%D8%B9%D8%B1%D8%A8%D9%8A12.0%
https://world.openfoodfacts.org/product/3175680011480/gerble-sesame-cookie-230g-8-2oz12.0%
https://world.openfoodfacts.org/product/20995553/chocolat-noir-85-cacao-j-d-gross12.0%
https://world.openfoodfacts.org/product/3268840001008/hhhhh-cristaline12.0%
https://world.openfoodfacts.org/product/3362600011044/henry-s12.0%
https://world.openfoodfacts.org/product/8425197712024/compound-chocolate-with-milk-and-almonds-maruja12.0%
https://world.openfoodfacts.org/product/7622210578464/organic-70-dark-chocolate-bar-green-black-s12.0%
https://world.openfoodfacts.org/product/6111259343108/king-cookies-excelo12.0%
https://world.openfoodfacts.org/product/3362600011228/sable-coco-henry-s-42g12.0%
https://world.openfoodfacts.org/product/8000500310427/biscuits-nutella12.0%
https://world.openfoodfacts.org/product/7300400481595/authentique-wasa12.0%
https://world.openfoodfacts.org/product/3046920022651/excellence-noir-intense-70-cacao-lindt12.0%
https://world.openfoodfacts.org/product/5060042641000/tyrell-s-lightly-sea-salted-tyrrell-s12.0%
https://world.openfoodfacts.org/product/7622210584724/intense-dark-chocolate-green-and-black12.0%
https://world.openfoodfacts.org/product/3046920022606/excellence-85-cacao-chocolat-noir-puissant-lindt-lindt12.0%
https://world.openfoodfacts.org/product/3229820100234/filled-dark-chocolate-bjorg12.0%
https://world.openfoodfacts.org/product/20022464/extra-dark-74-cocoa-fin-carre12.0%

data_quality_completeness_tags unknown

Out[622]:

saturn.columns["data_quality_completeness_tags"].stats

statvalue
n50
nulls0 (0.0%)
unique
alert: skippedno profiler for kind=unknown

ecoscore_data unknown

Out[624]:

saturn.columns["ecoscore_data"].stats

statvalue
n50
nulls0 (0.0%)
unique
alert: skippedno profiler for kind=unknown

generic_name_pl categorical

Out[626]:

saturn.columns["generic_name_pl"].stats

statvalue
n50
nulls45 (90.0%)
unique2
top_value
top_rate 0.8
cardinality 2
entropy 0.7219
entropy_ratio 0.7219
alert: null_rate90.0% null
Fig 153.
Top values for generic_name_pl.
Show data table
Top values for generic_name_pl (2 unique shown, of 2 total).
valuecountshare
48.0%
Wyśmienita czkolada gorzka 70% kakao12.0%

nutrition_data categorical

Out[629]:

saturn.columns["nutrition_data"].stats

statvalue
n50
nulls1 (2.0%)
unique1
top_value on
top_rate 1
cardinality 1
entropy 0
entropy_ratio 0
alert: imbalancetop value is 100.0% of rows
Fig 154.
Top values for nutrition_data.
Show data table
Top values for nutrition_data (1 unique shown, of 1 total).
valuecountshare
on4998.0%

generic_name_ja categorical

Out[632]:

saturn.columns["generic_name_ja"].stats

statvalue
n50
nulls49 (98.0%)
unique1
top_value
top_rate 1
cardinality 1
entropy 0
entropy_ratio 0
alert: long_tail1 singleton categories
alert: null_rate98.0% null
alert: imbalancetop value is 100.0% of rows
Fig 155.
Top values for generic_name_ja.
Show data table
Top values for generic_name_ja (1 unique shown, of 1 total).
valuecountshare
12.0%

nutriments unknown

Out[635]:

saturn.columns["nutriments"].stats

statvalue
n50
nulls0 (0.0%)
unique
alert: skippedno profiler for kind=unknown

last_image_dates_tags unknown

Out[637]:

saturn.columns["last_image_dates_tags"].stats

statvalue
n50
nulls0 (0.0%)
unique
alert: skippedno profiler for kind=unknown

brands categorical

Out[639]:

saturn.columns["brands"].stats

statvalue
n50
nulls0 (0.0%)
unique41
top_value Lindt
top_rate 0.08
cardinality 41
entropy 5.214
entropy_ratio 0.9731
alert: long_tail36 singleton categories
Fig 156.
Top values for brands.
Show data table
Top values for brands (20 unique shown, of 41 total).
valuecountshare
Lindt48.0%
Gerblé36.0%
Excelo36.0%
Henry's24.0%
Pringles24.0%
Perly12.0%
LU12.0%
عربي12.0%
J. D. Gross12.0%
Cristaline12.0%
Maruja12.0%
Green & Black's12.0%
Nutella12.0%
wasa12.0%
Tyrrell's12.0%
Green and black12.0%
Bjorg12.0%
fin CARRÉ12.0%
Wasa12.0%
Henry’s12.0%

minerals_tags unknown

Out[642]:

saturn.columns["minerals_tags"].stats

statvalue
n50
nulls0 (0.0%)
unique
alert: skippedno profiler for kind=unknown

nutrition_data_prepared_per categorical

Out[644]:

saturn.columns["nutrition_data_prepared_per"].stats

statvalue
n50
nulls0 (0.0%)
unique1
top_value 100g
top_rate 1
cardinality 1
entropy 0
entropy_ratio 0
alert: imbalancetop value is 100.0% of rows
Fig 157.
Top values for nutrition_data_prepared_per.
Show data table
Top values for nutrition_data_prepared_per (1 unique shown, of 1 total).
valuecountshare
100g50100.0%

popularity_tags unknown

Out[647]:

saturn.columns["popularity_tags"].stats

statvalue
n50
nulls0 (0.0%)
unique
alert: skippedno profiler for kind=unknown

packaging_text_es categorical

Out[649]:

saturn.columns["packaging_text_es"].stats

statvalue
n50
nulls30 (60.0%)
unique2
top_value
top_rate 0.95
cardinality 2
entropy 0.2864
entropy_ratio 0.2864
alert: null_rate60.0% null
Fig 158.
Top values for packaging_text_es.
Show data table
Top values for packaging_text_es (2 unique shown, of 2 total).
valuecountshare
1938.0%
1 caja de cartón para reciclar, 1 bandeja de plástico para reciclar12.0%

manufacturing_places categorical

Out[652]:

saturn.columns["manufacturing_places"].stats

statvalue
n50
nulls1 (2.0%)
unique20
top_value
top_rate 0.4082
cardinality 20
entropy 3.187
entropy_ratio 0.7374
alert: long_tail16 singleton categories
Fig 159.
Top values for manufacturing_places.
Show data table
Top values for manufacturing_places (20 unique shown, of 20 total).
valuecountshare
2040.0%
France918.0%
Maroc24.0%
Espagne24.0%
Aachen12.0%
France,Italie12.0%
Barilla Sverige AB,682 82,Filipstad,Zweden12.0%
United Kingdom12.0%
France,Oloron-sainte-marie 6440012.0%
Übach-Palenberg,Heinsberg (Kreis),Köln (Regierungsbezirk),Nordrhein-Westfalen,Deutschland12.0%
Barilla Deutschland GmbH,Wasastrasze 10,29229,Celle,Allemagne12.0%
Biscuits12.0%
maroc12.0%
Peaugres 0734012.0%
Tanger,Maroc12.0%
Rausch Schokoladen GmbH,Peine (Landkreis),Niedersachsen,Deutschland12.0%
Revel (31250),Annoray,France12.0%
Allemagne12.0%
85150,Vendée,France,Pays de la Loire,La Mothe Achard12.0%
Belgique12.0%

generic_name_nb categorical

Out[655]:

saturn.columns["generic_name_nb"].stats

statvalue
n50
nulls48 (96.0%)
unique1
top_value
top_rate 1
cardinality 1
entropy 0
entropy_ratio 0
alert: null_rate96.0% null
alert: imbalancetop value is 100.0% of rows
Fig 160.
Top values for generic_name_nb.
Show data table
Top values for generic_name_nb (1 unique shown, of 1 total).
valuecountshare
24.0%

last_modified_t numeric

Out[658]:

saturn.columns["last_modified_t"].stats

statvalue
n50
nulls0 (0.0%)
unique50
min 1.738e+09
max 1.769e+09
mean 1.763e+09
median 1.767e+09
std 8.093e+06
q1 1.762e+09
q3 1.768e+09
iqr 6.138e+06
skew -1.961
kurtosis 2.972
n_outliers 6
outlier_rate 0.12
zero_rate 0
alert: outliers12.0% rows beyond 1.5 IQR
Fig 161.
Distribution of last_modified_t. Vertical dash marks the median.
Show data table
Histogram bins for last_modified_t (median: 1766580948.5).
bincount
1.738e+09 – 1.742e+093
1.742e+09 – 1.747e+091
1.747e+09 – 1.751e+091
1.751e+09 – 1.755e+092
1.755e+09 – 1.76e+093
1.76e+09 – 1.764e+098
1.764e+09 – 1.769e+0932

vitamins_tags unknown

Out[661]:

saturn.columns["vitamins_tags"].stats

statvalue
n50
nulls0 (0.0%)
unique
alert: skippedno profiler for kind=unknown

_id categorical

Out[663]:

saturn.columns["_id"].stats

statvalue
n50
nulls0 (0.0%)
unique50
top_value 6111242100992
top_rate 0.02
cardinality 50
entropy 5.644
entropy_ratio 1
alert: long_tail50 singleton categories
Fig 162.
Top values for _id.
Show data table
Top values for _id (20 unique shown, of 50 total).
valuecountshare
611124210099212.0%
762221044928312.0%
304692002975912.0%
611103100506412.0%
317568001148012.0%
2099555312.0%
326884000100812.0%
336260001104412.0%
842519771202412.0%
762221057846412.0%
611125934310812.0%
336260001122812.0%
800050031042712.0%
730040048159512.0%
304692002265112.0%
506004264100012.0%
762221058472412.0%
304692002260612.0%
322982010023412.0%
2002246412.0%

teams_tags unknown

Out[666]:

saturn.columns["teams_tags"].stats

statvalue
n50
nulls0 (0.0%)
unique
alert: skippedno profiler for kind=unknown

countries categorical

Out[668]:

saturn.columns["countries"].stats

statvalue
n50
nulls0 (0.0%)
unique43
top_value Maroc
top_rate 0.1
cardinality 43
entropy 5.252
entropy_ratio 0.9678
alert: long_tail41 singleton categories
Fig 163.
Top values for countries.
Show data table
Top values for countries (20 unique shown, of 43 total).
valuecountshare
Maroc510.0%
Morocco48.0%
Morocco,United States12.0%
Algeria,Belgium,France,French Polynesia,Germany,Guadeloupe,Hungary,Luxembourg,Martinique,Morocco,New Caledonia,Réunion,Spain,Switzerland,United States12.0%
Algérie,Autriche,Belgique,Bulgarie,Canada,République tchèque,Finlande,France,Polynésie française,Allemagne,Irlande,Italie,Maurice,Maroc,Pays-Bas,Norvège,La Réunion,Roumanie,Singapour,Espagne,Suède,Suisse,Tunisie,Royaume-Uni12.0%
Belgium, Bulgaria, France, en:switzerland12.0%
Austria,Belgium,Bulgaria,Estonia,Finland,France,Germany,Italy,Lithuania,Slovakia,Slovenia,Spain,United Kingdom12.0%
Belgique,Côte d'Ivoire,France,Allemagne,Luxembourg,Mali,Martinique,Russie,Suisse,Royaume-Uni12.0%
Algeria,Cameroon,France,Morocco,Spain12.0%
France,Irlande,Suède,Royaume-Uni12.0%
Francia,Alemania,Italia,Marruecos,Portugal,Rumania,España,Suiza12.0%
France, Italy, Spain, Switzerland, en:reunion12.0%
Algérie,Belgique,République tchèque,France,Allemagne,Guadeloupe,Italie,Maroc,La Réunion,Espagne,Suisse12.0%
France,Germany,Spain,United Kingdom12.0%
Belgium, France, United Kingdom, en:ireland12.0%
Autriche,Belgique,France,Allemagne,Italie,Maroc,Pays-Bas,La Réunion,Espagne,Suisse12.0%
France,Luxembourg,Switzerland12.0%
Belgium,Bulgaria,Czech Republic,Finland,Germany,Netherlands,Poland,Spain12.0%
Belgique,France,Guadeloupe,Italie,La Réunion,Espagne,Suisse12.0%
Österreich,Belgien,Dänemark,Estland,Finnland,Frankreich,Deutschland,Italien,Luxemburg,Malta,Marokko,Niederlande,Portugal,Spanien,Schweden,Schweiz12.0%

pnns_groups_2 categorical

Out[671]:

saturn.columns["pnns_groups_2"].stats

statvalue
n50
nulls0 (0.0%)
unique11
top_value Biscuits and cakes
top_rate 0.34
cardinality 11
entropy 2.599
entropy_ratio 0.7513
Fig 164.
Top values for pnns_groups_2.
Show data table
Top values for pnns_groups_2 (11 unique shown, of 11 total).
valuecountshare
Biscuits and cakes1734.0%
Chocolate products1632.0%
Appetizers48.0%
Pastries36.0%
Bread24.0%
unknown24.0%
Sweets24.0%
Dairy desserts12.0%
Waters and flavored waters12.0%
Cereals12.0%
Dried fruits12.0%

states_hierarchy unknown

Out[674]:

saturn.columns["states_hierarchy"].stats

statvalue
n50
nulls0 (0.0%)
unique
alert: skippedno profiler for kind=unknown

code categorical

Out[676]:

saturn.columns["code"].stats

statvalue
n50
nulls0 (0.0%)
unique50
top_value 6111242100992
top_rate 0.02
cardinality 50
entropy 5.644
entropy_ratio 1
alert: long_tail50 singleton categories
Fig 165.
Top values for code.
Show data table
Top values for code (20 unique shown, of 50 total).
valuecountshare
611124210099212.0%
762221044928312.0%
304692002975912.0%
611103100506412.0%
317568001148012.0%
2099555312.0%
326884000100812.0%
336260001104412.0%
842519771202412.0%
762221057846412.0%
611125934310812.0%
336260001122812.0%
800050031042712.0%
730040048159512.0%
304692002265112.0%
506004264100012.0%
762221058472412.0%
304692002260612.0%
322982010023412.0%
2002246412.0%

countries_lc categorical

Out[679]:

saturn.columns["countries_lc"].stats

statvalue
n50
nulls1 (2.0%)
unique6
top_value en
top_rate 0.5714
cardinality 6
entropy 1.521
entropy_ratio 0.5883
Fig 166.
Top values for countries_lc.
Show data table
Top values for countries_lc (6 unique shown, of 6 total).
valuecountshare
en2856.0%
fr1632.0%
es24.0%
de12.0%
it12.0%
pl12.0%

stores_tags unknown

Out[682]:

saturn.columns["stores_tags"].stats

statvalue
n50
nulls0 (0.0%)
unique
alert: skippedno profiler for kind=unknown

generic_name_de categorical

Out[684]:

saturn.columns["generic_name_de"].stats

statvalue
n50
nulls30 (60.0%)
unique9
top_value
top_rate 0.6
cardinality 9
entropy 2.171
entropy_ratio 0.6849
alert: long_tail8 singleton categories
alert: null_rate60.0% null
Fig 167.
Top values for generic_name_de.
Show data table
Top values for generic_name_de (9 unique shown, of 9 total).
valuecountshare
1224.0%
Edelbitterschokolade 90% Kakao12.0%
Kekse mit Nuss-Nougat-Creme-Füllung12.0%
Extra feine dunkle Schokolade12.0%
Edelbitter-Schokolade 74% Kakao12.0%
Kräcker12.0%
Edel-Bitter-Schokolade. Ecuador 70% Kakao12.0%
Nuss-Nugat-Creme12.0%
Alpenmilch-Schokolade12.0%

ingredients_n_tags unknown

Out[687]:

saturn.columns["ingredients_n_tags"].stats

statvalue
n50
nulls0 (0.0%)
unique
alert: skippedno profiler for kind=unknown

allergens categorical

Out[689]:

saturn.columns["allergens"].stats

statvalue
n50
nulls0 (0.0%)
unique16
top_value
top_rate 0.32
cardinality 16
entropy 3.364
entropy_ratio 0.8411
Fig 168.
Top values for allergens.
Show data table
Top values for allergens (16 unique shown, of 16 total).
valuecountshare
1632.0%
en:soybeans510.0%
en:gluten510.0%
en:gluten,en:milk,en:soybeans48.0%
en:milk,en:nuts,en:soybeans48.0%
en:gluten,en:milk36.0%
en:eggs,en:gluten,en:milk,en:soybeans24.0%
en:milk24.0%
en:eggs,en:gluten,en:milk24.0%
en:banana,en:milk12.0%
en:gluten,en:milk,en:nuts,en:soybeans12.0%
en:gluten,en:sesame-seeds12.0%
en:eggs,en:gluten,en:sulphur-dioxide-and-sulphites12.0%
en:gluten,en:nuts12.0%
en:eggs,en:gluten12.0%
en:nuts,en:sulphur-dioxide-and-sulphites12.0%

allergens_lc categorical

Out[692]:

saturn.columns["allergens_lc"].stats

statvalue
n50
nulls2 (4.0%)
unique6
top_value en
top_rate 0.4583
cardinality 6
entropy 1.578
entropy_ratio 0.6104
Fig 169.
Top values for allergens_lc.
Show data table
Top values for allergens_lc (6 unique shown, of 6 total).
valuecountshare
en2244.0%
fr2142.0%
es24.0%
de12.0%
it12.0%
pl12.0%

ingredients_text_en categorical

Out[695]:

saturn.columns["ingredients_text_en"].stats

statvalue
n50
nulls6 (12.0%)
unique36
top_value
top_rate 0.2045
cardinality 36
entropy 4.811
entropy_ratio 0.9306
alert: long_tail35 singleton categories
Fig 170.
Top values for ingredients_text_en.
Show data table
Top values for ingredients_text_en (20 unique shown, of 36 total).
valuecountshare
918.0%
milk cream, cream, sugar, banana, bacteria12.0%
WHEAT flour 35%, whole WHEAT flour 15.7%, sugar, vegetable oils (palm, rapeseed), low-fat cocoa powder 4.5%, glucose syrup, WHEAT starch, raising agents (ammonium bicarbonate, sodium bicarbonate, disodium diphosphate), emulsifiers (SOY lecithin, sunflower lecithin), salt, skimmed MILK powder, lactose and MILK proteins, flavors, MAY CONTAIN EGG.12.0%
cocoa mass, cocoa butter, fat reduced cocoa, sugar, vanilla12.0%
Wheat flour, brown cane sugar, rapeseed oil, toasted sesame 10.6%, wheat germ 5.4%, whole wheat flour 5.4%, natural flavor, magnesium, emulsifier: lecithins, raising agents (potassium tartrates, sodium carbonates, ammonium carbonates), sea salt, wheat starch, vitamins (E, PP, B6, B1, B9).12.0%
cocoa mass, low-fat cocoa powder, cocoa butter, sugar, emulsifier: lecithin (soy), vanilla extract, may contain traces of nuts and milk,12.0%
Hhhhh12.0%
sugar, cocoa butter, whole milk powder, cocoa mass, almonds, emulsifier (soya lecithin), flavoring12.0%
cocoa mass #, cane sugar #, cocoa butter #, vanilla extract #, may contain nuts, milk,12.0%
wholemeal rye flour (77 g*), rye flour (28 g*), yeast, salt, may contain traces of milk and sesame seeds, *in g per 100 g of product,12.0%
cocoa paste, sugar, cocoa butter, vanilla,12.0%
Potatoes, sunflower oil, sea salt. May contain Milk.12.0%
cocoa mass, cocoa butter, fat-reduced cocoa powder, cane sugar, vanilla extract12.0%
Pâte de cacao, cacao maigre, beurre de cacao, cassonade, vanille bourbon naturelle en gousse.12.0%
_Wheat_ flour 39%, dark chocolate 25% (cocoa mass, cane sugar, cocoa butter), unrefined brown cane sugar, wholemeal _wheat_ flour 15%, oleic sunflower oil, natural vanilla flavouring, skimmed _milk_ powder, sea salt, raising agents: ammonium carbonates, sodium carbonates, thickener: acacia gum, antioxidant: rosemary extract.12.0%
cocoa mass, sugar, cocoa butter, fat reduced cocoa powder, emulsifier: lecithins (soya), natural vanilla flavouring, dark chocolate contains: cocoa solids 74% minimum,12.0%
whole rye flour (57 g), wheat bran (27 g), oatmeal (13 g), sesame seeds (7.9 g), wheat germ, salt.12.0%
wheat flour, palm oil, glucose syrup, barley malt extract, raising agents (ammonium carbonates, sodium carbonates), salt, eggs , flavouring, flour treatment agent (sodium metabisulfite ),12.0%
cocoa mass, sugar, cocoa butter, vanilla,12.0%
Farine de maïs* (70%), farine de riz*, sel marin. * K issus de l'agriculture biologique. • sans sucres ajoutés(¹) (contient des sucres naturellement présents.12.0%

misc_tags unknown

Out[698]:

saturn.columns["misc_tags"].stats

statvalue
n50
nulls0 (0.0%)
unique
alert: skippedno profiler for kind=unknown

photographers_tags unknown

Out[700]:

saturn.columns["photographers_tags"].stats

statvalue
n50
nulls0 (0.0%)
unique
alert: skippedno profiler for kind=unknown

packaging_materials_tags unknown

Out[702]:

saturn.columns["packaging_materials_tags"].stats

statvalue
n50
nulls0 (0.0%)
unique
alert: skippedno profiler for kind=unknown

product_name_nl categorical

Out[704]:

saturn.columns["product_name_nl"].stats

statvalue
n50
nulls38 (76.0%)
unique7
top_value
top_rate 0.5
cardinality 7
entropy 2.292
entropy_ratio 0.8166
alert: long_tail6 singleton categories
alert: null_rate76.0% null
Fig 171.
Top values for product_name_nl.
Show data table
Top values for product_name_nl (7 unique shown, of 7 total).
valuecountshare
612.0%
Excellence 70% Cocoa Intense Dark12.0%
Tartines craquantes multi-céréales12.0%
Dark absolute12.0%
Nuts & Fruits Mix12.0%
Granola12.0%
Volkoren cracotte12.0%

nutrition_score_warning_fruits_vegetables_legumes_estimate_from_ingredients numeric

Out[707]:

saturn.columns["nutrition_score_warning_fruits_vegetables_legumes_estimate_from_ingredients"].stats

statvalue
n50
nulls4 (8.0%)
unique1
min 1
max 1
mean 1
median 1
std 0
q1 1
q3 1
iqr 0
skew 0
kurtosis 0
n_outliers 0
outlier_rate 0
zero_rate 0
alert: constantonly one distinct value
Fig 172.
Distribution of nutrition_score_warning_fruits_vegetables_legumes_estimate_from_ingredients. Vertical dash marks the median.
Show data table
Histogram bins for nutrition_score_warning_fruits_vegetables_legumes_estimate_from_ingredients (median: 1.0).
bincount
0.5 – 0.66670
0.6667 – 0.83330
0.8333 – 10
1 – 1.16746
1.167 – 1.3330
1.333 – 1.50

product_name_sv categorical

Out[710]:

saturn.columns["product_name_sv"].stats

statvalue
n50
nulls46 (92.0%)
unique4
top_value 90% Cocoa
top_rate 0.25
cardinality 4
entropy 2
entropy_ratio 1
alert: long_tail4 singleton categories
alert: null_rate92.0% null
Fig 173.
Top values for product_name_sv.
Show data table
Top values for product_name_sv (4 unique shown, of 4 total).
valuecountshare
90% Cocoa12.0%
Arriba 85% Cacao Dark Chocolate12.0%
Dark 70%12.0%
Original12.0%

food_groups_tags unknown

Out[713]:

saturn.columns["food_groups_tags"].stats

statvalue
n50
nulls0 (0.0%)
unique
alert: skippedno profiler for kind=unknown

completeness numeric

Out[715]:

saturn.columns["completeness"].stats

statvalue
n50
nulls0 (0.0%)
unique14
min 0.575
max 1.1
mean 0.91
median 0.9
std 0.1358
q1 0.8875
q3 1
iqr 0.1125
skew -0.6678
kurtosis 0.32
n_outliers 6
outlier_rate 0.12
zero_rate 0
alert: outliers12.0% rows beyond 1.5 IQR
Fig 174.
Distribution of completeness. Vertical dash marks the median.
Show data table
Histogram bins for completeness (median: 0.9).
bincount
0.575 – 0.653
0.65 – 0.7253
0.725 – 0.82
0.8 – 0.8752
0.875 – 0.9522
0.95 – 1.0259
1.025 – 1.19

pnns_groups_1_tags unknown

Out[718]:

saturn.columns["pnns_groups_1_tags"].stats

statvalue
n50
nulls0 (0.0%)
unique
alert: skippedno profiler for kind=unknown

ingredients_with_specified_percent_n numeric

Out[720]:

saturn.columns["ingredients_with_specified_percent_n"].stats

statvalue
n50
nulls0 (0.0%)
unique7
min 0
max 8
mean 1.1
median 0
std 1.729
q1 0
q3 2
iqr 2
skew 1.878
kurtosis 3.676
n_outliers 1
outlier_rate 0.02
zero_rate 0.58
Fig 175.
Distribution of ingredients_with_specified_percent_n. Vertical dash marks the median.
Show data table
Histogram bins for ingredients_with_specified_percent_n (median: 0.0).
bincount
0 – 1.14336
1.143 – 2.2865
2.286 – 3.4293
3.429 – 4.5714
4.571 – 5.7141
5.714 – 6.8570
6.857 – 81

origin_nl categorical other

This column ('origin_nl') is a categorical field, likely a Dutch-language origin label or description, but it is effectively empty: 76% of the 50 rows are null, and the sole non-null value present is an empty string (''), appearing 12 times. With cardinality of 1, zero entropy, and a top_rate of 1.0 across only 12 non-null rows, the column carries no information whatsoever.

Treatment: Drop this column; it contains no usable signal (100% null or empty string across all 50 rows).

anthropic:default · confidence high
Out[723]:

saturn.columns["origin_nl"].stats

statvalue
n50
nulls38 (76.0%)
unique1
top_value
top_rate 1
cardinality 1
entropy 0
entropy_ratio 0
alert: null_rate76.0% null
alert: imbalancetop value is 100.0% of rows
Fig 176.
Top values for origin_nl.
Show data table
Top values for origin_nl (1 unique shown, of 1 total).
valuecountshare
1224.0%

fruits-vegetables-nuts_100g_estimate numeric

Out[726]:

saturn.columns["fruits-vegetables-nuts_100g_estimate"].stats

statvalue
n50
nulls23 (46.0%)
unique2
min 0
max 85
mean 3.148
median 0
std 16.36
q1 0
q3 0
iqr 0
skew 4.903
kurtosis 22.04
n_outliers 1
outlier_rate 0.03704
zero_rate 0.963
alert: null_rate46.0% null
alert: high_skewskew=+4.90
Fig 177.
Distribution of fruits-vegetables-nuts_100g_estimate. Vertical dash marks the median.
Show data table
Histogram bins for fruits-vegetables-nuts_100g_estimate (median: 0.0).
bincount
0 – 1726
17 – 340
34 – 510
51 – 680
68 – 851

brands_old categorical

Out[729]:

saturn.columns["brands_old"].stats

statvalue
n50
nulls16 (32.0%)
unique29
top_value Gerblé
top_rate 0.08824
cardinality 29
entropy 4.749
entropy_ratio 0.9776
alert: long_tail26 singleton categories
alert: null_rate32.0% null
Fig 178.
Top values for brands_old.
Show data table
Top values for brands_old (20 unique shown, of 29 total).
valuecountshare
Gerblé36.0%
Lindt36.0%
Green & Black's24.0%
LuMondelez12.0%
Lindt & sprüngli (nordic)12.0%
J.D. Gross12.0%
Cristaline12.0%
Maruja12.0%
Wasa,Barilla12.0%
Tyrrell's12.0%
Bjorg12.0%
Fin Carré12.0%
Wasa12.0%
Le pain des Fleurs,Ekibio12.0%
Aperitivos company12.0%
Lidl,J.D. Gross12.0%
Nutella,Ferrero12.0%
Pringles12.0%
Nature Valley12.0%
Lindt,ลินด์12.0%

generic_name_fr categorical

Out[732]:

saturn.columns["generic_name_fr"].stats

statvalue
n50
nulls3 (6.0%)
unique34
top_value
top_rate 0.2979
cardinality 34
entropy 4.42
entropy_ratio 0.8689
alert: long_tail33 singleton categories
Fig 179.
Top values for generic_name_fr.
Show data table
Top values for generic_name_fr (20 unique shown, of 34 total).
valuecountshare
1428.0%
Perly fromage frais12.0%
BISCUITS FOURRÉS (35%) PARFUM CHOCOLAT12.0%
Chocolat noir extra-fin traditionnel à 90% de cacao12.0%
Biscuits au sésame12.0%
Chocolat noir, 85% de cacao12.0%
Eau de source12.0%
Succédané de chocolat au lait avec amandes12.0%
Sablé coco12.0%
Biscuit fourré à la pâte à tartiner aux noisettes et au cacao Nutella®12.0%
Pain croustillant a la farine de seigle12.0%
Chocolat noir extra-fin traditionnel12.0%
Chips de pommes de terre légèrement salées au sel de mer12.0%
Chocolat noir extra fin, traditionnel12.0%
goûters fourrés au chocolat noir12.0%
Edelbitter-Schokolade 74% Kakao12.0%
Pain croustillant à la farine complète de seigle, avoine et sésame.12.0%
Crackers12.0%
Chocolat noir extra-fin12.0%
Biscuits aux pommes et aux noisettes, très pauvres en sel, riches en vitamines B1, B2, B9 et E et source de vitamines PP et B612.0%

ingredients unknown

Out[735]:

saturn.columns["ingredients"].stats

statvalue
n50
nulls0 (0.0%)
unique
alert: skippedno profiler for kind=unknown

countries_tags unknown

Out[737]:

saturn.columns["countries_tags"].stats

statvalue
n50
nulls0 (0.0%)
unique
alert: skippedno profiler for kind=unknown

ingredients_original_tags unknown

Out[739]:

saturn.columns["ingredients_original_tags"].stats

statvalue
n50
nulls0 (0.0%)
unique
alert: skippedno profiler for kind=unknown

ingredients_text_de categorical

Out[741]:

saturn.columns["ingredients_text_de"].stats

statvalue
n50
nulls30 (60.0%)
unique16
top_value
top_rate 0.25
cardinality 16
entropy 3.741
entropy_ratio 0.9354
alert: long_tail15 singleton categories
alert: null_rate60.0% null
Fig 180.
Top values for ingredients_text_de.
Show data table
Top values for ingredients_text_de (16 unique shown, of 16 total).
valuecountshare
510.0%
Kakaomasse, Kakaobutter, fettarmes Kakaopulver, Zucker, Vanille12.0%
Kakaomasse, fettarmes Kakaopulver, Kakaobutter, Zucker, Emulgator: Lecithine (Soja); Vanilleextrakt.12.0%
Nuss-Nugat-Creme 40 % (Zucker, Palmöl, _HASELNÜSSE_ 13 %, _MAGERMILCHPULVER_ 8.7%, fettarmer Kakao 7,4 %, Emulgator Lecithine (_SOJA_), Vanillin), _WEIZENMEHL_ (32,5 %), pflanzliche Fette (Palm, Palmkern), Rohrzucker 8,5 % (enthält _WEIZEN_), _MILCHZUCKER_, _WEIZENKLEIE_, _VOLLMILCHPULVER_, _GERSTENMALZ_ - und Maisextraktpulver, Honig, Backtriebmittel: Dinatriumdiphosphat, Natriumhydrogencarbonat, Ammoniumhydrogencarbonat; fettarmer Kakao, Salz, _WEIZENSTÄRKE_, _GERSTENMALZMEHL_, Emulgator Lecithine (_SOJA_), Vanillin12.0%
Kakaomasse, Zucker, Kakaobutter, Vanille12.0%
Kartoffeln, Sonnenblumenöl, Meersalz.12.0%
Kakaomasse, fettarmes Kakaopulver, Kakaobutter, Rohrzucker, Vanille. Kann Schalenfrüchte, Milch, Soja, Sesamsamen und Weizen enthalten.12.0%
kakaomass of*, zucker, kakaobutter, kakaopulver stark entöit, emulgator: sonnenblumenlecithine (e-322), natürliche in vanille-aroma, * rainforest alliance certified, cocoa: 74% mindestens,12.0%
_WEIZENMEHL_, Palmöl, Glukosesirup, _GERSTENMALZEXTRAKT_, Backtriebmittel (Ammoniumcarbonate, Natriumcarbonate), Speisesalz 1,4 %, _EIER_, Aroma, Mehlbehandlungsmittel (_NATRIUMMETABISULFIT_).12.0%
Kakaomasse, Zucker, Kakaobutter, Emulgator: Lecithine (_Soja_); Vanilleextrakt.12.0%
Kartoffelpüreepulver, pflanzliche Öle (Sonnenblume, Palm, Mais) in veränderlichen Gewichtsanteilen, Weizenmehl, Maismehl, Reismehl, Maltodextrin, Emulgator (E471), Salz, Farbstoff (Annatto Norbixin).12.0%
Kakaomasse, fettarmes Kakaopulver, Kakaobutter . Kann Schalenfrüchte, Milch und Soja enthalten.12.0%
Alpenmilch Schokolade. Zutaten: Zucker, Kakaobutter, Magermilchpulver, Kakaomasse, Süßmolkenpulver (aus Milch), Butterreinfett, Haselnüsse, Emulgatoren (Sojalecithin, E476), Aroma. Kakao: 30 % mindestens. Kann andere Nüsse und Weizen enthalten. Ohne Farbstoffe** und Konservierungsstoffe** -**Gemäß rechtlicher Vorschriften.12.0%
Kakaomasse¹, Rohrzucker¹, Kakaobutter¹, Emulgator: Lecithine (_Soja_)¹. ¹aus kontrolliert ökologischem Anbau.12.0%
25% _Walnusskerne_, 25% _Mandeln_, 25% Sultaninen geschwefelt (Sultaninen, Sonnenblumenöl, Konservierungsstoff: _Schwefeldioxid_), 25% Cranberries (Cranberries, Zucker, Sonnenblumenöl).12.0%
Kakaomasse, Zucker, Kakaobutter, Emulgator (Sojalecithin), Vanille. Kann Haselnüsse, Mandeln, Milch enthalten.12.0%

nutriscore_grade categorical

Out[744]:

saturn.columns["nutriscore_grade"].stats

statvalue
n50
nulls0 (0.0%)
unique6
top_value e
top_rate 0.54
cardinality 6
entropy 1.913
entropy_ratio 0.7399
Fig 181.
Top values for nutriscore_grade.
Show data table
Top values for nutriscore_grade (6 unique shown, of 6 total).
valuecountshare
e2754.0%
d918.0%
c714.0%
a48.0%
b24.0%
unknown12.0%

image_thumb_url categorical

Out[747]:

saturn.columns["image_thumb_url"].stats

statvalue
n50
nulls0 (0.0%)
unique50
top_value https://images.openfoodfacts.org/images/products/611/124/210/0992/front_fr.172.100.jpg
top_rate 0.02
cardinality 50
entropy 5.644
entropy_ratio 1
alert: long_tail50 singleton categories
Fig 182.
Top values for image_thumb_url.
Show data table
Top values for image_thumb_url (20 unique shown, of 50 total).
valuecountshare
https://images.openfoodfacts.org/images/products/611/124/210/0992/front_fr.172.100.jpg12.0%
https://images.openfoodfacts.org/images/products/762/221/044/9283/front_en.605.100.jpg12.0%
https://images.openfoodfacts.org/images/products/304/692/002/9759/front_en.492.100.jpg12.0%
https://images.openfoodfacts.org/images/products/611/103/100/5064/front_fr.56.100.jpg12.0%
https://images.openfoodfacts.org/images/products/317/568/001/1480/front_en.221.100.jpg12.0%
https://images.openfoodfacts.org/images/products/000/002/099/5553/front_en.314.100.jpg12.0%
https://images.openfoodfacts.org/images/products/326/884/000/1008/front_fr.422.100.jpg12.0%
https://images.openfoodfacts.org/images/products/336/260/001/1044/front_fr.50.100.jpg12.0%
https://images.openfoodfacts.org/images/products/842/519/771/2024/front_en.60.100.jpg12.0%
https://images.openfoodfacts.org/images/products/762/221/057/8464/front_en.29.100.jpg12.0%
https://images.openfoodfacts.org/images/products/611/125/934/3108/front_fr.25.100.jpg12.0%
https://images.openfoodfacts.org/images/products/336/260/001/1228/front_fr.38.100.jpg12.0%
https://images.openfoodfacts.org/images/products/800/050/031/0427/front_fr.488.100.jpg12.0%
https://images.openfoodfacts.org/images/products/730/040/048/1595/front_fr.242.100.jpg12.0%
https://images.openfoodfacts.org/images/products/304/692/002/2651/front_en.159.100.jpg12.0%
https://images.openfoodfacts.org/images/products/506/004/264/1000/front_en.179.100.jpg12.0%
https://images.openfoodfacts.org/images/products/762/221/058/4724/front_en.95.100.jpg12.0%
https://images.openfoodfacts.org/images/products/304/692/002/2606/front_en.102.100.jpg12.0%
https://images.openfoodfacts.org/images/products/322/982/010/0234/front_fr.246.100.jpg12.0%
https://images.openfoodfacts.org/images/products/000/002/002/2464/front_en.301.100.jpg12.0%

packaging_text_en categorical

Out[750]:

saturn.columns["packaging_text_en"].stats

statvalue
n50
nulls7 (14.0%)
unique5
top_value
top_rate 0.907
cardinality 5
entropy 0.6325
entropy_ratio 0.2724
alert: long_tail4 singleton categories
Fig 183.
Top values for packaging_text_en.
Show data table
Top values for packaging_text_en (5 unique shown, of 5 total).
valuecountshare
3978.0%
1 plastic bottle to recycle 1 plastic cap to recycle12.0%
1 cardboard sleeve recyclable, 1 sheet of aluminium recyclable12.0%
Terracycle. Please dispose of this pack responsibly. Find out more at www.terracycle.co.uk.12.0%
cardboard (to recycle) foil paper (to throw away)12.0%

packaging_text_it categorical

Out[753]:

saturn.columns["packaging_text_it"].stats

statvalue
n50
nulls34 (68.0%)
unique3
top_value
top_rate 0.875
cardinality 3
entropy 0.6686
entropy_ratio 0.4218
alert: long_tail2 singleton categories
alert: null_rate68.0% null
Fig 184.
Top values for packaging_text_it.
Show data table
Top values for packaging_text_it (3 unique shown, of 3 total).
valuecountshare
1428.0%
Incarto esterno in carta da riciclare, Incarto interno in alluminio da riciclare.12.0%
1 tubo C/PAP 85 da indifferenziata, 1 sigillo C/PAP 84 da indifferenziata, 1 tappo di plastica PP5 da riciclare.12.0%

traces_tags unknown

Out[756]:

saturn.columns["traces_tags"].stats

statvalue
n50
nulls0 (0.0%)
unique
alert: skippedno profiler for kind=unknown

brands_tags unknown

Out[758]:

saturn.columns["brands_tags"].stats

statvalue
n50
nulls0 (0.0%)
unique
alert: skippedno profiler for kind=unknown

nutriscore_2021_tags unknown

Out[760]:

saturn.columns["nutriscore_2021_tags"].stats

statvalue
n50
nulls0 (0.0%)
unique
alert: skippedno profiler for kind=unknown

packaging_text categorical

Out[762]:

saturn.columns["packaging_text"].stats

statvalue
n50
nulls2 (4.0%)
unique13
top_value
top_rate 0.75
cardinality 13
entropy 1.708
entropy_ratio 0.4614
alert: long_tail12 singleton categories
Fig 185.
Top values for packaging_text.
Show data table
Top values for packaging_text (13 unique shown, of 13 total).
valuecountshare
3672.0%
1 film en plastique à recycler 1 étui en papier ondulé à recycler12.0%
carton, plastique12.0%
1 bouchon en plastique à trier 1 bouteille en plastique à trier12.0%
1 étui en carton à recycler 1 feuille en aluminium à recycler12.0%
1 sachet plastique à jeter12.0%
1 étui en carton  à recycler 1 feuille en aluminium à recycler12.0%
LE TRI +FACILE + BAC DE TRI12.0%
4 FILMS PLASTIQUE A JETER 1 ÉTUI CARTON À RECYCLER12.0%
cardboard (to recycle) foil paper (to throw away)12.0%
FR LE TRI + FACILE ÉTUI 8+ SACHETS BAC DE TRI A consommer de préférence avant le : en France par et Santé S.A.S. 10:02 11914538 112 eCastelnaudary REVEL 30 04 202412.0%
Sachet, clip à recycler12.0%
2 sachets en plastique à recycler 1 boîte en carton à recycler12.0%

popularity_key numeric identifier

This column appears to be a synthetic or encoded identifier rather than a true popularity metric — values cluster tightly in the 23.9–24.0 billion range, with a median of ~23,999,500,422 and a max of ~23,999,992,269, suggesting a fixed-prefix integer key scheme. The strong negative skew (−2.67) and high kurtosis (5.11) are driven by 5 outlier values that fall far below the cluster, near the minimum of ~22,999,500,355, which is about 1 billion lower than the bulk of records. Despite the name 'popularity_key', the distribution is inconsistent with any organic popularity signal and is almost certainly a generated or composite key.

Treatment: Treat as an opaque identifier; do not use as a numeric feature — investigate the 5 outlier records (~10% of data) for data integrity issues before joining or filtering.

anthropic:default · confidence medium
Out[765]:

saturn.columns["popularity_key"].stats

statvalue
n50
nulls0 (0.0%)
unique49
min 2.3e+10
max 2.4e+10
mean 2.39e+10
median 2.4e+10
std 3.03e+08
q1 2.4e+10
q3 2.4e+10
iqr 4.002e+05
skew -2.667
kurtosis 5.111
n_outliers 5
outlier_rate 0.1
zero_rate 0
alert: high_skewskew=-2.67
alert: outliers10.0% rows beyond 1.5 IQR
Fig 186.
Distribution of popularity_key. Vertical dash marks the median.
Show data table
Histogram bins for popularity_key (median: 23999500422.0).
bincount
2.3e+10 – 2.314e+105
2.314e+10 – 2.329e+100
2.329e+10 – 2.343e+100
2.343e+10 – 2.357e+100
2.357e+10 – 2.371e+100
2.371e+10 – 2.386e+100
2.386e+10 – 2.4e+1045

ingredients_text categorical

Out[768]:

saturn.columns["ingredients_text"].stats

statvalue
n50
nulls0 (0.0%)
unique50
top_value milk cream, cream, sugar, banana, bacteria
top_rate 0.02
cardinality 50
entropy 5.644
entropy_ratio 1
alert: long_tail50 singleton categories
Fig 187.
Top values for ingredients_text.
Show data table
Top values for ingredients_text (20 unique shown, of 50 total).
valuecountshare
milk cream, cream, sugar, banana, bacteria12.0%
Céréale 50 % (Farine de blé 34,8 %, farine de blé complet 15,2 %), sucre, huiles végétales (palme, colza), cacao maigre en poudre 4,5 %, sirop de glucose, amidon de blé, poudres à lever (carbonates d'ammonium, carbonates de sodium), émulsifiant (lécithines de soja), sel, lait écrémé en poudre, perméat de lactosérum (de lait), arômes. Peut contenir œuf.12.0%
Pâte de cacao, beurre de cacao, cacao maigre, sucre, vanille.12.0%
Coffret fourré au cacao (41,6%) et à la vanille (208) - Ingrédients Farine de blé, sucre, huile végétale non hydrogénée (huile de palme), filtrat de lait, poudre de cacao Émulsifiant à faible teneur en cacao (322) Lécithine de soja) Agent levant (5000) Sucre artificiel (vanilline) Sel Contient du lait, du blé (gluten) du soja12.0%
Farine de blé 57%, sucre de canne roux, huile de colza, sésame toasté 10,6%, germe de blé 5,4%, farine complète de blé 5,4%, arôme naturel, magnésium, émulsifiant : lécithines, poudres à lever (tartrates de potassium, carbonates de sodium, carbonates d'ammonium), sel de mer, amidon de blé, vitamines (E, PP, B6, B1, B9).12.0%
Какаова маса, нискомаслено какао на прах, какаово масло, захар, емулгатор: лецитин (соеви), екстракт от ванилия, Може да съдържа следи от ядки и мляко,12.0%
Eau de source12.0%
Farine de froment, sucre, graisse végétale, sucre inverti, agents levants ( bicarbonate d'ammonium - bicarbonate de sodium), sel, arome.12.0%
sugar, cocoa butter, whole milk powder, cocoa mass, almonds, emulsifier (soya lecithin), flavoring12.0%
cocoa mass #, cane sugar #, cocoa butter #, vanilla extract #, may contain nuts, milk,12.0%
دقيقالقمح،رقائق الشوكولاته20%[عجينة زيت النخلة.الكاكاو،سكر،دكستروز و مستحلب12.0%
Farine de _froment_, sucre, graisse végétale, noix de coco râpée, poudre de _lait_, poudre de _lactosérum_, sucre inverti, agents levants (bicarbonate d'ammonium - bicarbonate de Sodium), sel, arômes.12.0%
Pâte à tartiner aux NOISETTES et au cacao 40% (sucre, huile de palme, NOISETTES 13%**, LAIT écrémé en poudre 8,7%**, cacao maigre 7,4%**, émulsifiants : lécithines [SOJA]; vanilline), farine de FROMENT 32,5%, graisses végétales (palme, palmiste), sucre de canne (contient BLE) 8,5%, LACTOSE, son de BLE, LAIT en poudre, miel, poudres à lever (diphosphate disodique, carbonate acide de sodium, carbonate acide d'ammonium), farine d'ORGE malté, cacao maigre en poudre, sel, extrait en poudre de malt d'ORGE et de maïs, amidon de FROMENT, émulsifiants: lécithines [SOJA]; vanilline.12.0%
Farine complète de SEIGLE (77 g*), farine de SEIGLE (28 g*), levure, sel. Peut contenir des traces de LUPIN, LAIT, MOUTARDE, GRAINES DE SÉSAME et SOJA. *en g pour 100 g de produit.12.0%
Pâte de cacao, sucre, beurre de cacao, vanille. Peut contenir des fruits à coque, du lait, du soja et des graines de sésame.12.0%
Kartoffeln, Sonnenblumenöl, Meersalz.12.0%
pâte de cacao*, beurre de cacao*, cacao maigre en poudre*, sucre de canne*, extrait de vanille*, * ingrédients issus de l'agriculture biologique12.0%
Pâte de cacao, cacao maigre, beurre de cacao, cassonade, vanille12.0%
Farine de blé* 41%, Chocolat noir* 22% (pâte de cacao*, sucre de canne", beurre de cacao"), Sucre de canne* roux non raffiné, Farine complète de blé* 16%, Huile de tournesol oléique*, Arôme naturel de vanille, Lait écrémé en poudre, Sel de mer, carbonates d'ammonium, carbonates de sodium, gomme d'acacia*, extraits de romarin* Peut contenir du soja, des œufs, des fruits à coque, des graines de sésame et de la moutarde. *Ingrédients biologiques.12.0%
cocoa mass, sugar, cocoa butter, fat reduced cocoa powder, emulsifier: lecithins (soya), natural vanilla flavouring, dark chocolate contains: cocoa solids 74% minimum,12.0%

ingredients_text_with_allergens_fr categorical

Out[771]:

saturn.columns["ingredients_text_with_allergens_fr"].stats

statvalue
n50
nulls2 (4.0%)
unique47
top_value
top_rate 0.04167
cardinality 47
entropy 5.543
entropy_ratio 0.998
alert: long_tail46 singleton categories
Fig 188.
Top values for ingredients_text_with_allergens_fr.
Show data table
Top values for ingredients_text_with_allergens_fr (20 unique shown, of 47 total).
valuecountshare
24.0%
Lait écrémé, crème, SUcre, ferments laciques12.0%
Céréale 50 % (Farine de blé 34,8 %, farine de blé complet 15,2 %), sucre, huiles végétales (palme, colza), cacao maigre en poudre 4,5 %, sirop de glucose, amidon de blé, poudres à lever (carbonates d'ammonium, carbonates de sodium), émulsifiant (lécithines de soja), sel, lait écrémé en poudre, perméat de lactosérum (de lait), arômes. Peut contenir œuf.12.0%
Pâte de cacao, beurre de cacao, cacao maigre, sucre, vanille.12.0%
Coffret fourré au cacao (41,6%) et à la vanille (208) - Ingrédients Farine de blé, sucre, huile végétale non hydrogénée (huile de palme), filtrat de lait, poudre de cacao Émulsifiant à faible teneur en cacao (322) Lécithine de soja) Agent levant (5000) Sucre artificiel (vanilline) Sel Contient du lait, du blé (gluten) du soja12.0%
Farine de blé 57%, sucre de canne roux, huile de colza, sésame toasté 10,6%, germe de blé 5,4%, farine complète de blé 5,4%, arôme naturel, magnésium, émulsifiant : lécithines, poudres à lever (tartrates de potassium, carbonates de sodium, carbonates d'ammonium), sel de mer, amidon de blé, vitamines (E, PP, B6, B1, B9).12.0%
Pâte de cacao, cacao maigre en poudre, beurre de cacao, sucre, émulsifiant : lécithines (soja) ; extrait de vanille. Traces éventuelles de fruits à coque et de lait.12.0%
Eau de source12.0%
Farine de froment, sucre, graisse végétale, sucre inverti, agents levants ( bicarbonate d'ammonium - bicarbonate de sodium), sel, arome.12.0%
Sucre, graisse vegetale de palmiste hidrogenée, Lait Enteir en poudre, Amandes, Cacao Dégraissé en poudre, lactoserum en poudre, Emulsifiant Lécithine de soja, Arômes (Vainilline).12.0%
دقيقالقمح،رقائق الشوكولاته20%[عجينة زيت النخلة.الكاكاو،سكر،دكستروز و مستحلب12.0%
Farine de froment, sucre, graisse végétale, noix de coco râpée, poudre de lait, poudre de lactosérum, sucre inverti, agents levants (bicarbonate d'ammonium - bicarbonate de Sodium), sel, arômes.12.0%
Pâte à tartiner aux NOISETTES et au cacao 40% (sucre, huile de palme, NOISETTES 13%**, LAIT écrémé en poudre 8,7%**, cacao maigre 7,4%**, émulsifiants : lécithines [SOJA]; vanilline), farine de FROMENT 32,5%, graisses végétales (palme, palmiste), sucre de canne (contient BLE) 8,5%, LACTOSE, son de BLE, LAIT en poudre, miel, poudres à lever (diphosphate disodique, carbonate acide de sodium, carbonate acide d'ammonium), farine d'ORGE malté, cacao maigre en poudre, sel, extrait en poudre de malt d'ORGE et de maïs, amidon de FROMENT, émulsifiants: lécithines [SOJA]; vanilline.12.0%
Farine complète de SEIGLE (77 g*), farine de SEIGLE (28 g*), levure, sel. Peut contenir des traces de LUPIN, LAIT, MOUTARDE, GRAINES DE SÉSAME et SOJA. *en g pour 100 g de produit.12.0%
Pâte de cacao, sucre, beurre de cacao, vanille. Peut contenir des fruits à coque, du lait, du soja et des graines de sésame.12.0%
pâte de cacao*, beurre de cacao*, cacao maigre en poudre*, sucre de canne*, extrait de vanille*, * ingrédients issus de l'agriculture biologique12.0%
Pâte de cacao, cacao maigre, beurre de cacao, cassonade, vanille12.0%
Farine de blé* 41%, Chocolat noir* 22% (pâte de cacao*, sucre de canne", beurre de cacao"), Sucre de canne* roux non raffiné, Farine complète de blé* 16%, Huile de tournesol oléique*, Arôme naturel de vanille, Lait écrémé en poudre, Sel de mer, carbonates d'ammonium, carbonates de sodium, gomme d'acacia*, extraits de romarin* Peut contenir du soja, des œufs, des fruits à coque, des graines de sésame et de la moutarde. *Ingrédients biologiques.12.0%
Pâte de cacao, sucre, beurre de cacao, cacao maigre en poudre, émulsifiant : lécithines (soja), arôme naturel de vanille.12.0%
Farine complète de SEIGLE 59 g*, son de BLÉ 27 g*, flocons d'AVOINE 12 g*, GRAINES DE SÉSAME 7,0 g*, germe de BLÉ, sel. *en g pour 100 g de produit fini. Peut contenir des traces de LUPIN, LAIT, MOUTARDE et SOJA.12.0%

ingredients_text_nl categorical

Out[774]:

saturn.columns["ingredients_text_nl"].stats

statvalue
n50
nulls38 (76.0%)
unique9
top_value
top_rate 0.3333
cardinality 9
entropy 2.918
entropy_ratio 0.9206
alert: long_tail8 singleton categories
alert: null_rate76.0% null
Fig 189.
Top values for ingredients_text_nl.
Show data table
Top values for ingredients_text_nl (9 unique shown, of 9 total).
valuecountshare
48.0%
Cacaomassa, cacaoboter, magere cacaopoeder, suiker.12.0%
Aardappelen, zonnebloemolie, zeezout.12.0%
Cacaomassa, magere cacao, cacaoboter, bruine suiker, vanille. Kan noten, melk, soja, sesamzaad en tarwe bevatten.12.0%
Cacaomassa, suiker, cacaoboter, vanille.12.0%
Cacaomassa, magere cacaopoeder, cacaoboter, bruine suiker.12.0%
*Referentie inname van een gemiddelde volwassehe (8400 kJ/ 2000 ReJI), 16,7 g 46x4, www,snackmindful,com Milka www,milka,com ER Mondelez France SAS, 6 avenue Réaumur, CS 50014, 92142 Clamart Cedex, Service Consommateurs Nº Cristal:09,69,39,79,79 BE Mondelez Belgium, Stationsstraat 100, 2800 Mechelen, ND Mondelez Nederland, Verlengde Poolseweg 34, 4818 CL Breda, eu mondelezinternational,com e 100 g COCOA LIFE www,cocoalife,org 8 FR FRANCE ONLY 05 pp 3 045140 10550212.0%
_tarwebloem_ 47%, _melkchocolade_ 29% (suiker, cacaomassa, cacaoboter, weipoeder (van _melk_), magere _melkpoeder_, plantaardige vetten (shea, palm in wisselende verhoudingen), _melkvet_, emulgatoren (_sojalecithine_, E476), lactose (van _melk_), aroma), plantaardige oliën (palm, kokos), suiker, suikerstroop, _tarwezemelen_, rijsmiddelen (natriumwaterstofcarbonaat, ammoniumwaterstofcarbonaat), zout, _tarwekiemen_, voedingszuur (citroenzuur)12.0%
granen 98.3% (_volkorentarwemeel_ 65.8%, _roggebloem_, _tarwebloem_ 10.2%, rijstbloem, gemoute _tarwebloem_, _tarwegriesmeel_, boekweitbloem, _gerstebloem_), suiker, magere _melkpoeder_, zout, palmolie, _tarwekiemen_, emulgator (zonnebloemlecithine)12.0%

product_name_es categorical

Out[777]:

saturn.columns["product_name_es"].stats

statvalue
n50
nulls30 (60.0%)
unique17
top_value
top_rate 0.2
cardinality 17
entropy 3.922
entropy_ratio 0.9595
alert: long_tail16 singleton categories
alert: null_rate60.0% null
Fig 190.
Top values for product_name_es.
Show data table
Top values for product_name_es (17 unique shown, of 17 total).
valuecountshare
48.0%
Príncipe Galletas de Chocolate12.0%
Excellence chocolate 90% cacao12.0%
Chocolate negro 85% cacao12.0%
Nutella Biscuits12.0%
Biscotes integrales original12.0%
Excellence 85% cacao12.0%
Chocolate negro 74% cacao12.0%
Tostadas crujientes de fibra12.0%
Original12.0%
Excellence 70% Cocoa Intense Dark12.0%
Chocolate negro Ecuador 70% cacao12.0%
Nutella12.0%
Crunchy Oats & Honey12.0%
Excellence 99% Cacao Noir Absolu12.0%
Chocolate Con Leche Milka12.0%
Excellence suave 70% cacao12.0%

data_sources_tags unknown

Out[780]:

saturn.columns["data_sources_tags"].stats

statvalue
n50
nulls0 (0.0%)
unique
alert: skippedno profiler for kind=unknown

data_quality_bugs_tags unknown

Out[782]:

saturn.columns["data_quality_bugs_tags"].stats

statvalue
n50
nulls0 (0.0%)
unique
alert: skippedno profiler for kind=unknown

obsolete_since_date categorical

Out[784]:

saturn.columns["obsolete_since_date"].stats

statvalue
n50
nulls6 (12.0%)
unique1
top_value
top_rate 1
cardinality 1
entropy 0
entropy_ratio 0
alert: imbalancetop value is 100.0% of rows
Fig 191.
Top values for obsolete_since_date.
Show data table
Top values for obsolete_since_date (1 unique shown, of 1 total).
valuecountshare
4488.0%

weighers_tags unknown

Out[787]:

saturn.columns["weighers_tags"].stats

statvalue
n50
nulls0 (0.0%)
unique
alert: skippedno profiler for kind=unknown

ingredients_text_debug categorical

Out[789]:

saturn.columns["ingredients_text_debug"].stats

statvalue
n50
nulls14 (28.0%)
unique35
top_value
top_rate 0.05556
cardinality 35
entropy 5.114
entropy_ratio 0.9971
alert: long_tail34 singleton categories
alert: null_rate28.0% null
Fig 192.
Top values for ingredients_text_debug.
Show data table
Top values for ingredients_text_debug (20 unique shown, of 35 total).
valuecountshare
24.0%
Lait écrémé, créme, sucre, ferments lactiques. matière grosse 3% , sa première date de publication au maroc 01/10/1993 le changement du packaging 10 ans par 10 ans depuis vingt-cinq ans de l’offre12.0%
Céréale 50,7 % (farine de blé 35 %, farine de blé complète 15,7 %), sucre, huiles végétales (palme, colza), cacao maigre en poudre 4,5 %, sirop de glucose, amidon de blé, poudre à lever : (carbonate acide d'ammonium, carbonate acide de sodium, diphosphate disodique), émulsifiants : (lécithine de soja, lécithine de tournesol), sel, lait écrémé en poudre, lactose et protéines de lait, arômes.12.0%
Pâte de cacao, beurre de cacao, cacao maige, sucre, vanille. Cacao: 90% minimum.12.0%
Farine de blé 55,1%, sucre de canne roux, huile de colza 14,3%, sésame toasté 11,6%, germe de blé 5,2%, levain de seigle dévitalisé en poudre, fibres d'avoine, calcium, sel de mer, arôme naturel, magnésium, émulsifiant : lécithines de colza, poudres à lever : (tartrates de potassium, carbonates de sodium, carbonates d'ammonium), acidifiant : acide malique, protéines de lait, amidon de blé, vitamines B1, B6, B9, PP et E (lactose, protéines de lait).12.0%
Eau de source12.0%
Farine de froment sucre, graisse végétale ,sucre inverti, agents levants ( bicarbonate d'ammonium-bicarbonate de sodium, sel , arome. Contient du gluten Peut contenir traces de lait et soja. Conserver dans un endroit frais et sec12.0%
Sucre, graisse végétale de palmiste hydrogénée, _Lait_ entier en poudre, Amandes, Cacao dégraissé en poudre, _lactosérum_ en poudre, Émulsifiant : Lécithine de _soja_, Arômes (Vanilline).12.0%
Pâte à tartiner aux _noisettes_ et au cacao 40% (sucre, huile de palme, _noisettes_ 13%, _lait_ écrémé en poudre 8,7%, cacao maigre 7,4%, émulsifiants : lécithines _soja_ ; vanilline), farine de _froment_ 32%, graisses végétales (palme, palmiste), sucre de canne 9%, _lactose_, son de _blé_, _lait_ en poudre, extrait en poudre de malt d'orge et de maïs, miel, poudres à lever : (disphosfate disodique, carbonate acide d'ammonium, carbonate acide de sodium), cacao maigre, sel, amidon de _froment_, farine d'_orge_ malté, lécithines _soja_ ; vanilline.12.0%
Farine complète de _seigle_, farine de _seigle_ 29%, levure, sel.12.0%
Pâte de cacao, sucre, beurre de cacao, vanille.12.0%
Pomme de terre, huile de tournesol, sel de mer.12.0%
pâte de cacao, cacao maigre, beurre de cacao, cassonade, vanille.12.0%
Céréales 54%(*farine de _blé_, *farine complète de _blé_ (15%)), *chocolat noir (25%) (*pâte de cacao, *sucre de canne non raffiné, *beurre de cacao), *sucre de canne roux non raffiné, *huile de tournesol oléique (9,7%), arôme naturel de vanille, *_lait_ écrémé en poudre, sel de mer non raffiné, poudres à lever : carbonates d'ammonium et de sodium, épaississant : *gomme d'acacia, antioxydant : , *extraits de romarin.12.0%
Kakaomasse*, Zucker, Kakaobutter, Kakaopulver stark entöit, Emulgator: Sonnenblumenlecithine ( - e322 - ), natürliches Vanille-Aroma. * Rainforest Alliance Certified. Kakao: 74% mindestens.12.0%
en gras peuvent provoquer yne réaction chez tes personnes souffrant d'allergies d'intolérahces alimentaires. en g pour 100g de produit. ou 412.0%
farine de froment, sucre, Graisse végétale , Sucre inverti, Agents levants (Bicarbonate d'ammonium, Bicarbonate de sodium), arôme vanille12.0%
Farine de _Blé_ 73.5 %, matière grasse végétale,extrait de malt d'_orge_, sirop de glucose, sel, poudre à lever : (carbonate acide d’ammonium, carbonate acide de sodium), _œufs_, agent de traitement de la farine : (_sulfite_ de sodium_), arôme12.0%
Pasta de cacao, azúcar, manteca de cacao, vainilla Bourbon natural. (Cacao: 70% mínimo)12.0%
Farine de _blé_ 68,4%, huile de colza, sirop de sucres issu de fruits, jus concentré de pomme 5,3%, _noisettes_ torréfiées 5,3%, germe de _blé_ 5,2%, fibres de chicorée : fructo-oligosaccharides, extrait de malt d'_orge_, arôme naturel de pomme, émulsifiant : lécithines de colza, amidon de _blé_, poudres à lever : (tartrates de potassium, carbonates de potassium, carbonates d‘ammonium), protéines de _lait_, vitamines B1, B2, B6, B9, PP et E (_lactose_, protéines de _lait_).12.0%
Out[792]:

saturn.columns["link"].stats

statvalue
n50
nulls2 (4.0%)
unique28
top_value
top_rate 0.4375
cardinality 28
entropy 3.663
entropy_ratio 0.762
alert: long_tail27 singleton categories
Fig 193.
Top values for link.
Show data table
Top values for link (20 unique shown, of 28 total).
valuecountshare
2142.0%
www.copag.ma12.0%
https://www.lu.fr/prince12.0%
http://www.lindt.es/swf/spa/productos/excellence/altos-porcentajes/excellence-90/www.lindt.com12.0%
https://www.gerble.fr/vitalite/biscuit-sesame12.0%
https://www.nutella.com/de/de/produkte/nutella-biscuits12.0%
http://www.wasa.fr/produits/tartines-croustillantes/authentique/pack/12.0%
https://www.lindt.fr/excellence-noir-7012.0%
https://www.tyrrellscrisps.co.uk/range/potato-crisps/lightly-sea-salted/12.0%
https://www.lindt.fr/excellence-noir-8512.0%
www.bjorg.fr12.0%
https://www.wasa.fr12.0%
www.henrys.ma12.0%
https://www.tuc.eu/produkte_de_at#tuc-prod-412.0%
http://www.lindt.es/swf/spa/productos/excellence/altos-porcentajes/excellence-70/12.0%
https://www.lepaindesfleurs.fr/la-marque12.0%
https://www.gerble.fr/teneur-reduite/biscuit-pomme-noisette12.0%
https://www.pringles.com/de/products/flavours/pringles-original-product.html12.0%
http://www.lindt.ca/swf/fra/produits/excellence/barres/excellence-99-cacao/12.0%
www.nestledessert.fr12.0%

created_t numeric

Out[795]:

saturn.columns["created_t"].stats

statvalue
n50
nulls0 (0.0%)
unique50
min 1.338e+09
max 1.724e+09
mean 1.483e+09
median 1.476e+09
std 1.043e+08
q1 1.386e+09
q3 1.555e+09
iqr 1.694e+08
skew 0.3311
kurtosis -0.8095
n_outliers 0
outlier_rate 0
zero_rate 0
Fig 194.
Distribution of created_t. Vertical dash marks the median.
Show data table
Histogram bins for created_t (median: 1475927880.5).
bincount
1.338e+09 – 1.393e+0913
1.393e+09 – 1.448e+098
1.448e+09 – 1.503e+098
1.503e+09 – 1.558e+099
1.558e+09 – 1.614e+097
1.614e+09 – 1.669e+093
1.669e+09 – 1.724e+092

ingredients_text_fr categorical

Out[798]:

saturn.columns["ingredients_text_fr"].stats

statvalue
n50
nulls2 (4.0%)
unique47
top_value
top_rate 0.04167
cardinality 47
entropy 5.543
entropy_ratio 0.998
alert: long_tail46 singleton categories
Fig 195.
Top values for ingredients_text_fr.
Show data table
Top values for ingredients_text_fr (20 unique shown, of 47 total).
valuecountshare
24.0%
Lait écrémé, crème, SUcre, ferments laciques12.0%
Céréale 50 % (Farine de blé 34,8 %, farine de blé complet 15,2 %), sucre, huiles végétales (palme, colza), cacao maigre en poudre 4,5 %, sirop de glucose, amidon de blé, poudres à lever (carbonates d'ammonium, carbonates de sodium), émulsifiant (lécithines de soja), sel, lait écrémé en poudre, perméat de lactosérum (de lait), arômes. Peut contenir œuf.12.0%
Pâte de cacao, beurre de cacao, cacao maigre, sucre, vanille.12.0%
Coffret fourré au cacao (41,6%) et à la vanille (208) - Ingrédients Farine de blé, sucre, huile végétale non hydrogénée (huile de palme), filtrat de lait, poudre de cacao Émulsifiant à faible teneur en cacao (322) Lécithine de soja) Agent levant (5000) Sucre artificiel (vanilline) Sel Contient du lait, du blé (gluten) du soja12.0%
Farine de blé 57%, sucre de canne roux, huile de colza, sésame toasté 10,6%, germe de blé 5,4%, farine complète de blé 5,4%, arôme naturel, magnésium, émulsifiant : lécithines, poudres à lever (tartrates de potassium, carbonates de sodium, carbonates d'ammonium), sel de mer, amidon de blé, vitamines (E, PP, B6, B1, B9).12.0%
Pâte de cacao, cacao maigre en poudre, beurre de cacao, sucre, émulsifiant : lécithines (soja) ; extrait de vanille. Traces éventuelles de fruits à coque et de lait.12.0%
Eau de source12.0%
Farine de froment, sucre, graisse végétale, sucre inverti, agents levants ( bicarbonate d'ammonium - bicarbonate de sodium), sel, arome.12.0%
Sucre, graisse vegetale de palmiste hidrogenée, Lait Enteir en poudre, Amandes, Cacao Dégraissé en poudre, lactoserum en poudre, Emulsifiant Lécithine de soja, Arômes (Vainilline).12.0%
دقيقالقمح،رقائق الشوكولاته20%[عجينة زيت النخلة.الكاكاو،سكر،دكستروز و مستحلب12.0%
Farine de _froment_, sucre, graisse végétale, noix de coco râpée, poudre de _lait_, poudre de _lactosérum_, sucre inverti, agents levants (bicarbonate d'ammonium - bicarbonate de Sodium), sel, arômes.12.0%
Pâte à tartiner aux NOISETTES et au cacao 40% (sucre, huile de palme, NOISETTES 13%**, LAIT écrémé en poudre 8,7%**, cacao maigre 7,4%**, émulsifiants : lécithines [SOJA]; vanilline), farine de FROMENT 32,5%, graisses végétales (palme, palmiste), sucre de canne (contient BLE) 8,5%, LACTOSE, son de BLE, LAIT en poudre, miel, poudres à lever (diphosphate disodique, carbonate acide de sodium, carbonate acide d'ammonium), farine d'ORGE malté, cacao maigre en poudre, sel, extrait en poudre de malt d'ORGE et de maïs, amidon de FROMENT, émulsifiants: lécithines [SOJA]; vanilline.12.0%
Farine complète de SEIGLE (77 g*), farine de SEIGLE (28 g*), levure, sel. Peut contenir des traces de LUPIN, LAIT, MOUTARDE, GRAINES DE SÉSAME et SOJA. *en g pour 100 g de produit.12.0%
Pâte de cacao, sucre, beurre de cacao, vanille. Peut contenir des fruits à coque, du lait, du soja et des graines de sésame.12.0%
pâte de cacao*, beurre de cacao*, cacao maigre en poudre*, sucre de canne*, extrait de vanille*, * ingrédients issus de l'agriculture biologique12.0%
Pâte de cacao, cacao maigre, beurre de cacao, cassonade, vanille12.0%
Farine de blé* 41%, Chocolat noir* 22% (pâte de cacao*, sucre de canne", beurre de cacao"), Sucre de canne* roux non raffiné, Farine complète de blé* 16%, Huile de tournesol oléique*, Arôme naturel de vanille, Lait écrémé en poudre, Sel de mer, carbonates d'ammonium, carbonates de sodium, gomme d'acacia*, extraits de romarin* Peut contenir du soja, des œufs, des fruits à coque, des graines de sésame et de la moutarde. *Ingrédients biologiques.12.0%
Pâte de cacao, sucre, beurre de cacao, cacao maigre en poudre, émulsifiant : lécithines (_soja_), arôme naturel de vanille.12.0%
Farine complète de SEIGLE 59 g*, son de BLÉ 27 g*, flocons d'AVOINE 12 g*, GRAINES DE SÉSAME 7,0 g*, germe de BLÉ, sel. *en g pour 100 g de produit fini. Peut contenir des traces de LUPIN, LAIT, MOUTARDE et SOJA.12.0%

labels_hierarchy unknown

Out[801]:

saturn.columns["labels_hierarchy"].stats

statvalue
n50
nulls0 (0.0%)
unique
alert: skippedno profiler for kind=unknown

ingredients_non_nutritive_sweeteners_n numeric

Out[803]:

saturn.columns["ingredients_non_nutritive_sweeteners_n"].stats

statvalue
n50
nulls0 (0.0%)
unique1
min 0
max 0
mean 0
median 0
std 0
q1 0
q3 0
iqr 0
skew 0
kurtosis 0
n_outliers 0
outlier_rate 0
zero_rate 1
alert: constantonly one distinct value
Fig 196.
Distribution of ingredients_non_nutritive_sweeteners_n. Vertical dash marks the median.
Show data table
Histogram bins for ingredients_non_nutritive_sweeteners_n (median: 0.0).
bincount
-0.5 – -0.35710
-0.3571 – -0.21430
-0.2143 – -0.071430
-0.07143 – 0.0714350
0.07143 – 0.21430
0.2143 – 0.35710
0.3571 – 0.50

last_edit_dates_tags unknown

Out[806]:

saturn.columns["last_edit_dates_tags"].stats

statvalue
n50
nulls0 (0.0%)
unique
alert: skippedno profiler for kind=unknown

packaging_text_nb categorical

Out[808]:

saturn.columns["packaging_text_nb"].stats

statvalue
n50
nulls48 (96.0%)
unique1
top_value
top_rate 1
cardinality 1
entropy 0
entropy_ratio 0
alert: null_rate96.0% null
alert: imbalancetop value is 100.0% of rows
Fig 197.
Top values for packaging_text_nb.
Show data table
Top values for packaging_text_nb (1 unique shown, of 1 total).
valuecountshare
24.0%

packagings_complete numeric

Out[811]:

saturn.columns["packagings_complete"].stats

statvalue
n50
nulls2 (4.0%)
unique2
min 0
max 1
mean 0.5208
median 1
std 0.5049
q1 0
q3 1
iqr 1
skew -0.08341
kurtosis -1.993
n_outliers 0
outlier_rate 0
zero_rate 0.4792
Fig 198.
Distribution of packagings_complete. Vertical dash marks the median.
Show data table
Histogram bins for packagings_complete (median: 1.0).
bincount
0 – 0.166723
0.1667 – 0.33330
0.3333 – 0.50
0.5 – 0.66670
0.6667 – 0.83330
0.8333 – 125

data_sources categorical

Out[814]:

saturn.columns["data_sources"].stats

statvalue
n50
nulls0 (0.0%)
unique43
top_value App - yuka, Apps, App - Open Food Facts, App - smoothie-openfoodfacts
top_rate 0.08
cardinality 43
entropy 5.309
entropy_ratio 0.9783
alert: long_tail39 singleton categories
Fig 199.
Top values for data_sources.
Show data table
Top values for data_sources (20 unique shown, of 43 total).
valuecountshare
App - yuka, Apps, App - Open Food Facts, App - smoothie-openfoodfacts48.0%
App - yuka, Apps, App - smoothie-openfoodfacts36.0%
App - yuka, Apps, App - InFood, App - Open Food Facts, App - smoothie-openfoodfacts, App - macrofactor24.0%
App - Yuka, Apps, App - smoothie-openfoodfacts24.0%
App - yuka, Apps, App - Open Food Facts, App - smoothie-openfoodfacts, App - allergytracker, App - openfoodfactsflutterapp12.0%
App - yuka, Apps, App - InFood, App - Open Food Facts, App - Horizon, App - smoothie-openfoodfacts, App - halal-healthy, App - foodwasteieee, App - mon-coach-ig-bas, App - intolerapp, App - fooducate12.0%
Database - FoodRepo / openfood.ch, Databases, App - yuka, Apps, App - Horizon, App - InFood, App - Open Food Facts, App - smoothie-openfoodfacts, App - mon-coach-ig-bas, App - macrofactor, App - caloriecounterapp, App - Speisekammer12.0%
App - smoothie-openfoodfacts, Apps12.0%
App - yuka, Apps, App - InFood, App - Open Food Facts, App - smoothie-openfoodfacts, App - Waistline12.0%
App - elcoco, App - yuka, Apps, App - off, App - El CoCo, App - InFood, App - Open Food Facts, App - Speisekammer, App - smoothie-openfoodfacts, App - macrofactor, App - mon-coach-ig-bas, App - caloriecounterapp12.0%
App - yuka, Apps, App - ethic-advisor, App - InFood, App - Open Food Facts, App - smoothie-openfoodfacts, Producers, Producer - gie-sources-alma, Database - Equadis, Database - GDSN, Databases12.0%
App - yuka, Apps, App - InFood, App - Open Food Facts, App - halal-healthy, App - smoothie-openfoodfacts12.0%
Producer - Ferrero, Producers, App - off, App - yuka, Apps, Producer - ferrero-france-commerciale, Database - Equadis, Database - GDSN, Databases, App - Horizon, App - InFood, App - Open Food Facts, App - Speisekammer, App - smoothie-openfoodfacts, App - El CoCo, App - mon-coach-ig-bas, App - intolerapp, App - macrofactor, App - caloriecounterapp12.0%
Database - FoodRepo / openfood.ch, Databases, App - yuka, Apps, App - ethic-advisor, Producers, Producer - barilla, Producer - barilla-france-sa, Database - Equadis, Database - GDSN, App - Open Food Facts, App - smoothie-openfoodfacts, App - mon-coach-ig-bas, App - InFood, App - caloriecounterapp12.0%
Database - FoodRepo / openfood.ch, Databases, App - off, Apps, App - InFood, App - Open Food Facts, App - Yuka, App - smoothie-openfoodfacts, App - mon-coach-ig-bas, App - macrofactor12.0%
Database - FoodRepo / openfood.ch, Databases, App - yuka, Apps, App - Horizon, App - InFood, App - Open Food Facts, App - smoothie-openfoodfacts, App - macrofactor, App - Speisekammer12.0%
App - yuka, Apps, App - InFood, App - Open Food Facts, App - smoothie-openfoodfacts, App - caloriecounterapp, App - macrofactor12.0%
Database - FoodRepo / openfood.ch, Databases, App - yuka, Apps, app-elcoco, App - InFood, App - Open Food Facts, App - smoothie-openfoodfacts, App - mon-coach-ig-bas12.0%
App - yuka, Apps, App - Open Food Facts, App - InFood, App - smoothie-openfoodfacts12.0%
App - yuka, Apps, App - Horizon, App - InFood, App - Open Food Facts, App - smoothie-openfoodfacts, App - macrofactor, App - caloriecounterapp12.0%

labels_old categorical

Out[817]:

saturn.columns["labels_old"].stats

statvalue
n50
nulls4 (8.0%)
unique38
top_value
top_rate 0.1957
cardinality 38
entropy 4.903
entropy_ratio 0.9343
alert: long_tail37 singleton categories
Fig 200.
Top values for labels_old.
Show data table
Top values for labels_old (20 unique shown, of 38 total).
valuecountshare
918.0%
Triman, en:Sin gluten12.0%
Bezglutenowy, Triman12.0%
Point Vert, Fabriqué en France, Arômes naturels, Sans colorants, Sans huile de palme, Nutriscore, Nutriscore B, Triman12.0%
Справедлива търговия, Вегетарианско, Веган, Fairtrade cocoa, FSC, FSC Mix12.0%
Triman, Sans Nitrates12.0%
Point Vert, Fabriqué en Espagne, en:CE12.0%
Fair trade, Organic, Vegetarian, EU Organic, Fairtrade International, Vegan, Soil Association Organic, The Vegan Society, Commerce équitable12.0%
Point Vert, Non-bio, Triman12.0%
Sans conservateurs, Fabriqué en France, Triman12.0%
Sans gluten, Végétarien, Sans arômes artificiels, Végétalien, Assured Food Standards, Point Vert, Sans colorants artificiels, Sans exhausteur de goût, Sans glutamate, en:Made-in-england, en:Terracycle12.0%
Organic, Vegetarian, EU Organic, Fair trade, Non-EU Agriculture, Vegan, Fairtrade International, FR-BIO-01, FSC, FSC Mix, Green Dot, Max Havelaar, PL-EKO-07, Soil Association Organic, The Vegan Society12.0%
Agriculture non UE, Fabriqué en Belgique, Fabriqué en France, Sans huile de palme, Triman12.0%
Organic,EU Organic,Non-EU Agriculture,Certified B Corporation,EU Agriculture,EU/non-EU Agriculture,FR-BIO-01,No palm oil,Nutriscore,Nutriscore Grade D,Pure cocoa butter,AB Agriculture Biologique12.0%
Fair trade, Vegetarian, Fairtrade International, Vegan, Pure cocoa butter, Rainforest Alliance, Commerce-equitable, Pur-beurre-de-cacao12.0%
Source de fibres alimentaires,Point Vert,Riche en fibres,Triman,Emballage-recyclable12.0%
Halal12.0%
Vegetariano,Vegano,Punto Verde12.0%
Commerce équitable, Sans gluten, Bio, Végétarien, Épi barré, Bio européen, Kascher, Végétalien, Point Vert, Fabriqué en France, Nutriscore, Nutriscore A, The Vegan Society, AB Agriculture Biologique, Afdiag12.0%
Peu ou pas de sucre, Peu de sucre, Pauvre ou sans sodium, Sans conservateurs, Agriculture non UE, Allégé en sucre, Riche en vitamine E, Source de fibres alimentaires, Agriculture durable, Enrichi en vitamines, Agriculture UE, Agriculture UE/Non UE, Riche en fibres, Faible teneur en sodium, Fabriqué en France, Arômes naturels, Sans colorants, Sans colorants ou conservateurs, Sans huile de palme, Nutriscore, Nutriscore A, Riche en vitamine B1, Riche en vitamine B9, Source de vitamine B6, Sans édulcorants, Farine de blé français, Triman12.0%

data_quality_info_tags unknown

Out[820]:

saturn.columns["data_quality_info_tags"].stats

statvalue
n50
nulls0 (0.0%)
unique
alert: skippedno profiler for kind=unknown

ingredients_from_palm_oil_n numeric

Out[822]:

saturn.columns["ingredients_from_palm_oil_n"].stats

statvalue
n50
nulls4 (8.0%)
unique2
min 0
max 1
mean 0.1522
median 0
std 0.3632
q1 0
q3 0
iqr 0
skew 1.937
kurtosis 1.751
n_outliers 7
outlier_rate 0.1522
zero_rate 0.8478
alert: outliers15.2% rows beyond 1.5 IQR
Fig 201.
Distribution of ingredients_from_palm_oil_n. Vertical dash marks the median.
Show data table
Histogram bins for ingredients_from_palm_oil_n (median: 0.0).
bincount
0 – 0.166739
0.1667 – 0.33330
0.3333 – 0.50
0.5 – 0.66670
0.6667 – 0.83330
0.8333 – 17

ingredients_text_with_allergens_ja categorical

Out[825]:

saturn.columns["ingredients_text_with_allergens_ja"].stats

statvalue
n50
nulls49 (98.0%)
unique1
top_value
top_rate 1
cardinality 1
entropy 0
entropy_ratio 0
alert: long_tail1 singleton categories
alert: null_rate98.0% null
alert: imbalancetop value is 100.0% of rows
Fig 202.
Top values for ingredients_text_with_allergens_ja.
Show data table
Top values for ingredients_text_with_allergens_ja (1 unique shown, of 1 total).
valuecountshare
12.0%

ingredients_lc categorical

Out[828]:

saturn.columns["ingredients_lc"].stats

statvalue
n50
nulls0 (0.0%)
unique4
top_value fr
top_rate 0.7
cardinality 4
entropy 1.212
entropy_ratio 0.6061
Fig 203.
Top values for ingredients_lc.
Show data table
Top values for ingredients_lc (4 unique shown, of 4 total).
valuecountshare
fr3570.0%
en1122.0%
bg24.0%
de24.0%

origins categorical

Out[831]:

saturn.columns["origins"].stats

statvalue
n50
nulls2 (4.0%)
unique20
top_value
top_rate 0.5
cardinality 20
entropy 3.027
entropy_ratio 0.7003
alert: long_tail17 singleton categories
Fig 204.
Top values for origins.
Show data table
Top values for origins (20 unique shown, of 20 total).
valuecountshare
2448.0%
France48.0%
Maroc36.0%
Morocco12.0%
France,Union européenne,Non Union Européenne12.0%
France,Provence-Alpes-Côte d'Azur,Italie,Vaucluse,en:Cairanne,en:Chambon-la-Forêt,en:Source Emma,en:Source Ofélia,en:Source Sainte Cécile,en:Source Éléna,en:Source Éléonore12.0%
United Kingdom12.0%
en:Madagarcar vanilla12.0%
France,European Union and Non European Union12.0%
Germany,Ludwig Weinrich,Ludwig Weinrich in Germany12.0%
Suède,Allemagne,Biélorussie,Estonie,Lettonie,Pologne,Seigle12.0%
European Union and Non European Union12.0%
Équateur12.0%
España12.0%
France,Non Union Européenne,Non indiqué12.0%
madagascar, fr:afrique, amérique-du-sud12.0%
fr:maroc12.0%
Unspecified12.0%
Farine œuf France12.0%
European Union12.0%

nutriscore_data unknown

Out[834]:

saturn.columns["nutriscore_data"].stats

statvalue
n50
nulls0 (0.0%)
unique
alert: skippedno profiler for kind=unknown

scans_n numeric feature

This column likely represents a count of scans per record (e.g., barcode or document scans), with 50 records and no nulls. The bulk of values sit in a moderate range (Q1=387, median=492, Q3=604), but extreme positive skew (3.90) and very high kurtosis (18.72) are driven by 4 outliers (8% of rows) reaching up to 2523 — more than 4× the median. The min of 333 suggests a natural floor, possibly a minimum scan threshold or truncation artefact.

Treatment: Investigate the 4 outliers before modelling; apply log-transform or robust scaling to reduce skew impact in regression or distance-based models.

anthropic:default · confidence medium
Out[836]:

saturn.columns["scans_n"].stats

statvalue
n50
nulls0 (0.0%)
unique49
min 333
max 2,523
mean 577.9
median 492
std 343.9
q1 387
q3 604
iqr 217
skew 3.899
kurtosis 18.72
n_outliers 4
outlier_rate 0.08
zero_rate 0
alert: high_skewskew=+3.90
alert: outliers8.0% rows beyond 1.5 IQR
Fig 205.
Distribution of scans_n. Vertical dash marks the median.
Show data table
Histogram bins for scans_n (median: 492.0).
bincount
333 – 645.939
645.9 – 958.77
958.7 – 12723
1272 – 15840
1584 – 18970
1897 – 22100
2210 – 25231

ingredients_that_may_be_from_palm_oil_tags unknown

Out[839]:

saturn.columns["ingredients_that_may_be_from_palm_oil_tags"].stats

statvalue
n50
nulls0 (0.0%)
unique
alert: skippedno profiler for kind=unknown

generic_name_ar categorical

Out[841]:

saturn.columns["generic_name_ar"].stats

statvalue
n50
nulls40 (80.0%)
unique2
top_value
top_rate 0.9
cardinality 2
entropy 0.469
entropy_ratio 0.469
alert: null_rate80.0% null
Fig 206.
Top values for generic_name_ar.
Show data table
Top values for generic_name_ar (2 unique shown, of 2 total).
valuecountshare
918.0%
الامير12.0%

product_name_uk categorical

Out[844]:

saturn.columns["product_name_uk"].stats

statvalue
n50
nulls49 (98.0%)
unique1
top_value
top_rate 1
cardinality 1
entropy 0
entropy_ratio 0
alert: long_tail1 singleton categories
alert: null_rate98.0% null
alert: imbalancetop value is 100.0% of rows
Fig 207.
Top values for product_name_uk.
Show data table
Top values for product_name_uk (1 unique shown, of 1 total).
valuecountshare
12.0%

last_checked_t numeric

Out[847]:

saturn.columns["last_checked_t"].stats

statvalue
n50
nulls43 (86.0%)
unique7
min 1.541e+09
max 1.73e+09
mean 1.607e+09
median 1.565e+09
std 7.772e+07
q1 1.556e+09
q3 1.652e+09
iqr 9.601e+07
skew 0.8106
kurtosis -1.103
n_outliers 0
outlier_rate 0
zero_rate 0
alert: null_rate86.0% null
Fig 208.
Distribution of last_checked_t. Vertical dash marks the median.
Show data table
Histogram bins for last_checked_t (median: 1564679969.0).
bincount
1.541e+09 – 1.579e+094
1.579e+09 – 1.617e+091
1.617e+09 – 1.655e+090
1.655e+09 – 1.692e+090
1.692e+09 – 1.73e+092

last_check_dates_tags unknown

Out[850]:

saturn.columns["last_check_dates_tags"].stats

statvalue
n50
nulls0 (0.0%)
unique
alert: skippedno profiler for kind=unknown

ingredients_text_uk categorical

Out[852]:

saturn.columns["ingredients_text_uk"].stats

statvalue
n50
nulls49 (98.0%)
unique1
top_value
top_rate 1
cardinality 1
entropy 0
entropy_ratio 0
alert: long_tail1 singleton categories
alert: null_rate98.0% null
alert: imbalancetop value is 100.0% of rows
Fig 209.
Top values for ingredients_text_uk.
Show data table
Top values for ingredients_text_uk (1 unique shown, of 1 total).
valuecountshare
12.0%

carbon_footprint_from_known_ingredients_debug categorical

Out[855]:

saturn.columns["carbon_footprint_from_known_ingredients_debug"].stats

statvalue
n50
nulls36 (72.0%)
unique14
top_value en:cereal 50% x 0.3 = 15 g -
top_rate 0.07143
cardinality 14
entropy 3.807
entropy_ratio 1
alert: long_tail14 singleton categories
alert: null_rate72.0% null
Fig 210.
Top values for carbon_footprint_from_known_ingredients_debug.
Show data table
Top values for carbon_footprint_from_known_ingredients_debug (14 unique shown, of 14 total).
valuecountshare
en:cereal 50% x 0.3 = 15 g - 12.0%
en:wheat-flour 55.1% x 1.2 = 66.12 g - 12.0%
en:wheat-flour 32% x 1.2 = 38.4 g - en:cane-sugar 9% x 1.3 = 11.7 g - 12.0%
en:wholemeal-rye-flour 77% x 1.2 = 92.4 g - en:rye-flour 28% x 1.2 = 33.6 g - 12.0%
en:wheat-flour 39% x 1.2 = 46.8 g - en:dark-chocolate 25% x 4.9 = 122.5 g - en:whole-wheat-flour 15% x 1.2 = 18 g - 12.0%
en:wholemeal-rye-flour 59% x 1.2 = 70.8 g - en:wheat-bran 27% x 0.6 = 16.2 g - en:oat-flakes 12% x 0.3 = 3.6 g - 12.0%
en:wheat-flour 68.5% x 1.2 = 82.2 g - en:wheat-germ 5.2% x 0.6 = 3.12 g - 12.0%
en:hazelnut-oil 13% x 2.6 = 33.8 g - 12.0%
en:whole-wheat-flour 26.5% x 1.2 = 31.8 g - en:wheat-flour 26.1% x 1.2 = 31.32 g - en:wheat-bran 19.9% x 0.6 = 11.94 g - en:fig-paste 5.1% x 0.3 = 1.53 g - 12.0%
en:wheat-flour 41% x 1.2 = 49.2 g - en:fresh-egg 11% x 2.6 = 28.6 g - 12.0%
en:walnut-kernel 25% x 1.3 = 32.5 g - en:almond 25% x 5.9 = 147.5 g - en:cranberry 25% x 0.3 = 7.5 g - 12.0%
en:whole-fresh-eggs 8% x 2.6 = 20.8 g - 12.0%
en:wheat-flour 37% x 1.2 = 44.4 g - en:milk-chocolate 27% x 5.9 = 159.3 g - en:whole-wheat-flour 12% x 1.2 = 14.4 g - 12.0%
en:cereal 98.3% x 0.3 = 29.49 g - 12.0%

packaging_text_ar categorical metadata

This column appears to hold Arabic-language packaging text, but it is effectively empty: 80% of the 50 rows are null, and the remaining 10 non-null rows contain only an empty string — giving a single unique value with top_rate of 1.0 and zero entropy. The column carries no information whatsoever in this dataset snapshot.

Treatment: Drop this column; it contains no usable signal (100% null or empty string across all rows).

anthropic:default · confidence high
Out[858]:

saturn.columns["packaging_text_ar"].stats

statvalue
n50
nulls40 (80.0%)
unique1
top_value
top_rate 1
cardinality 1
entropy 0
entropy_ratio 0
alert: null_rate80.0% null
alert: imbalancetop value is 100.0% of rows
Fig 211.
Top values for packaging_text_ar.
Show data table
Top values for packaging_text_ar (1 unique shown, of 1 total).
valuecountshare
1020.0%

generic_name_uk categorical

Out[861]:

saturn.columns["generic_name_uk"].stats

statvalue
n50
nulls49 (98.0%)
unique1
top_value
top_rate 1
cardinality 1
entropy 0
entropy_ratio 0
alert: long_tail1 singleton categories
alert: null_rate98.0% null
alert: imbalancetop value is 100.0% of rows
Fig 212.
Top values for generic_name_uk.
Show data table
Top values for generic_name_uk (1 unique shown, of 1 total).
valuecountshare
12.0%

last_checker categorical

Out[864]:

saturn.columns["last_checker"].stats

statvalue
n50
nulls43 (86.0%)
unique4
top_value aleene
top_rate 0.4286
cardinality 4
entropy 1.842
entropy_ratio 0.9212
alert: null_rate86.0% null
Fig 213.
Top values for last_checker.
Show data table
Top values for last_checker (4 unique shown, of 4 total).
valuecountshare
aleene36.0%
moon-rabbit24.0%
beniben12.0%
sebleouf12.0%

checked categorical feature

This column appears to be a binary checkbox field (HTML-style 'on'/'off'), but only the value 'on' is ever recorded — cardinality is 1 with 'on' appearing in all 7 non-null rows. The 86% null rate is the dominant signal: nulls almost certainly represent unchecked state rather than missing data, meaning the column encodes a boolean with an unconventional null-as-false convention. Zero entropy confirms complete absence of variation among non-null values.

Treatment: Recode nulls as 0 and 'on' as 1 to produce a proper boolean/integer column before modelling.

anthropic:default · confidence high
Out[867]:

saturn.columns["checked"].stats

statvalue
n50
nulls43 (86.0%)
unique1
top_value on
top_rate 1
cardinality 1
entropy 0
entropy_ratio 0
alert: null_rate86.0% null
alert: imbalancetop value is 100.0% of rows
Fig 214.
Top values for checked.
Show data table
Top values for checked (1 unique shown, of 1 total).
valuecountshare
on714.0%

product_name_ar categorical

Out[870]:

saturn.columns["product_name_ar"].stats

statvalue
n50
nulls39 (78.0%)
unique6
top_value
top_rate 0.5455
cardinality 6
entropy 2.049
entropy_ratio 0.7928
alert: long_tail5 singleton categories
alert: null_rate78.0% null
Fig 215.
Top values for product_name_ar.
Show data table
Top values for product_name_ar (6 unique shown, of 6 total).
valuecountshare
612.0%
برنس12.0%
Tonjik12.0%
Leche Y Almendras12.0%
Eyoo cover12.0%
Chocolate Negro 92% Cacao12.0%

ingredients_text_with_allergens_uk categorical

Out[873]:

saturn.columns["ingredients_text_with_allergens_uk"].stats

statvalue
n50
nulls49 (98.0%)
unique1
top_value
top_rate 1
cardinality 1
entropy 0
entropy_ratio 0
alert: long_tail1 singleton categories
alert: null_rate98.0% null
alert: imbalancetop value is 100.0% of rows
Fig 216.
Top values for ingredients_text_with_allergens_uk.
Show data table
Top values for ingredients_text_with_allergens_uk (1 unique shown, of 1 total).
valuecountshare
12.0%

origin_uk categorical

Out[876]:

saturn.columns["origin_uk"].stats

statvalue
n50
nulls49 (98.0%)
unique1
top_value
top_rate 1
cardinality 1
entropy 0
entropy_ratio 0
alert: long_tail1 singleton categories
alert: null_rate98.0% null
alert: imbalancetop value is 100.0% of rows
Fig 217.
Top values for origin_uk.
Show data table
Top values for origin_uk (1 unique shown, of 1 total).
valuecountshare
12.0%

packaging_text_uk categorical

Out[879]:

saturn.columns["packaging_text_uk"].stats

statvalue
n50
nulls49 (98.0%)
unique1
top_value
top_rate 1
cardinality 1
entropy 0
entropy_ratio 0
alert: long_tail1 singleton categories
alert: null_rate98.0% null
alert: imbalancetop value is 100.0% of rows
Fig 218.
Top values for packaging_text_uk.
Show data table
Top values for packaging_text_uk (1 unique shown, of 1 total).
valuecountshare
12.0%

ingredients_text_ar categorical

Out[882]:

saturn.columns["ingredients_text_ar"].stats

statvalue
n50
nulls39 (78.0%)
unique2
top_value
top_rate 0.9091
cardinality 2
entropy 0.4395
entropy_ratio 0.4395
alert: null_rate78.0% null
Fig 219.
Top values for ingredients_text_ar.
Show data table
Top values for ingredients_text_ar (2 unique shown, of 2 total).
valuecountshare
1020.0%
سكر،دقيق،دهون نباتية (نخيل،شيا)،مسحوق كاكاو،شراب جلوكوز،نشا الذرة،مسحوق حليب،مسحوق مصل اللبن،مسحوق حليب كامل الدسم،عجينة الكاكاو،مواد رافعة(بكربونات الصوديوم و الأمونيوم)،ملح،مستحلب(لسيتين الصويا(E322)وڤانيلين12.0%

ingredients_text_with_allergens_ar categorical

Out[885]:

saturn.columns["ingredients_text_with_allergens_ar"].stats

statvalue
n50
nulls41 (82.0%)
unique2
top_value
top_rate 0.8889
cardinality 2
entropy 0.5033
entropy_ratio 0.5033
alert: null_rate82.0% null
Fig 220.
Top values for ingredients_text_with_allergens_ar.
Show data table
Top values for ingredients_text_with_allergens_ar (2 unique shown, of 2 total).
valuecountshare
816.0%
سكر،دقيق،دهون نباتية (نخيل،شيا)،مسحوق كاكاو،شراب جلوكوز،نشا الذرة،مسحوق حليب،مسحوق مصل اللبن،مسحوق حليب كامل الدسم،عجينة الكاكاو،مواد رافعة(بكربونات الصوديوم و الأمونيوم)،ملح،مستحلب(لسيتين الصويا(E322)وڤانيلين12.0%

carbon_footprint_percent_of_known_ingredients numeric

Out[888]:

saturn.columns["carbon_footprint_percent_of_known_ingredients"].stats

statvalue
n50
nulls31 (62.0%)
unique19
min 8
max 105
mean 61.79
median 70
std 28.98
q1 45.5
q3 78.3
iqr 32.8
skew -0.4493
kurtosis -0.8083
n_outliers 0
outlier_rate 0
zero_rate 0
alert: null_rate62.0% null
Fig 221.
Distribution of carbon_footprint_percent_of_known_ingredients. Vertical dash marks the median.
Show data table
Histogram bins for carbon_footprint_percent_of_known_ingredients (median: 70.0).
bincount
8 – 27.43
27.4 – 46.82
46.8 – 66.23
66.2 – 85.68
85.6 – 1053

origin_ar categorical other

This column appears to be an Arabic-language origin field ('origin_ar') that is almost entirely empty. With an 80% null rate and cardinality of 1, the sole 'unique' value is itself an empty string appearing 10 times across 50 rows — meaning the column contains no actual data at all. This is a fully degenerate column with zero informational content.

Treatment: Drop — column carries no information (100% null or empty string, entropy 0.0).

anthropic:default · confidence high
Out[891]:

saturn.columns["origin_ar"].stats

statvalue
n50
nulls40 (80.0%)
unique1
top_value
top_rate 1
cardinality 1
entropy 0
entropy_ratio 0
alert: null_rate80.0% null
alert: imbalancetop value is 100.0% of rows
Fig 222.
Top values for origin_ar.
Show data table
Top values for origin_ar (1 unique shown, of 1 total).
valuecountshare
1020.0%

nutrition_score_warning_no_fiber numeric

Out[894]:

saturn.columns["nutrition_score_warning_no_fiber"].stats

statvalue
n50
nulls35 (70.0%)
unique1
min 1
max 1
mean 1
median 1
std 0
q1 1
q3 1
iqr 0
skew 0
kurtosis 0
n_outliers 0
outlier_rate 0
zero_rate 0
alert: null_rate70.0% null
alert: constantonly one distinct value
Fig 223.
Distribution of nutrition_score_warning_no_fiber. Vertical dash marks the median.
Show data table
Histogram bins for nutrition_score_warning_no_fiber (median: 1.0).
bincount
0.5 – 0.70
0.7 – 0.90
0.9 – 1.115
1.1 – 1.30
1.3 – 1.50

ingredients_text_debug_tags unknown

Out[897]:

saturn.columns["ingredients_text_debug_tags"].stats

statvalue
n50
nulls0 (0.0%)
unique
alert: skippedno profiler for kind=unknown

nutriments_estimated unknown

Out[899]:

saturn.columns["nutriments_estimated"].stats

statvalue
n50
nulls0 (0.0%)
unique
alert: skippedno profiler for kind=unknown

completed_t numeric

Out[901]:

saturn.columns["completed_t"].stats

statvalue
n50
nulls34 (68.0%)
unique16
min 1.628e+09
max 1.763e+09
mean 1.7e+09
median 1.703e+09
std 4.07e+07
q1 1.663e+09
q3 1.74e+09
iqr 7.618e+07
skew 0.001247
kurtosis -1.155
n_outliers 0
outlier_rate 0
zero_rate 0
alert: null_rate68.0% null
Fig 224.
Distribution of completed_t. Vertical dash marks the median.
Show data table
Histogram bins for completed_t (median: 1703093252.0).
bincount
1.628e+09 – 1.655e+091
1.655e+09 – 1.682e+095
1.682e+09 – 1.709e+094
1.709e+09 – 1.736e+091
1.736e+09 – 1.763e+095

taxonomies_enhancer_tags unknown

Out[904]:

saturn.columns["taxonomies_enhancer_tags"].stats

statvalue
n50
nulls0 (0.0%)
unique
alert: skippedno profiler for kind=unknown

ingredients_text_with_allergens_sl categorical

Out[906]:

saturn.columns["ingredients_text_with_allergens_sl"].stats

statvalue
n50
nulls49 (98.0%)
unique1
top_value Kakavova masa, manjmasten kakavov prah, kakavovo maslo, sladkor, emulgator: lecitini (sojin lecitin); ekstrakt vanilije. Lahko vsebuje sledi oreškov (lešniki, mandlji, pistacija) in mleka. Uporabno najmanj do: glej odtis na zadnji strani embalaže.
top_rate 1
cardinality 1
entropy 0
entropy_ratio 0
alert: long_tail1 singleton categories
alert: null_rate98.0% null
alert: imbalancetop value is 100.0% of rows
Fig 225.
Top values for ingredients_text_with_allergens_sl.
Show data table
Top values for ingredients_text_with_allergens_sl (1 unique shown, of 1 total).
valuecountshare
Kakavova masa, manjmasten kakavov prah, kakavovo maslo, sladkor, emulgator: lecitini (sojin lecitin); ekstrakt vanilije. Lahko vsebuje sledi oreškov (lešniki, mandlji, pistacija) in mleka. Uporabno najmanj do: glej odtis na zadnji strani embalaže.12.0%

packaging_text_sk categorical

Out[909]:

saturn.columns["packaging_text_sk"].stats

statvalue
n50
nulls49 (98.0%)
unique1
top_value
top_rate 1
cardinality 1
entropy 0
entropy_ratio 0
alert: long_tail1 singleton categories
alert: null_rate98.0% null
alert: imbalancetop value is 100.0% of rows
Fig 226.
Top values for packaging_text_sk.
Show data table
Top values for packaging_text_sk (1 unique shown, of 1 total).
valuecountshare
12.0%

ingredients_text_with_allergens_bg categorical

Out[912]:

saturn.columns["ingredients_text_with_allergens_bg"].stats

statvalue
n50
nulls47 (94.0%)
unique3
top_value Какаова маса, нискомаслено какао на прах, какаово масло, захар, емулгатор: лецитин (соеви), екстракт от ванилия, Може да съдържа следи от ядки и мляко,
top_rate 0.3333
cardinality 3
entropy 1.585
entropy_ratio 1
alert: long_tail3 singleton categories
alert: null_rate94.0% null
Fig 227.
Top values for ingredients_text_with_allergens_bg.
Show data table
Top values for ingredients_text_with_allergens_bg (3 unique shown, of 3 total).
valuecountshare
Какаова маса, нискомаслено какао на прах, какаово масло, захар, емулгатор: лецитин (соеви), екстракт от ванилия, Може да съдържа следи от ядки и мляко,12.0%
12.0%
Захар, палмово масло, ЛЕШНИЦИ (13%), обезмаслено МЛЯКО на прах (8,7%), нискомаслено какао на прах (7,4%), емулгатор: лецитини (СОЯ), ванилин.12.0%

ingredients_text_pt categorical

Out[915]:

saturn.columns["ingredients_text_pt"].stats

statvalue
n50
nulls40 (80.0%)
unique4
top_value
top_rate 0.7
cardinality 4
entropy 1.357
entropy_ratio 0.6784
alert: long_tail3 singleton categories
alert: null_rate80.0% null
Fig 228.
Top values for ingredients_text_pt.
Show data table
Top values for ingredients_text_pt (4 unique shown, of 4 total).
valuecountshare
714.0%
Creme para barrar de AVELAS e cacau 40% (açúcar, gordura de palma, AVELAS (13%), LEITE desnatado em pó (8,7%), cacau magro (7,4%), emulsionantes: lecitinas (SOJA), vanilina), farinha de TRIGO (32,5%), gorduras vegetais (palma, palmiste), açúcar de cana (contém TRIGO) (8,5%), LACTOSE, farelo de TRIGO, LEITE inteiro em pó, mel, levedantes químicos (difosfato dissódico, hidrogenocarbonato de sódio, hidrogenocarbonato de amónio), farinha de CEVADA maltada, cacau magro, sal, extrato em pó de malte de CEVADA e milho, amido de TRIGO, emulsionantes: lecitinas (SOJA), vanilina.12.0%
Farinha de _TRIGO_, gordura de palma, xarope de glucose, extrato de _CEVADA_ malteada, levedantes (carbonatos de amónio, carbonatos de sódio), sal, _OVOS_, aroma, agente de tratamento da farinha (_METABISSULFITO_ de sódio).12.0%
Pasta de cacau, açúcar, manteiga de cacau, baunilha.12.0%

ingredients_text_dz categorical

Out[918]:

saturn.columns["ingredients_text_dz"].stats

statvalue
n50
nulls49 (98.0%)
unique1
top_value
top_rate 1
cardinality 1
entropy 0
entropy_ratio 0
alert: long_tail1 singleton categories
alert: null_rate98.0% null
alert: imbalancetop value is 100.0% of rows
Fig 229.
Top values for ingredients_text_dz.
Show data table
Top values for ingredients_text_dz (1 unique shown, of 1 total).
valuecountshare
12.0%

generic_name_ca categorical

Out[921]:

saturn.columns["generic_name_ca"].stats

statvalue
n50
nulls48 (96.0%)
unique1
top_value
top_rate 1
cardinality 1
entropy 0
entropy_ratio 0
alert: null_rate96.0% null
alert: imbalancetop value is 100.0% of rows
Fig 230.
Top values for generic_name_ca.
Show data table
Top values for generic_name_ca (1 unique shown, of 1 total).
valuecountshare
24.0%

generic_name_bg categorical label

This column appears to be a Bulgarian-language generic name field (likely a pharmaceutical or product name localization), but it is almost entirely absent: 94% of rows are null and the remaining 3 non-null rows contain only an empty string. With cardinality of 1 and entropy of 0, the column carries zero information.

Treatment: Drop this column; it is 94% null and the only observed value is an empty string, making it analytically useless.

anthropic:default · confidence high
Out[924]:

saturn.columns["generic_name_bg"].stats

statvalue
n50
nulls47 (94.0%)
unique1
top_value
top_rate 1
cardinality 1
entropy 0
entropy_ratio 0
alert: null_rate94.0% null
alert: imbalancetop value is 100.0% of rows
Fig 231.
Top values for generic_name_bg.
Show data table
Top values for generic_name_bg (1 unique shown, of 1 total).
valuecountshare
36.0%

origin_sl categorical

Out[927]:

saturn.columns["origin_sl"].stats

statvalue
n50
nulls49 (98.0%)
unique1
top_value
top_rate 1
cardinality 1
entropy 0
entropy_ratio 0
alert: long_tail1 singleton categories
alert: null_rate98.0% null
alert: imbalancetop value is 100.0% of rows
Fig 232.
Top values for origin_sl.
Show data table
Top values for origin_sl (1 unique shown, of 1 total).
valuecountshare
12.0%

product_name_et categorical

Out[930]:

saturn.columns["product_name_et"].stats

statvalue
n50
nulls47 (94.0%)
unique3
top_value Chocolat noir - 85% cacao
top_rate 0.3333
cardinality 3
entropy 1.585
entropy_ratio 1
alert: long_tail3 singleton categories
alert: null_rate94.0% null
Fig 233.
Top values for product_name_et.
Show data table
Top values for product_name_et (3 unique shown, of 3 total).
valuecountshare
Chocolat noir - 85% cacao12.0%
12.0%
Excellence 70% Cocoa Intense Dark12.0%

origin_et categorical metadata

This column appears to be an origin or source tag in Amharic/Ethiopic script (indicated by the '_et' suffix), but it is effectively empty: 94% of the 50 rows are null, and the sole non-null value present is an empty string appearing 3 times. With cardinality of 1 and entropy of 0.0, the column carries zero information. This is likely an unfilled localization or metadata field.

Treatment: Drop this column; it contains no usable signal (94% null, sole value is empty string).

anthropic:default · confidence high
Out[933]:

saturn.columns["origin_et"].stats

statvalue
n50
nulls47 (94.0%)
unique1
top_value
top_rate 1
cardinality 1
entropy 0
entropy_ratio 0
alert: null_rate94.0% null
alert: imbalancetop value is 100.0% of rows
Fig 234.
Top values for origin_et.
Show data table
Top values for origin_et (1 unique shown, of 1 total).
valuecountshare
36.0%

ingredients_text_with_allergens_sk categorical

Out[936]:

saturn.columns["ingredients_text_with_allergens_sk"].stats

statvalue
n50
nulls49 (98.0%)
unique1
top_value
top_rate 1
cardinality 1
entropy 0
entropy_ratio 0
alert: long_tail1 singleton categories
alert: null_rate98.0% null
alert: imbalancetop value is 100.0% of rows
Fig 235.
Top values for ingredients_text_with_allergens_sk.
Show data table
Top values for ingredients_text_with_allergens_sk (1 unique shown, of 1 total).
valuecountshare
12.0%

ingredients_text_with_allergens_et categorical

Out[939]:

saturn.columns["ingredients_text_with_allergens_et"].stats

statvalue
n50
nulls47 (94.0%)
unique3
top_value kakavova masa, manjmasten kakavov prah, kakavovo maslo, sladkor, emulgator: lecitini (sojin lecitin); ekstrakt vanilije.
top_rate 0.3333
cardinality 3
entropy 1.585
entropy_ratio 1
alert: long_tail3 singleton categories
alert: null_rate94.0% null
Fig 236.
Top values for ingredients_text_with_allergens_et.
Show data table
Top values for ingredients_text_with_allergens_et (3 unique shown, of 3 total).
valuecountshare
kakavova masa, manjmasten kakavov prah, kakavovo maslo, sladkor, emulgator: lecitini (sojin lecitin); ekstrakt vanilije.12.0%
Kakaomasse*, Zucker, Kakaobutter, Kakaopulver stark entöit, Emulgator: Sonnenblumenlecithine (E-322), natürliches Vanille-Aroma, * Rainforest Alliance Certified, Kakao: 74% mindestens,12.0%
Kakaomass, suhkur, kakaovoi, vanill.12.0%

nutrition_score_warning_nutriments_estimated numeric

Out[942]:

saturn.columns["nutrition_score_warning_nutriments_estimated"].stats

statvalue
n50
nulls48 (96.0%)
unique1
min 1
max 1
mean 1
median 1
std 0
q1 1
q3 1
iqr 0
skew 0
kurtosis 0
n_outliers 0
outlier_rate 0
zero_rate 0
alert: null_rate96.0% null
alert: constantonly one distinct value
Fig 237.
Distribution of nutrition_score_warning_nutriments_estimated. Vertical dash marks the median.
Show data table
Histogram bins for nutrition_score_warning_nutriments_estimated (median: 1.0).
bincount
0.5 – 0.70
0.7 – 0.90
0.9 – 1.12
1.1 – 1.30
1.3 – 1.50

ingredients_text_sk categorical

Out[945]:

saturn.columns["ingredients_text_sk"].stats

statvalue
n50
nulls49 (98.0%)
unique1
top_value
top_rate 1
cardinality 1
entropy 0
entropy_ratio 0
alert: long_tail1 singleton categories
alert: null_rate98.0% null
alert: imbalancetop value is 100.0% of rows
Fig 238.
Top values for ingredients_text_sk.
Show data table
Top values for ingredients_text_sk (1 unique shown, of 1 total).
valuecountshare
12.0%

generic_name_pt categorical

Out[948]:

saturn.columns["generic_name_pt"].stats

statvalue
n50
nulls40 (80.0%)
unique3
top_value
top_rate 0.8
cardinality 3
entropy 0.9219
entropy_ratio 0.5817
alert: long_tail2 singleton categories
alert: null_rate80.0% null
Fig 239.
Top values for generic_name_pt.
Show data table
Top values for generic_name_pt (3 unique shown, of 3 total).
valuecountshare
816.0%
Bolachas recheadas de creme para barrar de avelãs e cacau NUTELLA®12.0%
Chocolate extrafino com 70% de cacau12.0%

ingredients_text_bg categorical

Out[951]:

saturn.columns["ingredients_text_bg"].stats

statvalue
n50
nulls47 (94.0%)
unique3
top_value Какаова маса, нискомаслено какао на прах, какаово масло, захар, емулгатор: лецитин (соеви), екстракт от ванилия, Може да съдържа следи от ядки и мляко,
top_rate 0.3333
cardinality 3
entropy 1.585
entropy_ratio 1
alert: long_tail3 singleton categories
alert: null_rate94.0% null
Fig 240.
Top values for ingredients_text_bg.
Show data table
Top values for ingredients_text_bg (3 unique shown, of 3 total).
valuecountshare
Какаова маса, нискомаслено какао на прах, какаово масло, захар, емулгатор: лецитин (соеви), екстракт от ванилия, Може да съдържа следи от ядки и мляко,12.0%
12.0%
Захар, палмово масло, ЛЕШНИЦИ (13%), обезмаслено МЛЯКО на прах (8,7%), нискомаслено какао на прах (7,4%), емулгатор: лецитини (СОЯ), ванилин.12.0%

packaging_text_et categorical free_text

This column contains Estonian-language packaging text (`_et` locale suffix), but is effectively empty: 94% of its 50 rows are null, and the sole non-null value across all 3 populated rows is an empty string. With cardinality of 1 and entropy of 0.0, the column carries zero information — it has never been populated in this dataset.

Treatment: Drop — 94% null rate and only empty-string values provide no usable signal.

anthropic:default · confidence high
Out[954]:

saturn.columns["packaging_text_et"].stats

statvalue
n50
nulls47 (94.0%)
unique1
top_value
top_rate 1
cardinality 1
entropy 0
entropy_ratio 0
alert: null_rate94.0% null
alert: imbalancetop value is 100.0% of rows
Fig 241.
Top values for packaging_text_et.
Show data table
Top values for packaging_text_et (1 unique shown, of 1 total).
valuecountshare
36.0%

product_name_sk categorical

Out[957]:

saturn.columns["product_name_sk"].stats

statvalue
n50
nulls49 (98.0%)
unique1
top_value
top_rate 1
cardinality 1
entropy 0
entropy_ratio 0
alert: long_tail1 singleton categories
alert: null_rate98.0% null
alert: imbalancetop value is 100.0% of rows
Fig 242.
Top values for product_name_sk.
Show data table
Top values for product_name_sk (1 unique shown, of 1 total).
valuecountshare
12.0%

ingredients_text_ca categorical

Out[960]:

saturn.columns["ingredients_text_ca"].stats

statvalue
n50
nulls48 (96.0%)
unique1
top_value
top_rate 1
cardinality 1
entropy 0
entropy_ratio 0
alert: null_rate96.0% null
alert: imbalancetop value is 100.0% of rows
Fig 243.
Top values for ingredients_text_ca.
Show data table
Top values for ingredients_text_ca (1 unique shown, of 1 total).
valuecountshare
24.0%

ingredients_text_with_allergens_ca categorical

Out[963]:

saturn.columns["ingredients_text_with_allergens_ca"].stats

statvalue
n50
nulls49 (98.0%)
unique1
top_value
top_rate 1
cardinality 1
entropy 0
entropy_ratio 0
alert: long_tail1 singleton categories
alert: null_rate98.0% null
alert: imbalancetop value is 100.0% of rows
Fig 244.
Top values for ingredients_text_with_allergens_ca.
Show data table
Top values for ingredients_text_with_allergens_ca (1 unique shown, of 1 total).
valuecountshare
12.0%

product_name_dz categorical

Out[966]:

saturn.columns["product_name_dz"].stats

statvalue
n50
nulls49 (98.0%)
unique1
top_value
top_rate 1
cardinality 1
entropy 0
entropy_ratio 0
alert: long_tail1 singleton categories
alert: null_rate98.0% null
alert: imbalancetop value is 100.0% of rows
Fig 245.
Top values for product_name_dz.
Show data table
Top values for product_name_dz (1 unique shown, of 1 total).
valuecountshare
12.0%

product_name_sl categorical

Out[969]:

saturn.columns["product_name_sl"].stats

statvalue
n50
nulls49 (98.0%)
unique1
top_value ARRIBA 85% cacao
top_rate 1
cardinality 1
entropy 0
entropy_ratio 0
alert: long_tail1 singleton categories
alert: null_rate98.0% null
alert: imbalancetop value is 100.0% of rows
Fig 246.
Top values for product_name_sl.
Show data table
Top values for product_name_sl (1 unique shown, of 1 total).
valuecountshare
ARRIBA 85% cacao12.0%

origin_sk categorical

Out[972]:

saturn.columns["origin_sk"].stats

statvalue
n50
nulls49 (98.0%)
unique1
top_value
top_rate 1
cardinality 1
entropy 0
entropy_ratio 0
alert: long_tail1 singleton categories
alert: null_rate98.0% null
alert: imbalancetop value is 100.0% of rows
Fig 247.
Top values for origin_sk.
Show data table
Top values for origin_sk (1 unique shown, of 1 total).
valuecountshare
12.0%

generic_name_et categorical label

This column appears to be an Estonian-language generic name field ('et' locale suffix), but it is effectively empty: 94% of its 50 rows are null, and the sole non-null value is a blank string appearing 3 times, giving a cardinality of 1. The column carries zero information — entropy is 0.0 and top_rate is 1.0 across a single empty token.

Treatment: Drop this column; it contains no usable data (94% null, remaining values are blank strings).

anthropic:default · confidence high
Out[975]:

saturn.columns["generic_name_et"].stats

statvalue
n50
nulls47 (94.0%)
unique1
top_value
top_rate 1
cardinality 1
entropy 0
entropy_ratio 0
alert: null_rate94.0% null
alert: imbalancetop value is 100.0% of rows
Fig 248.
Top values for generic_name_et.
Show data table
Top values for generic_name_et (1 unique shown, of 1 total).
valuecountshare
36.0%

ingredients_text_et categorical

Out[978]:

saturn.columns["ingredients_text_et"].stats

statvalue
n50
nulls47 (94.0%)
unique3
top_value kakavova masa, manjmasten kakavov prah, kakavovo maslo, sladkor, emulgator: lecitini (_sojin_ lecitin); ekstrakt vanilije.
top_rate 0.3333
cardinality 3
entropy 1.585
entropy_ratio 1
alert: long_tail3 singleton categories
alert: null_rate94.0% null
Fig 249.
Top values for ingredients_text_et.
Show data table
Top values for ingredients_text_et (3 unique shown, of 3 total).
valuecountshare
kakavova masa, manjmasten kakavov prah, kakavovo maslo, sladkor, emulgator: lecitini (_sojin_ lecitin); ekstrakt vanilije.12.0%
Kakaomasse*, Zucker, Kakaobutter, Kakaopulver stark entöit, Emulgator: Sonnenblumenlecithine (E-322), natürliches Vanille-Aroma, * Rainforest Alliance Certified, Kakao: 74% mindestens,12.0%
Kakaomass, suhkur, kakaovoi, vanill.12.0%

packaging_text_ca categorical

Out[981]:

saturn.columns["packaging_text_ca"].stats

statvalue
n50
nulls48 (96.0%)
unique1
top_value
top_rate 1
cardinality 1
entropy 0
entropy_ratio 0
alert: null_rate96.0% null
alert: imbalancetop value is 100.0% of rows
Fig 250.
Top values for packaging_text_ca.
Show data table
Top values for packaging_text_ca (1 unique shown, of 1 total).
valuecountshare
24.0%

packaging_text_sl categorical

Out[984]:

saturn.columns["packaging_text_sl"].stats

statvalue
n50
nulls49 (98.0%)
unique1
top_value
top_rate 1
cardinality 1
entropy 0
entropy_ratio 0
alert: long_tail1 singleton categories
alert: null_rate98.0% null
alert: imbalancetop value is 100.0% of rows
Fig 251.
Top values for packaging_text_sl.
Show data table
Top values for packaging_text_sl (1 unique shown, of 1 total).
valuecountshare
12.0%

generic_name_dz categorical

Out[987]:

saturn.columns["generic_name_dz"].stats

statvalue
n50
nulls49 (98.0%)
unique1
top_value
top_rate 1
cardinality 1
entropy 0
entropy_ratio 0
alert: long_tail1 singleton categories
alert: null_rate98.0% null
alert: imbalancetop value is 100.0% of rows
Fig 252.
Top values for generic_name_dz.
Show data table
Top values for generic_name_dz (1 unique shown, of 1 total).
valuecountshare
12.0%

origin_ca categorical

Out[990]:

saturn.columns["origin_ca"].stats

statvalue
n50
nulls48 (96.0%)
unique1
top_value
top_rate 1
cardinality 1
entropy 0
entropy_ratio 0
alert: null_rate96.0% null
alert: imbalancetop value is 100.0% of rows
Fig 253.
Top values for origin_ca.
Show data table
Top values for origin_ca (1 unique shown, of 1 total).
valuecountshare
24.0%

product_name_ca categorical

Out[993]:

saturn.columns["product_name_ca"].stats

statvalue
n50
nulls48 (96.0%)
unique1
top_value
top_rate 1
cardinality 1
entropy 0
entropy_ratio 0
alert: null_rate96.0% null
alert: imbalancetop value is 100.0% of rows
Fig 254.
Top values for product_name_ca.
Show data table
Top values for product_name_ca (1 unique shown, of 1 total).
valuecountshare
24.0%

packaging_text_pt categorical free_text

This column appears to be a Portuguese-language packaging text field, almost certainly intended to carry product label or packaging descriptions. With an 80% null rate and the sole non-null value being an empty string appearing 10 times, the column contains zero usable information across all 50 rows. The effective data-present rate is 0%, making this column entirely empty in practice.

Treatment: Drop this column; it carries no information and all present values are empty strings.

anthropic:default · confidence high
Out[996]:

saturn.columns["packaging_text_pt"].stats

statvalue
n50
nulls40 (80.0%)
unique1
top_value
top_rate 1
cardinality 1
entropy 0
entropy_ratio 0
alert: null_rate80.0% null
alert: imbalancetop value is 100.0% of rows
Fig 255.
Top values for packaging_text_pt.
Show data table
Top values for packaging_text_pt (1 unique shown, of 1 total).
valuecountshare
1020.0%

origin_bg categorical other

This column ('origin_bg') is a categorical field with 50 rows, but 94% of values are null and the sole non-null value is an empty string appearing 3 times — making it entirely devoid of usable information. Cardinality is 1, entropy is 0, and top_rate is 1.0, confirming complete uniformity across non-null entries. Both alerts (null_rate and imbalance) are triggered, which is expected given the near-total absence of data.

Treatment: Drop this column; it carries zero information with 94% nulls and only empty strings remaining.

anthropic:default · confidence high
Out[999]:

saturn.columns["origin_bg"].stats

statvalue
n50
nulls47 (94.0%)
unique1
top_value
top_rate 1
cardinality 1
entropy 0
entropy_ratio 0
alert: null_rate94.0% null
alert: imbalancetop value is 100.0% of rows
Fig 256.
Top values for origin_bg.
Show data table
Top values for origin_bg (1 unique shown, of 1 total).
valuecountshare
36.0%

packaging_text_bg categorical free_text

This column contains Bulgarian-language packaging text for products, but it is almost entirely empty: 94% of the 50 rows are null, and the sole non-null value observed is an empty string appearing 3 times (top_rate 1.0). With cardinality of 1 and entropy of 0.0, the column carries zero information in its current state.

Treatment: Drop from modelling; re-evaluate only if Bulgarian market data is backfilled, otherwise exclude as zero-variance.

anthropic:default · confidence high
Out[1002]:

saturn.columns["packaging_text_bg"].stats

statvalue
n50
nulls47 (94.0%)
unique1
top_value
top_rate 1
cardinality 1
entropy 0
entropy_ratio 0
alert: null_rate94.0% null
alert: imbalancetop value is 100.0% of rows
Fig 257.
Top values for packaging_text_bg.
Show data table
Top values for packaging_text_bg (1 unique shown, of 1 total).
valuecountshare
36.0%

origin_pt categorical other

This column, likely representing an origin point or location, is almost entirely empty: 80% of its 50 rows are null, and the only non-null value present is an empty string appearing 10 times — meaning the column contains no actual information whatsoever. With a cardinality of 1 and entropy of 0.0, it is completely invariant. The combination of high null rate and a sole value being an empty string suggests the field was never populated in this dataset.

Treatment: Drop — column carries zero information due to 80% nulls and a single empty-string value across all remaining rows.

anthropic:default · confidence high
Out[1005]:

saturn.columns["origin_pt"].stats

statvalue
n50
nulls40 (80.0%)
unique1
top_value
top_rate 1
cardinality 1
entropy 0
entropy_ratio 0
alert: null_rate80.0% null
alert: imbalancetop value is 100.0% of rows
Fig 258.
Top values for origin_pt.
Show data table
Top values for origin_pt (1 unique shown, of 1 total).
valuecountshare
1020.0%

ingredients_text_with_allergens_pt categorical

Out[1008]:

saturn.columns["ingredients_text_with_allergens_pt"].stats

statvalue
n50
nulls42 (84.0%)
unique4
top_value
top_rate 0.625
cardinality 4
entropy 1.549
entropy_ratio 0.7744
alert: long_tail3 singleton categories
alert: null_rate84.0% null
Fig 259.
Top values for ingredients_text_with_allergens_pt.
Show data table
Top values for ingredients_text_with_allergens_pt (4 unique shown, of 4 total).
valuecountshare
510.0%
Creme para barrar de AVELAS e cacau 40% (açúcar, gordura de palma, AVELAS (13%), LEITE desnatado em pó (8,7%), cacau magro (7,4%), emulsionantes: lecitinas (SOJA), vanilina), farinha de TRIGO (32,5%), gorduras vegetais (palma, palmiste), açúcar de cana (contém TRIGO) (8,5%), LACTOSE, farelo de TRIGO, LEITE inteiro em pó, mel, levedantes químicos (difosfato dissódico, hidrogenocarbonato de sódio, hidrogenocarbonato de amónio), farinha de CEVADA maltada, cacau magro, sal, extrato em pó de malte de CEVADA e milho, amido de TRIGO, emulsionantes: lecitinas (SOJA), vanilina.12.0%
Farinha de TRIGO, gordura de palma, xarope de glucose, extrato de CEVADA malteada, levedantes (carbonatos de amónio, carbonatos de sódio), sal, OVOS, aroma, agente de tratamento da farinha (METABISSULFITO de sódio).12.0%
Pasta de cacau, açúcar, manteiga de cacau, baunilha.12.0%

product_name_bg categorical

Out[1011]:

saturn.columns["product_name_bg"].stats

statvalue
n50
nulls47 (94.0%)
unique3
top_value Шоколад 85% какаова маса
top_rate 0.3333
cardinality 3
entropy 1.585
entropy_ratio 1
alert: long_tail3 singleton categories
alert: null_rate94.0% null
Fig 260.
Top values for product_name_bg.
Show data table
Top values for product_name_bg (3 unique shown, of 3 total).
valuecountshare
Шоколад 85% какаова маса12.0%
Тъмен шоколад 74% какао12.0%
Лешниково-какаов крем12.0%

ingredients_text_sl categorical

Out[1014]:

saturn.columns["ingredients_text_sl"].stats

statvalue
n50
nulls49 (98.0%)
unique1
top_value Kakavova masa, manjmasten kakavov prah, kakavovo maslo, sladkor, emulgator: lecitini (sojin lecitin); ekstrakt vanilije. Lahko vsebuje sledi oreškov (lešniki, mandlji, pistacija) in mleka. Uporabno najmanj do: glej odtis na zadnji strani embalaže.
top_rate 1
cardinality 1
entropy 0
entropy_ratio 0
alert: long_tail1 singleton categories
alert: null_rate98.0% null
alert: imbalancetop value is 100.0% of rows
Fig 261.
Top values for ingredients_text_sl.
Show data table
Top values for ingredients_text_sl (1 unique shown, of 1 total).
valuecountshare
Kakavova masa, manjmasten kakavov prah, kakavovo maslo, sladkor, emulgator: lecitini (sojin lecitin); ekstrakt vanilije. Lahko vsebuje sledi oreškov (lešniki, mandlji, pistacija) in mleka. Uporabno najmanj do: glej odtis na zadnji strani embalaže.12.0%

generic_name_sl categorical

Out[1017]:

saturn.columns["generic_name_sl"].stats

statvalue
n50
nulls49 (98.0%)
unique1
top_value
top_rate 1
cardinality 1
entropy 0
entropy_ratio 0
alert: long_tail1 singleton categories
alert: null_rate98.0% null
alert: imbalancetop value is 100.0% of rows
Fig 262.
Top values for generic_name_sl.
Show data table
Top values for generic_name_sl (1 unique shown, of 1 total).
valuecountshare
12.0%

generic_name_sk categorical

Out[1020]:

saturn.columns["generic_name_sk"].stats

statvalue
n50
nulls49 (98.0%)
unique1
top_value
top_rate 1
cardinality 1
entropy 0
entropy_ratio 0
alert: long_tail1 singleton categories
alert: null_rate98.0% null
alert: imbalancetop value is 100.0% of rows
Fig 263.
Top values for generic_name_sk.
Show data table
Top values for generic_name_sk (1 unique shown, of 1 total).
valuecountshare
12.0%

product_name_pt categorical

Out[1023]:

saturn.columns["product_name_pt"].stats

statvalue
n50
nulls40 (80.0%)
unique7
top_value
top_rate 0.4
cardinality 7
entropy 2.522
entropy_ratio 0.8983
alert: long_tail6 singleton categories
alert: null_rate80.0% null
Fig 264.
Top values for product_name_pt.
Show data table
Top values for product_name_pt (7 unique shown, of 7 total).
valuecountshare
48.0%
Cioccolato Fondente 85% Cacao12.0%
Crocantes bolachas com um coração cremoso de Nutella®12.0%
70% Cacao noir intense12.0%
Excellence 70% Cocoa Intense Dark12.0%
Original12.0%
Mix com sultanas e arandos12.0%

lc_imported categorical

Out[1026]:

saturn.columns["lc_imported"].stats

statvalue
n50
nulls42 (84.0%)
unique2
top_value fr
top_rate 0.875
cardinality 2
entropy 0.5436
entropy_ratio 0.5436
alert: null_rate84.0% null
Fig 265.
Top values for lc_imported.
Show data table
Top values for lc_imported (2 unique shown, of 2 total).
valuecountshare
fr714.0%
es12.0%

abbreviated_product_name_fr_imported categorical

Out[1029]:

saturn.columns["abbreviated_product_name_fr_imported"].stats

statvalue
n50
nulls43 (86.0%)
unique7
top_value CRISTALINE Eau De Source 0.5L
top_rate 0.1429
cardinality 7
entropy 2.807
entropy_ratio 1
alert: long_tail7 singleton categories
alert: null_rate86.0% null
Fig 266.
Top values for abbreviated_product_name_fr_imported.
Show data table
Top values for abbreviated_product_name_fr_imported (7 unique shown, of 7 total).
valuecountshare
CRISTALINE Eau De Source 0.5L12.0%
Nutella biscuits t2212.0%
Authentique 275g, fr12.0%
Fibres 230g, fr12.0%
ORG Original 175g12.0%
NESTLE DESSERT Noir 205g12.0%
BRIOCHE TRANCHEE BIO 400g12.0%

generic_name_zh categorical

Out[1032]:

saturn.columns["generic_name_zh"].stats

statvalue
n50
nulls49 (98.0%)
unique1
top_value
top_rate 1
cardinality 1
entropy 0
entropy_ratio 0
alert: long_tail1 singleton categories
alert: null_rate98.0% null
alert: imbalancetop value is 100.0% of rows
Fig 267.
Top values for generic_name_zh.
Show data table
Top values for generic_name_zh (1 unique shown, of 1 total).
valuecountshare
12.0%

obsolete_imported categorical other

This column appears to be a boolean or flag field (likely 'imported' status, now obsolete) that contains only the value '0' across all 7 non-null rows. With an 86% null rate and a cardinality of 1, the column carries zero information — entropy is exactly 0.0 and the single observed value covers 100% of non-null records. Both the near-total nulls and complete value imbalance are flagged as alerts.

Treatment: Drop — zero variance, 86% nulls, and a name explicitly marking it obsolete make this column uninformative for any downstream use.

anthropic:default · confidence high
Out[1035]:

saturn.columns["obsolete_imported"].stats

statvalue
n50
nulls43 (86.0%)
unique1
top_value 0
top_rate 1
cardinality 1
entropy 0
entropy_ratio 0
alert: null_rate86.0% null
alert: imbalancetop value is 100.0% of rows
Fig 268.
Top values for obsolete_imported.
Show data table
Top values for obsolete_imported (1 unique shown, of 1 total).
valuecountshare
0714.0%

generic_name_fr_imported categorical

Out[1038]:

saturn.columns["generic_name_fr_imported"].stats

statvalue
n50
nulls43 (86.0%)
unique7
top_value Eau De Source
top_rate 0.1429
cardinality 7
entropy 2.807
entropy_ratio 1
alert: long_tail7 singleton categories
alert: null_rate86.0% null
Fig 269.
Top values for generic_name_fr_imported.
Show data table
Top values for generic_name_fr_imported (7 unique shown, of 7 total).
valuecountshare
Eau De Source12.0%
Biscuit fourré à la pâte à tartiner aux noisettes et au cacao Nutella®12.0%
Pain croustillant a la farine de seigle12.0%
Pain croustillant à la farine complète de seigle, avoine et sésame.12.0%
Snack salé12.0%
Chocolat noir supérieur12.0%
Brioche tranchée issue de l'agriculture biologique12.0%

owners_tags categorical

Out[1041]:

saturn.columns["owners_tags"].stats

statvalue
n50
nulls43 (86.0%)
unique6
top_value org-barilla-france-sa
top_rate 0.2857
cardinality 6
entropy 2.522
entropy_ratio 0.9755
alert: long_tail5 singleton categories
alert: null_rate86.0% null
Fig 270.
Top values for owners_tags.
Show data table
Top values for owners_tags (6 unique shown, of 6 total).
valuecountshare
org-barilla-france-sa24.0%
org-gie-sources-alma12.0%
org-ferrero-france-commerciale12.0%
org-kellogg-s12.0%
org-nestle-france12.0%
org-la-boulangere-co12.0%

owner_imported categorical

Out[1044]:

saturn.columns["owner_imported"].stats

statvalue
n50
nulls44 (88.0%)
unique5
top_value org-barilla-france-sa
top_rate 0.3333
cardinality 5
entropy 2.252
entropy_ratio 0.9697
alert: long_tail4 singleton categories
alert: null_rate88.0% null
Fig 271.
Top values for owner_imported.
Show data table
Top values for owner_imported (5 unique shown, of 5 total).
valuecountshare
org-barilla-france-sa24.0%
org-gie-sources-alma12.0%
org-ferrero-france-commerciale12.0%
org-nestle-france12.0%
org-la-boulangere-co12.0%

customer_service categorical

Out[1047]:

saturn.columns["customer_service"].stats

statvalue
n50
nulls43 (86.0%)
unique6
top_value Service Consommateurs, : www.wasa.com/fr-fr/contact (depuis la France), www.wasa.com/fr-be/contact (depuis la Belgique)
top_rate 0.2857
cardinality 6
entropy 2.522
entropy_ratio 0.9755
alert: long_tail5 singleton categories
alert: null_rate86.0% null
Fig 272.
Top values for customer_service.
Show data table
Top values for customer_service (6 unique shown, of 6 total).
valuecountshare
Service Consommateurs, : www.wasa.com/fr-fr/contact (depuis la France), www.wasa.com/fr-be/contact (depuis la Belgique)24.0%
Service Consommateurs Cristaline, 70 avenue des Sources 03270 SAINT YORRE12.0%
FERRERO FRANCE COMMERCIALE - Service Consommateurs, CS 90058 - 76136 MONT SAINT AIGNAN Cedex12.0%
Service Conseil Consommateurs, Kellogg's Produits Alimentaires S.A.S. - Immeuble Neptune - 1 rue Galilée 93160 Noisy-le-Grand (France)12.0%
Nestlé France, BP 900 Noisiel 77446 Marne la Vallée Cedex 212.0%
Service consommateurs La Boulangère & Co, La Boulangère & Co 1 rue du petit bocage CS 40 201 85140 ESSARTS12.0%

ingredients_text_zh_debug_tags unknown

Out[1050]:

saturn.columns["ingredients_text_zh_debug_tags"].stats

statvalue
n50
nulls0 (0.0%)
unique
alert: skippedno profiler for kind=unknown

countries_imported categorical

Out[1052]:

saturn.columns["countries_imported"].stats

statvalue
n50
nulls42 (84.0%)
unique2
top_value France
top_rate 0.875
cardinality 2
entropy 0.5436
entropy_ratio 0.5436
alert: null_rate84.0% null
Fig 273.
Top values for countries_imported.
Show data table
Top values for countries_imported (2 unique shown, of 2 total).
valuecountshare
France714.0%
España12.0%

data_sources_imported categorical

Out[1055]:

saturn.columns["data_sources_imported"].stats

statvalue
n50
nulls42 (84.0%)
unique8
top_value Producers, Producer - gie-sources-alma, Database - Equadis, Database - GDSN, Databases, Producers, Producer - gie-sources-alma
top_rate 0.125
cardinality 8
entropy 3
entropy_ratio 1
alert: long_tail8 singleton categories
alert: null_rate84.0% null
Fig 274.
Top values for data_sources_imported.
Show data table
Top values for data_sources_imported (8 unique shown, of 8 total).
valuecountshare
Producers, Producer - gie-sources-alma, Database - Equadis, Database - GDSN, Databases, Producers, Producer - gie-sources-alma12.0%
Producers, Producer - ferrero-france-commerciale, Database - Equadis, Database - GDSN, Databases, Producers, Producer - ferrero-france-commerciale12.0%
Database - Equadis, Database - GDSN, Databases, Producers, Producer - barilla-france-sa, Producers, Producer - barilla-france-sa12.0%
Apps, app-elcoco12.0%
Producers, Producer - barilla-france-sa, Database - Equadis, Database - GDSN, Databases, Producers, Producer - barilla-france-sa12.0%
Database - CodeOnline, Database - GDSN, Databases12.0%
Database - Equadis, Database - GDSN, Databases, Producers, Producer - nestle-france, Producers, Producer - nestle-france12.0%
Producers, Producer - la-boulangere-co, Database - Equadis, Database - GDSN, Databases, Producers, Producer - la-boulangere-co12.0%

product_name_zh categorical

Out[1058]:

saturn.columns["product_name_zh"].stats

statvalue
n50
nulls49 (98.0%)
unique1
top_value
top_rate 1
cardinality 1
entropy 0
entropy_ratio 0
alert: long_tail1 singleton categories
alert: null_rate98.0% null
alert: imbalancetop value is 100.0% of rows
Fig 275.
Top values for product_name_zh.
Show data table
Top values for product_name_zh (1 unique shown, of 1 total).
valuecountshare
12.0%

categories_imported categorical

Out[1061]:

saturn.columns["categories_imported"].stats

statvalue
n50
nulls44 (88.0%)
unique5
top_value Snacks, Snacks salés, Amuse-gueules, Chips et frites, Chips
top_rate 0.3333
cardinality 5
entropy 2.252
entropy_ratio 0.9697
alert: long_tail4 singleton categories
alert: null_rate88.0% null
Fig 276.
Top values for categories_imported.
Show data table
Top values for categories_imported (5 unique shown, of 5 total).
valuecountshare
Snacks, Snacks salés, Amuse-gueules, Chips et frites, Chips24.0%
Boissons et préparations de boissons, Boissons, Eaux, Eaux de sources12.0%
Snacks, Snacks sucrés, Biscuits et gâteaux, Biscuits sucrés & biscuits apéritifs, Biscuits, en:Biscuits/Cookies (Shelf Stable)12.0%
Snacks, Snacks sucrés, Cacao et dérivés, Chocolats, Chocolats noirs, Chocolat noir pâtissier en tablette à 40% de cacao minimum12.0%
Snacks, Snacks sucrés, en:Sweet pastries and pies, Viennoiseries12.0%

quantity_imported categorical

Out[1064]:

saturn.columns["quantity_imported"].stats

statvalue
n50
nulls43 (86.0%)
unique7
top_value 500 ml
top_rate 0.1429
cardinality 7
entropy 2.807
entropy_ratio 1
alert: long_tail7 singleton categories
alert: null_rate86.0% null
Fig 277.
Top values for quantity_imported.
Show data table
Top values for quantity_imported (7 unique shown, of 7 total).
valuecountshare
500 ml12.0%
304 g12.0%
275 g12.0%
230 g12.0%
175 g12.0%
205 g12.0%
400 g12.0%

ingredients_text_zh categorical

Out[1067]:

saturn.columns["ingredients_text_zh"].stats

statvalue
n50
nulls49 (98.0%)
unique1
top_value
top_rate 1
cardinality 1
entropy 0
entropy_ratio 0
alert: long_tail1 singleton categories
alert: null_rate98.0% null
alert: imbalancetop value is 100.0% of rows
Fig 278.
Top values for ingredients_text_zh.
Show data table
Top values for ingredients_text_zh (1 unique shown, of 1 total).
valuecountshare
12.0%

emb_code categorical

Out[1070]:

saturn.columns["emb_code"].stats

statvalue
n50
nulls49 (98.0%)
unique1
top_value EMB 44068 A
top_rate 1
cardinality 1
entropy 0
entropy_ratio 0
alert: long_tail1 singleton categories
alert: null_rate98.0% null
alert: imbalancetop value is 100.0% of rows
Fig 279.
Top values for emb_code.
Show data table
Top values for emb_code (1 unique shown, of 1 total).
valuecountshare
EMB 44068 A12.0%

origins_fr categorical

Out[1073]:

saturn.columns["origins_fr"].stats

statvalue
n50
nulls48 (96.0%)
unique2
top_value Chambon-la-Forêt,France,Cairanne,Provence-Alpes-Côte d'Azur,Vaucluse,Italie,Source Sainte Cécile,Source Ofélia,Source Éléonore,Source Emma,Source Éléna
top_rate 0.5
cardinality 2
entropy 1
entropy_ratio 1
alert: long_tail2 singleton categories
alert: null_rate96.0% null
Fig 280.
Top values for origins_fr.
Show data table
Top values for origins_fr (2 unique shown, of 2 total).
valuecountshare
Chambon-la-Forêt,France,Cairanne,Provence-Alpes-Côte d'Azur,Vaucluse,Italie,Source Sainte Cécile,Source Ofélia,Source Éléonore,Source Emma,Source Éléna12.0%
12.0%

nutrition_data_prepared_per_imported categorical metadata

This column captures the unit basis for imported nutrition data (e.g., 'per 100g'), and is effectively a constant — the only observed value is '100g' across all 7 non-null rows. With an 86% null rate and cardinality of 1, it carries zero discriminative information. The combination of near-total missingness and zero entropy is a strong signal this field was either sparsely populated at ingestion or serves as a fixed schema placeholder.

Treatment: Drop before modelling; column is a zero-variance constant with 86% nulls and provides no analytical value.

anthropic:default · confidence high
Out[1076]:

saturn.columns["nutrition_data_prepared_per_imported"].stats

statvalue
n50
nulls43 (86.0%)
unique1
top_value 100g
top_rate 1
cardinality 1
entropy 0
entropy_ratio 0
alert: null_rate86.0% null
alert: imbalancetop value is 100.0% of rows
Fig 281.
Top values for nutrition_data_prepared_per_imported.
Show data table
Top values for nutrition_data_prepared_per_imported (1 unique shown, of 1 total).
valuecountshare
100g714.0%

product_name_zh_debug_tags unknown

Out[1079]:

saturn.columns["product_name_zh_debug_tags"].stats

statvalue
n50
nulls0 (0.0%)
unique
alert: skippedno profiler for kind=unknown

sources_fields unknown

Out[1081]:

saturn.columns["sources_fields"].stats

statvalue
n50
nulls0 (0.0%)
unique
alert: skippedno profiler for kind=unknown

customer_service_fr categorical

Out[1083]:

saturn.columns["customer_service_fr"].stats

statvalue
n50
nulls43 (86.0%)
unique6
top_value Service Consommateurs, : www.wasa.com/fr-fr/contact (depuis la France), www.wasa.com/fr-be/contact (depuis la Belgique)
top_rate 0.2857
cardinality 6
entropy 2.522
entropy_ratio 0.9755
alert: long_tail5 singleton categories
alert: null_rate86.0% null
Fig 282.
Top values for customer_service_fr.
Show data table
Top values for customer_service_fr (6 unique shown, of 6 total).
valuecountshare
Service Consommateurs, : www.wasa.com/fr-fr/contact (depuis la France), www.wasa.com/fr-be/contact (depuis la Belgique)24.0%
Service Consommateurs Cristaline, 70 avenue des Sources 03270 SAINT YORRE12.0%
FERRERO FRANCE COMMERCIALE - Service Consommateurs, CS 90058 - 76136 MONT SAINT AIGNAN Cedex12.0%
Service Conseil Consommateurs, Kellogg's Produits Alimentaires S.A.S. - Immeuble Neptune - 1 rue Galilée 93160 Noisy-le-Grand (France)12.0%
Nestlé France, 34-40 rue Guynemer 92130 Issy-les-Moulineaux12.0%
Service consommateurs La Boulangère & Co, La Boulangère & Co 1 rue du petit bocage CS 40 201 85140 ESSARTS12.0%

nutrition_data_per_imported categorical metadata

This column represents the unit basis for imported nutrition data, and every non-null value is identically '100g' — giving it a cardinality of 1 and an entropy of 0.0. With an 84% null rate across 50 rows, only 8 observations carry a value at all, making the column almost entirely absent. The combination of extreme nullity and zero variance means this column provides no discriminating information whatsoever.

Treatment: Drop — 84% null with a single constant value ('100g') offers no predictive or analytical signal.

anthropic:default · confidence high
Out[1086]:

saturn.columns["nutrition_data_per_imported"].stats

statvalue
n50
nulls42 (84.0%)
unique1
top_value 100g
top_rate 1
cardinality 1
entropy 0
entropy_ratio 0
alert: null_rate84.0% null
alert: imbalancetop value is 100.0% of rows
Fig 283.
Top values for nutrition_data_per_imported.
Show data table
Top values for nutrition_data_per_imported (1 unique shown, of 1 total).
valuecountshare
100g816.0%

owner categorical

Out[1089]:

saturn.columns["owner"].stats

statvalue
n50
nulls43 (86.0%)
unique6
top_value org-barilla-france-sa
top_rate 0.2857
cardinality 6
entropy 2.522
entropy_ratio 0.9755
alert: long_tail5 singleton categories
alert: null_rate86.0% null
Fig 284.
Top values for owner.
Show data table
Top values for owner (6 unique shown, of 6 total).
valuecountshare
org-barilla-france-sa24.0%
org-gie-sources-alma12.0%
org-ferrero-france-commerciale12.0%
org-kellogg-s12.0%
org-nestle-france12.0%
org-la-boulangere-co12.0%

abbreviated_product_name categorical

Out[1092]:

saturn.columns["abbreviated_product_name"].stats

statvalue
n50
nulls43 (86.0%)
unique7
top_value CRISTALINE Eau De Source 0.5L
top_rate 0.1429
cardinality 7
entropy 2.807
entropy_ratio 1
alert: long_tail7 singleton categories
alert: null_rate86.0% null
Fig 285.
Top values for abbreviated_product_name.
Show data table
Top values for abbreviated_product_name (7 unique shown, of 7 total).
valuecountshare
CRISTALINE Eau De Source 0.5L12.0%
Nutella biscuits t2212.0%
Authentique 275g, fr12.0%
Fibres 230g, fr12.0%
ORG Original 175g12.0%
NESTLE DESSERT Noir 205g12.0%
BRIOCHE TRANCHEE BIO 400g12.0%

conservation_conditions_fr categorical

Out[1095]:

saturn.columns["conservation_conditions_fr"].stats

statvalue
n50
nulls43 (86.0%)
unique7
top_value A conserver de préférence à l'abri du soleil, dans un endroit propre, frais et sans odeur.
top_rate 0.1429
cardinality 7
entropy 2.807
entropy_ratio 1
alert: long_tail7 singleton categories
alert: null_rate86.0% null
Fig 286.
Top values for conservation_conditions_fr.
Show data table
Top values for conservation_conditions_fr (7 unique shown, of 7 total).
valuecountshare
A conserver de préférence à l'abri du soleil, dans un endroit propre, frais et sans odeur.12.0%
A conserver au sec et à l'abri de la chaleur. Ne pas mettre au réfrigérateur.12.0%
A conserver dans un endroit sec à l'abri de la lumière.12.0%
Conserver dans un endroit frais et sec.12.0%
À conserver dans un endroit sec12.0%
A conserver au frais et au sec.12.0%
À conserver dans son emballage fermé, dans un endroit sec, à température ambiante.12.0%

brands_imported categorical

Out[1098]:

saturn.columns["brands_imported"].stats

statvalue
n50
nulls43 (86.0%)
unique6
top_value Wasa
top_rate 0.2857
cardinality 6
entropy 2.522
entropy_ratio 0.9755
alert: long_tail5 singleton categories
alert: null_rate86.0% null
Fig 287.
Top values for brands_imported.
Show data table
Top values for brands_imported (6 unique shown, of 6 total).
valuecountshare
Wasa24.0%
Cristaline12.0%
Nutella biscuits12.0%
Pringles12.0%
NESTLE DESSERT,Tablettes12.0%
La boulangere12.0%

owner_fields unknown

Out[1101]:

saturn.columns["owner_fields"].stats

statvalue
n50
nulls0 (0.0%)
unique
alert: skippedno profiler for kind=unknown

conservation_conditions_fr_imported categorical

Out[1103]:

saturn.columns["conservation_conditions_fr_imported"].stats

statvalue
n50
nulls43 (86.0%)
unique7
top_value A conserver de préférence à l'abri du soleil, dans un endroit propre, frais et sans odeur.
top_rate 0.1429
cardinality 7
entropy 2.807
entropy_ratio 1
alert: long_tail7 singleton categories
alert: null_rate86.0% null
Fig 288.
Top values for conservation_conditions_fr_imported.
Show data table
Top values for conservation_conditions_fr_imported (7 unique shown, of 7 total).
valuecountshare
A conserver de préférence à l'abri du soleil, dans un endroit propre, frais et sans odeur.12.0%
A conserver au sec et à l'abri de la chaleur. Ne pas mettre au réfrigérateur.12.0%
A conserver dans un endroit sec à l'abri de la lumière.12.0%
Conserver dans un endroit frais et sec.12.0%
À conserver dans un endroit sec12.0%
A conserver au frais et au sec.12.0%
À conserver dans son emballage fermé, dans un endroit sec, à température ambiante.12.0%

origin_fr_imported categorical

Out[1106]:

saturn.columns["origin_fr_imported"].stats

statvalue
n50
nulls48 (96.0%)
unique2
top_value France
top_rate 0.5
cardinality 2
entropy 1
entropy_ratio 1
alert: long_tail2 singleton categories
alert: null_rate96.0% null
Fig 289.
Top values for origin_fr_imported.
Show data table
Top values for origin_fr_imported (2 unique shown, of 2 total).
valuecountshare
France12.0%
Pâte de cacao (Afrique de l'Ouest, Amérique du Sud) Afrique, Europe, Madagascar, Amérique du Sud, Afrique de l'Ouest12.0%

customer_service_fr_imported categorical

Out[1109]:

saturn.columns["customer_service_fr_imported"].stats

statvalue
n50
nulls43 (86.0%)
unique6
top_value Service Consommateurs, : www.wasa.com/fr-fr/contact (depuis la France), www.wasa.com/fr-be/contact (depuis la Belgique)
top_rate 0.2857
cardinality 6
entropy 2.522
entropy_ratio 0.9755
alert: long_tail5 singleton categories
alert: null_rate86.0% null
Fig 290.
Top values for customer_service_fr_imported.
Show data table
Top values for customer_service_fr_imported (6 unique shown, of 6 total).
valuecountshare
Service Consommateurs, : www.wasa.com/fr-fr/contact (depuis la France), www.wasa.com/fr-be/contact (depuis la Belgique)24.0%
Service Consommateurs Cristaline, 70 avenue des Sources 03270 SAINT YORRE12.0%
FERRERO FRANCE COMMERCIALE - Service Consommateurs, CS 90058 - 76136 MONT SAINT AIGNAN Cedex12.0%
Service Conseil Consommateurs, Kellogg's Produits Alimentaires S.A.S. - Immeuble Neptune - 1 rue Galilée 93160 Noisy-le-Grand (France)12.0%
Nestlé France, 34-40 rue Guynemer 92130 Issy-les-Moulineaux12.0%
Service consommateurs La Boulangère & Co, La Boulangère & Co 1 rue du petit bocage CS 40 201 85140 ESSARTS12.0%

generic_name_zh_debug_tags unknown

Out[1112]:

saturn.columns["generic_name_zh_debug_tags"].stats

statvalue
n50
nulls0 (0.0%)
unique
alert: skippedno profiler for kind=unknown

product_name_fr_imported categorical

Out[1114]:

saturn.columns["product_name_fr_imported"].stats

statvalue
n50
nulls43 (86.0%)
unique7
top_value CRISTALINE Eau De Source 0.5L
top_rate 0.1429
cardinality 7
entropy 2.807
entropy_ratio 1
alert: long_tail7 singleton categories
alert: null_rate86.0% null
Fig 291.
Top values for product_name_fr_imported.
Show data table
Top values for product_name_fr_imported (7 unique shown, of 7 total).
valuecountshare
CRISTALINE Eau De Source 0.5L12.0%
Biscuits Nutella x22 biscuits fourrés - 304g12.0%
Wasa tartine croustillante authentique au seigle 275g12.0%
Wasa tartine croustillante fibres 230g12.0%
Chips Pringles Original12.0%
NESTLE DESSERT Noir 205g12.0%
Brioche Tranchée Bio 400g12.0%

lang_imported categorical metadata

This column records the imported language of a record, and across the full 50-row dataset every non-null value is 'fr' (French) — a single unique value with zero entropy. With an 86% null rate, only 7 of 50 rows carry any value at all, making the column nearly empty and entirely constant where populated. Both the extreme null rate and perfect imbalance are flagged as alerts, suggesting this field may be partially populated metadata from an import pipeline rather than a reliable feature.

Treatment: Drop or impute cautiously — 86% nulls and zero variance make this column uninformative for modelling; investigate import pipeline for why values are absent.

anthropic:default · confidence high
Out[1117]:

saturn.columns["lang_imported"].stats

statvalue
n50
nulls43 (86.0%)
unique1
top_value fr
top_rate 1
cardinality 1
entropy 0
entropy_ratio 0
alert: null_rate86.0% null
alert: imbalancetop value is 100.0% of rows
Fig 292.
Top values for lang_imported.
Show data table
Top values for lang_imported (1 unique shown, of 1 total).
valuecountshare
fr714.0%

abbreviated_product_name_fr categorical

Out[1120]:

saturn.columns["abbreviated_product_name_fr"].stats

statvalue
n50
nulls43 (86.0%)
unique7
top_value CRISTALINE Eau De Source 0.5L
top_rate 0.1429
cardinality 7
entropy 2.807
entropy_ratio 1
alert: long_tail7 singleton categories
alert: null_rate86.0% null
Fig 293.
Top values for abbreviated_product_name_fr.
Show data table
Top values for abbreviated_product_name_fr (7 unique shown, of 7 total).
valuecountshare
CRISTALINE Eau De Source 0.5L12.0%
Nutella biscuits t2212.0%
Authentique 275g, fr12.0%
Fibres 230g, fr12.0%
ORG Original 175g12.0%
NESTLE DESSERT Noir 205g12.0%
BRIOCHE TRANCHEE BIO 400g12.0%

ingredients_text_fr_imported categorical

Out[1123]:

saturn.columns["ingredients_text_fr_imported"].stats

statvalue
n50
nulls43 (86.0%)
unique7
top_value Eau de Source
top_rate 0.1429
cardinality 7
entropy 2.807
entropy_ratio 1
alert: long_tail7 singleton categories
alert: null_rate86.0% null
Fig 294.
Top values for ingredients_text_fr_imported.
Show data table
Top values for ingredients_text_fr_imported (7 unique shown, of 7 total).
valuecountshare
Eau de Source12.0%
Pâte à tartiner aux NOISETTES et au cacao 40% (sucre, huile de palme, NOISETTES 13%, LAIT écrémé en poudre 8,7%, cacao maigre 7,4%, émulsifiants : lécithines [SOJA] ; vanilline), farine de FROMENT 32%, graisses végétales (palme, palmiste), sucre de canne 8,5%, LACTOSE, son de BLE, LAIT en poudre, extrait en poudre de malt d'ORGE et de maïs, miel, poudres à lever (disphosphate disodique, carbonate acide d'ammonium, carbonate acide de sodium), cacao maigre, sel, amidon de FROMENT, farine d'ORGE malté, émulsifiants : lécithines [SOJA] ; vanilline.12.0%
Farine complète de SEIGLE (77 g*), farine de SEIGLE (28 g*), levure, sel. Peut contenir des traces de LUPIN, LAIT, MOUTARDE, GRAINES DE SÉSAME et SOJA. *en g pour 100 g de produit.12.0%
Farine complète de SEIGLE 59 g*, son de BLÉ 27 g*, flocons d'AVOINE 12 g*, GRAINES DE SÉSAME 7,0 g*, germe de BLÉ, sel. *en g pour 100 g de produit fini. Peut contenir des traces de LUPIN, LAIT, MOUTARDE et SOJA.12.0%
Pommes de terre déshydratées, huiles végétales (tournesol, maïs), farine de riz, amidon de BLÉ, farine de maïs, émulsifiant (E471), maltodextrine, sel, extrait de levure, levure en poudre, colorant (rocou).12.0%
Sucre, pâte de cacao (Afrique de l'Ouest, Amérique du Sud), beurre de cacao, émulsifiant (lécithine), arôme naturel de vanille de Madagascar. Cacao : 53% minimum. Peut contenir : LAIT, FRUITS A COQUE.12.0%
Farine de BLÉ*/** 54%, ŒUFS entiers*/** 14%, sucre de canne roux*, huile de tournesol*/** 8%, levain* (eau, farines de BLÉ*/** 2% et de SEIGLE*, levures), GLUTEN DE BLÉ*, sel, levure, arôme naturel de vanille* (contient alcool*), extrait de vanille*, levure désactivée. Traces éventuelles de lait, moutarde et soja. *Ingrédients issus de l'Agriculture Biologique. **Ingrédients issus du commerce équitable français.12.0%

conservation_conditions categorical

Out[1126]:

saturn.columns["conservation_conditions"].stats

statvalue
n50
nulls43 (86.0%)
unique7
top_value A conserver de préférence à l'abri du soleil, dans un endroit propre, frais et sans odeur.
top_rate 0.1429
cardinality 7
entropy 2.807
entropy_ratio 1
alert: long_tail7 singleton categories
alert: null_rate86.0% null
Fig 295.
Top values for conservation_conditions.
Show data table
Top values for conservation_conditions (7 unique shown, of 7 total).
valuecountshare
A conserver de préférence à l'abri du soleil, dans un endroit propre, frais et sans odeur.12.0%
A conserver au sec et à l'abri de la chaleur. Ne pas mettre au réfrigérateur.12.0%
A conserver dans un endroit sec à l'abri de la lumière.12.0%
Conserver dans un endroit frais et sec.12.0%
À conserver dans un endroit sec12.0%
A conserver au frais et au sec.12.0%
À conserver dans son emballage fermé, dans un endroit sec, à température ambiante.12.0%

nova_group_error categorical

Out[1129]:

saturn.columns["nova_group_error"].stats

statvalue
n50
nulls48 (96.0%)
unique1
top_value too_many_unknown_ingredients
top_rate 1
cardinality 1
entropy 0
entropy_ratio 0
alert: null_rate96.0% null
alert: imbalancetop value is 100.0% of rows
Fig 296.
Top values for nova_group_error.
Show data table
Top values for nova_group_error (1 unique shown, of 1 total).
valuecountshare
too_many_unknown_ingredients24.0%

producer_version_id_imported categorical

Out[1132]:

saturn.columns["producer_version_id_imported"].stats

statvalue
n50
nulls46 (92.0%)
unique3
top_value 1
top_rate 0.5
cardinality 3
entropy 1.5
entropy_ratio 0.9464
alert: long_tail2 singleton categories
alert: null_rate92.0% null
Fig 297.
Top values for producer_version_id_imported.
Show data table
Top values for producer_version_id_imported (3 unique shown, of 3 total).
valuecountshare
124.0%
2021-01-25T13:53:49+01:0012.0%
4421706312.0%

ingredients_text_de_ocr_1648990410 categorical

Out[1135]:

saturn.columns["ingredients_text_de_ocr_1648990410"].stats

statvalue
n50
nulls49 (98.0%)
unique1
top_value Kekse mit Nuss - Nugat- Creme - Füllung: Nuss-Nugat-Creme 40% (Zucker, Palmöl, HASELNÜSSE Magermilchpulver, fettarmer Kakao, Emulgator Lecithine (S0JA), Vanillin, Weizenmehl, pflanzliche Fette ( Palm, Palmkern), Rohrzucker, Milchzucker, Weizenkleie, VOLLMILCHPULVER, GERSTENMALZ-und Maisextraktpulver, Honig. Backtriebmittel: Dinatriumdiphosphat, Natriumhydrogencarbonat, Ammoniumhydrogencarbonat; fettarmer Kakao, Salz, Weizenstärke, Gerstenmalzmehl, Emulgator Lecithine (Soja), Vanillin
top_rate 1
cardinality 1
entropy 0
entropy_ratio 0
alert: long_tail1 singleton categories
alert: null_rate98.0% null
alert: imbalancetop value is 100.0% of rows
Fig 298.
Top values for ingredients_text_de_ocr_1648990410.
Show data table
Top values for ingredients_text_de_ocr_1648990410 (1 unique shown, of 1 total).
valuecountshare
Kekse mit Nuss - Nugat- Creme - Füllung: Nuss-Nugat-Creme 40% (Zucker, Palmöl, HASELNÜSSE Magermilchpulver, fettarmer Kakao, Emulgator Lecithine (S0JA), Vanillin, Weizenmehl, pflanzliche Fette ( Palm, Palmkern), Rohrzucker, Milchzucker, Weizenkleie, VOLLMILCHPULVER, GERSTENMALZ-und Maisextraktpulver, Honig. Backtriebmittel: Dinatriumdiphosphat, Natriumhydrogencarbonat, Ammoniumhydrogencarbonat; fettarmer Kakao, Salz, Weizenstärke, Gerstenmalzmehl, Emulgator Lecithine (Soja), Vanillin12.0%

product_name_ro categorical

Out[1138]:

saturn.columns["product_name_ro"].stats

statvalue
n50
nulls48 (96.0%)
unique2
top_value
top_rate 0.5
cardinality 2
entropy 1
entropy_ratio 1
alert: long_tail2 singleton categories
alert: null_rate96.0% null
Fig 299.
Top values for product_name_ro.
Show data table
Top values for product_name_ro (2 unique shown, of 2 total).
valuecountshare
12.0%
Sour Cream & Onion12.0%

packaging_imported categorical

Out[1141]:

saturn.columns["packaging_imported"].stats

statvalue
n50
nulls46 (92.0%)
unique2
top_value Enveloppe
top_rate 0.75
cardinality 2
entropy 0.8113
entropy_ratio 0.8113
alert: null_rate92.0% null
Fig 300.
Top values for packaging_imported.
Show data table
Top values for packaging_imported (2 unique shown, of 2 total).
valuecountshare
Enveloppe36.0%
Boîte, Barquette12.0%

ingredients_text_de_ocr_1648990410_result categorical

Out[1144]:

saturn.columns["ingredients_text_de_ocr_1648990410_result"].stats

statvalue
n50
nulls49 (98.0%)
unique1
top_value Kekse mit Nuss - Nugat - Creme - Füllung: Nuss-Nugat-Creme 40% (Zucker, Palmöl, HASELNÜSSE Magermilchpulver, fettarmer Kakao, Emulgator Lecithine (S0JA), Vanillin, Weizenmehl, pflanzliche Fette ( Palm, Palmkern), Rohrzucker, Milchzucker, Weizenkleie, VOLLMILCHPULVER, GERSTENMALZ-und Maisextraktpulver, Honig. Backtriebmittel: Dinatriumdiphosphat, Natriumhydrogencarbonat, Ammoniumhydrogencarbonat; fettarmer Kakao, Salz, Weizenstärke, Gerstenmalzmehl, Emulgator Lecithine (Soja), Vanillin
top_rate 1
cardinality 1
entropy 0
entropy_ratio 0
alert: long_tail1 singleton categories
alert: null_rate98.0% null
alert: imbalancetop value is 100.0% of rows
Fig 301.
Top values for ingredients_text_de_ocr_1648990410_result.
Show data table
Top values for ingredients_text_de_ocr_1648990410_result (1 unique shown, of 1 total).
valuecountshare
Kekse mit Nuss - Nugat - Creme - Füllung: Nuss-Nugat-Creme 40% (Zucker, Palmöl, HASELNÜSSE Magermilchpulver, fettarmer Kakao, Emulgator Lecithine (S0JA), Vanillin, Weizenmehl, pflanzliche Fette ( Palm, Palmkern), Rohrzucker, Milchzucker, Weizenkleie, VOLLMILCHPULVER, GERSTENMALZ-und Maisextraktpulver, Honig. Backtriebmittel: Dinatriumdiphosphat, Natriumhydrogencarbonat, Ammoniumhydrogencarbonat; fettarmer Kakao, Salz, Weizenstärke, Gerstenmalzmehl, Emulgator Lecithine (Soja), Vanillin12.0%

ingredients_text_ro categorical

Out[1147]:

saturn.columns["ingredients_text_ro"].stats

statvalue
n50
nulls48 (96.0%)
unique1
top_value
top_rate 1
cardinality 1
entropy 0
entropy_ratio 0
alert: null_rate96.0% null
alert: imbalancetop value is 100.0% of rows
Fig 302.
Top values for ingredients_text_ro.
Show data table
Top values for ingredients_text_ro (1 unique shown, of 1 total).
valuecountshare
24.0%

producer_version_id categorical

Out[1150]:

saturn.columns["producer_version_id"].stats

statvalue
n50
nulls46 (92.0%)
unique3
top_value 1
top_rate 0.5
cardinality 3
entropy 1.5
entropy_ratio 0.9464
alert: long_tail2 singleton categories
alert: null_rate92.0% null
Fig 303.
Top values for producer_version_id.
Show data table
Top values for producer_version_id (3 unique shown, of 3 total).
valuecountshare
124.0%
2021-01-25T13:53:49+01:0012.0%
4421706312.0%

labels_imported categorical

Out[1153]:

saturn.columns["labels_imported"].stats

statvalue
n50
nulls45 (90.0%)
unique3
top_value Végétarien
top_rate 0.6
cardinality 3
entropy 1.371
entropy_ratio 0.865
alert: long_tail2 singleton categories
alert: null_rate90.0% null
Fig 304.
Top values for labels_imported.
Show data table
Top values for labels_imported (3 unique shown, of 3 total).
valuecountshare
Végétarien36.0%
Point Vert, Rainforest Alliance, Triman12.0%
Commerce équitable, Bio, Bio européen, en:organic12.0%

allergens_imported categorical

Out[1156]:

saturn.columns["allergens_imported"].stats

statvalue
n50
nulls45 (90.0%)
unique4
top_value Gluten
top_rate 0.4
cardinality 4
entropy 1.922
entropy_ratio 0.961
alert: long_tail3 singleton categories
alert: null_rate90.0% null
Fig 305.
Top values for allergens_imported.
Show data table
Top values for allergens_imported (4 unique shown, of 4 total).
valuecountshare
Gluten24.0%
Gluten, Lait, Fruits à coque, Soja, Gs1:T4078:ML12.0%
Gluten, Graines de sésame12.0%
Œufs, Gluten12.0%

origin_ro categorical

Out[1159]:

saturn.columns["origin_ro"].stats

statvalue
n50
nulls48 (96.0%)
unique1
top_value
top_rate 1
cardinality 1
entropy 0
entropy_ratio 0
alert: null_rate96.0% null
alert: imbalancetop value is 100.0% of rows
Fig 306.
Top values for origin_ro.
Show data table
Top values for origin_ro (1 unique shown, of 1 total).
valuecountshare
24.0%

no_nutrition_data_imported categorical feature

This column is a boolean flag indicating whether nutrition data was absent for a record. It has a 92% null rate across 50 rows, and the only 4 non-null values all carry the single value 'false', giving it zero entropy and cardinality of 1. The extreme null rate combined with complete value uniformity among non-nulls means this column carries no predictive signal whatsoever — it is effectively empty.

Treatment: Drop — zero variance and 92% nulls make this column useless for modelling or analysis.

anthropic:default · confidence high
Out[1162]:

saturn.columns["no_nutrition_data_imported"].stats

statvalue
n50
nulls46 (92.0%)
unique1
top_value false
top_rate 1
cardinality 1
entropy 0
entropy_ratio 0
alert: null_rate92.0% null
alert: imbalancetop value is 100.0% of rows
Fig 307.
Top values for no_nutrition_data_imported.
Show data table
Top values for no_nutrition_data_imported (1 unique shown, of 1 total).
valuecountshare
false48.0%

serving_size_imported categorical

Out[1165]:

saturn.columns["serving_size_imported"].stats

statvalue
n50
nulls44 (88.0%)
unique6
top_value 13.8 g (1)
top_rate 0.1667
cardinality 6
entropy 2.585
entropy_ratio 1
alert: long_tail6 singleton categories
alert: null_rate88.0% null
Fig 308.
Top values for serving_size_imported.
Show data table
Top values for serving_size_imported (6 unique shown, of 6 total).
valuecountshare
13.8 g (1)12.0%
11.4 g (1 tranche)12.0%
10 g (1 tranche)12.0%
30 g12.0%
25.6 g (5 carrés (25,6 g))12.0%
26.7 g (1 tranche de 26.7 g environ)12.0%

generic_name_ro categorical

Out[1168]:

saturn.columns["generic_name_ro"].stats

statvalue
n50
nulls48 (96.0%)
unique1
top_value
top_rate 1
cardinality 1
entropy 0
entropy_ratio 0
alert: null_rate96.0% null
alert: imbalancetop value is 100.0% of rows
Fig 309.
Top values for generic_name_ro.
Show data table
Top values for generic_name_ro (1 unique shown, of 1 total).
valuecountshare
24.0%

ingredients_text_de_ocr_1648897071 categorical

Out[1171]:

saturn.columns["ingredients_text_de_ocr_1648897071"].stats

statvalue
n50
nulls49 (98.0%)
unique1
top_value Nuss-Nougat-Creme 40% (Zucker, Palmöl, _Haselnüsse_ 13%, _Magermilchpulver_ 8,7%, fettarmer Kakao 7,4%, Emulgator Lecithine (_Soja_), Vanillin), _Weizenmehl_ 32,5%, pflanzliche Fette (Palm, Palmkern), Rohrzucker 8,5% (enthält _Weizen_), _Milchzucker_, _Weizenkleie_, _Vollmilchpulver_, _Gerstenmalz_- und Maisextraktpulver, Honig, Backtriebmittel: Dinatriumdiphosphat, Natriumhydrogencarbonat, Ammoniumhydrogencarbonat; fettarmer Kakao, Salz, _Weizenstärke_, _Gerstenmalzmehl_, Emulgator Lecithine (_Soja_), Vanillin
top_rate 1
cardinality 1
entropy 0
entropy_ratio 0
alert: long_tail1 singleton categories
alert: null_rate98.0% null
alert: imbalancetop value is 100.0% of rows
Fig 310.
Top values for ingredients_text_de_ocr_1648897071.
Show data table
Top values for ingredients_text_de_ocr_1648897071 (1 unique shown, of 1 total).
valuecountshare
Nuss-Nougat-Creme 40% (Zucker, Palmöl, _Haselnüsse_ 13%, _Magermilchpulver_ 8,7%, fettarmer Kakao 7,4%, Emulgator Lecithine (_Soja_), Vanillin), _Weizenmehl_ 32,5%, pflanzliche Fette (Palm, Palmkern), Rohrzucker 8,5% (enthält _Weizen_), _Milchzucker_, _Weizenkleie_, _Vollmilchpulver_, _Gerstenmalz_- und Maisextraktpulver, Honig, Backtriebmittel: Dinatriumdiphosphat, Natriumhydrogencarbonat, Ammoniumhydrogencarbonat; fettarmer Kakao, Salz, _Weizenstärke_, _Gerstenmalzmehl_, Emulgator Lecithine (_Soja_), Vanillin12.0%

ingredients_text_de_ocr_1648897071_result categorical

Out[1174]:

saturn.columns["ingredients_text_de_ocr_1648897071_result"].stats

statvalue
n50
nulls49 (98.0%)
unique1
top_value Nuss-Nougat-Creme 40% (Zucker, Palmöl, _Haselnüsse_ 13%, _Magermilchpulver_ 8,7%, fettarmer Kakao 7,4%, Emulgator Lecithine (_Soja_), Vanillin), _Weizenmehl_ 32,5%, pflanzliche Fette (Palm, Palmkern), Rohrzucker 8,5% (enthält _Weizen_), _Milchzucker_, _Weizenkleie_, _Vollmilchpulver_, _Gerstenmalz_ - und Maisextraktpulver, Honig, Backtriebmittel: Dinatriumdiphosphat, Natriumhydrogencarbonat, Ammoniumhydrogencarbonat; fettarmer Kakao, Salz, _Weizenstärke_, _Gerstenmalzmehl_, Emulgator Lecithine (_Soja_), Vanillin
top_rate 1
cardinality 1
entropy 0
entropy_ratio 0
alert: long_tail1 singleton categories
alert: null_rate98.0% null
alert: imbalancetop value is 100.0% of rows
Fig 311.
Top values for ingredients_text_de_ocr_1648897071_result.
Show data table
Top values for ingredients_text_de_ocr_1648897071_result (1 unique shown, of 1 total).
valuecountshare
Nuss-Nougat-Creme 40% (Zucker, Palmöl, _Haselnüsse_ 13%, _Magermilchpulver_ 8,7%, fettarmer Kakao 7,4%, Emulgator Lecithine (_Soja_), Vanillin), _Weizenmehl_ 32,5%, pflanzliche Fette (Palm, Palmkern), Rohrzucker 8,5% (enthält _Weizen_), _Milchzucker_, _Weizenkleie_, _Vollmilchpulver_, _Gerstenmalz_ - und Maisextraktpulver, Honig, Backtriebmittel: Dinatriumdiphosphat, Natriumhydrogencarbonat, Ammoniumhydrogencarbonat; fettarmer Kakao, Salz, _Weizenstärke_, _Gerstenmalzmehl_, Emulgator Lecithine (_Soja_), Vanillin12.0%

packaging_text_ro categorical

Out[1177]:

saturn.columns["packaging_text_ro"].stats

statvalue
n50
nulls48 (96.0%)
unique1
top_value
top_rate 1
cardinality 1
entropy 0
entropy_ratio 0
alert: null_rate96.0% null
alert: imbalancetop value is 100.0% of rows
Fig 312.
Top values for packaging_text_ro.
Show data table
Top values for packaging_text_ro (1 unique shown, of 1 total).
valuecountshare
24.0%

abbreviated_product_name_imported categorical

Out[1180]:

saturn.columns["abbreviated_product_name_imported"].stats

statvalue
n50
nulls47 (94.0%)
unique3
top_value Authentique 275g, fr
top_rate 0.3333
cardinality 3
entropy 1.585
entropy_ratio 1
alert: long_tail3 singleton categories
alert: null_rate94.0% null
Fig 313.
Top values for abbreviated_product_name_imported.
Show data table
Top values for abbreviated_product_name_imported (3 unique shown, of 3 total).
valuecountshare
Authentique 275g, fr12.0%
Fibres 230g, fr12.0%
DESSERT Noir 205g12.0%

traces_imported categorical

Out[1183]:

saturn.columns["traces_imported"].stats

statvalue
n50
nulls46 (92.0%)
unique4
top_value Lupin, Lait, Moutarde, Graines de sésame, Soja
top_rate 0.25
cardinality 4
entropy 2
entropy_ratio 1
alert: long_tail4 singleton categories
alert: null_rate92.0% null
Fig 314.
Top values for traces_imported.
Show data table
Top values for traces_imported (4 unique shown, of 4 total).
valuecountshare
Lupin, Lait, Moutarde, Graines de sésame, Soja12.0%
Lupin, Lait, Moutarde, Soja12.0%
Lait, Fruits à coque12.0%
Lait, Moutarde, Soja12.0%

specific_ingredients unknown

Out[1186]:

saturn.columns["specific_ingredients"].stats

statvalue
n50
nulls0 (0.0%)
unique
alert: skippedno profiler for kind=unknown

packaging_text_ru categorical metadata

This column holds Russian-language packaging text, but is almost entirely empty: 94% of the 50 rows are null, and the sole non-null value appearing 3 times is an empty string — giving a cardinality of 1 and zero entropy. In practice the column carries no information whatsoever across the observed sample.

Treatment: Drop this column; it is effectively unpopulated (94% null, remaining values are empty strings) and provides no signal for modelling or analysis.

anthropic:default · confidence high
Out[1188]:

saturn.columns["packaging_text_ru"].stats

statvalue
n50
nulls47 (94.0%)
unique1
top_value
top_rate 1
cardinality 1
entropy 0
entropy_ratio 0
alert: null_rate94.0% null
alert: imbalancetop value is 100.0% of rows
Fig 315.
Top values for packaging_text_ru.
Show data table
Top values for packaging_text_ru (1 unique shown, of 1 total).
valuecountshare
36.0%

origin_ru categorical other

This column appears to be a Russian-language origin/source field that is almost entirely unpopulated: 94% of the 50 rows are null, and the sole non-null value is an empty string appearing 3 times. With cardinality of 1, zero entropy, and a top_rate of 1.0, the column carries absolutely no information. It was likely intended to capture Russian-locale origin metadata but was never populated.

Treatment: Drop this column — it contains no usable signal (94% null, remaining values are empty strings).

anthropic:default · confidence high
Out[1191]:

saturn.columns["origin_ru"].stats

statvalue
n50
nulls47 (94.0%)
unique1
top_value
top_rate 1
cardinality 1
entropy 0
entropy_ratio 0
alert: null_rate94.0% null
alert: imbalancetop value is 100.0% of rows
Fig 316.
Top values for origin_ru.
Show data table
Top values for origin_ru (1 unique shown, of 1 total).
valuecountshare
36.0%

ingredients_text_with_allergens_ru categorical metadata

This column is intended to store Russian-language ingredients text with allergen information for food products. It is effectively empty: 94% of the 50 rows are null, and the sole non-null value present is an empty string (''), giving a cardinality of 1 and entropy of 0. There is no usable signal whatsoever in this column for the sampled data.

Treatment: Drop this column; it carries no information (94% null, remaining values are empty strings).

anthropic:default · confidence high
Out[1194]:

saturn.columns["ingredients_text_with_allergens_ru"].stats

statvalue
n50
nulls47 (94.0%)
unique1
top_value
top_rate 1
cardinality 1
entropy 0
entropy_ratio 0
alert: null_rate94.0% null
alert: imbalancetop value is 100.0% of rows
Fig 317.
Top values for ingredients_text_with_allergens_ru.
Show data table
Top values for ingredients_text_with_allergens_ru (1 unique shown, of 1 total).
valuecountshare
36.0%

product_name_ru categorical

Out[1197]:

saturn.columns["product_name_ru"].stats

statvalue
n50
nulls47 (94.0%)
unique2
top_value
top_rate 0.6667
cardinality 2
entropy 0.9183
entropy_ratio 0.9183
alert: null_rate94.0% null
Fig 318.
Top values for product_name_ru.
Show data table
Top values for product_name_ru (2 unique shown, of 2 total).
valuecountshare
24.0%
Экселенс 99% какао12.0%

generic_name_ru categorical

Out[1200]:

saturn.columns["generic_name_ru"].stats

statvalue
n50
nulls47 (94.0%)
unique2
top_value
top_rate 0.6667
cardinality 2
entropy 0.9183
entropy_ratio 0.9183
alert: null_rate94.0% null
Fig 319.
Top values for generic_name_ru.
Show data table
Top values for generic_name_ru (2 unique shown, of 2 total).
valuecountshare
24.0%
Плитка горького шоколада (99% какао)12.0%

ingredients_text_ru categorical other

This column is a Russian-language ingredients text field for food/product records, almost certainly a localized variant of a broader ingredients column. It is 94% null across 50 rows, and the only non-null value observed is an empty string (appearing 3 times), meaning there is effectively zero usable content in this column. Cardinality of 1 and entropy of 0.0 confirm complete absence of informational signal.

Treatment: Drop; 94% null with only empty-string values provides no modelling or analytical value.

anthropic:default · confidence high
Out[1203]:

saturn.columns["ingredients_text_ru"].stats

statvalue
n50
nulls47 (94.0%)
unique1
top_value
top_rate 1
cardinality 1
entropy 0
entropy_ratio 0
alert: null_rate94.0% null
alert: imbalancetop value is 100.0% of rows
Fig 320.
Top values for ingredients_text_ru.
Show data table
Top values for ingredients_text_ru (1 unique shown, of 1 total).
valuecountshare
36.0%

packaging_text_da categorical

Out[1206]:

saturn.columns["packaging_text_da"].stats

statvalue
n50
nulls48 (96.0%)
unique1
top_value
top_rate 1
cardinality 1
entropy 0
entropy_ratio 0
alert: null_rate96.0% null
alert: imbalancetop value is 100.0% of rows
Fig 321.
Top values for packaging_text_da.
Show data table
Top values for packaging_text_da (1 unique shown, of 1 total).
valuecountshare
24.0%

generic_name_da categorical

Out[1209]:

saturn.columns["generic_name_da"].stats

statvalue
n50
nulls48 (96.0%)
unique2
top_value Kiks
top_rate 0.5
cardinality 2
entropy 1
entropy_ratio 1
alert: long_tail2 singleton categories
alert: null_rate96.0% null
Fig 322.
Top values for generic_name_da.
Show data table
Top values for generic_name_da (2 unique shown, of 2 total).
valuecountshare
Kiks12.0%
12.0%

forest_footprint_data unknown

Out[1212]:

saturn.columns["forest_footprint_data"].stats

statvalue
n50
nulls0 (0.0%)
unique
alert: skippedno profiler for kind=unknown

product_name_da categorical

Out[1214]:

saturn.columns["product_name_da"].stats

statvalue
n50
nulls48 (96.0%)
unique2
top_value Original
top_rate 0.5
cardinality 2
entropy 1
entropy_ratio 1
alert: long_tail2 singleton categories
alert: null_rate96.0% null
Fig 323.
Top values for product_name_da.
Show data table
Top values for product_name_da (2 unique shown, of 2 total).
valuecountshare
Original12.0%
Alpine Milk12.0%

ingredients_text_with_allergens_da categorical

Out[1217]:

saturn.columns["ingredients_text_with_allergens_da"].stats

statvalue
n50
nulls48 (96.0%)
unique2
top_value VETEMJÖL/HVEDEMEL, palmolja/-olie, glukossirap, maltextrakt från KORN/BYG, bakpulver/hævemidler (ammoniumkarbonater, natriumkarbonater), salt, ÄGG/ÆG/EGG, arom, mjölbehandlingsmedel/melbehandlingsmiddel (NATRIUMDISULFIT).
top_rate 0.5
cardinality 2
entropy 1
entropy_ratio 1
alert: long_tail2 singleton categories
alert: null_rate96.0% null
Fig 324.
Top values for ingredients_text_with_allergens_da.
Show data table
Top values for ingredients_text_with_allergens_da (2 unique shown, of 2 total).
valuecountshare
VETEMJÖL/HVEDEMEL, palmolja/-olie, glukossirap, maltextrakt från KORN/BYG, bakpulver/hævemidler (ammoniumkarbonater, natriumkarbonater), salt, ÄGG/ÆG/EGG, arom, mjölbehandlingsmedel/melbehandlingsmiddel (NATRIUMDISULFIT).12.0%
12.0%

origin_da categorical

Out[1220]:

saturn.columns["origin_da"].stats

statvalue
n50
nulls48 (96.0%)
unique1
top_value
top_rate 1
cardinality 1
entropy 0
entropy_ratio 0
alert: null_rate96.0% null
alert: imbalancetop value is 100.0% of rows
Fig 325.
Top values for origin_da.
Show data table
Top values for origin_da (1 unique shown, of 1 total).
valuecountshare
24.0%

ingredients_text_da categorical

Out[1223]:

saturn.columns["ingredients_text_da"].stats

statvalue
n50
nulls48 (96.0%)
unique2
top_value _VETEMJÖL_/_HVEDEMEL_, palmolja/-olie, glukossirap, maltextrakt från _KORN_/_BYG_, bakpulver/hævemidler (ammoniumkarbonater, natriumkarbonater), salt, _ÄGG_/_ÆG_/_EGG_, arom, mjölbehandlingsmedel/melbehandlingsmiddel (_NATRIUMDISULFIT_).
top_rate 0.5
cardinality 2
entropy 1
entropy_ratio 1
alert: long_tail2 singleton categories
alert: null_rate96.0% null
Fig 326.
Top values for ingredients_text_da.
Show data table
Top values for ingredients_text_da (2 unique shown, of 2 total).
valuecountshare
_VETEMJÖL_/_HVEDEMEL_, palmolja/-olie, glukossirap, maltextrakt från _KORN_/_BYG_, bakpulver/hævemidler (ammoniumkarbonater, natriumkarbonater), salt, _ÄGG_/_ÆG_/_EGG_, arom, mjölbehandlingsmedel/melbehandlingsmiddel (_NATRIUMDISULFIT_).12.0%
12.0%

ingredients_text_cs categorical

Out[1226]:

saturn.columns["ingredients_text_cs"].stats

statvalue
n50
nulls47 (94.0%)
unique2
top_value
top_rate 0.6667
cardinality 2
entropy 0.9183
entropy_ratio 0.9183
alert: null_rate94.0% null
Fig 327.
Top values for ingredients_text_cs.
Show data table
Top values for ingredients_text_cs (2 unique shown, of 2 total).
valuecountshare
24.0%
Kakaová hmota, cukr, kakaové máslo, vanilka.12.0%

ingredients_text_nl_ocr_1675675383_result categorical

Out[1229]:

saturn.columns["ingredients_text_nl_ocr_1675675383_result"].stats

statvalue
n50
nulls49 (98.0%)
unique1
top_value Cacaomassa, suiker, cacaoboter, natuurlijk Bourbon vanille - stokje.
top_rate 1
cardinality 1
entropy 0
entropy_ratio 0
alert: long_tail1 singleton categories
alert: null_rate98.0% null
alert: imbalancetop value is 100.0% of rows
Fig 328.
Top values for ingredients_text_nl_ocr_1675675383_result.
Show data table
Top values for ingredients_text_nl_ocr_1675675383_result (1 unique shown, of 1 total).
valuecountshare
Cacaomassa, suiker, cacaoboter, natuurlijk Bourbon vanille - stokje.12.0%

product_name_cs categorical

Out[1232]:

saturn.columns["product_name_cs"].stats

statvalue
n50
nulls47 (94.0%)
unique2
top_value
top_rate 0.6667
cardinality 2
entropy 0.9183
entropy_ratio 0.9183
alert: null_rate94.0% null
Fig 329.
Top values for product_name_cs.
Show data table
Top values for product_name_cs (2 unique shown, of 2 total).
valuecountshare
24.0%
Excellence 70% Cocoa Intense Dark12.0%

ingredients_text_hu_ocr_1571428260_result categorical

Out[1235]:

saturn.columns["ingredients_text_hu_ocr_1571428260_result"].stats

statvalue
n50
nulls49 (98.0%)
unique1
top_value kakaómassza, cukor, kakaó - vaj, természetes bourbon vanília. Nyomokban egyéb dióféléket, tejet, szóját, szezámmagot es búzát tartalmazhat.
top_rate 1
cardinality 1
entropy 0
entropy_ratio 0
alert: long_tail1 singleton categories
alert: null_rate98.0% null
alert: imbalancetop value is 100.0% of rows
Fig 330.
Top values for ingredients_text_hu_ocr_1571428260_result.
Show data table
Top values for ingredients_text_hu_ocr_1571428260_result (1 unique shown, of 1 total).
valuecountshare
kakaómassza, cukor, kakaó - vaj, természetes bourbon vanília. Nyomokban egyéb dióféléket, tejet, szóját, szezámmagot es búzát tartalmazhat.12.0%

packaging_text_cs categorical metadata

This column appears to be Czech-language packaging text (`_cs` locale suffix), but it is almost entirely empty: 94% null rate across 50 rows, and the only observed non-null value is an empty string appearing 3 times. With cardinality of 1 and entropy of 0, the column carries zero information — it is effectively unpopulated for this dataset slice.

Treatment: Drop this column; it contains no usable signal (94% nulls, sole value is empty string).

anthropic:default · confidence high
Out[1238]:

saturn.columns["packaging_text_cs"].stats

statvalue
n50
nulls47 (94.0%)
unique1
top_value
top_rate 1
cardinality 1
entropy 0
entropy_ratio 0
alert: null_rate94.0% null
alert: imbalancetop value is 100.0% of rows
Fig 331.
Top values for packaging_text_cs.
Show data table
Top values for packaging_text_cs (1 unique shown, of 1 total).
valuecountshare
36.0%

ingredients_text_sr categorical

Out[1241]:

saturn.columns["ingredients_text_sr"].stats

statvalue
n50
nulls48 (96.0%)
unique2
top_value Šećer, kakao masa, kakao buter, vanile.
top_rate 0.5
cardinality 2
entropy 1
entropy_ratio 1
alert: long_tail2 singleton categories
alert: null_rate96.0% null
Fig 332.
Top values for ingredients_text_sr.
Show data table
Top values for ingredients_text_sr (2 unique shown, of 2 total).
valuecountshare
Šećer, kakao masa, kakao buter, vanile.12.0%
12.0%

origin_sr categorical

Out[1244]:

saturn.columns["origin_sr"].stats

statvalue
n50
nulls48 (96.0%)
unique1
top_value
top_rate 1
cardinality 1
entropy 0
entropy_ratio 0
alert: null_rate96.0% null
alert: imbalancetop value is 100.0% of rows
Fig 333.
Top values for origin_sr.
Show data table
Top values for origin_sr (1 unique shown, of 1 total).
valuecountshare
24.0%

ingredients_text_hu_ocr_1571428260 categorical

Out[1247]:

saturn.columns["ingredients_text_hu_ocr_1571428260"].stats

statvalue
n50
nulls49 (98.0%)
unique1
top_value kakaómassza, cukor, kakaó- vaj, természetes bourbon vanília. Nyomokban egyéb dióféléket, tejet, szóját, szezámmagot es búzát tartalmazhat.
top_rate 1
cardinality 1
entropy 0
entropy_ratio 0
alert: long_tail1 singleton categories
alert: null_rate98.0% null
alert: imbalancetop value is 100.0% of rows
Fig 334.
Top values for ingredients_text_hu_ocr_1571428260.
Show data table
Top values for ingredients_text_hu_ocr_1571428260 (1 unique shown, of 1 total).
valuecountshare
kakaómassza, cukor, kakaó- vaj, természetes bourbon vanília. Nyomokban egyéb dióféléket, tejet, szóját, szezámmagot es búzát tartalmazhat.12.0%

packaging_text_hu categorical feature

This column contains Hungarian-language packaging text, but is almost entirely empty: 92% null rate across 50 rows, and the only non-null value observed is an empty string appearing 4 times. With cardinality of 1 and entropy of 0.0, the column carries zero information — it is effectively unpopulated.

Treatment: Drop — 92% nulls and a single empty-string value provide no modelling or analytical signal.

anthropic:default · confidence high
Out[1250]:

saturn.columns["packaging_text_hu"].stats

statvalue
n50
nulls46 (92.0%)
unique1
top_value
top_rate 1
cardinality 1
entropy 0
entropy_ratio 0
alert: null_rate92.0% null
alert: imbalancetop value is 100.0% of rows
Fig 335.
Top values for packaging_text_hu.
Show data table
Top values for packaging_text_hu (1 unique shown, of 1 total).
valuecountshare
48.0%

origin_cs categorical

Out[1253]:

saturn.columns["origin_cs"].stats

statvalue
n50
nulls48 (96.0%)
unique1
top_value
top_rate 1
cardinality 1
entropy 0
entropy_ratio 0
alert: null_rate96.0% null
alert: imbalancetop value is 100.0% of rows
Fig 336.
Top values for origin_cs.
Show data table
Top values for origin_cs (1 unique shown, of 1 total).
valuecountshare
24.0%

ingredients_text_nl_ocr_1675675383 categorical

Out[1256]:

saturn.columns["ingredients_text_nl_ocr_1675675383"].stats

statvalue
n50
nulls49 (98.0%)
unique1
top_value Cacaomassa, suiker, cacaoboter, natuurlijk Bourbon vanille- stokje.
top_rate 1
cardinality 1
entropy 0
entropy_ratio 0
alert: long_tail1 singleton categories
alert: null_rate98.0% null
alert: imbalancetop value is 100.0% of rows
Fig 337.
Top values for ingredients_text_nl_ocr_1675675383.
Show data table
Top values for ingredients_text_nl_ocr_1675675383 (1 unique shown, of 1 total).
valuecountshare
Cacaomassa, suiker, cacaoboter, natuurlijk Bourbon vanille- stokje.12.0%

product_name_sr categorical

Out[1259]:

saturn.columns["product_name_sr"].stats

statvalue
n50
nulls48 (96.0%)
unique2
top_value Excellence 70% Cocoa Intense Dark
top_rate 0.5
cardinality 2
entropy 1
entropy_ratio 1
alert: long_tail2 singleton categories
alert: null_rate96.0% null
Fig 338.
Top values for product_name_sr.
Show data table
Top values for product_name_sr (2 unique shown, of 2 total).
valuecountshare
Excellence 70% Cocoa Intense Dark12.0%
Течен Шоколад Нутела12.0%

generic_name_hu categorical

Out[1262]:

saturn.columns["generic_name_hu"].stats

statvalue
n50
nulls46 (92.0%)
unique2
top_value
top_rate 0.75
cardinality 2
entropy 0.8113
entropy_ratio 0.8113
alert: null_rate92.0% null
Fig 339.
Top values for generic_name_hu.
Show data table
Top values for generic_name_hu (2 unique shown, of 2 total).
valuecountshare
36.0%
Finom12.0%

packaging_text_sr categorical

Out[1265]:

saturn.columns["packaging_text_sr"].stats

statvalue
n50
nulls48 (96.0%)
unique1
top_value
top_rate 1
cardinality 1
entropy 0
entropy_ratio 0
alert: null_rate96.0% null
alert: imbalancetop value is 100.0% of rows
Fig 340.
Top values for packaging_text_sr.
Show data table
Top values for packaging_text_sr (1 unique shown, of 1 total).
valuecountshare
24.0%

ingredients_text_with_allergens_cs categorical

Out[1268]:

saturn.columns["ingredients_text_with_allergens_cs"].stats

statvalue
n50
nulls49 (98.0%)
unique1
top_value Kakaová hmota, cukr, kakaové máslo, vanilka.
top_rate 1
cardinality 1
entropy 0
entropy_ratio 0
alert: long_tail1 singleton categories
alert: null_rate98.0% null
alert: imbalancetop value is 100.0% of rows
Fig 341.
Top values for ingredients_text_with_allergens_cs.
Show data table
Top values for ingredients_text_with_allergens_cs (1 unique shown, of 1 total).
valuecountshare
Kakaová hmota, cukr, kakaové máslo, vanilka.12.0%

ingredients_text_with_allergens_sr categorical

Out[1271]:

saturn.columns["ingredients_text_with_allergens_sr"].stats

statvalue
n50
nulls48 (96.0%)
unique2
top_value Šećer, kakao masa, kakao buter, vanile.
top_rate 0.5
cardinality 2
entropy 1
entropy_ratio 1
alert: long_tail2 singleton categories
alert: null_rate96.0% null
Fig 342.
Top values for ingredients_text_with_allergens_sr.
Show data table
Top values for ingredients_text_with_allergens_sr (2 unique shown, of 2 total).
valuecountshare
Šećer, kakao masa, kakao buter, vanile.12.0%
12.0%

ingredients_text_hu categorical

Out[1274]:

saturn.columns["ingredients_text_hu"].stats

statvalue
n50
nulls46 (92.0%)
unique4
top_value Kakaómassza, cukor, kakaó - vaj, vanília.
top_rate 0.25
cardinality 4
entropy 2
entropy_ratio 1
alert: long_tail4 singleton categories
alert: null_rate92.0% null
Fig 343.
Top values for ingredients_text_hu.
Show data table
Top values for ingredients_text_hu (4 unique shown, of 4 total).
valuecountshare
Kakaómassza, cukor, kakaó - vaj, vanília.12.0%
HU Étcsokoládé. Kakaó szárazanyag legalább 70% . ÖSszetevők: kakaómassza, cukor, kakaóvaj, emulgeálószerek: lecitinek (szójából); vanília kivonat. Nyomokban dióféléket és tejet tartalmazhat. Bontatlan csomagolásban tárolva minőségét megórzi (nap/hónap/év): a csomagolás hátoldalán feltüntetett időpontig. Száraz, hűvös helyen tárolandó! Készült: Németországban. A kakaóbab származási helye: Ecuador, Elefántcsontpart, Ghána, Kamerun és Nigeria. A Fairtrade Cocoa Program (Fairtrade Kakaó Program) előnyökhöz juttatja a kistermelőket azáltal, hogy több kakaót értékesítenek Fairtrade termékként. Látogasson el a www.info.fairtrade.net/program oldalra. RO Ciocolată amăruie. Substantă uscată de cacao minimum 70% Ingrediente: masă de cacao, zahăr, unt de cacao, emulsifiant: lecitine din soia; extract din vanilie. Cu ingrediente din tări UE şi non UE. Poate contine urme de fructe cu coajă lemnoasă şi lapte. A se consuma de preferintă înainte de/Nr. Lot: vezi spate ambalaj. A se păstra la loc uscat şi răcoros, ferit de razele soarelui și de înghet, atât înainte, cât şi după deschidere. A se consuma în cel mai scurt timp după deschidere. Fairtrade Cocoa Program (Programul Fairtrade de Cacao) permite micilor agricultori să beneficieze de vânzarea propriei cacao ca Fairtrade. Vizitați www.info.fairtrade.net/program. Produs in U.E. pentru S.C. Lidl Discount SRL, Sat Nedelea, Comuna Ariceştii Rahtivani, DN 72, Crângul lui Bot, KM 73+810, județul Prahova, România. BG Натурален шоколад. Съдържа мин. 70% какаова маса. Съставки: какаова маса, захар, какаово масло, емулгатор: лецитин (соев); екстракт от ванилия. Може да съдържа следи от ядки и мляко. Неотворен най-добър до:/ Партида: виж задната страна. Да се съхранява на сухо и хладно място. Програмата за сертифициране на какао Fairtrade Сосоа Program дава възможност на малките производители да продават повече какао при справедливи условия на търговия. Повече информация на www.info.fairtrade.net/program Произведено в Германия за Лидл Щифтунг енд Ко. КГ, Щифтсбергщрасе 1, 74167 Некарзулм, Германия. LIDL12.0%
Cukor, pálmaolaj, _MOGYORÓ_ (13%), zsírszegény kakaópor (7,4%), sovány _TEJPOR_ (6,6%), _TEJSAVÓPOR_, emulgeálószer: lecitinek (_SZÓJA_); aroma (vanillin).12.0%
12.0%

product_name_hu categorical

Out[1277]:

saturn.columns["product_name_hu"].stats

statvalue
n50
nulls46 (92.0%)
unique3
top_value
top_rate 0.5
cardinality 3
entropy 1.5
entropy_ratio 0.9464
alert: long_tail2 singleton categories
alert: null_rate92.0% null
Fig 344.
Top values for product_name_hu.
Show data table
Top values for product_name_hu (3 unique shown, of 3 total).
valuecountshare
24.0%
Excellence 70% Cocoa Intense Dark12.0%
Dark Chocolate 70% Cacao12.0%

generic_name_sr categorical

Out[1280]:

saturn.columns["generic_name_sr"].stats

statvalue
n50
nulls48 (96.0%)
unique2
top_value Tamna čokolada sa 70% kakaa
top_rate 0.5
cardinality 2
entropy 1
entropy_ratio 1
alert: long_tail2 singleton categories
alert: null_rate96.0% null
Fig 345.
Top values for generic_name_sr.
Show data table
Top values for generic_name_sr (2 unique shown, of 2 total).
valuecountshare
Tamna čokolada sa 70% kakaa12.0%
12.0%

origin_hu categorical other

This column appears to be an origin or handling-unit identifier that is almost entirely unpopulated — 92% of its 50 rows are null, and the sole non-null value present is an empty string appearing 4 times. With cardinality of 1, zero entropy, and a top_rate of 1.0 across non-null values, the column carries no discriminative information whatsoever. This is a effectively a blank field in the current dataset snapshot.

Treatment: Drop — 92% null with a single empty-string value provides zero signal for any downstream task.

anthropic:default · confidence high
Out[1283]:

saturn.columns["origin_hu"].stats

statvalue
n50
nulls46 (92.0%)
unique1
top_value
top_rate 1
cardinality 1
entropy 0
entropy_ratio 0
alert: null_rate92.0% null
alert: imbalancetop value is 100.0% of rows
Fig 346.
Top values for origin_hu.
Show data table
Top values for origin_hu (1 unique shown, of 1 total).
valuecountshare
48.0%

ingredients_text_with_allergens_hu categorical

Out[1286]:

saturn.columns["ingredients_text_with_allergens_hu"].stats

statvalue
n50
nulls47 (94.0%)
unique3
top_value Kakaómassza, cukor, kakaó - vaj, vanília.
top_rate 0.3333
cardinality 3
entropy 1.585
entropy_ratio 1
alert: long_tail3 singleton categories
alert: null_rate94.0% null
Fig 347.
Top values for ingredients_text_with_allergens_hu.
Show data table
Top values for ingredients_text_with_allergens_hu (3 unique shown, of 3 total).
valuecountshare
Kakaómassza, cukor, kakaó - vaj, vanília.12.0%
HU Étcsokoládé. Kakaó szárazanyag legalább 70% . ÖSszetevők: kakaómassza, cukor, kakaóvaj, emulgeálószerek: lecitinek (szójából); vanília kivonat. Nyomokban dióféléket és tejet tartalmazhat. Bontatlan csomagolásban tárolva minőségét megórzi (nap/hónap/év): a csomagolás hátoldalán feltüntetett időpontig. Száraz, hűvös helyen tárolandó! Készült: Németországban. A kakaóbab származási helye: Ecuador, Elefántcsontpart, Ghána, Kamerun és Nigeria. A Fairtrade Cocoa Program (Fairtrade Kakaó Program) előnyökhöz juttatja a kistermelőket azáltal, hogy több kakaót értékesítenek Fairtrade termékként. Látogasson el a www.info.fairtrade.net/program oldalra. RO Ciocolată amăruie. Substantă uscată de cacao minimum 70% Ingrediente: masă de cacao, zahăr, unt de cacao, emulsifiant: lecitine din soia; extract din vanilie. Cu ingrediente din tări UE şi non UE. Poate contine urme de fructe cu coajă lemnoasă şi lapte. A se consuma de preferintă înainte de/Nr. Lot: vezi spate ambalaj. A se păstra la loc uscat şi răcoros, ferit de razele soarelui și de înghet, atât înainte, cât şi după deschidere. A se consuma în cel mai scurt timp după deschidere. Fairtrade Cocoa Program (Programul Fairtrade de Cacao) permite micilor agricultori să beneficieze de vânzarea propriei cacao ca Fairtrade. Vizitați www.info.fairtrade.net/program. Produs in U.E. pentru S.C. Lidl Discount SRL, Sat Nedelea, Comuna Ariceştii Rahtivani, DN 72, Crângul lui Bot, KM 73+810, județul Prahova, România. BG Натурален шоколад. Съдържа мин. 70% какаова маса. Съставки: какаова маса, захар, какаово масло, емулгатор: лецитин (соев); екстракт от ванилия. Може да съдържа следи от ядки и мляко. Неотворен най-добър до:/ Партида: виж задната страна. Да се съхранява на сухо и хладно място. Програмата за сертифициране на какао Fairtrade Сосоа Program дава възможност на малките производители да продават повече какао при справедливи условия на търговия. Повече информация на www.info.fairtrade.net/program Произведено в Германия за Лидл Щифтунг енд Ко. КГ, Щифтсбергщрасе 1, 74167 Некарзулм, Германия. LIDL12.0%
Cukor, pálmaolaj, MOGYORÓ (13%), zsírszegény kakaópor (7,4%), sovány TEJPOR (6,6%), TEJSAVÓPOR, emulgeálószer: lecitinek (SZÓJA); aroma (vanillin).12.0%

generic_name_cs categorical label

This column appears to be a Czech-language generic name field (indicated by the '_cs' suffix) that is almost entirely empty: 94% of its 50 rows are null, and the sole non-null value is an empty string appearing 3 times. With cardinality of 1 and entropy of 0, the column carries zero information — it is effectively unpopulated.

Treatment: Drop this column; it contains no usable signal with a 94% null rate and only empty-string values in the remainder.

anthropic:default · confidence high
Out[1289]:

saturn.columns["generic_name_cs"].stats

statvalue
n50
nulls47 (94.0%)
unique1
top_value
top_rate 1
cardinality 1
entropy 0
entropy_ratio 0
alert: null_rate94.0% null
alert: imbalancetop value is 100.0% of rows
Fig 348.
Top values for generic_name_cs.
Show data table
Top values for generic_name_cs (1 unique shown, of 1 total).
valuecountshare
36.0%

ingredients_text_xx categorical

Out[1292]:

saturn.columns["ingredients_text_xx"].stats

statvalue
n50
nulls48 (96.0%)
unique1
top_value
top_rate 1
cardinality 1
entropy 0
entropy_ratio 0
alert: null_rate96.0% null
alert: imbalancetop value is 100.0% of rows
Fig 349.
Top values for ingredients_text_xx.
Show data table
Top values for ingredients_text_xx (1 unique shown, of 1 total).
valuecountshare
24.0%

origin_xx categorical

Out[1295]:

saturn.columns["origin_xx"].stats

statvalue
n50
nulls49 (98.0%)
unique1
top_value
top_rate 1
cardinality 1
entropy 0
entropy_ratio 0
alert: long_tail1 singleton categories
alert: null_rate98.0% null
alert: imbalancetop value is 100.0% of rows
Fig 350.
Top values for origin_xx.
Show data table
Top values for origin_xx (1 unique shown, of 1 total).
valuecountshare
12.0%

product_name_xx categorical

Out[1298]:

saturn.columns["product_name_xx"].stats

statvalue
n50
nulls48 (96.0%)
unique1
top_value
top_rate 1
cardinality 1
entropy 0
entropy_ratio 0
alert: null_rate96.0% null
alert: imbalancetop value is 100.0% of rows
Fig 351.
Top values for product_name_xx.
Show data table
Top values for product_name_xx (1 unique shown, of 1 total).
valuecountshare
24.0%

packaging_text_xx categorical

Out[1301]:

saturn.columns["packaging_text_xx"].stats

statvalue
n50
nulls49 (98.0%)
unique1
top_value
top_rate 1
cardinality 1
entropy 0
entropy_ratio 0
alert: long_tail1 singleton categories
alert: null_rate98.0% null
alert: imbalancetop value is 100.0% of rows
Fig 352.
Top values for packaging_text_xx.
Show data table
Top values for packaging_text_xx (1 unique shown, of 1 total).
valuecountshare
12.0%

generic_name_xx categorical

Out[1304]:

saturn.columns["generic_name_xx"].stats

statvalue
n50
nulls48 (96.0%)
unique1
top_value
top_rate 1
cardinality 1
entropy 0
entropy_ratio 0
alert: null_rate96.0% null
alert: imbalancetop value is 100.0% of rows
Fig 353.
Top values for generic_name_xx.
Show data table
Top values for generic_name_xx (1 unique shown, of 1 total).
valuecountshare
24.0%

ingredients_text_es_ocr_1548767061 categorical

Out[1307]:

saturn.columns["ingredients_text_es_ocr_1548767061"].stats

statvalue
n50
nulls49 (98.0%)
unique1
top_value Pasta de cacao, azúcar, manteca de cacao, emulgente: lecitina de girasol (E-322), extracto de vainilla. Cacao: 70% mínimo.
top_rate 1
cardinality 1
entropy 0
entropy_ratio 0
alert: long_tail1 singleton categories
alert: null_rate98.0% null
alert: imbalancetop value is 100.0% of rows
Fig 354.
Top values for ingredients_text_es_ocr_1548767061.
Show data table
Top values for ingredients_text_es_ocr_1548767061 (1 unique shown, of 1 total).
valuecountshare
Pasta de cacao, azúcar, manteca de cacao, emulgente: lecitina de girasol (E-322), extracto de vainilla. Cacao: 70% mínimo. 12.0%

ingredients_text_es_ocr_1548767061_result categorical

Out[1310]:

saturn.columns["ingredients_text_es_ocr_1548767061_result"].stats

statvalue
n50
nulls49 (98.0%)
unique1
top_value Pasta de cacao, azúcar, manteca de cacao, emulgente: lecitina de girasol (E-322), extracto de vainilla. Cacao: 70% mínimo.
top_rate 1
cardinality 1
entropy 0
entropy_ratio 0
alert: long_tail1 singleton categories
alert: null_rate98.0% null
alert: imbalancetop value is 100.0% of rows
Fig 355.
Top values for ingredients_text_es_ocr_1548767061_result.
Show data table
Top values for ingredients_text_es_ocr_1548767061_result (1 unique shown, of 1 total).
valuecountshare
Pasta de cacao, azúcar, manteca de cacao, emulgente: lecitina de girasol (E-322), extracto de vainilla. Cacao: 70% mínimo.12.0%

origin_ur categorical

Out[1313]:

saturn.columns["origin_ur"].stats

statvalue
n50
nulls49 (98.0%)
unique1
top_value
top_rate 1
cardinality 1
entropy 0
entropy_ratio 0
alert: long_tail1 singleton categories
alert: null_rate98.0% null
alert: imbalancetop value is 100.0% of rows
Fig 356.
Top values for origin_ur.
Show data table
Top values for origin_ur (1 unique shown, of 1 total).
valuecountshare
12.0%

product_name_he categorical

Out[1316]:

saturn.columns["product_name_he"].stats

statvalue
n50
nulls48 (96.0%)
unique2
top_value נוטלה
top_rate 0.5
cardinality 2
entropy 1
entropy_ratio 1
alert: long_tail2 singleton categories
alert: null_rate96.0% null
Fig 357.
Top values for product_name_he.
Show data table
Top values for product_name_he (2 unique shown, of 2 total).
valuecountshare
נוטלה12.0%
תפוציפס שמנת בצל12.0%

ingredients_text_he categorical

Out[1319]:

saturn.columns["ingredients_text_he"].stats

statvalue
n50
nulls49 (98.0%)
unique1
top_value
top_rate 1
cardinality 1
entropy 0
entropy_ratio 0
alert: long_tail1 singleton categories
alert: null_rate98.0% null
alert: imbalancetop value is 100.0% of rows
Fig 358.
Top values for ingredients_text_he.
Show data table
Top values for ingredients_text_he (1 unique shown, of 1 total).
valuecountshare
12.0%

product_name_ur categorical

Out[1322]:

saturn.columns["product_name_ur"].stats

statvalue
n50
nulls49 (98.0%)
unique1
top_value
top_rate 1
cardinality 1
entropy 0
entropy_ratio 0
alert: long_tail1 singleton categories
alert: null_rate98.0% null
alert: imbalancetop value is 100.0% of rows
Fig 359.
Top values for product_name_ur.
Show data table
Top values for product_name_ur (1 unique shown, of 1 total).
valuecountshare
12.0%

generic_name_he categorical

Out[1325]:

saturn.columns["generic_name_he"].stats

statvalue
n50
nulls49 (98.0%)
unique1
top_value ממרח אגוזי לוז עם קקאו
top_rate 1
cardinality 1
entropy 0
entropy_ratio 0
alert: long_tail1 singleton categories
alert: null_rate98.0% null
alert: imbalancetop value is 100.0% of rows
Fig 360.
Top values for generic_name_he.
Show data table
Top values for generic_name_he (1 unique shown, of 1 total).
valuecountshare
ממרח אגוזי לוז עם קקאו12.0%

packaging_text_he categorical

Out[1328]:

saturn.columns["packaging_text_he"].stats

statvalue
n50
nulls49 (98.0%)
unique1
top_value
top_rate 1
cardinality 1
entropy 0
entropy_ratio 0
alert: long_tail1 singleton categories
alert: null_rate98.0% null
alert: imbalancetop value is 100.0% of rows
Fig 361.
Top values for packaging_text_he.
Show data table
Top values for packaging_text_he (1 unique shown, of 1 total).
valuecountshare
12.0%

ingredients_text_ur categorical

Out[1331]:

saturn.columns["ingredients_text_ur"].stats

statvalue
n50
nulls49 (98.0%)
unique1
top_value
top_rate 1
cardinality 1
entropy 0
entropy_ratio 0
alert: long_tail1 singleton categories
alert: null_rate98.0% null
alert: imbalancetop value is 100.0% of rows
Fig 362.
Top values for ingredients_text_ur.
Show data table
Top values for ingredients_text_ur (1 unique shown, of 1 total).
valuecountshare
12.0%

origin_he categorical

Out[1334]:

saturn.columns["origin_he"].stats

statvalue
n50
nulls49 (98.0%)
unique1
top_value
top_rate 1
cardinality 1
entropy 0
entropy_ratio 0
alert: long_tail1 singleton categories
alert: null_rate98.0% null
alert: imbalancetop value is 100.0% of rows
Fig 363.
Top values for origin_he.
Show data table
Top values for origin_he (1 unique shown, of 1 total).
valuecountshare
12.0%

generic_name_ur categorical

Out[1337]:

saturn.columns["generic_name_ur"].stats

statvalue
n50
nulls49 (98.0%)
unique1
top_value
top_rate 1
cardinality 1
entropy 0
entropy_ratio 0
alert: long_tail1 singleton categories
alert: null_rate98.0% null
alert: imbalancetop value is 100.0% of rows
Fig 364.
Top values for generic_name_ur.
Show data table
Top values for generic_name_ur (1 unique shown, of 1 total).
valuecountshare
12.0%

packaging_text_ur categorical

Out[1340]:

saturn.columns["packaging_text_ur"].stats

statvalue
n50
nulls49 (98.0%)
unique1
top_value
top_rate 1
cardinality 1
entropy 0
entropy_ratio 0
alert: long_tail1 singleton categories
alert: null_rate98.0% null
alert: imbalancetop value is 100.0% of rows
Fig 365.
Top values for packaging_text_ur.
Show data table
Top values for packaging_text_ur (1 unique shown, of 1 total).
valuecountshare
12.0%

ingredients_text_with_allergens_he categorical

Out[1343]:

saturn.columns["ingredients_text_with_allergens_he"].stats

statvalue
n50
nulls49 (98.0%)
unique1
top_value
top_rate 1
cardinality 1
entropy 0
entropy_ratio 0
alert: long_tail1 singleton categories
alert: null_rate98.0% null
alert: imbalancetop value is 100.0% of rows
Fig 366.
Top values for ingredients_text_with_allergens_he.
Show data table
Top values for ingredients_text_with_allergens_he (1 unique shown, of 1 total).
valuecountshare
12.0%

nutriscore_grade_producer_imported categorical

Out[1346]:

saturn.columns["nutriscore_grade_producer_imported"].stats

statvalue
n50
nulls47 (94.0%)
unique3
top_value c
top_rate 0.3333
cardinality 3
entropy 1.585
entropy_ratio 1
alert: long_tail3 singleton categories
alert: null_rate94.0% null
Fig 367.
Top values for nutriscore_grade_producer_imported.
Show data table
Top values for nutriscore_grade_producer_imported (3 unique shown, of 3 total).
valuecountshare
c12.0%
e12.0%
b12.0%

nutriscore_grade_producer categorical

Out[1349]:

saturn.columns["nutriscore_grade_producer"].stats

statvalue
n50
nulls47 (94.0%)
unique3
top_value c
top_rate 0.3333
cardinality 3
entropy 1.585
entropy_ratio 1
alert: long_tail3 singleton categories
alert: null_rate94.0% null
Fig 368.
Top values for nutriscore_grade_producer.
Show data table
Top values for nutriscore_grade_producer (3 unique shown, of 3 total).
valuecountshare
c12.0%
e12.0%
b12.0%

ingredients_text_el categorical

Out[1352]:

saturn.columns["ingredients_text_el"].stats

statvalue
n50
nulls49 (98.0%)
unique1
top_value
top_rate 1
cardinality 1
entropy 0
entropy_ratio 0
alert: long_tail1 singleton categories
alert: null_rate98.0% null
alert: imbalancetop value is 100.0% of rows
Fig 369.
Top values for ingredients_text_el.
Show data table
Top values for ingredients_text_el (1 unique shown, of 1 total).
valuecountshare
12.0%

ingredients_text_with_allergens_el categorical

Out[1355]:

saturn.columns["ingredients_text_with_allergens_el"].stats

statvalue
n50
nulls49 (98.0%)
unique1
top_value
top_rate 1
cardinality 1
entropy 0
entropy_ratio 0
alert: long_tail1 singleton categories
alert: null_rate98.0% null
alert: imbalancetop value is 100.0% of rows
Fig 370.
Top values for ingredients_text_with_allergens_el.
Show data table
Top values for ingredients_text_with_allergens_el (1 unique shown, of 1 total).
valuecountshare
12.0%

packaging_text_el categorical

Out[1358]:

saturn.columns["packaging_text_el"].stats

statvalue
n50
nulls49 (98.0%)
unique1
top_value
top_rate 1
cardinality 1
entropy 0
entropy_ratio 0
alert: long_tail1 singleton categories
alert: null_rate98.0% null
alert: imbalancetop value is 100.0% of rows
Fig 371.
Top values for packaging_text_el.
Show data table
Top values for packaging_text_el (1 unique shown, of 1 total).
valuecountshare
12.0%

origin_el categorical

Out[1361]:

saturn.columns["origin_el"].stats

statvalue
n50
nulls49 (98.0%)
unique1
top_value
top_rate 1
cardinality 1
entropy 0
entropy_ratio 0
alert: long_tail1 singleton categories
alert: null_rate98.0% null
alert: imbalancetop value is 100.0% of rows
Fig 372.
Top values for origin_el.
Show data table
Top values for origin_el (1 unique shown, of 1 total).
valuecountshare
12.0%

product_name_el categorical

Out[1364]:

saturn.columns["product_name_el"].stats

statvalue
n50
nulls49 (98.0%)
unique1
top_value
top_rate 1
cardinality 1
entropy 0
entropy_ratio 0
alert: long_tail1 singleton categories
alert: null_rate98.0% null
alert: imbalancetop value is 100.0% of rows
Fig 373.
Top values for product_name_el.
Show data table
Top values for product_name_el (1 unique shown, of 1 total).
valuecountshare
12.0%

generic_name_el categorical

Out[1367]:

saturn.columns["generic_name_el"].stats

statvalue
n50
nulls49 (98.0%)
unique1
top_value
top_rate 1
cardinality 1
entropy 0
entropy_ratio 0
alert: long_tail1 singleton categories
alert: null_rate98.0% null
alert: imbalancetop value is 100.0% of rows
Fig 374.
Top values for generic_name_el.
Show data table
Top values for generic_name_el (1 unique shown, of 1 total).
valuecountshare
12.0%

ingredients_text_it_ocr_1559410715 categorical

Out[1370]:

saturn.columns["ingredients_text_it_ocr_1559410715"].stats

statvalue
n50
nulls49 (98.0%)
unique1
top_value Cioccolato amaro extra. Cacao: 99% minimo. Ingredienti: pasta di cacao, cacao magro, burro di cacao, zucchero grezzo di canna. Può contenere frutta a guscio, latte e soia.
top_rate 1
cardinality 1
entropy 0
entropy_ratio 0
alert: long_tail1 singleton categories
alert: null_rate98.0% null
alert: imbalancetop value is 100.0% of rows
Fig 375.
Top values for ingredients_text_it_ocr_1559410715.
Show data table
Top values for ingredients_text_it_ocr_1559410715 (1 unique shown, of 1 total).
valuecountshare
Cioccolato amaro extra. Cacao: 99% minimo. Ingredienti: pasta di cacao, cacao magro, burro di cacao, zucchero grezzo di canna. Può contenere frutta a guscio, latte e soia.12.0%

ingredients_text_de_ocr_1559410715 categorical

Out[1373]:

saturn.columns["ingredients_text_de_ocr_1559410715"].stats

statvalue
n50
nulls49 (98.0%)
unique1
top_value Extra feine dunkle Schokolade. Schokolade enthält: Kakao: mind. 99%. Zutaten: Kakaomasse, fettarmes Kakaopulver, Kakaobutter, Rohrzucker. Kann Schalenfrüchte, Milch und Soja enthalten.
top_rate 1
cardinality 1
entropy 0
entropy_ratio 0
alert: long_tail1 singleton categories
alert: null_rate98.0% null
alert: imbalancetop value is 100.0% of rows
Fig 376.
Top values for ingredients_text_de_ocr_1559410715.
Show data table
Top values for ingredients_text_de_ocr_1559410715 (1 unique shown, of 1 total).
valuecountshare
Extra feine dunkle Schokolade. Schokolade enthält: Kakao: mind. 99%. Zutaten: Kakaomasse, fettarmes Kakaopulver, Kakaobutter, Rohrzucker. Kann Schalenfrüchte, Milch und Soja enthalten.12.0%

product_name_th categorical

Out[1376]:

saturn.columns["product_name_th"].stats

statvalue
n50
nulls49 (98.0%)
unique1
top_value ลินด์ เอ็กเซอร์แลนซ์ ดาร์ก 99% โกโก้ ดาร์ก แอปโซลูท ช็อกโกแลต
top_rate 1
cardinality 1
entropy 0
entropy_ratio 0
alert: long_tail1 singleton categories
alert: null_rate98.0% null
alert: imbalancetop value is 100.0% of rows
Fig 377.
Top values for product_name_th.
Show data table
Top values for product_name_th (1 unique shown, of 1 total).
valuecountshare
ลินด์ เอ็กเซอร์แลนซ์ ดาร์ก 99% โกโก้ ดาร์ก แอปโซลูท ช็อกโกแลต12.0%

ingredients_text_th categorical

Out[1379]:

saturn.columns["ingredients_text_th"].stats

statvalue
n50
nulls49 (98.0%)
unique1
top_value Cocoa solids 99%, Cocoa paste, fat-reduced cocoa, cocoa butter, demerara sugar. May contain nuts, milk and soya.
top_rate 1
cardinality 1
entropy 0
entropy_ratio 0
alert: long_tail1 singleton categories
alert: null_rate98.0% null
alert: imbalancetop value is 100.0% of rows
Fig 378.
Top values for ingredients_text_th.
Show data table
Top values for ingredients_text_th (1 unique shown, of 1 total).
valuecountshare
Cocoa solids 99%, Cocoa paste, fat-reduced cocoa, cocoa butter, demerara sugar. May contain nuts, milk and soya.12.0%

ingredients_text_de_ocr_1548767354 categorical

Out[1382]:

saturn.columns["ingredients_text_de_ocr_1548767354"].stats

statvalue
n50
nulls49 (98.0%)
unique1
top_value Extra feine dunkle Schokolade. Schokolade enthält: Kakao: mind. 99%. Zutaten: Kakaomasse, fettarmes Kakaopulver, Kakaobutter, Rohrzucker. Kann Schalenfrüchte, Milch und Soja enthalten.
top_rate 1
cardinality 1
entropy 0
entropy_ratio 0
alert: long_tail1 singleton categories
alert: null_rate98.0% null
alert: imbalancetop value is 100.0% of rows
Fig 379.
Top values for ingredients_text_de_ocr_1548767354.
Show data table
Top values for ingredients_text_de_ocr_1548767354 (1 unique shown, of 1 total).
valuecountshare
Extra feine dunkle Schokolade. Schokolade enthält: Kakao: mind. 99%. Zutaten: Kakaomasse, fettarmes Kakaopulver, Kakaobutter, Rohrzucker. Kann Schalenfrüchte, Milch und Soja enthalten. 12.0%

ingredients_text_de_ocr_1548767354_result categorical

Out[1385]:

saturn.columns["ingredients_text_de_ocr_1548767354_result"].stats

statvalue
n50
nulls49 (98.0%)
unique1
top_value Extra feine dunkle Schokolade. Schokolade enthält: Kakao: mind. 99%. Zutaten: Kakaomasse, fettarmes Kakaopulver, Kakaobutter, Rohrzucker. Kann Schalenfrüchte, Milch und Soja enthalten.
top_rate 1
cardinality 1
entropy 0
entropy_ratio 0
alert: long_tail1 singleton categories
alert: null_rate98.0% null
alert: imbalancetop value is 100.0% of rows
Fig 380.
Top values for ingredients_text_de_ocr_1548767354_result.
Show data table
Top values for ingredients_text_de_ocr_1548767354_result (1 unique shown, of 1 total).
valuecountshare
Extra feine dunkle Schokolade. Schokolade enthält: Kakao: mind. 99%. Zutaten: Kakaomasse, fettarmes Kakaopulver, Kakaobutter, Rohrzucker. Kann Schalenfrüchte, Milch und Soja enthalten.12.0%

generic_name_th categorical

Out[1388]:

saturn.columns["generic_name_th"].stats

statvalue
n50
nulls49 (98.0%)
unique1
top_value
top_rate 1
cardinality 1
entropy 0
entropy_ratio 0
alert: long_tail1 singleton categories
alert: null_rate98.0% null
alert: imbalancetop value is 100.0% of rows
Fig 381.
Top values for generic_name_th.
Show data table
Top values for generic_name_th (1 unique shown, of 1 total).
valuecountshare
12.0%

ingredients_text_it_ocr_1559410715_result categorical

Out[1391]:

saturn.columns["ingredients_text_it_ocr_1559410715_result"].stats

statvalue
n50
nulls49 (98.0%)
unique1
top_value pasta di cacao, cacao magro, burro di cacao, zucchero grezzo di canna. Può contenere frutta a guscio, latte e soia.
top_rate 1
cardinality 1
entropy 0
entropy_ratio 0
alert: long_tail1 singleton categories
alert: null_rate98.0% null
alert: imbalancetop value is 100.0% of rows
Fig 382.
Top values for ingredients_text_it_ocr_1559410715_result.
Show data table
Top values for ingredients_text_it_ocr_1559410715_result (1 unique shown, of 1 total).
valuecountshare
pasta di cacao, cacao magro, burro di cacao, zucchero grezzo di canna. Può contenere frutta a guscio, latte e soia.12.0%

packaging_text_th categorical

Out[1394]:

saturn.columns["packaging_text_th"].stats

statvalue
n50
nulls49 (98.0%)
unique1
top_value
top_rate 1
cardinality 1
entropy 0
entropy_ratio 0
alert: long_tail1 singleton categories
alert: null_rate98.0% null
alert: imbalancetop value is 100.0% of rows
Fig 383.
Top values for packaging_text_th.
Show data table
Top values for packaging_text_th (1 unique shown, of 1 total).
valuecountshare
12.0%

origin_th categorical

Out[1397]:

saturn.columns["origin_th"].stats

statvalue
n50
nulls49 (98.0%)
unique1
top_value
top_rate 1
cardinality 1
entropy 0
entropy_ratio 0
alert: long_tail1 singleton categories
alert: null_rate98.0% null
alert: imbalancetop value is 100.0% of rows
Fig 384.
Top values for origin_th.
Show data table
Top values for origin_th (1 unique shown, of 1 total).
valuecountshare
12.0%

ingredients_text_with_allergens_th categorical

Out[1400]:

saturn.columns["ingredients_text_with_allergens_th"].stats

statvalue
n50
nulls49 (98.0%)
unique1
top_value Cocoa solids 99%, Cocoa paste, fat-reduced cocoa, cocoa butter, demerara sugar. May contain nuts, milk and soya.
top_rate 1
cardinality 1
entropy 0
entropy_ratio 0
alert: long_tail1 singleton categories
alert: null_rate98.0% null
alert: imbalancetop value is 100.0% of rows
Fig 385.
Top values for ingredients_text_with_allergens_th.
Show data table
Top values for ingredients_text_with_allergens_th (1 unique shown, of 1 total).
valuecountshare
Cocoa solids 99%, Cocoa paste, fat-reduced cocoa, cocoa butter, demerara sugar. May contain nuts, milk and soya.12.0%

ingredients_text_de_ocr_1559410715_result categorical

Out[1403]:

saturn.columns["ingredients_text_de_ocr_1559410715_result"].stats

statvalue
n50
nulls49 (98.0%)
unique1
top_value Kakaomasse, fettarmes Kakaopulver, Kakaobutter, Rohrzucker. Kann Schalenfrüchte, Milch und Soja enthalten.
top_rate 1
cardinality 1
entropy 0
entropy_ratio 0
alert: long_tail1 singleton categories
alert: null_rate98.0% null
alert: imbalancetop value is 100.0% of rows
Fig 386.
Top values for ingredients_text_de_ocr_1559410715_result.
Show data table
Top values for ingredients_text_de_ocr_1559410715_result (1 unique shown, of 1 total).
valuecountshare
Kakaomasse, fettarmes Kakaopulver, Kakaobutter, Rohrzucker. Kann Schalenfrüchte, Milch und Soja enthalten.12.0%

packaging_text_fr_imported categorical

Out[1406]:

saturn.columns["packaging_text_fr_imported"].stats

statvalue
n50
nulls49 (98.0%)
unique1
top_value 1 FEUILLE PAPIER À RECYCLER, 1 FEUILLE METAL À RECYCLER.
top_rate 1
cardinality 1
entropy 0
entropy_ratio 0
alert: long_tail1 singleton categories
alert: null_rate98.0% null
alert: imbalancetop value is 100.0% of rows
Fig 387.
Top values for packaging_text_fr_imported.
Show data table
Top values for packaging_text_fr_imported (1 unique shown, of 1 total).
valuecountshare
1 FEUILLE PAPIER À RECYCLER, 1 FEUILLE METAL À RECYCLER.12.0%

preparation categorical

Out[1409]:

saturn.columns["preparation"].stats

statvalue
n50
nulls49 (98.0%)
unique1
top_value Produit prêt à consommer
top_rate 1
cardinality 1
entropy 0
entropy_ratio 0
alert: long_tail1 singleton categories
alert: null_rate98.0% null
alert: imbalancetop value is 100.0% of rows
Fig 388.
Top values for preparation.
Show data table
Top values for preparation (1 unique shown, of 1 total).
valuecountshare
Produit prêt à consommer12.0%

preparation_fr_imported categorical

Out[1412]:

saturn.columns["preparation_fr_imported"].stats

statvalue
n50
nulls49 (98.0%)
unique1
top_value Produit prêt à consommer
top_rate 1
cardinality 1
entropy 0
entropy_ratio 0
alert: long_tail1 singleton categories
alert: null_rate98.0% null
alert: imbalancetop value is 100.0% of rows
Fig 389.
Top values for preparation_fr_imported.
Show data table
Top values for preparation_fr_imported (1 unique shown, of 1 total).
valuecountshare
Produit prêt à consommer12.0%

preparation_fr categorical

Out[1415]:

saturn.columns["preparation_fr"].stats

statvalue
n50
nulls49 (98.0%)
unique1
top_value Produit prêt à consommer
top_rate 1
cardinality 1
entropy 0
entropy_ratio 0
alert: long_tail1 singleton categories
alert: null_rate98.0% null
alert: imbalancetop value is 100.0% of rows
Fig 390.
Top values for preparation_fr.
Show data table
Top values for preparation_fr (1 unique shown, of 1 total).
valuecountshare
Produit prêt à consommer12.0%

generic_name_lc categorical

Out[1418]:

saturn.columns["generic_name_lc"].stats

statvalue
n50
nulls49 (98.0%)
unique1
top_value
top_rate 1
cardinality 1
entropy 0
entropy_ratio 0
alert: long_tail1 singleton categories
alert: null_rate98.0% null
alert: imbalancetop value is 100.0% of rows
Fig 391.
Top values for generic_name_lc.
Show data table
Top values for generic_name_lc (1 unique shown, of 1 total).
valuecountshare
12.0%

product_name_lc categorical

Out[1421]:

saturn.columns["product_name_lc"].stats

statvalue
n50
nulls49 (98.0%)
unique1
top_value
top_rate 1
cardinality 1
entropy 0
entropy_ratio 0
alert: long_tail1 singleton categories
alert: null_rate98.0% null
alert: imbalancetop value is 100.0% of rows
Fig 392.
Top values for product_name_lc.
Show data table
Top values for product_name_lc (1 unique shown, of 1 total).
valuecountshare
12.0%

ingredients_text_lc categorical

Out[1424]:

saturn.columns["ingredients_text_lc"].stats

statvalue
n50
nulls49 (98.0%)
unique1
top_value
top_rate 1
cardinality 1
entropy 0
entropy_ratio 0
alert: long_tail1 singleton categories
alert: null_rate98.0% null
alert: imbalancetop value is 100.0% of rows
Fig 393.
Top values for ingredients_text_lc.
Show data table
Top values for ingredients_text_lc (1 unique shown, of 1 total).
valuecountshare
12.0%

ingredients_text_with_allergens_lc categorical

Out[1427]:

saturn.columns["ingredients_text_with_allergens_lc"].stats

statvalue
n50
nulls49 (98.0%)
unique1
top_value
top_rate 1
cardinality 1
entropy 0
entropy_ratio 0
alert: long_tail1 singleton categories
alert: null_rate98.0% null
alert: imbalancetop value is 100.0% of rows
Fig 394.
Top values for ingredients_text_with_allergens_lc.
Show data table
Top values for ingredients_text_with_allergens_lc (1 unique shown, of 1 total).
valuecountshare
12.0%

generic_name_xx_debug_tags unknown

Out[1430]:

saturn.columns["generic_name_xx_debug_tags"].stats

statvalue
n50
nulls0 (0.0%)
unique
alert: skippedno profiler for kind=unknown

ingredients_text_xx_debug_tags unknown

Out[1432]:

saturn.columns["ingredients_text_xx_debug_tags"].stats

statvalue
n50
nulls0 (0.0%)
unique
alert: skippedno profiler for kind=unknown

product_name_xx_debug_tags unknown

Out[1434]:

saturn.columns["product_name_xx_debug_tags"].stats

statvalue
n50
nulls0 (0.0%)
unique
alert: skippedno profiler for kind=unknown

ingredients_text_fr_ocr_1561814324_result categorical

Out[1436]:

saturn.columns["ingredients_text_fr_ocr_1561814324_result"].stats

statvalue
n50
nulls49 (98.0%)
unique1
top_value 25 % cerneaux de noix, 25 % amandes décortiquées 25 % raisins secs sultanines (raisins secs,huile de tournesol. antioxydant: anhydride lfureux), 15% canneberges, 9,8% sucre, huile de tournesol. Traces éventuelles d'autres fruits à coque et d'arachides.
top_rate 1
cardinality 1
entropy 0
entropy_ratio 0
alert: long_tail1 singleton categories
alert: null_rate98.0% null
alert: imbalancetop value is 100.0% of rows
Fig 395.
Top values for ingredients_text_fr_ocr_1561814324_result.
Show data table
Top values for ingredients_text_fr_ocr_1561814324_result (1 unique shown, of 1 total).
valuecountshare
25 % cerneaux de noix, 25 % amandes décortiquées 25 % raisins secs sultanines (raisins secs,huile de tournesol. antioxydant: anhydride lfureux), 15% canneberges, 9,8% sucre, huile de tournesol. Traces éventuelles d'autres fruits à coque et d'arachides.12.0%

ingredients_text_fr_ocr_1561814324 categorical

Out[1439]:

saturn.columns["ingredients_text_fr_ocr_1561814324"].stats

statvalue
n50
nulls49 (98.0%)
unique1
top_value 25 % cerneaux de noix, 25 % amandes décortiquées 25 % raisins secs sultanines (raisins secs,huile de tournesol. antioxydant: anhydride lfureux), 15% canneberges, 9,8% sucre, huile de tournesol. Traces éventuelles d'autres fruits à coque et d'arachides. Conditionné sous atmosphère protectrice.
top_rate 1
cardinality 1
entropy 0
entropy_ratio 0
alert: long_tail1 singleton categories
alert: null_rate98.0% null
alert: imbalancetop value is 100.0% of rows
Fig 396.
Top values for ingredients_text_fr_ocr_1561814324.
Show data table
Top values for ingredients_text_fr_ocr_1561814324 (1 unique shown, of 1 total).
valuecountshare
25 % cerneaux de noix, 25 % amandes décortiquées 25 % raisins secs sultanines (raisins secs,huile de tournesol. antioxydant: anhydride lfureux), 15% canneberges, 9,8% sucre, huile de tournesol. Traces éventuelles d'autres fruits à coque et d'arachides. Conditionné sous atmosphère protectrice.12.0%

ingredients_text_fr_ocr_1624039072 categorical

Out[1442]:

saturn.columns["ingredients_text_fr_ocr_1624039072"].stats

statvalue
n50
nulls49 (98.0%)
unique1
top_value ingrédients : cacao, émulsifiant (lécithine de _soja_), vanille.
top_rate 1
cardinality 1
entropy 0
entropy_ratio 0
alert: long_tail1 singleton categories
alert: null_rate98.0% null
alert: imbalancetop value is 100.0% of rows
Fig 397.
Top values for ingredients_text_fr_ocr_1624039072.
Show data table
Top values for ingredients_text_fr_ocr_1624039072 (1 unique shown, of 1 total).
valuecountshare
ingrédients : cacao, émulsifiant (lécithine de _soja_), vanille.12.0%

ingredients_text_fr_ocr_1624039072_result categorical

Out[1445]:

saturn.columns["ingredients_text_fr_ocr_1624039072_result"].stats

statvalue
n50
nulls49 (98.0%)
unique1
top_value Cacao, émulsifiant (lécithine de _soja_), vanille.
top_rate 1
cardinality 1
entropy 0
entropy_ratio 0
alert: long_tail1 singleton categories
alert: null_rate98.0% null
alert: imbalancetop value is 100.0% of rows
Fig 398.
Top values for ingredients_text_fr_ocr_1624039072_result.
Show data table
Top values for ingredients_text_fr_ocr_1624039072_result (1 unique shown, of 1 total).
valuecountshare
Cacao, émulsifiant (lécithine de _soja_), vanille.12.0%

ingredients_text_fr_ocr_1573108349 categorical

Out[1448]:

saturn.columns["ingredients_text_fr_ocr_1573108349"].stats

statvalue
n50
nulls49 (98.0%)
unique1
top_value Farine de blé, sucre, beurre frais 9,5 % , aeufs entiers frais, crème fraiche 5,5% , levure, sel, arômes naturels (contient alcool), gluten de blé, poudre de lait écrémé, eau de vie, émulsifiants (Mono et diglycérides d'acides gras, Stéaroyl-2- actylate de sodium, diacétyltartriques des mono et diglycérides d'acides désactivée, colorant (béta carotène) Traces éventuelles de fruits à coque. Esters et mono gras), protéines de lait levure
top_rate 1
cardinality 1
entropy 0
entropy_ratio 0
alert: long_tail1 singleton categories
alert: null_rate98.0% null
alert: imbalancetop value is 100.0% of rows
Fig 399.
Top values for ingredients_text_fr_ocr_1573108349.
Show data table
Top values for ingredients_text_fr_ocr_1573108349 (1 unique shown, of 1 total).
valuecountshare
Farine de blé, sucre, beurre frais 9,5 % , aeufs entiers frais, crème fraiche 5,5% , levure, sel, arômes naturels (contient alcool), gluten de blé, poudre de lait écrémé, eau de vie, émulsifiants (Mono et diglycérides d'acides gras, Stéaroyl-2- actylate de sodium, diacétyltartriques des mono et diglycérides d'acides désactivée, colorant (béta carotène) Traces éventuelles de fruits à coque. Esters et mono gras), protéines de lait levure12.0%

ingredients_text_fr_ocr_1573108349_result categorical

Out[1451]:

saturn.columns["ingredients_text_fr_ocr_1573108349_result"].stats

statvalue
n50
nulls49 (98.0%)
unique1
top_value Farine de blé, sucre, beurre frais 9,5 % , aeufs entiers frais, crème fraiche 5,5% , levure, sel, arômes naturels (contient alcool), gluten de blé, poudre de lait écrémé, eau de vie, émulsifiants (Mono et diglycérides d'acides gras, Stéaroyl-2 - actylate de sodium, diacétyltartriques des mono et diglycérides d'acides désactivée, colorant (béta carotène) Traces éventuelles de fruits à coque. Esters et mono gras), protéines de lait levure
top_rate 1
cardinality 1
entropy 0
entropy_ratio 0
alert: long_tail1 singleton categories
alert: null_rate98.0% null
alert: imbalancetop value is 100.0% of rows
Fig 400.
Top values for ingredients_text_fr_ocr_1573108349_result.
Show data table
Top values for ingredients_text_fr_ocr_1573108349_result (1 unique shown, of 1 total).
valuecountshare
Farine de blé, sucre, beurre frais 9,5 % , aeufs entiers frais, crème fraiche 5,5% , levure, sel, arômes naturels (contient alcool), gluten de blé, poudre de lait écrémé, eau de vie, émulsifiants (Mono et diglycérides d'acides gras, Stéaroyl-2 - actylate de sodium, diacétyltartriques des mono et diglycérides d'acides désactivée, colorant (béta carotène) Traces éventuelles de fruits à coque. Esters et mono gras), protéines de lait levure12.0%

ingredients_text_fr_ocr_1573107560_result categorical

Out[1454]:

saturn.columns["ingredients_text_fr_ocr_1573107560_result"].stats

statvalue
n50
nulls49 (98.0%)
unique1
top_value Farine de blé, sucre, beurre frais 9,5 % , aeufs entiers frais, crème fraiche 5,5% , levure, sel, arômes naturels (contient alcool), gluten de blé, poudre de lait écrémé, eau de vie, émulsifiants (Mono et diglycérides d'acides gras, Stéaroyl-2 - actylate de sodium, diacétyltartriques des mono et diglycérides d'acides désactivée, colorant (béta carotène) Traces éventuelles de fruits à coque. Esters et mono gras), protéines de lait levure
top_rate 1
cardinality 1
entropy 0
entropy_ratio 0
alert: long_tail1 singleton categories
alert: null_rate98.0% null
alert: imbalancetop value is 100.0% of rows
Fig 401.
Top values for ingredients_text_fr_ocr_1573107560_result.
Show data table
Top values for ingredients_text_fr_ocr_1573107560_result (1 unique shown, of 1 total).
valuecountshare
Farine de blé, sucre, beurre frais 9,5 % , aeufs entiers frais, crème fraiche 5,5% , levure, sel, arômes naturels (contient alcool), gluten de blé, poudre de lait écrémé, eau de vie, émulsifiants (Mono et diglycérides d'acides gras, Stéaroyl-2 - actylate de sodium, diacétyltartriques des mono et diglycérides d'acides désactivée, colorant (béta carotène) Traces éventuelles de fruits à coque. Esters et mono gras), protéines de lait levure12.0%

ingredients_text_fr_ocr_1573108360 categorical

Out[1457]:

saturn.columns["ingredients_text_fr_ocr_1573108360"].stats

statvalue
n50
nulls49 (98.0%)
unique1
top_value Farine de blé, sucre, beurre frais 9,5 % , aeufs entiers frais, crème fraiche 5,5% , levure, sel, arômes naturels (contient alcool), gluten de blé, poudre de lait écrémé, eau de vie, émulsifiants (Mono et diglycérides d'acides gras, Stéaroyl-2- actylate de sodium, diacétyltartriques des mono et diglycérides d'acides désactivée, colorant (béta carotène) Traces éventuelles de fruits à coque. Esters et mono gras), protéines de lait levure
top_rate 1
cardinality 1
entropy 0
entropy_ratio 0
alert: long_tail1 singleton categories
alert: null_rate98.0% null
alert: imbalancetop value is 100.0% of rows
Fig 402.
Top values for ingredients_text_fr_ocr_1573108360.
Show data table
Top values for ingredients_text_fr_ocr_1573108360 (1 unique shown, of 1 total).
valuecountshare
Farine de blé, sucre, beurre frais 9,5 % , aeufs entiers frais, crème fraiche 5,5% , levure, sel, arômes naturels (contient alcool), gluten de blé, poudre de lait écrémé, eau de vie, émulsifiants (Mono et diglycérides d'acides gras, Stéaroyl-2- actylate de sodium, diacétyltartriques des mono et diglycérides d'acides désactivée, colorant (béta carotène) Traces éventuelles de fruits à coque. Esters et mono gras), protéines de lait levure12.0%

ingredients_text_fr_ocr_1573107556_result categorical

Out[1460]:

saturn.columns["ingredients_text_fr_ocr_1573107556_result"].stats

statvalue
n50
nulls49 (98.0%)
unique1
top_value Farine de blé, sucre, beurre frais 9,5 % , aeufs entiers frais, crème fraiche 5,5% , levure, sel, arômes naturels (contient alcool), gluten de blé, poudre de lait écrémé, eau de vie, émulsifiants (Mono et diglycérides d'acides gras, Stéaroyl-2 - actylate de sodium, diacétyltartriques des mono et diglycérides d'acides désactivée, colorant (béta carotène) Traces éventuelles de fruits à coque. Esters et mono gras), protéines de lait levure
top_rate 1
cardinality 1
entropy 0
entropy_ratio 0
alert: long_tail1 singleton categories
alert: null_rate98.0% null
alert: imbalancetop value is 100.0% of rows
Fig 403.
Top values for ingredients_text_fr_ocr_1573107556_result.
Show data table
Top values for ingredients_text_fr_ocr_1573107556_result (1 unique shown, of 1 total).
valuecountshare
Farine de blé, sucre, beurre frais 9,5 % , aeufs entiers frais, crème fraiche 5,5% , levure, sel, arômes naturels (contient alcool), gluten de blé, poudre de lait écrémé, eau de vie, émulsifiants (Mono et diglycérides d'acides gras, Stéaroyl-2 - actylate de sodium, diacétyltartriques des mono et diglycérides d'acides désactivée, colorant (béta carotène) Traces éventuelles de fruits à coque. Esters et mono gras), protéines de lait levure12.0%

ingredients_text_fr_ocr_1573109955 categorical

Out[1463]:

saturn.columns["ingredients_text_fr_ocr_1573109955"].stats

statvalue
n50
nulls49 (98.0%)
unique1
top_value Farine de blé, sucre, beurre frais 9,5 % , aeufs entiers frais, crème fraiche 5,5% , levure, sel, arômes naturels (contient alcool), gluten de blé, poudre de lait écrémé, eau de vie, émulsifiants (Mono et diglycérides d'acides gras, Stéaroyl-2- actylate de sodium, diacétyltartriques des mono et diglycérides d'acides désactivée, colorant (béta carotène) Traces éventuelles de fruits à coque. Esters et mono gras), protéines de lait levure
top_rate 1
cardinality 1
entropy 0
entropy_ratio 0
alert: long_tail1 singleton categories
alert: null_rate98.0% null
alert: imbalancetop value is 100.0% of rows
Fig 404.
Top values for ingredients_text_fr_ocr_1573109955.
Show data table
Top values for ingredients_text_fr_ocr_1573109955 (1 unique shown, of 1 total).
valuecountshare
Farine de blé, sucre, beurre frais 9,5 % , aeufs entiers frais, crème fraiche 5,5% , levure, sel, arômes naturels (contient alcool), gluten de blé, poudre de lait écrémé, eau de vie, émulsifiants (Mono et diglycérides d'acides gras, Stéaroyl-2- actylate de sodium, diacétyltartriques des mono et diglycérides d'acides désactivée, colorant (béta carotène) Traces éventuelles de fruits à coque. Esters et mono gras), protéines de lait levure12.0%

ingredients_text_fr_ocr_1566920858 categorical

Out[1466]:

saturn.columns["ingredients_text_fr_ocr_1566920858"].stats

statvalue
n50
nulls49 (98.0%)
unique1
top_value Farine de blé, sucre, beurre frais 9,5 % , oeufs entiers frais, crème fraîche 5,5% , levure, sel, arômes naturels (contient alcool), gluten de blé, poudre de lait écrémé, eau de vie, émulsifiants (Mono et diglycérides d'acides gras, Stéaroyl-2- lactylate de sodium, Esters et mono et diacétyltartriques des mono et diglycérides d'acides gras), protéines de lait, levure désactivée, colorant (béta carotène) Traces éventuelles de fruits à coque.
top_rate 1
cardinality 1
entropy 0
entropy_ratio 0
alert: long_tail1 singleton categories
alert: null_rate98.0% null
alert: imbalancetop value is 100.0% of rows
Fig 405.
Top values for ingredients_text_fr_ocr_1566920858.
Show data table
Top values for ingredients_text_fr_ocr_1566920858 (1 unique shown, of 1 total).
valuecountshare
Farine de blé, sucre, beurre frais 9,5 % , oeufs entiers frais, crème fraîche 5,5% , levure, sel, arômes naturels (contient alcool), gluten de blé, poudre de lait écrémé, eau de vie, émulsifiants (Mono et diglycérides d'acides gras, Stéaroyl-2- lactylate de sodium, Esters et mono et diacétyltartriques des mono et diglycérides d'acides gras), protéines de lait, levure désactivée, colorant (béta carotène) Traces éventuelles de fruits à coque.12.0%

ingredients_text_fr_ocr_1573107560 categorical

Out[1469]:

saturn.columns["ingredients_text_fr_ocr_1573107560"].stats

statvalue
n50
nulls49 (98.0%)
unique1
top_value Farine de blé, sucre, beurre frais 9,5 % , aeufs entiers frais, crème fraiche 5,5% , levure, sel, arômes naturels (contient alcool), gluten de blé, poudre de lait écrémé, eau de vie, émulsifiants (Mono et diglycérides d'acides gras, Stéaroyl-2- actylate de sodium, diacétyltartriques des mono et diglycérides d'acides désactivée, colorant (béta carotène) Traces éventuelles de fruits à coque. Esters et mono gras), protéines de lait levure
top_rate 1
cardinality 1
entropy 0
entropy_ratio 0
alert: long_tail1 singleton categories
alert: null_rate98.0% null
alert: imbalancetop value is 100.0% of rows
Fig 406.
Top values for ingredients_text_fr_ocr_1573107560.
Show data table
Top values for ingredients_text_fr_ocr_1573107560 (1 unique shown, of 1 total).
valuecountshare
Farine de blé, sucre, beurre frais 9,5 % , aeufs entiers frais, crème fraiche 5,5% , levure, sel, arômes naturels (contient alcool), gluten de blé, poudre de lait écrémé, eau de vie, émulsifiants (Mono et diglycérides d'acides gras, Stéaroyl-2- actylate de sodium, diacétyltartriques des mono et diglycérides d'acides désactivée, colorant (béta carotène) Traces éventuelles de fruits à coque. Esters et mono gras), protéines de lait levure12.0%

ingredients_text_fr_ocr_1573108346 categorical

Out[1472]:

saturn.columns["ingredients_text_fr_ocr_1573108346"].stats

statvalue
n50
nulls49 (98.0%)
unique1
top_value Farine de blé, sucre, beurre frais 9,5 % , aeufs entiers frais, crème fraiche 5,5% , levure, sel, arômes naturels (contient alcool), gluten de blé, poudre de lait écrémé, eau de vie, émulsifiants (Mono et diglycérides d'acides gras, Stéaroyl-2- actylate de sodium, diacétyltartriques des mono et diglycérides d'acides désactivée, colorant (béta carotène) Traces éventuelles de fruits à coque. Esters et mono gras), protéines de lait levure
top_rate 1
cardinality 1
entropy 0
entropy_ratio 0
alert: long_tail1 singleton categories
alert: null_rate98.0% null
alert: imbalancetop value is 100.0% of rows
Fig 407.
Top values for ingredients_text_fr_ocr_1573108346.
Show data table
Top values for ingredients_text_fr_ocr_1573108346 (1 unique shown, of 1 total).
valuecountshare
Farine de blé, sucre, beurre frais 9,5 % , aeufs entiers frais, crème fraiche 5,5% , levure, sel, arômes naturels (contient alcool), gluten de blé, poudre de lait écrémé, eau de vie, émulsifiants (Mono et diglycérides d'acides gras, Stéaroyl-2- actylate de sodium, diacétyltartriques des mono et diglycérides d'acides désactivée, colorant (béta carotène) Traces éventuelles de fruits à coque. Esters et mono gras), protéines de lait levure12.0%

ingredients_text_fr_ocr_1573108346_result categorical

Out[1475]:

saturn.columns["ingredients_text_fr_ocr_1573108346_result"].stats

statvalue
n50
nulls49 (98.0%)
unique1
top_value Farine de blé, sucre, beurre frais 9,5 % , aeufs entiers frais, crème fraiche 5,5% , levure, sel, arômes naturels (contient alcool), gluten de blé, poudre de lait écrémé, eau de vie, émulsifiants (Mono et diglycérides d'acides gras, Stéaroyl-2 - actylate de sodium, diacétyltartriques des mono et diglycérides d'acides désactivée, colorant (béta carotène) Traces éventuelles de fruits à coque. Esters et mono gras), protéines de lait levure
top_rate 1
cardinality 1
entropy 0
entropy_ratio 0
alert: long_tail1 singleton categories
alert: null_rate98.0% null
alert: imbalancetop value is 100.0% of rows
Fig 408.
Top values for ingredients_text_fr_ocr_1573108346_result.
Show data table
Top values for ingredients_text_fr_ocr_1573108346_result (1 unique shown, of 1 total).
valuecountshare
Farine de blé, sucre, beurre frais 9,5 % , aeufs entiers frais, crème fraiche 5,5% , levure, sel, arômes naturels (contient alcool), gluten de blé, poudre de lait écrémé, eau de vie, émulsifiants (Mono et diglycérides d'acides gras, Stéaroyl-2 - actylate de sodium, diacétyltartriques des mono et diglycérides d'acides désactivée, colorant (béta carotène) Traces éventuelles de fruits à coque. Esters et mono gras), protéines de lait levure12.0%

ingredients_text_fr_ocr_1573109955_result categorical

Out[1478]:

saturn.columns["ingredients_text_fr_ocr_1573109955_result"].stats

statvalue
n50
nulls49 (98.0%)
unique1
top_value Farine de blé, sucre, beurre frais 9,5 % , aeufs entiers frais, crème fraiche 5,5% , levure, sel, arômes naturels (contient alcool), gluten de blé, poudre de lait écrémé, eau de vie, émulsifiants (Mono et diglycérides d'acides gras, Stéaroyl-2 - actylate de sodium, diacétyltartriques des mono et diglycérides d'acides désactivée, colorant (béta carotène) Traces éventuelles de fruits à coque. Esters et mono gras), protéines de lait levure
top_rate 1
cardinality 1
entropy 0
entropy_ratio 0
alert: long_tail1 singleton categories
alert: null_rate98.0% null
alert: imbalancetop value is 100.0% of rows
Fig 409.
Top values for ingredients_text_fr_ocr_1573109955_result.
Show data table
Top values for ingredients_text_fr_ocr_1573109955_result (1 unique shown, of 1 total).
valuecountshare
Farine de blé, sucre, beurre frais 9,5 % , aeufs entiers frais, crème fraiche 5,5% , levure, sel, arômes naturels (contient alcool), gluten de blé, poudre de lait écrémé, eau de vie, émulsifiants (Mono et diglycérides d'acides gras, Stéaroyl-2 - actylate de sodium, diacétyltartriques des mono et diglycérides d'acides désactivée, colorant (béta carotène) Traces éventuelles de fruits à coque. Esters et mono gras), protéines de lait levure12.0%

ingredients_text_fr_ocr_1566920858_result categorical

Out[1481]:

saturn.columns["ingredients_text_fr_ocr_1566920858_result"].stats

statvalue
n50
nulls49 (98.0%)
unique1
top_value Farine de blé, sucre, beurre frais 9,5 % , oeufs entiers frais, crème fraîche 5,5% , levure, sel, arômes naturels (contient alcool), gluten de blé, poudre de lait écrémé, eau de vie, émulsifiants (Mono et diglycérides d'acides gras, Stéaroyl-2 - lactylate de sodium, Esters et mono et diacétyltartriques des mono et diglycérides d'acides gras), protéines de lait, levure désactivée, colorant (béta carotène) Traces éventuelles de fruits à coque.
top_rate 1
cardinality 1
entropy 0
entropy_ratio 0
alert: long_tail1 singleton categories
alert: null_rate98.0% null
alert: imbalancetop value is 100.0% of rows
Fig 410.
Top values for ingredients_text_fr_ocr_1566920858_result.
Show data table
Top values for ingredients_text_fr_ocr_1566920858_result (1 unique shown, of 1 total).
valuecountshare
Farine de blé, sucre, beurre frais 9,5 % , oeufs entiers frais, crème fraîche 5,5% , levure, sel, arômes naturels (contient alcool), gluten de blé, poudre de lait écrémé, eau de vie, émulsifiants (Mono et diglycérides d'acides gras, Stéaroyl-2 - lactylate de sodium, Esters et mono et diacétyltartriques des mono et diglycérides d'acides gras), protéines de lait, levure désactivée, colorant (béta carotène) Traces éventuelles de fruits à coque.12.0%

ingredients_text_fr_ocr_1573107556 categorical

Out[1484]:

saturn.columns["ingredients_text_fr_ocr_1573107556"].stats

statvalue
n50
nulls49 (98.0%)
unique1
top_value Farine de blé, sucre, beurre frais 9,5 % , aeufs entiers frais, crème fraiche 5,5% , levure, sel, arômes naturels (contient alcool), gluten de blé, poudre de lait écrémé, eau de vie, émulsifiants (Mono et diglycérides d'acides gras, Stéaroyl-2- actylate de sodium, diacétyltartriques des mono et diglycérides d'acides désactivée, colorant (béta carotène) Traces éventuelles de fruits à coque. Esters et mono gras), protéines de lait levure
top_rate 1
cardinality 1
entropy 0
entropy_ratio 0
alert: long_tail1 singleton categories
alert: null_rate98.0% null
alert: imbalancetop value is 100.0% of rows
Fig 411.
Top values for ingredients_text_fr_ocr_1573107556.
Show data table
Top values for ingredients_text_fr_ocr_1573107556 (1 unique shown, of 1 total).
valuecountshare
Farine de blé, sucre, beurre frais 9,5 % , aeufs entiers frais, crème fraiche 5,5% , levure, sel, arômes naturels (contient alcool), gluten de blé, poudre de lait écrémé, eau de vie, émulsifiants (Mono et diglycérides d'acides gras, Stéaroyl-2- actylate de sodium, diacétyltartriques des mono et diglycérides d'acides désactivée, colorant (béta carotène) Traces éventuelles de fruits à coque. Esters et mono gras), protéines de lait levure12.0%

ingredients_text_fr_ocr_1573108360_result categorical

Out[1487]:

saturn.columns["ingredients_text_fr_ocr_1573108360_result"].stats

statvalue
n50
nulls49 (98.0%)
unique1
top_value Farine de blé, sucre, beurre frais 9,5 % , aeufs entiers frais, crème fraiche 5,5% , levure, sel, arômes naturels (contient alcool), gluten de blé, poudre de lait écrémé, eau de vie, émulsifiants (Mono et diglycérides d'acides gras, Stéaroyl-2 - actylate de sodium, diacétyltartriques des mono et diglycérides d'acides désactivée, colorant (béta carotène) Traces éventuelles de fruits à coque. Esters et mono gras), protéines de lait levure
top_rate 1
cardinality 1
entropy 0
entropy_ratio 0
alert: long_tail1 singleton categories
alert: null_rate98.0% null
alert: imbalancetop value is 100.0% of rows
Fig 412.
Top values for ingredients_text_fr_ocr_1573108360_result.
Show data table
Top values for ingredients_text_fr_ocr_1573108360_result (1 unique shown, of 1 total).
valuecountshare
Farine de blé, sucre, beurre frais 9,5 % , aeufs entiers frais, crème fraiche 5,5% , levure, sel, arômes naturels (contient alcool), gluten de blé, poudre de lait écrémé, eau de vie, émulsifiants (Mono et diglycérides d'acides gras, Stéaroyl-2 - actylate de sodium, diacétyltartriques des mono et diglycérides d'acides désactivée, colorant (béta carotène) Traces éventuelles de fruits à coque. Esters et mono gras), protéines de lait levure12.0%

ingredients_text_with_allergens_ro categorical

Out[1490]:

saturn.columns["ingredients_text_with_allergens_ro"].stats

statvalue
n50
nulls49 (98.0%)
unique1
top_value
top_rate 1
cardinality 1
entropy 0
entropy_ratio 0
alert: long_tail1 singleton categories
alert: null_rate98.0% null
alert: imbalancetop value is 100.0% of rows
Fig 413.
Top values for ingredients_text_with_allergens_ro.
Show data table
Top values for ingredients_text_with_allergens_ro (1 unique shown, of 1 total).
valuecountshare
12.0%

origin_lt categorical

Out[1493]:

saturn.columns["origin_lt"].stats

statvalue
n50
nulls49 (98.0%)
unique1
top_value
top_rate 1
cardinality 1
entropy 0
entropy_ratio 0
alert: long_tail1 singleton categories
alert: null_rate98.0% null
alert: imbalancetop value is 100.0% of rows
Fig 414.
Top values for origin_lt.
Show data table
Top values for origin_lt (1 unique shown, of 1 total).
valuecountshare
12.0%

ingredients_text_with_allergens_lt categorical

Out[1496]:

saturn.columns["ingredients_text_with_allergens_lt"].stats

statvalue
n50
nulls49 (98.0%)
unique1
top_value
top_rate 1
cardinality 1
entropy 0
entropy_ratio 0
alert: long_tail1 singleton categories
alert: null_rate98.0% null
alert: imbalancetop value is 100.0% of rows
Fig 415.
Top values for ingredients_text_with_allergens_lt.
Show data table
Top values for ingredients_text_with_allergens_lt (1 unique shown, of 1 total).
valuecountshare
12.0%

product_name_lt categorical

Out[1499]:

saturn.columns["product_name_lt"].stats

statvalue
n50
nulls49 (98.0%)
unique1
top_value
top_rate 1
cardinality 1
entropy 0
entropy_ratio 0
alert: long_tail1 singleton categories
alert: null_rate98.0% null
alert: imbalancetop value is 100.0% of rows
Fig 416.
Top values for product_name_lt.
Show data table
Top values for product_name_lt (1 unique shown, of 1 total).
valuecountshare
12.0%

ingredients_text_lt categorical

Out[1502]:

saturn.columns["ingredients_text_lt"].stats

statvalue
n50
nulls49 (98.0%)
unique1
top_value
top_rate 1
cardinality 1
entropy 0
entropy_ratio 0
alert: long_tail1 singleton categories
alert: null_rate98.0% null
alert: imbalancetop value is 100.0% of rows
Fig 417.
Top values for ingredients_text_lt.
Show data table
Top values for ingredients_text_lt (1 unique shown, of 1 total).
valuecountshare
12.0%

packaging_text_lt categorical

Out[1505]:

saturn.columns["packaging_text_lt"].stats

statvalue
n50
nulls49 (98.0%)
unique1
top_value
top_rate 1
cardinality 1
entropy 0
entropy_ratio 0
alert: long_tail1 singleton categories
alert: null_rate98.0% null
alert: imbalancetop value is 100.0% of rows
Fig 418.
Top values for packaging_text_lt.
Show data table
Top values for packaging_text_lt (1 unique shown, of 1 total).
valuecountshare
12.0%

generic_name_lt categorical

Out[1508]:

saturn.columns["generic_name_lt"].stats

statvalue
n50
nulls49 (98.0%)
unique1
top_value
top_rate 1
cardinality 1
entropy 0
entropy_ratio 0
alert: long_tail1 singleton categories
alert: null_rate98.0% null
alert: imbalancetop value is 100.0% of rows
Fig 419.
Top values for generic_name_lt.
Show data table
Top values for generic_name_lt (1 unique shown, of 1 total).
valuecountshare
12.0%

ingredients_text_fr_ocr_1713713129_result categorical

Out[1511]:

saturn.columns["ingredients_text_fr_ocr_1713713129_result"].stats

statvalue
n50
nulls49 (98.0%)
unique1
top_value Pâte de cacao, cacao en poudre dégraissé, beurre de cacao, sucre, lait en poudre, pâte de amandes et de noisettes, émulsifiants (lécithines (soja, toumesol)) et arôme. Cacao 92% minimum. Peut contenir des traces d'autres fruits à coque.
top_rate 1
cardinality 1
entropy 0
entropy_ratio 0
alert: long_tail1 singleton categories
alert: null_rate98.0% null
alert: imbalancetop value is 100.0% of rows
Fig 420.
Top values for ingredients_text_fr_ocr_1713713129_result.
Show data table
Top values for ingredients_text_fr_ocr_1713713129_result (1 unique shown, of 1 total).
valuecountshare
Pâte de cacao, cacao en poudre dégraissé, beurre de cacao, sucre, lait en poudre, pâte de amandes et de noisettes, émulsifiants (lécithines (soja, toumesol)) et arôme. Cacao 92% minimum. Peut contenir des traces d'autres fruits à coque.12.0%

ingredients_text_fr_ocr_1713713129 categorical

Out[1514]:

saturn.columns["ingredients_text_fr_ocr_1713713129"].stats

statvalue
n50
nulls49 (98.0%)
unique1
top_value Ingrédients : Pâte de cacao, cacao en poudre dégraissé, beurre de cacao, sucre, lait en poudre, pâte de amandes et de noisettes, émulsifiants (lécithines (soja, toumesol)) et arôme. Cacao 92% minimum. Peut contenir des traces d'autres fruits à coque.
top_rate 1
cardinality 1
entropy 0
entropy_ratio 0
alert: long_tail1 singleton categories
alert: null_rate98.0% null
alert: imbalancetop value is 100.0% of rows
Fig 421.
Top values for ingredients_text_fr_ocr_1713713129.
Show data table
Top values for ingredients_text_fr_ocr_1713713129 (1 unique shown, of 1 total).
valuecountshare
Ingrédients : Pâte de cacao, cacao en poudre dégraissé, beurre de cacao, sucre, lait en poudre, pâte de amandes et de noisettes, émulsifiants (lécithines (soja, toumesol)) et arôme. Cacao 92% minimum. Peut contenir des traces d'autres fruits à coque.12.0%

How to cite

click to copy

BibTeX
@misc{saturn-data-trove-openfoodfacts-database-2026,
  author       = {Steuber, Luke},
  title        = {Saturn reading: data trove openfoodfacts database},
  year         ={2026},
  howpublished = {\url{https://dr.eamer.dev/saturn/view/data-trove-openfoodfacts-database}},
  note         = {Profiled with saturn-dissect v0.2.0, prompt saturn-insight-v2, model anthropic:default},
}
APA
Steuber, L. (2026). Saturn reading: data trove openfoodfacts database. Source: /home/coolhand/html/datavis/data_trove/cache/wild/openfoodfacts_sample.json. Profiled with saturn-dissect v0.2.0 (saturn-insight-v2, anthropic:default). Retrieved from https://dr.eamer.dev/saturn/view/data-trove-openfoodfacts-database