accessibility atlas cdc dhds disability prevalence

source /home/coolhand/datasets/accessibility-atlas/cdc_dhds_disability_prevalence.csv 3,592 rows 30 columns profiled 2026-05-01 raw JSON static .html .ipynb Report Notebook

Reading

dataset summary · high confidence anthropic:claude-opus-4-7

This dataset contains 3,592 BRFSS-derived records of age-adjusted disability prevalence among U.S. adults 18+, broken out by state/territory (65 locations), year (2016-2022), and 8 disability response types. The core measure is Data_Value (percent prevalence), which ranges from 1.8% to 81.3% with a median of 9.1% and a heavily right-skewed distribution flagged for outliers. Most metadata columns (Category, Indicator, DataSource, Stratification1, etc.) are constant single-value fields and can be ignored as filters. The two things worth a closer look are the distribution of Data_Value across the 8 disability types in Response, and the geographic spread via LocationDesc — both are perfectly balanced in row counts, so any variation will come from the prevalence values themselves.

citing: row_count · column_count · Data_Value · Response · LocationDesc · Year · WeightedNumber · Category

Charts the summary said to look at first

Data_Value · Prevalence percentages are right-skewed with a median near 9.1% and a long tail up to 81.3% — watch for the outlier cluster.

Show data table

Histogram bins for Data_Value (median: 9.1).
bin	count
1.8 – 3.788	442
3.788 – 5.775	606
5.775 – 7.763	551
7.763 – 9.75	297
9.75 – 11.74	343
11.74 – 13.73	237
13.73 – 15.71	112
15.71 – 17.7	58
17.7 – 19.69	38
19.69 – 21.68	50
21.68 – 23.66	84
23.66 – 25.65	88
25.65 – 27.64	79
27.64 – 29.62	66
29.62 – 31.61	29
31.61 – 33.6	25
33.6 – 35.59	14
35.59 – 37.57	7
37.57 – 39.56	8
39.56 – 41.55	2
41.55 – 43.54	2
43.54 – 45.52	0
45.52 – 47.51	0
47.51 – 49.5	0
49.5 – 51.49	0
51.49 – 53.48	0
53.48 – 55.46	0
55.46 – 57.45	0
57.45 – 59.44	3
59.44 – 61.42	5
61.42 – 63.41	8
63.41 – 65.4	10
65.4 – 67.39	21
67.39 – 69.38	25
69.38 – 71.36	44
71.36 – 73.35	76
73.35 – 75.34	81
75.34 – 77.33	92
77.33 – 79.31	66
79.31 – 81.3	18

Response · All 8 disability types appear in equal counts (449 each), confirming a balanced design across Cognitive, Mobility, Vision, Hearing, Self-care, Independent Living, Any, and No Disability.

Show data table

Top values for Response (8 unique shown, of 8 total).
value	count	share
Cognitive Disability	449	12.5%
No Disability	449	12.5%
Mobility Disability	449	12.5%
Independent Living Disability	449	12.5%
Any Disability	449	12.5%
Vision Disability	449	12.5%
Self-care Disability	449	12.5%
Hearing Disability	449	12.5%

Year · Records span 2016-2022; check whether coverage is even across years before doing any trend analysis.

Show data table

Histogram bins for Year (median: 2019.0).
bin	count
2016 – 2016	520
2016 – 2016	0
2016 – 2016	0
2016 – 2017	0
2017 – 2017	0
2017 – 2017	0
2017 – 2017	512
2017 – 2017	0
2017 – 2017	0
2017 – 2018	0
2018 – 2018	0
2018 – 2018	0
2018 – 2018	0
2018 – 2018	512
2018 – 2018	0
2018 – 2018	0
2018 – 2019	0
2019 – 2019	0
2019 – 2019	0
2019 – 2019	0
2019 – 2019	504
2019 – 2019	0
2019 – 2019	0
2019 – 2020	0
2020 – 2020	0
2020 – 2020	0
2020 – 2020	512
2020 – 2020	0
2020 – 2020	0
2020 – 2020	0
2020 – 2021	0
2021 – 2021	0
2021 – 2021	0
2021 – 2021	512
2021 – 2021	0
2021 – 2021	0
2021 – 2022	0
2022 – 2022	0
2022 – 2022	0
2022 – 2022	520

LocationDesc · 65 locations (states, DC, territories) each contribute 56 rows — useful for confirming geographic completeness before mapping prevalence.

Show data table

Top values for LocationDesc (20 unique shown, of 65 total).
value	count	share
Pennsylvania	56	1.6%
Louisiana	56	1.6%
Arkansas	56	1.6%
Wyoming	56	1.6%
Alaska	56	1.6%
Maryland	56	1.6%
Guam	56	1.6%
Massachusetts	56	1.6%
West Virginia	56	1.6%
Utah	56	1.6%
North Dakota	56	1.6%
North Carolina	56	1.6%
Ohio	56	1.6%
South Dakota	56	1.6%
Connecticut	56	1.6%
Oregon	56	1.6%
Minnesota	56	1.6%
HHS Region 6	56	1.6%
Michigan	56	1.6%
HHS Region 8	56	1.6%

WeightedNumber · Weighted population estimates range from ~1.6K to 181M with extreme skew (kurtosis ~262) — large states dominate the tail.

Show data table

Histogram bins for WeightedNumber (median: 418252.0).
bin	count
1641 – 4.532e+06	3285
4.532e+06 – 9.063e+06	156
9.063e+06 – 1.359e+07	42
1.359e+07 – 1.812e+07	40
1.812e+07 – 2.265e+07	15
2.265e+07 – 2.718e+07	4
2.718e+07 – 3.172e+07	19
3.172e+07 – 3.625e+07	12
3.625e+07 – 4.078e+07	0
4.078e+07 – 4.531e+07	0
4.531e+07 – 4.984e+07	0
4.984e+07 – 5.437e+07	0
5.437e+07 – 5.89e+07	0
5.89e+07 – 6.343e+07	1
6.343e+07 – 6.796e+07	5
6.796e+07 – 7.249e+07	0
7.249e+07 – 7.702e+07	1
7.702e+07 – 8.155e+07	0
8.155e+07 – 8.608e+07	0
8.608e+07 – 9.061e+07	0
9.061e+07 – 9.514e+07	0
9.514e+07 – 9.967e+07	0
9.967e+07 – 1.042e+08	0
1.042e+08 – 1.087e+08	0
1.087e+08 – 1.133e+08	0
1.133e+08 – 1.178e+08	0
1.178e+08 – 1.223e+08	0
1.223e+08 – 1.269e+08	0
1.269e+08 – 1.314e+08	0
1.314e+08 – 1.359e+08	0
1.359e+08 – 1.404e+08	0
1.404e+08 – 1.45e+08	0
1.45e+08 – 1.495e+08	0
1.495e+08 – 1.54e+08	0
1.54e+08 – 1.586e+08	0
1.586e+08 – 1.631e+08	0
1.631e+08 – 1.676e+08	1
1.676e+08 – 1.722e+08	2
1.722e+08 – 1.767e+08	0
1.767e+08 – 1.812e+08	4

Schema

30 columns

Per-column summary. Click column name to jump to its detail.
				Alerts
Year	numeric	0.0%	7
LocationAbbr	categorical	0.0%	65
LocationDesc	categorical	0.0%	65
DataSource	categorical	0.0%	1	imbalance
Category	categorical	0.0%	1	imbalance
Indicator	categorical	0.0%	1	imbalance
Response	categorical	0.0%	8
Data_Value_Unit	categorical	0.0%	1	imbalance
Data_Value_Type	categorical	0.0%	1	imbalance
Data_Value	numeric	0.1%	486	outliers
Data_Value_Alt	numeric	0.1%	486	outliers
Data_Value_Footnote_Symbol	categorical	99.9%	1	null_rate imbalance
Data_Value_Footnote	categorical	99.9%	1	null_rate imbalance
Low_Confidence_Limit	numeric	0.1%	489	outliers
High_Confidence_Limit	numeric	0.1%	503	outliers
Number	numeric	0.1%	2,267	high_skew outliers
WeightedNumber	numeric	0.1%	3,580	high_skew outliers
StratificationCategory1	categorical	0.0%	1	imbalance
Stratification1	categorical	0.0%	1	imbalance
StratificationCategory2	unknown	0.0%	—	skipped
Stratification2	unknown	0.0%	—	skipped
CategoryID	categorical	0.0%	1	imbalance
IndicatorID	categorical	0.0%	1	imbalance
LocationID	numeric	0.0%	65
ResponseID	categorical	0.0%	8
DataValueTypeID	categorical	0.0%	1	imbalance
StratificationCategoryID1	categorical	0.0%	1	imbalance
StratificationID1	categorical	0.0%	1	imbalance
StratificationCategoryID2	unknown	0.0%	—	skipped
StratificationID2	unknown	0.0%	—	skipped

Year

numeric timestamp

This is a Year column spanning 2016 to 2022 with only 7 unique values across 3592 rows, no nulls, and a perfectly symmetric distribution centered on 2019 (mean = median = 2019). Despite being typed numeric, it functions as a low-cardinality temporal category. No outliers and zero zero-values, so the field is clean. Treatment: Treat as an ordinal/categorical year for grouping or one-hot encoding rather than a continuous numeric feature. high · anthropic:claude-opus-4-7

n: 3,592
nulls: 0 (0.0%)
unique: 7
min: 2,016
max: 2,022
mean: 2,019
median: 2,019
std: 2.008
q1: 2,017
q3: 2,021
iqr: 4
skew: 0
kurtosis: -1.259
n_outliers: 0
outlier_rate: 0
zero_rate: 0

LocationAbbr

categorical foreign_key

This is a US state/territory abbreviation code (e.g., PA, LA, AR, WY, GU), serving as a geographic key. With 65 unique values across 3592 rows and a near-uniform distribution (entropy ratio 0.999, top_rate just 0.0156), most codes appear exactly 56 times — suggesting a balanced panel of states/territories repeated across another dimension. The cardinality of 65 exceeds the 50 states, indicating territories and possibly national/regional aggregates are included. Treatment: left-join on this code to enrich with state/territory metadata, or one-hot encode for modelling. high · anthropic:claude-opus-4-7

n: 3,592
nulls: 0 (0.0%)
unique: 65
top_value: PA
top_rate: 0.01559
cardinality: 65
entropy: 6.017
entropy_ratio: 0.9992

LocationDesc

categorical feature

LocationDesc is a US state/territory name field with 65 distinct values including states, DC, and territories like Guam. The distribution is essentially uniform — entropy_ratio of 0.999 and the top 10 values all tie at 56 occurrences — suggesting this is a balanced panel where each location contributes the same number of rows. No nulls and a tidy, closed vocabulary. Treatment: Use as a categorical grouping key; one-hot or target-encode if modelling. high · anthropic:claude-opus-4-7

n: 3,592
nulls: 0 (0.0%)
unique: 65
top_value: Pennsylvania
top_rate: 0.01559
cardinality: 65
entropy: 6.017
entropy_ratio: 0.9992

DataSource

categorical metadata imbalance

This column records the dataset's provenance, with every one of the 3592 rows tagged "BRFSS". Cardinality is 1 and entropy is 0, so it carries no discriminative signal. Treatment: Drop; constant column adds no information. high · anthropic:claude-opus-4-7

n: 3,592
nulls: 0 (0.0%)
unique: 1
top_value: BRFSS
top_rate: 1
cardinality: 1
entropy: 0
entropy_ratio: 0

Indicator

categorical metadata imbalance

This column holds a single constant string ('Disability status and types among adults 18 years of age or older') across all 3,592 rows, with cardinality 1 and entropy 0. It carries no information for modelling and likely just labels the survey indicator the dataset was filtered to. Treatment: Drop; constant column with zero entropy. high · anthropic:claude-opus-4-7

n: 3,592
nulls: 0 (0.0%)
unique: 1
top_value: Disability status and types among adults 18 years of age or older
top_rate: 1
cardinality: 1
entropy: 0
entropy_ratio: 0

Response

categorical label

This column enumerates a disability response category, with 8 distinct values such as 'Cognitive Disability', 'No Disability', and 'Hearing Disability'. The distribution is perfectly uniform — each of the 8 values appears exactly 449 times (top_rate 0.125, entropy_ratio 1.0), indicating the dataset is balanced or pivoted by category rather than sampled organically. There are no nulls. Treatment: Use as a categorical label; one-hot or factor encode for modelling. high · anthropic:claude-opus-4-7

n: 3,592
nulls: 0 (0.0%)
unique: 8
top_value: Cognitive Disability
top_rate: 0.125
cardinality: 8
entropy: 3
entropy_ratio: 1

Data_Value_Unit

categorical metadata imbalance

This column records the unit of measurement for the data values, and it is constant: every one of the 3592 rows carries the value "%". With cardinality 1, entropy 0, and top_rate 1.0, it provides no information for modelling or segmentation. Treatment: Drop; constant column carrying no signal. high · anthropic:claude-opus-4-7

n: 3,592
nulls: 0 (0.0%)
unique: 1
top_value: %
top_rate: 1
cardinality: 1
entropy: 0
entropy_ratio: 0

Data_Value_Type

categorical metadata imbalance

This column records the type of data value reported, but every one of the 3592 rows holds the single label "Age-adjusted Prevalence". Cardinality is 1 and entropy is 0, so the field carries no information for modelling or segmentation. It likely exists as a schema placeholder from a wider source where multiple value types are possible. Treatment: Drop; constant column with zero entropy. high · anthropic:claude-opus-4-7

n: 3,592
nulls: 0 (0.0%)
unique: 1
top_value: Age-adjusted Prevalence
top_rate: 1
cardinality: 1
entropy: 0
entropy_ratio: 0

Data_Value

numeric feature outliers

Data_Value is a continuous numeric measurement spanning 1.8 to 81.3 with a median of 9.1 but mean of 18.25, indicating heavy right skew (skew 1.88, kurtosis 2.09). The distribution flags 450 outliers (12.5% of rows) and the standard deviation (22.16) exceeds the mean, suggesting a long upper tail or a mixture of differently-scaled metrics. Nulls are negligible (0.14%) and there are no zeros, but only 486 unique values across 3,592 rows hints at rounding or a discrete reporting grid. Treatment: Log-transform or winsorize before modelling to tame the right skew and 12.5% outlier load. high · anthropic:claude-opus-4-7

n: 3,592
nulls: 5 (0.1%)
unique: 486
min: 1.8
max: 81.3
mean: 18.25
median: 9.1
std: 22.16
q1: 5.3
q3: 19.95
iqr: 14.65
skew: 1.876
kurtosis: 2.086
n_outliers: 450
outlier_rate: 0.1255
zero_rate: 0

Data_Value_Alt

numeric feature outliers

A numeric measurement field (likely an alternate encoding of Data_Value) ranging from 1.8 to 81.3 with a median of 9.1 and mean of 18.25. The distribution is heavily right-skewed (skew 1.88, kurtosis 2.09) with std 22.16 dwarfing the IQR of 14.65, and 12.5% of rows (450) flagged as outliers. Only 486 distinct values across 3,592 rows suggest a discretised or rounded scale rather than a continuous measure. Treatment: Log-transform or winsorise before modelling to tame the right skew and outlier mass. high · anthropic:claude-opus-4-7

n: 3,592
nulls: 5 (0.1%)
unique: 486
min: 1.8
max: 81.3
mean: 18.25
median: 9.1
std: 22.16
q1: 5.3
q3: 19.95
iqr: 14.65
skew: 1.876
kurtosis: 2.086
n_outliers: 450
outlier_rate: 0.1255
zero_rate: 0

Data_Value_Footnote_Symbol

categorical metadata null_rate imbalance

This appears to be a footnote symbol marker, almost entirely empty with a 99.86% null rate and only 5 non-null entries — all the single character '*'. With cardinality of 1 and entropy of 0, the column carries no discriminative information. Treatment: Drop; effectively constant with 99.86% nulls. high · anthropic:claude-opus-4-7

n: 3,592
nulls: 3,587 (99.9%)
unique: 1
top_value: *
top_rate: 1
cardinality: 1
entropy: 0
entropy_ratio: 0

Data_Value_Footnote

categorical metadata null_rate imbalance

This column is a footnote/annotation field accompanying a Data_Value column, used to flag exceptional rows. It is effectively empty: 99.86% null, with only 5 non-null entries, all carrying the single value "Data suppressed" (cardinality 1, entropy 0). It carries no discriminative information on its own and only marks the handful of suppressed measurements. Treatment: Convert to a boolean is_suppressed flag and drop the original column. high · anthropic:claude-opus-4-7

n: 3,592
nulls: 3,587 (99.9%)
unique: 1
top_value: Data suppressed
top_rate: 1
cardinality: 1
entropy: 0
entropy_ratio: 0

Low_Confidence_Limit

numeric feature outliers

This is the lower bound of a confidence interval for some measured rate or percentage, ranging from 1.1 to 80.5 with a median of 8.2. The distribution is heavily right-skewed (skew 1.90, kurtosis 2.16) and 12.57% of values flag as outliers, suggesting a long tail of high-confidence-floor estimates above the bulk of small values. Nulls are negligible (0.14%) and there are no zeros. Treatment: Log-transform before modelling to tame the right skew, and pair with the matching upper limit. high · anthropic:claude-opus-4-7

n: 3,592
nulls: 5 (0.1%)
unique: 489
min: 1.1
max: 80.5
mean: 17.31
median: 8.2
std: 21.89
q1: 4.7
q3: 18.7
iqr: 14
skew: 1.899
kurtosis: 2.159
n_outliers: 451
outlier_rate: 0.1257
zero_rate: 0

High_Confidence_Limit

numeric feature outliers

A numeric upper-confidence-bound feature, ranging from 2.2 to 83.0 with a median of 10.1 but a mean of 19.26, indicating a long right tail. The distribution is heavily right-skewed (skew 1.85, kurtosis 2.01) and 12.5% of values (449 rows) are flagged as outliers. With 503 unique values across 3592 rows and only 0.14% nulls, it behaves as a continuous measurement rather than a categorical bound. Treatment: Log-transform before modelling to compress the right tail and dampen the 12.5% outlier mass. high · anthropic:claude-opus-4-7

n: 3,592
nulls: 5 (0.1%)
unique: 503
min: 2.2
max: 83
mean: 19.26
median: 10.1
std: 22.4
q1: 6
q3: 21.5
iqr: 15.5
skew: 1.851
kurtosis: 2.011
n_outliers: 449
outlier_rate: 0.1252
zero_rate: 0

Number

numeric feature high_skew outliers

This is a numeric 'Number' column, almost certainly a count or quantity metric rather than an identifier given 2267 unique values across 3592 rows and a non-trivial null rate of 0.0014. The distribution is severely right-skewed (skew 14.57, kurtosis 256.99): the median is 978 while the mean is 3780 and the max reaches 327817, with 385 outliers (10.7%) flagged. The IQR (467 to 2750) is tiny relative to the max, so a handful of extreme values dominate the variance (std 15294). Treatment: Log-transform (or winsorize) before any distance- or variance-based modelling. high · anthropic:claude-opus-4-7

n: 3,592
nulls: 5 (0.1%)
unique: 2,267
min: 31
max: 327,817
mean: 3780
median: 978
std: 1.529e+04
q1: 467
q3: 2,750
iqr: 2,283
skew: 14.57
kurtosis: 257
n_outliers: 385
outlier_rate: 0.1073
zero_rate: 0

WeightedNumber

numeric feature high_skew outliers

WeightedNumber is a numeric measure with 3580 distinct values across 3592 rows, ranging from 1641 to 181,223,676 with a median of 418,252 but a mean of 2,103,449. The distribution is severely right-skewed (skew 14.65, kurtosis 262.16) and 444 rows (12.4%) fall outside the IQR fence, suggesting a long tail of very large weights dominating the mean. Treatment: log-transform before modelling to tame the heavy right tail. high · anthropic:claude-opus-4-7

n: 3,592
nulls: 5 (0.1%)
unique: 3,580
min: 1,641
max: 1.812e+08
mean: 2.103e+06
median: 418,252
std: 9.082e+06
q1: 149,677
q3: 1.303e+06
iqr: 1.153e+06
skew: 14.65
kurtosis: 262.2
n_outliers: 444
outlier_rate: 0.1238
zero_rate: 0

StratificationCategory1

categorical metadata imbalance

This column is a stratification dimension label, but every one of the 3592 rows holds the single value "Overall" (top_rate 1.0, cardinality 1, entropy 0.0). It carries no information and likely indicates this slice of the source dataset was filtered to the un-stratified aggregate. Treatment: Drop; constant column with zero entropy. high · anthropic:claude-opus-4-7

n: 3,592
nulls: 0 (0.0%)
unique: 1
top_value: Overall
top_rate: 1
cardinality: 1
entropy: 0
entropy_ratio: 0

Stratification1

categorical metadata imbalance

This column is a stratification label that takes the single value "Overall" across all 3592 rows. With cardinality 1 and entropy 0, it carries no information and cannot differentiate records. It likely indicates that this slice of the source data was not broken out by any subgroup. Treatment: drop, constant column with a single value. high · anthropic:claude-opus-4-7

n: 3,592
nulls: 0 (0.0%)
unique: 1
top_value: Overall
top_rate: 1
cardinality: 1
entropy: 0
entropy_ratio: 0

StratificationCategory2

unknown metadata skipped

Column was skipped by the profiler, so no value-level statistics are available beyond a row count of 3592 and a null rate of 0.0. The name suggests a secondary stratification dimension used alongside a primary category, typical of public health or survey datasets. Without unique counts or value distributions, its content cannot be characterised further. Treatment: Re-profile with the skip removed to inspect cardinality before deciding on encoding. low · anthropic:claude-opus-4-7

n: 3,592
nulls: 0 (0.0%)
unique: —

Stratification2

unknown other skipped

Saturn skipped detailed profiling for Stratification2, so only the row count (3592) and a 0.0 null rate are known. With no unique count, type, or value distribution available, the column's content cannot be characterised from this evidence alone. The name suggests a secondary stratification key paired with a primary Stratification1 field, but that is not confirmed by the stats. Treatment: Re-profile or inspect raw values before deciding; do not use until kind and cardinality are established. low · anthropic:claude-opus-4-7

n: 3,592
nulls: 0 (0.0%)
unique: —

CategoryID

categorical metadata imbalance

CategoryID is a categorical column that carries no information: every one of the 3592 rows holds the single value "DISEST", giving cardinality 1 and entropy 0. It likely encodes a fixed dataset-level tag or filter rather than a per-row attribute. Treatment: Drop; constant column with zero variance. high · anthropic:claude-opus-4-7

n: 3,592
nulls: 0 (0.0%)
unique: 1
top_value: DISEST
top_rate: 1
cardinality: 1
entropy: 0
entropy_ratio: 0

IndicatorID

categorical metadata imbalance

IndicatorID is a categorical column that holds the single value "STATTYPE" across all 3592 rows, with zero nulls and cardinality of 1. Entropy is 0.0, so the field carries no information and likely functions as a constant tag identifying the indicator type for this slice of the dataset. Treatment: Drop before modelling; constant column with no variance. high · anthropic:claude-opus-4-7

n: 3,592
nulls: 0 (0.0%)
unique: 1
top_value: STATTYPE
top_rate: 1
cardinality: 1
entropy: 0
entropy_ratio: 0

LocationID

numeric foreign_key

LocationID is almost certainly a categorical location key encoded as integers, with 65 distinct values across 3592 rows and no nulls. Values range from 1 to 89 with a median of 36 and mild positive skew (0.50), consistent with an ID lookup rather than a measured quantity. Treating it as numeric would be misleading despite its int dtype. Treatment: Cast to categorical and left-join to a location lookup table rather than using as a numeric feature. high · anthropic:claude-opus-4-7

n: 3,592
nulls: 0 (0.0%)
unique: 65
min: 1
max: 89
mean: 39.69
median: 36
std: 25.34
q1: 20
q3: 54
iqr: 34
skew: 0.5048
kurtosis: -0.7622
n_outliers: 0
outlier_rate: 0
zero_rate: 0

ResponseID

categorical feature

ResponseID holds 8 distinct codes (Q6COG, Q6DIS2, Q6MOB, Q6IND, Q6DIS1, Q6VIS, Q6SEL, Q6HEAR), each appearing exactly 449 times across 3592 rows with no nulls. The perfectly uniform distribution and entropy ratio of 1.0 indicate this is a question/disability-domain identifier replicated per respondent rather than a unique response key. Despite the name, it behaves as a categorical factor, not an identifier. Treatment: Treat as a categorical factor (one-hot or group-by key); do not use as a unique row id. high · anthropic:claude-opus-4-7

n: 3,592
nulls: 0 (0.0%)
unique: 8
top_value: Q6COG
top_rate: 0.125
cardinality: 8
entropy: 3
entropy_ratio: 1

DataValueTypeID

categorical metadata imbalance

DataValueTypeID is a categorical metadata field indicating the type of statistical measure reported, but every one of the 3592 rows carries the single value 'AGEADJPREV' (age-adjusted prevalence). Cardinality is 1 and entropy is 0, so the column carries no information for modelling or filtering. Treatment: Drop; constant column with no variance. high · anthropic:claude-opus-4-7

n: 3,592
nulls: 0 (0.0%)
unique: 1
top_value: AGEADJPREV
top_rate: 1
cardinality: 1
entropy: 0
entropy_ratio: 0

StratificationCategoryID1

categorical metadata imbalance

This column holds a single constant value "CAT1" across all 3592 rows, with zero nulls and cardinality of 1. Entropy is 0, meaning it carries no information for any downstream task. The name suggests it was meant to identify a stratification category, but only one category is represented in this slice. Treatment: Drop; constant column with no variance. high · anthropic:claude-opus-4-7

n: 3,592
nulls: 0 (0.0%)
unique: 1
top_value: CAT1
top_rate: 1
cardinality: 1
entropy: 0
entropy_ratio: 0

StratificationID1

categorical metadata imbalance

This column holds a single constant value 'BO1' across all 3592 rows, with cardinality 1 and entropy 0. As a 'StratificationID1' it likely encodes a stratification dimension (e.g., overall/total) that was never varied in this slice. It carries no information for modelling or grouping. Treatment: Drop; constant column with zero entropy. high · anthropic:claude-opus-4-7

n: 3,592
nulls: 0 (0.0%)
unique: 1
top_value: BO1
top_rate: 1
cardinality: 1
entropy: 0
entropy_ratio: 0

StratificationCategoryID2

unknown other skipped

This column is named StratificationCategoryID2, suggesting it holds a secondary stratification category identifier in a public-health style dataset. Saturn skipped profiling, so no uniqueness, value, or distribution stats are available beyond a row count of 3592 and a null rate of 0.0. Without further signals, its actual content and cardinality cannot be characterised here. Treatment: Re-profile with type coercion to confirm whether this is a categorical key before use. low · anthropic:claude-opus-4-7

n: 3,592
nulls: 0 (0.0%)
unique: —

StratificationID2

unknown foreign_key skipped

StratificationID2 was skipped by the profiler, so its kind, uniqueness, and value distribution are unknown. The only confirmed signals are that it has 3592 rows and a null rate of 0.0. The name suggests a secondary stratification key (e.g., demographic subgroup) commonly paired with a StratificationCategoryID2 in CDC-style indicator tables. Treatment: Re-profile the column to determine cardinality, then treat as a categorical join key against its stratification lookup. low · anthropic:claude-opus-4-7

n: 3,592
nulls: 0 (0.0%)
unique: —