accessibility atlas cdc dhds disability prevalence
Reading
This dataset contains 3,592 BRFSS-derived records of age-adjusted disability prevalence among U.S. adults 18+, broken out by state/territory (65 locations), year (2016-2022), and 8 disability response types. The core measure is Data_Value (percent prevalence), which ranges from 1.8% to 81.3% with a median of 9.1% and a heavily right-skewed distribution flagged for outliers. Most metadata columns (Category, Indicator, DataSource, Stratification1, etc.) are constant single-value fields and can be ignored as filters. The two things worth a closer look are the distribution of Data_Value across the 8 disability types in Response, and the geographic spread via LocationDesc — both are perfectly balanced in row counts, so any variation will come from the prevalence values themselves.
citing: row_count · column_count · Data_Value · Response · LocationDesc · Year · WeightedNumber · Category
Charts the summary said to look at first
Show data table
| bin | count |
|---|---|
| 1.8 – 3.788 | 442 |
| 3.788 – 5.775 | 606 |
| 5.775 – 7.763 | 551 |
| 7.763 – 9.75 | 297 |
| 9.75 – 11.74 | 343 |
| 11.74 – 13.73 | 237 |
| 13.73 – 15.71 | 112 |
| 15.71 – 17.7 | 58 |
| 17.7 – 19.69 | 38 |
| 19.69 – 21.68 | 50 |
| 21.68 – 23.66 | 84 |
| 23.66 – 25.65 | 88 |
| 25.65 – 27.64 | 79 |
| 27.64 – 29.62 | 66 |
| 29.62 – 31.61 | 29 |
| 31.61 – 33.6 | 25 |
| 33.6 – 35.59 | 14 |
| 35.59 – 37.57 | 7 |
| 37.57 – 39.56 | 8 |
| 39.56 – 41.55 | 2 |
| 41.55 – 43.54 | 2 |
| 43.54 – 45.52 | 0 |
| 45.52 – 47.51 | 0 |
| 47.51 – 49.5 | 0 |
| 49.5 – 51.49 | 0 |
| 51.49 – 53.48 | 0 |
| 53.48 – 55.46 | 0 |
| 55.46 – 57.45 | 0 |
| 57.45 – 59.44 | 3 |
| 59.44 – 61.42 | 5 |
| 61.42 – 63.41 | 8 |
| 63.41 – 65.4 | 10 |
| 65.4 – 67.39 | 21 |
| 67.39 – 69.38 | 25 |
| 69.38 – 71.36 | 44 |
| 71.36 – 73.35 | 76 |
| 73.35 – 75.34 | 81 |
| 75.34 – 77.33 | 92 |
| 77.33 – 79.31 | 66 |
| 79.31 – 81.3 | 18 |
Show data table
| value | count | share |
|---|---|---|
| Cognitive Disability | 449 | 12.5% |
| No Disability | 449 | 12.5% |
| Mobility Disability | 449 | 12.5% |
| Independent Living Disability | 449 | 12.5% |
| Any Disability | 449 | 12.5% |
| Vision Disability | 449 | 12.5% |
| Self-care Disability | 449 | 12.5% |
| Hearing Disability | 449 | 12.5% |
Show data table
| bin | count |
|---|---|
| 2016 – 2016 | 520 |
| 2016 – 2016 | 0 |
| 2016 – 2016 | 0 |
| 2016 – 2017 | 0 |
| 2017 – 2017 | 0 |
| 2017 – 2017 | 0 |
| 2017 – 2017 | 512 |
| 2017 – 2017 | 0 |
| 2017 – 2017 | 0 |
| 2017 – 2018 | 0 |
| 2018 – 2018 | 0 |
| 2018 – 2018 | 0 |
| 2018 – 2018 | 0 |
| 2018 – 2018 | 512 |
| 2018 – 2018 | 0 |
| 2018 – 2018 | 0 |
| 2018 – 2019 | 0 |
| 2019 – 2019 | 0 |
| 2019 – 2019 | 0 |
| 2019 – 2019 | 0 |
| 2019 – 2019 | 504 |
| 2019 – 2019 | 0 |
| 2019 – 2019 | 0 |
| 2019 – 2020 | 0 |
| 2020 – 2020 | 0 |
| 2020 – 2020 | 0 |
| 2020 – 2020 | 512 |
| 2020 – 2020 | 0 |
| 2020 – 2020 | 0 |
| 2020 – 2020 | 0 |
| 2020 – 2021 | 0 |
| 2021 – 2021 | 0 |
| 2021 – 2021 | 0 |
| 2021 – 2021 | 512 |
| 2021 – 2021 | 0 |
| 2021 – 2021 | 0 |
| 2021 – 2022 | 0 |
| 2022 – 2022 | 0 |
| 2022 – 2022 | 0 |
| 2022 – 2022 | 520 |
Show data table
| value | count | share |
|---|---|---|
| Pennsylvania | 56 | 1.6% |
| Louisiana | 56 | 1.6% |
| Arkansas | 56 | 1.6% |
| Wyoming | 56 | 1.6% |
| Alaska | 56 | 1.6% |
| Maryland | 56 | 1.6% |
| Guam | 56 | 1.6% |
| Massachusetts | 56 | 1.6% |
| West Virginia | 56 | 1.6% |
| Utah | 56 | 1.6% |
| North Dakota | 56 | 1.6% |
| North Carolina | 56 | 1.6% |
| Ohio | 56 | 1.6% |
| South Dakota | 56 | 1.6% |
| Connecticut | 56 | 1.6% |
| Oregon | 56 | 1.6% |
| Minnesota | 56 | 1.6% |
| HHS Region 6 | 56 | 1.6% |
| Michigan | 56 | 1.6% |
| HHS Region 8 | 56 | 1.6% |
Show data table
| bin | count |
|---|---|
| 1641 – 4.532e+06 | 3285 |
| 4.532e+06 – 9.063e+06 | 156 |
| 9.063e+06 – 1.359e+07 | 42 |
| 1.359e+07 – 1.812e+07 | 40 |
| 1.812e+07 – 2.265e+07 | 15 |
| 2.265e+07 – 2.718e+07 | 4 |
| 2.718e+07 – 3.172e+07 | 19 |
| 3.172e+07 – 3.625e+07 | 12 |
| 3.625e+07 – 4.078e+07 | 0 |
| 4.078e+07 – 4.531e+07 | 0 |
| 4.531e+07 – 4.984e+07 | 0 |
| 4.984e+07 – 5.437e+07 | 0 |
| 5.437e+07 – 5.89e+07 | 0 |
| 5.89e+07 – 6.343e+07 | 1 |
| 6.343e+07 – 6.796e+07 | 5 |
| 6.796e+07 – 7.249e+07 | 0 |
| 7.249e+07 – 7.702e+07 | 1 |
| 7.702e+07 – 8.155e+07 | 0 |
| 8.155e+07 – 8.608e+07 | 0 |
| 8.608e+07 – 9.061e+07 | 0 |
| 9.061e+07 – 9.514e+07 | 0 |
| 9.514e+07 – 9.967e+07 | 0 |
| 9.967e+07 – 1.042e+08 | 0 |
| 1.042e+08 – 1.087e+08 | 0 |
| 1.087e+08 – 1.133e+08 | 0 |
| 1.133e+08 – 1.178e+08 | 0 |
| 1.178e+08 – 1.223e+08 | 0 |
| 1.223e+08 – 1.269e+08 | 0 |
| 1.269e+08 – 1.314e+08 | 0 |
| 1.314e+08 – 1.359e+08 | 0 |
| 1.359e+08 – 1.404e+08 | 0 |
| 1.404e+08 – 1.45e+08 | 0 |
| 1.45e+08 – 1.495e+08 | 0 |
| 1.495e+08 – 1.54e+08 | 0 |
| 1.54e+08 – 1.586e+08 | 0 |
| 1.586e+08 – 1.631e+08 | 0 |
| 1.631e+08 – 1.676e+08 | 1 |
| 1.676e+08 – 1.722e+08 | 2 |
| 1.722e+08 – 1.767e+08 | 0 |
| 1.767e+08 – 1.812e+08 | 4 |
Schema
30 columns| Alerts | ||||
|---|---|---|---|---|
| Year | numeric | 0.0% | 7 |
|
| LocationAbbr | categorical | 0.0% | 65 |
|
| LocationDesc | categorical | 0.0% | 65 |
|
| DataSource | categorical | 0.0% | 1 |
imbalance
|
| Category | categorical | 0.0% | 1 |
imbalance
|
| Indicator | categorical | 0.0% | 1 |
imbalance
|
| Response | categorical | 0.0% | 8 |
|
| Data_Value_Unit | categorical | 0.0% | 1 |
imbalance
|
| Data_Value_Type | categorical | 0.0% | 1 |
imbalance
|
| Data_Value | numeric | 0.1% | 486 |
outliers
|
| Data_Value_Alt | numeric | 0.1% | 486 |
outliers
|
| Data_Value_Footnote_Symbol | categorical | 99.9% | 1 |
null_rate
imbalance
|
| Data_Value_Footnote | categorical | 99.9% | 1 |
null_rate
imbalance
|
| Low_Confidence_Limit | numeric | 0.1% | 489 |
outliers
|
| High_Confidence_Limit | numeric | 0.1% | 503 |
outliers
|
| Number | numeric | 0.1% | 2,267 |
high_skew
outliers
|
| WeightedNumber | numeric | 0.1% | 3,580 |
high_skew
outliers
|
| StratificationCategory1 | categorical | 0.0% | 1 |
imbalance
|
| Stratification1 | categorical | 0.0% | 1 |
imbalance
|
| StratificationCategory2 | unknown | 0.0% | — |
skipped
|
| Stratification2 | unknown | 0.0% | — |
skipped
|
| CategoryID | categorical | 0.0% | 1 |
imbalance
|
| IndicatorID | categorical | 0.0% | 1 |
imbalance
|
| LocationID | numeric | 0.0% | 65 |
|
| ResponseID | categorical | 0.0% | 8 |
|
| DataValueTypeID | categorical | 0.0% | 1 |
imbalance
|
| StratificationCategoryID1 | categorical | 0.0% | 1 |
imbalance
|
| StratificationID1 | categorical | 0.0% | 1 |
imbalance
|
| StratificationCategoryID2 | unknown | 0.0% | — |
skipped
|
| StratificationID2 | unknown | 0.0% | — |
skipped
|
Year
numeric timestampThis is a Year column spanning 2016 to 2022 with only 7 unique values across 3592 rows, no nulls, and a perfectly symmetric distribution centered on 2019 (mean = median = 2019). Despite being typed numeric, it functions as a low-cardinality temporal category. No outliers and zero zero-values, so the field is clean. Treatment: Treat as an ordinal/categorical year for grouping or one-hot encoding rather than a continuous numeric feature.
- n
- 3,592
- nulls
- 0 (0.0%)
- unique
- 7
- min
- 2,016
- max
- 2,022
- mean
- 2,019
- median
- 2,019
- std
- 2.008
- q1
- 2,017
- q3
- 2,021
- iqr
- 4
- skew
- 0
- kurtosis
- -1.259
- n_outliers
- 0
- outlier_rate
- 0
- zero_rate
- 0
LocationAbbr
categorical foreign_keyThis is a US state/territory abbreviation code (e.g., PA, LA, AR, WY, GU), serving as a geographic key. With 65 unique values across 3592 rows and a near-uniform distribution (entropy ratio 0.999, top_rate just 0.0156), most codes appear exactly 56 times — suggesting a balanced panel of states/territories repeated across another dimension. The cardinality of 65 exceeds the 50 states, indicating territories and possibly national/regional aggregates are included. Treatment: left-join on this code to enrich with state/territory metadata, or one-hot encode for modelling.
- n
- 3,592
- nulls
- 0 (0.0%)
- unique
- 65
- top_value
- PA
- top_rate
- 0.01559
- cardinality
- 65
- entropy
- 6.017
- entropy_ratio
- 0.9992
LocationDesc
categorical featureLocationDesc is a US state/territory name field with 65 distinct values including states, DC, and territories like Guam. The distribution is essentially uniform — entropy_ratio of 0.999 and the top 10 values all tie at 56 occurrences — suggesting this is a balanced panel where each location contributes the same number of rows. No nulls and a tidy, closed vocabulary. Treatment: Use as a categorical grouping key; one-hot or target-encode if modelling.
- n
- 3,592
- nulls
- 0 (0.0%)
- unique
- 65
- top_value
- Pennsylvania
- top_rate
- 0.01559
- cardinality
- 65
- entropy
- 6.017
- entropy_ratio
- 0.9992
DataSource
categorical metadata imbalanceThis column records the dataset's provenance, with every one of the 3592 rows tagged "BRFSS". Cardinality is 1 and entropy is 0, so it carries no discriminative signal. Treatment: Drop; constant column adds no information.
- n
- 3,592
- nulls
- 0 (0.0%)
- unique
- 1
- top_value
- BRFSS
- top_rate
- 1
- cardinality
- 1
- entropy
- 0
- entropy_ratio
- 0
Category
categorical metadata imbalanceThis column is a single-valued tag labeling every row as "Disability Estimates" across all 3592 records. With cardinality of 1, top_rate of 1.0, and entropy of 0.0, it carries no information for modelling or filtering. Treatment: Drop; constant column with no variance.
- n
- 3,592
- nulls
- 0 (0.0%)
- unique
- 1
- top_value
- Disability Estimates
- top_rate
- 1
- cardinality
- 1
- entropy
- 0
- entropy_ratio
- 0
Indicator
categorical metadata imbalanceThis column holds a single constant string ('Disability status and types among adults 18 years of age or older') across all 3,592 rows, with cardinality 1 and entropy 0. It carries no information for modelling and likely just labels the survey indicator the dataset was filtered to. Treatment: Drop; constant column with zero entropy.
- n
- 3,592
- nulls
- 0 (0.0%)
- unique
- 1
- top_value
- Disability status and types among adults 18 years of age or older
- top_rate
- 1
- cardinality
- 1
- entropy
- 0
- entropy_ratio
- 0
Response
categorical labelThis column enumerates a disability response category, with 8 distinct values such as 'Cognitive Disability', 'No Disability', and 'Hearing Disability'. The distribution is perfectly uniform — each of the 8 values appears exactly 449 times (top_rate 0.125, entropy_ratio 1.0), indicating the dataset is balanced or pivoted by category rather than sampled organically. There are no nulls. Treatment: Use as a categorical label; one-hot or factor encode for modelling.
- n
- 3,592
- nulls
- 0 (0.0%)
- unique
- 8
- top_value
- Cognitive Disability
- top_rate
- 0.125
- cardinality
- 8
- entropy
- 3
- entropy_ratio
- 1
Data_Value_Unit
categorical metadata imbalanceThis column records the unit of measurement for the data values, and it is constant: every one of the 3592 rows carries the value "%". With cardinality 1, entropy 0, and top_rate 1.0, it provides no information for modelling or segmentation. Treatment: Drop; constant column carrying no signal.
- n
- 3,592
- nulls
- 0 (0.0%)
- unique
- 1
- top_value
- %
- top_rate
- 1
- cardinality
- 1
- entropy
- 0
- entropy_ratio
- 0
Data_Value_Type
categorical metadata imbalanceThis column records the type of data value reported, but every one of the 3592 rows holds the single label "Age-adjusted Prevalence". Cardinality is 1 and entropy is 0, so the field carries no information for modelling or segmentation. It likely exists as a schema placeholder from a wider source where multiple value types are possible. Treatment: Drop; constant column with zero entropy.
- n
- 3,592
- nulls
- 0 (0.0%)
- unique
- 1
- top_value
- Age-adjusted Prevalence
- top_rate
- 1
- cardinality
- 1
- entropy
- 0
- entropy_ratio
- 0
Data_Value
numeric feature outliersData_Value is a continuous numeric measurement spanning 1.8 to 81.3 with a median of 9.1 but mean of 18.25, indicating heavy right skew (skew 1.88, kurtosis 2.09). The distribution flags 450 outliers (12.5% of rows) and the standard deviation (22.16) exceeds the mean, suggesting a long upper tail or a mixture of differently-scaled metrics. Nulls are negligible (0.14%) and there are no zeros, but only 486 unique values across 3,592 rows hints at rounding or a discrete reporting grid. Treatment: Log-transform or winsorize before modelling to tame the right skew and 12.5% outlier load.
- n
- 3,592
- nulls
- 5 (0.1%)
- unique
- 486
- min
- 1.8
- max
- 81.3
- mean
- 18.25
- median
- 9.1
- std
- 22.16
- q1
- 5.3
- q3
- 19.95
- iqr
- 14.65
- skew
- 1.876
- kurtosis
- 2.086
- n_outliers
- 450
- outlier_rate
- 0.1255
- zero_rate
- 0
Data_Value_Alt
numeric feature outliersA numeric measurement field (likely an alternate encoding of Data_Value) ranging from 1.8 to 81.3 with a median of 9.1 and mean of 18.25. The distribution is heavily right-skewed (skew 1.88, kurtosis 2.09) with std 22.16 dwarfing the IQR of 14.65, and 12.5% of rows (450) flagged as outliers. Only 486 distinct values across 3,592 rows suggest a discretised or rounded scale rather than a continuous measure. Treatment: Log-transform or winsorise before modelling to tame the right skew and outlier mass.
- n
- 3,592
- nulls
- 5 (0.1%)
- unique
- 486
- min
- 1.8
- max
- 81.3
- mean
- 18.25
- median
- 9.1
- std
- 22.16
- q1
- 5.3
- q3
- 19.95
- iqr
- 14.65
- skew
- 1.876
- kurtosis
- 2.086
- n_outliers
- 450
- outlier_rate
- 0.1255
- zero_rate
- 0
Data_Value_Footnote_Symbol
categorical metadata null_rate imbalanceThis appears to be a footnote symbol marker, almost entirely empty with a 99.86% null rate and only 5 non-null entries — all the single character '*'. With cardinality of 1 and entropy of 0, the column carries no discriminative information. Treatment: Drop; effectively constant with 99.86% nulls.
- n
- 3,592
- nulls
- 3,587 (99.9%)
- unique
- 1
- top_value
- *
- top_rate
- 1
- cardinality
- 1
- entropy
- 0
- entropy_ratio
- 0
Data_Value_Footnote
categorical metadata null_rate imbalanceThis column is a footnote/annotation field accompanying a Data_Value column, used to flag exceptional rows. It is effectively empty: 99.86% null, with only 5 non-null entries, all carrying the single value "Data suppressed" (cardinality 1, entropy 0). It carries no discriminative information on its own and only marks the handful of suppressed measurements. Treatment: Convert to a boolean is_suppressed flag and drop the original column.
- n
- 3,592
- nulls
- 3,587 (99.9%)
- unique
- 1
- top_value
- Data suppressed
- top_rate
- 1
- cardinality
- 1
- entropy
- 0
- entropy_ratio
- 0
Low_Confidence_Limit
numeric feature outliersThis is the lower bound of a confidence interval for some measured rate or percentage, ranging from 1.1 to 80.5 with a median of 8.2. The distribution is heavily right-skewed (skew 1.90, kurtosis 2.16) and 12.57% of values flag as outliers, suggesting a long tail of high-confidence-floor estimates above the bulk of small values. Nulls are negligible (0.14%) and there are no zeros. Treatment: Log-transform before modelling to tame the right skew, and pair with the matching upper limit.
- n
- 3,592
- nulls
- 5 (0.1%)
- unique
- 489
- min
- 1.1
- max
- 80.5
- mean
- 17.31
- median
- 8.2
- std
- 21.89
- q1
- 4.7
- q3
- 18.7
- iqr
- 14
- skew
- 1.899
- kurtosis
- 2.159
- n_outliers
- 451
- outlier_rate
- 0.1257
- zero_rate
- 0
High_Confidence_Limit
numeric feature outliersA numeric upper-confidence-bound feature, ranging from 2.2 to 83.0 with a median of 10.1 but a mean of 19.26, indicating a long right tail. The distribution is heavily right-skewed (skew 1.85, kurtosis 2.01) and 12.5% of values (449 rows) are flagged as outliers. With 503 unique values across 3592 rows and only 0.14% nulls, it behaves as a continuous measurement rather than a categorical bound. Treatment: Log-transform before modelling to compress the right tail and dampen the 12.5% outlier mass.
- n
- 3,592
- nulls
- 5 (0.1%)
- unique
- 503
- min
- 2.2
- max
- 83
- mean
- 19.26
- median
- 10.1
- std
- 22.4
- q1
- 6
- q3
- 21.5
- iqr
- 15.5
- skew
- 1.851
- kurtosis
- 2.011
- n_outliers
- 449
- outlier_rate
- 0.1252
- zero_rate
- 0
Number
numeric feature high_skew outliersThis is a numeric 'Number' column, almost certainly a count or quantity metric rather than an identifier given 2267 unique values across 3592 rows and a non-trivial null rate of 0.0014. The distribution is severely right-skewed (skew 14.57, kurtosis 256.99): the median is 978 while the mean is 3780 and the max reaches 327817, with 385 outliers (10.7%) flagged. The IQR (467 to 2750) is tiny relative to the max, so a handful of extreme values dominate the variance (std 15294). Treatment: Log-transform (or winsorize) before any distance- or variance-based modelling.
- n
- 3,592
- nulls
- 5 (0.1%)
- unique
- 2,267
- min
- 31
- max
- 327,817
- mean
- 3780
- median
- 978
- std
- 1.529e+04
- q1
- 467
- q3
- 2,750
- iqr
- 2,283
- skew
- 14.57
- kurtosis
- 257
- n_outliers
- 385
- outlier_rate
- 0.1073
- zero_rate
- 0
WeightedNumber
numeric feature high_skew outliersWeightedNumber is a numeric measure with 3580 distinct values across 3592 rows, ranging from 1641 to 181,223,676 with a median of 418,252 but a mean of 2,103,449. The distribution is severely right-skewed (skew 14.65, kurtosis 262.16) and 444 rows (12.4%) fall outside the IQR fence, suggesting a long tail of very large weights dominating the mean. Treatment: log-transform before modelling to tame the heavy right tail.
- n
- 3,592
- nulls
- 5 (0.1%)
- unique
- 3,580
- min
- 1,641
- max
- 1.812e+08
- mean
- 2.103e+06
- median
- 418,252
- std
- 9.082e+06
- q1
- 149,677
- q3
- 1.303e+06
- iqr
- 1.153e+06
- skew
- 14.65
- kurtosis
- 262.2
- n_outliers
- 444
- outlier_rate
- 0.1238
- zero_rate
- 0
StratificationCategory1
categorical metadata imbalanceThis column is a stratification dimension label, but every one of the 3592 rows holds the single value "Overall" (top_rate 1.0, cardinality 1, entropy 0.0). It carries no information and likely indicates this slice of the source dataset was filtered to the un-stratified aggregate. Treatment: Drop; constant column with zero entropy.
- n
- 3,592
- nulls
- 0 (0.0%)
- unique
- 1
- top_value
- Overall
- top_rate
- 1
- cardinality
- 1
- entropy
- 0
- entropy_ratio
- 0
Stratification1
categorical metadata imbalanceThis column is a stratification label that takes the single value "Overall" across all 3592 rows. With cardinality 1 and entropy 0, it carries no information and cannot differentiate records. It likely indicates that this slice of the source data was not broken out by any subgroup. Treatment: drop, constant column with a single value.
- n
- 3,592
- nulls
- 0 (0.0%)
- unique
- 1
- top_value
- Overall
- top_rate
- 1
- cardinality
- 1
- entropy
- 0
- entropy_ratio
- 0
StratificationCategory2
unknown metadata skippedColumn was skipped by the profiler, so no value-level statistics are available beyond a row count of 3592 and a null rate of 0.0. The name suggests a secondary stratification dimension used alongside a primary category, typical of public health or survey datasets. Without unique counts or value distributions, its content cannot be characterised further. Treatment: Re-profile with the skip removed to inspect cardinality before deciding on encoding.
- n
- 3,592
- nulls
- 0 (0.0%)
- unique
- —
Stratification2
unknown other skippedSaturn skipped detailed profiling for Stratification2, so only the row count (3592) and a 0.0 null rate are known. With no unique count, type, or value distribution available, the column's content cannot be characterised from this evidence alone. The name suggests a secondary stratification key paired with a primary Stratification1 field, but that is not confirmed by the stats. Treatment: Re-profile or inspect raw values before deciding; do not use until kind and cardinality are established.
- n
- 3,592
- nulls
- 0 (0.0%)
- unique
- —
CategoryID
categorical metadata imbalanceCategoryID is a categorical column that carries no information: every one of the 3592 rows holds the single value "DISEST", giving cardinality 1 and entropy 0. It likely encodes a fixed dataset-level tag or filter rather than a per-row attribute. Treatment: Drop; constant column with zero variance.
- n
- 3,592
- nulls
- 0 (0.0%)
- unique
- 1
- top_value
- DISEST
- top_rate
- 1
- cardinality
- 1
- entropy
- 0
- entropy_ratio
- 0
IndicatorID
categorical metadata imbalanceIndicatorID is a categorical column that holds the single value "STATTYPE" across all 3592 rows, with zero nulls and cardinality of 1. Entropy is 0.0, so the field carries no information and likely functions as a constant tag identifying the indicator type for this slice of the dataset. Treatment: Drop before modelling; constant column with no variance.
- n
- 3,592
- nulls
- 0 (0.0%)
- unique
- 1
- top_value
- STATTYPE
- top_rate
- 1
- cardinality
- 1
- entropy
- 0
- entropy_ratio
- 0
LocationID
numeric foreign_keyLocationID is almost certainly a categorical location key encoded as integers, with 65 distinct values across 3592 rows and no nulls. Values range from 1 to 89 with a median of 36 and mild positive skew (0.50), consistent with an ID lookup rather than a measured quantity. Treating it as numeric would be misleading despite its int dtype. Treatment: Cast to categorical and left-join to a location lookup table rather than using as a numeric feature.
- n
- 3,592
- nulls
- 0 (0.0%)
- unique
- 65
- min
- 1
- max
- 89
- mean
- 39.69
- median
- 36
- std
- 25.34
- q1
- 20
- q3
- 54
- iqr
- 34
- skew
- 0.5048
- kurtosis
- -0.7622
- n_outliers
- 0
- outlier_rate
- 0
- zero_rate
- 0
ResponseID
categorical featureResponseID holds 8 distinct codes (Q6COG, Q6DIS2, Q6MOB, Q6IND, Q6DIS1, Q6VIS, Q6SEL, Q6HEAR), each appearing exactly 449 times across 3592 rows with no nulls. The perfectly uniform distribution and entropy ratio of 1.0 indicate this is a question/disability-domain identifier replicated per respondent rather than a unique response key. Despite the name, it behaves as a categorical factor, not an identifier. Treatment: Treat as a categorical factor (one-hot or group-by key); do not use as a unique row id.
- n
- 3,592
- nulls
- 0 (0.0%)
- unique
- 8
- top_value
- Q6COG
- top_rate
- 0.125
- cardinality
- 8
- entropy
- 3
- entropy_ratio
- 1
DataValueTypeID
categorical metadata imbalanceDataValueTypeID is a categorical metadata field indicating the type of statistical measure reported, but every one of the 3592 rows carries the single value 'AGEADJPREV' (age-adjusted prevalence). Cardinality is 1 and entropy is 0, so the column carries no information for modelling or filtering. Treatment: Drop; constant column with no variance.
- n
- 3,592
- nulls
- 0 (0.0%)
- unique
- 1
- top_value
- AGEADJPREV
- top_rate
- 1
- cardinality
- 1
- entropy
- 0
- entropy_ratio
- 0
StratificationCategoryID1
categorical metadata imbalanceThis column holds a single constant value "CAT1" across all 3592 rows, with zero nulls and cardinality of 1. Entropy is 0, meaning it carries no information for any downstream task. The name suggests it was meant to identify a stratification category, but only one category is represented in this slice. Treatment: Drop; constant column with no variance.
- n
- 3,592
- nulls
- 0 (0.0%)
- unique
- 1
- top_value
- CAT1
- top_rate
- 1
- cardinality
- 1
- entropy
- 0
- entropy_ratio
- 0
StratificationID1
categorical metadata imbalanceThis column holds a single constant value 'BO1' across all 3592 rows, with cardinality 1 and entropy 0. As a 'StratificationID1' it likely encodes a stratification dimension (e.g., overall/total) that was never varied in this slice. It carries no information for modelling or grouping. Treatment: Drop; constant column with zero entropy.
- n
- 3,592
- nulls
- 0 (0.0%)
- unique
- 1
- top_value
- BO1
- top_rate
- 1
- cardinality
- 1
- entropy
- 0
- entropy_ratio
- 0
StratificationCategoryID2
unknown other skippedThis column is named StratificationCategoryID2, suggesting it holds a secondary stratification category identifier in a public-health style dataset. Saturn skipped profiling, so no uniqueness, value, or distribution stats are available beyond a row count of 3592 and a null rate of 0.0. Without further signals, its actual content and cardinality cannot be characterised here. Treatment: Re-profile with type coercion to confirm whether this is a categorical key before use.
- n
- 3,592
- nulls
- 0 (0.0%)
- unique
- —
StratificationID2
unknown foreign_key skippedStratificationID2 was skipped by the profiler, so its kind, uniqueness, and value distribution are unknown. The only confirmed signals are that it has 3592 rows and a null rate of 0.0. The name suggests a secondary stratification key (e.g., demographic subgroup) commonly paired with a StratificationCategoryID2 in CDC-style indicator tables. Treatment: Re-profile the column to determine cardinality, then treat as a categorical join key against its stratification lookup.
- n
- 3,592
- nulls
- 0 (0.0%)
- unique
- —