accessibility atlas vizwiz val annotations

source /home/coolhand/datasets/accessibility-atlas/vizwiz_val_annotations.csv 4,319 rows 5 columns profiled 2026-05-01 raw JSON static .html .ipynb Report Notebook

Reading

dataset summary · high confidence anthropic:claude-opus-4-7

This dataset is the VizWiz validation annotations file with 4,319 rows and 5 columns: an image filename, a question, a set of crowd answers, an answer_type label, and a binary answerable flag. The questions are dominated by a small number of generic openers — 'What is this?' alone accounts for 523 rows and questions have a 35% duplicate rate, so visual variety hides behind repeated prompts. Answer_type is heavily skewed: 'other' covers 62% of rows and 'unanswerable' another 1,385, while 'yes/no' and 'number' are rare. Consistent with that, the answerable flag has a mean of 0.68, meaning roughly 32% of items are flagged unanswerable — a notable share to inspect before modeling. The answers column is a serialized list of dicts (long strings averaging ~560 characters) and will need parsing rather than direct text analysis.

citing: row_count · column_count · columns.question.stats.duplicate_rate · columns.question.top_values · columns.answer_type.top_values · columns.answer_type.stats.top_rate · columns.answerable.stats.mean · columns.answerable.stats.zero_rate · columns.answers.stats.len_mean

Charts the summary said to look at first

answer_type · Shows how 'other' dominates and 'unanswerable' is the clear second category.

Show data table

Top values for answer_type (4 unique shown, of 4 total).
value	count	share
other	2691	62.3%
unanswerable	1385	32.1%
yes/no	195	4.5%
number	48	1.1%

answerable · Visualizes the ~68/32 split between answerable and unanswerable items.

Show data table

Histogram bins for answerable (median: 1.0).
bin	count
0 – 0.025	1385
0.025 – 0.05	0
0.05 – 0.075	0
0.075 – 0.1	0
0.1 – 0.125	0
0.125 – 0.15	0
0.15 – 0.175	0
0.175 – 0.2	0
0.2 – 0.225	0
0.225 – 0.25	0
0.25 – 0.275	0
0.275 – 0.3	0
0.3 – 0.325	0
0.325 – 0.35	0
0.35 – 0.375	0
0.375 – 0.4	0
0.4 – 0.425	0
0.425 – 0.45	0
0.45 – 0.475	0
0.475 – 0.5	0
0.5 – 0.525	0
0.525 – 0.55	0
0.55 – 0.575	0
0.575 – 0.6	0
0.6 – 0.625	0
0.625 – 0.65	0
0.65 – 0.675	0
0.675 – 0.7	0
0.7 – 0.725	0
0.725 – 0.75	0
0.75 – 0.775	0
0.775 – 0.8	0
0.8 – 0.825	0
0.825 – 0.85	0
0.85 – 0.875	0
0.875 – 0.9	0
0.9 – 0.925	0
0.925 – 0.95	0
0.95 – 0.975	0
0.975 – 1	2934

question · Highlights the most repeated prompts, led by 'What is this?' at 523 occurrences.

Show data table

Character-length distribution for question (mean: 35.10141236397314).
chars	count
7 – 13	759
13 – 20	550
20 – 26	931
26 – 33	609
33 – 39	368
39 – 46	250
46 – 52	143
52 – 58	143
58 – 65	96
65 – 71	84
71 – 78	68
78 – 84	42
84 – 91	35
91 – 97	35
97 – 103	30
103 – 110	23
110 – 116	19
116 – 123	12
123 – 129	23
129 – 136	14
136 – 142	8
142 – 148	14
148 – 155	8
155 – 161	6
161 – 168	10
168 – 174	8
174 – 180	8
180 – 187	3
187 – 193	5
193 – 200	3
200 – 206	3
206 – 213	2
213 – 219	1
219 – 225	2
225 – 232	1
232 – 238	1
238 – 245	0
245 – 251	1
251 – 258	0
258 – 264	1

question · Reveals that most questions are short (median 26 chars) with a long tail up to 264.

Show data table

Character-length distribution for question (mean: 35.10141236397314).
chars	count
7 – 13	759
13 – 20	550
20 – 26	931
26 – 33	609
33 – 39	368
39 – 46	250
46 – 52	143
52 – 58	143
58 – 65	96
65 – 71	84
71 – 78	68
78 – 84	42
84 – 91	35
91 – 97	35
97 – 103	30
103 – 110	23
110 – 116	19
116 – 123	12
123 – 129	23
129 – 136	14
136 – 142	8
142 – 148	14
148 – 155	8
155 – 161	6
161 – 168	10
168 – 174	8
174 – 180	8
180 – 187	3
187 – 193	5
193 – 200	3
200 – 206	3
206 – 213	2
213 – 219	1
219 – 225	2
225 – 232	1
232 – 238	1
238 – 245	0
245 – 251	1
251 – 258	0
258 – 264	1

answers · Confirms answers are long serialized structures (~560 chars) needing parsing before use.

Show data table

Character-length distribution for answers (mean: 559.675387821255).
chars	count
450 – 462	14
462 – 474	71
474 – 486	126
486 – 498	175
498 – 510	228
510 – 522	279
522 – 535	464
535 – 547	598
547 – 559	585
559 – 571	369
571 – 583	330
583 – 595	282
595 – 607	212
607 – 619	133
619 – 631	91
631 – 643	72
643 – 655	54
655 – 667	44
667 – 679	38
679 – 692	24
692 – 704	28
704 – 716	18
716 – 728	18
728 – 740	6
740 – 752	10
752 – 764	8
764 – 776	10
776 – 788	3
788 – 800	7
800 – 812	4
812 – 824	5
824 – 836	2
836 – 848	2
848 – 861	1
861 – 873	2
873 – 885	1
885 – 897	1
897 – 909	0
909 – 921	2
921 – 933	2

Schema

5 columns

Per-column summary. Click column name to jump to its detail.
				Alerts
image	text	0.0%	4,319	near_unique one_word
question	text	0.0%	2,798	multilingual duplicates
answers	text	0.0%	4,295	near_unique
answer_type	categorical	0.0%	4
answerable	numeric	0.0%	2

image

text identifier near_unique one_word

This column holds image filenames following a fixed `vizwiz_val_########.jpg` pattern, with all 4319 values unique and exactly 23 characters long. It functions as a per-row image identifier rather than analysable text — vocab_size equals n, one_word_rate is 1.0, and there are no duplicates or nulls. The negative Flesch score (-47.98) is an artifact of scoring filenames as prose and should be ignored. Treatment: Use as a key to join image features or load pixel data; do not treat as text. high · anthropic:claude-opus-4-7

n: 4,319
nulls: 0 (0.0%)
unique: 4,319
len_min: 23
len_max: 23
len_mean: 23
len_median: 23
len_p95: 23
word_mean: 1
word_median: 1
n_empty: 0
n_duplicates: 0
duplicate_rate: 0
vocab_size: 4,319
readability_flesch_mean: -47.98
emoji_rate: 0
url_rate: 0
one_word_rate: 1
allcaps_rate: 0
boilerplate_rate: 0

question

text free_text multilingual duplicates

Short natural-language questions, predominantly English (4308/4319) with a handful of other-language detections, averaging 7.26 words and 35.1 characters. The column is heavily repetitive: 35.2% duplicate rate with 1521 duplicates, and the single string "What is this?" alone accounts for 523 of 4319 rows. Vocabulary is small (2779 unique tokens) and dominated by interrogatives like "what", "is", "this", consistent with VQA-style prompts directed at images or objects. Treatment: Tokenize and embed as a text feature; expect heavy duplication so consider pairing with the associated image/context rather than treating as a standalone signal. high · anthropic:claude-opus-4-7

n: 4,319
nulls: 0 (0.0%)
unique: 2,798
len_min: 7
len_max: 264
len_mean: 35.1
len_median: 26
len_p95: 95
word_mean: 7.259
word_median: 5
n_empty: 0
n_duplicates: 1,521
duplicate_rate: 0.3522
vocab_size: 2,779
readability_flesch_mean: 101.7
emoji_rate: 0
url_rate: 0
one_word_rate: 0
allcaps_rate: 0.002547
boilerplate_rate: 0.003473

answers

text free_text near_unique

This column holds serialized lists of answer dictionaries (each containing 'answer' and 'answer_confidence' keys with values like 'yes', 'maybe', 'unanswerable', 'unsuitable'), not raw natural-language text. Lengths are tightly bounded (min 450, max 933, median 550 chars) and 4295 of 4319 rows are unique, with only 24 duplicates flagged as near_unique. The strongly negative Flesch score (-56.5) confirms this is structured/JSON-like content rather than prose. Treatment: Parse the literal dict/list structure and explode answer and answer_confidence into separate fields before modelling. high · anthropic:claude-opus-4-7

n: 4,319
nulls: 0 (0.0%)
unique: 4,295
len_min: 450
len_max: 933
len_mean: 559.7
len_median: 550
len_p95: 660.1
word_mean: 47.66
word_median: 45
n_empty: 0
n_duplicates: 24
duplicate_rate: 0.005557
vocab_size: 11,308
readability_flesch_mean: -56.5
emoji_rate: 0
url_rate: 0
one_word_rate: 0
allcaps_rate: 0
boilerplate_rate: 0

answer_type

categorical label

Categorical label tagging the type of answer for each row, with only 4 distinct values and no nulls across 4319 rows. The distribution is heavily skewed: 'other' covers 62.3% (2691) and 'unanswerable' another 1385, while 'yes/no' (195) and 'number' (48) are rare. Entropy ratio of 0.61 confirms the imbalance, which will matter for any stratification or class-weighted modelling. Treatment: One-hot encode and apply class weighting or stratified sampling to handle the imbalance. high · anthropic:claude-opus-4-7

n: 4,319
nulls: 0 (0.0%)
unique: 4
top_value: other
top_rate: 0.6231
cardinality: 4
entropy: 1.225
entropy_ratio: 0.6127

answerable

numeric label

Binary 0/1 flag, almost certainly indicating whether a question is answerable. About 67.9% of rows are 1 and 32.1% are 0, with no nulls across 4319 rows. The class imbalance is moderate but worth noting for any classifier trained on this label. Treatment: Use directly as a binary target; account for the ~68/32 class imbalance during training. high · anthropic:claude-opus-4-7

n: 4,319
nulls: 0 (0.0%)
unique: 2
min: 0
max: 1
mean: 0.6793
median: 1
std: 0.4668
q1: 0
q3: 1
iqr: 1
skew: -0.7684
kurtosis: -1.41
n_outliers: 0
outlier_rate: 0
zero_rate: 0.3207