accessibility atlas vizwiz val annotations
Reading
This dataset is the VizWiz validation annotations file with 4,319 rows and 5 columns: an image filename, a question, a set of crowd answers, an answer_type label, and a binary answerable flag. The questions are dominated by a small number of generic openers — 'What is this?' alone accounts for 523 rows and questions have a 35% duplicate rate, so visual variety hides behind repeated prompts. Answer_type is heavily skewed: 'other' covers 62% of rows and 'unanswerable' another 1,385, while 'yes/no' and 'number' are rare. Consistent with that, the answerable flag has a mean of 0.68, meaning roughly 32% of items are flagged unanswerable — a notable share to inspect before modeling. The answers column is a serialized list of dicts (long strings averaging ~560 characters) and will need parsing rather than direct text analysis.
citing: row_count · column_count · columns.question.stats.duplicate_rate · columns.question.top_values · columns.answer_type.top_values · columns.answer_type.stats.top_rate · columns.answerable.stats.mean · columns.answerable.stats.zero_rate · columns.answers.stats.len_mean
Charts the summary said to look at first
Show data table
| value | count | share |
|---|---|---|
| other | 2691 | 62.3% |
| unanswerable | 1385 | 32.1% |
| yes/no | 195 | 4.5% |
| number | 48 | 1.1% |
Show data table
| bin | count |
|---|---|
| 0 – 0.025 | 1385 |
| 0.025 – 0.05 | 0 |
| 0.05 – 0.075 | 0 |
| 0.075 – 0.1 | 0 |
| 0.1 – 0.125 | 0 |
| 0.125 – 0.15 | 0 |
| 0.15 – 0.175 | 0 |
| 0.175 – 0.2 | 0 |
| 0.2 – 0.225 | 0 |
| 0.225 – 0.25 | 0 |
| 0.25 – 0.275 | 0 |
| 0.275 – 0.3 | 0 |
| 0.3 – 0.325 | 0 |
| 0.325 – 0.35 | 0 |
| 0.35 – 0.375 | 0 |
| 0.375 – 0.4 | 0 |
| 0.4 – 0.425 | 0 |
| 0.425 – 0.45 | 0 |
| 0.45 – 0.475 | 0 |
| 0.475 – 0.5 | 0 |
| 0.5 – 0.525 | 0 |
| 0.525 – 0.55 | 0 |
| 0.55 – 0.575 | 0 |
| 0.575 – 0.6 | 0 |
| 0.6 – 0.625 | 0 |
| 0.625 – 0.65 | 0 |
| 0.65 – 0.675 | 0 |
| 0.675 – 0.7 | 0 |
| 0.7 – 0.725 | 0 |
| 0.725 – 0.75 | 0 |
| 0.75 – 0.775 | 0 |
| 0.775 – 0.8 | 0 |
| 0.8 – 0.825 | 0 |
| 0.825 – 0.85 | 0 |
| 0.85 – 0.875 | 0 |
| 0.875 – 0.9 | 0 |
| 0.9 – 0.925 | 0 |
| 0.925 – 0.95 | 0 |
| 0.95 – 0.975 | 0 |
| 0.975 – 1 | 2934 |
Show data table
| chars | count |
|---|---|
| 7 – 13 | 759 |
| 13 – 20 | 550 |
| 20 – 26 | 931 |
| 26 – 33 | 609 |
| 33 – 39 | 368 |
| 39 – 46 | 250 |
| 46 – 52 | 143 |
| 52 – 58 | 143 |
| 58 – 65 | 96 |
| 65 – 71 | 84 |
| 71 – 78 | 68 |
| 78 – 84 | 42 |
| 84 – 91 | 35 |
| 91 – 97 | 35 |
| 97 – 103 | 30 |
| 103 – 110 | 23 |
| 110 – 116 | 19 |
| 116 – 123 | 12 |
| 123 – 129 | 23 |
| 129 – 136 | 14 |
| 136 – 142 | 8 |
| 142 – 148 | 14 |
| 148 – 155 | 8 |
| 155 – 161 | 6 |
| 161 – 168 | 10 |
| 168 – 174 | 8 |
| 174 – 180 | 8 |
| 180 – 187 | 3 |
| 187 – 193 | 5 |
| 193 – 200 | 3 |
| 200 – 206 | 3 |
| 206 – 213 | 2 |
| 213 – 219 | 1 |
| 219 – 225 | 2 |
| 225 – 232 | 1 |
| 232 – 238 | 1 |
| 238 – 245 | 0 |
| 245 – 251 | 1 |
| 251 – 258 | 0 |
| 258 – 264 | 1 |
Show data table
| chars | count |
|---|---|
| 7 – 13 | 759 |
| 13 – 20 | 550 |
| 20 – 26 | 931 |
| 26 – 33 | 609 |
| 33 – 39 | 368 |
| 39 – 46 | 250 |
| 46 – 52 | 143 |
| 52 – 58 | 143 |
| 58 – 65 | 96 |
| 65 – 71 | 84 |
| 71 – 78 | 68 |
| 78 – 84 | 42 |
| 84 – 91 | 35 |
| 91 – 97 | 35 |
| 97 – 103 | 30 |
| 103 – 110 | 23 |
| 110 – 116 | 19 |
| 116 – 123 | 12 |
| 123 – 129 | 23 |
| 129 – 136 | 14 |
| 136 – 142 | 8 |
| 142 – 148 | 14 |
| 148 – 155 | 8 |
| 155 – 161 | 6 |
| 161 – 168 | 10 |
| 168 – 174 | 8 |
| 174 – 180 | 8 |
| 180 – 187 | 3 |
| 187 – 193 | 5 |
| 193 – 200 | 3 |
| 200 – 206 | 3 |
| 206 – 213 | 2 |
| 213 – 219 | 1 |
| 219 – 225 | 2 |
| 225 – 232 | 1 |
| 232 – 238 | 1 |
| 238 – 245 | 0 |
| 245 – 251 | 1 |
| 251 – 258 | 0 |
| 258 – 264 | 1 |
Show data table
| chars | count |
|---|---|
| 450 – 462 | 14 |
| 462 – 474 | 71 |
| 474 – 486 | 126 |
| 486 – 498 | 175 |
| 498 – 510 | 228 |
| 510 – 522 | 279 |
| 522 – 535 | 464 |
| 535 – 547 | 598 |
| 547 – 559 | 585 |
| 559 – 571 | 369 |
| 571 – 583 | 330 |
| 583 – 595 | 282 |
| 595 – 607 | 212 |
| 607 – 619 | 133 |
| 619 – 631 | 91 |
| 631 – 643 | 72 |
| 643 – 655 | 54 |
| 655 – 667 | 44 |
| 667 – 679 | 38 |
| 679 – 692 | 24 |
| 692 – 704 | 28 |
| 704 – 716 | 18 |
| 716 – 728 | 18 |
| 728 – 740 | 6 |
| 740 – 752 | 10 |
| 752 – 764 | 8 |
| 764 – 776 | 10 |
| 776 – 788 | 3 |
| 788 – 800 | 7 |
| 800 – 812 | 4 |
| 812 – 824 | 5 |
| 824 – 836 | 2 |
| 836 – 848 | 2 |
| 848 – 861 | 1 |
| 861 – 873 | 2 |
| 873 – 885 | 1 |
| 885 – 897 | 1 |
| 897 – 909 | 0 |
| 909 – 921 | 2 |
| 921 – 933 | 2 |
Schema
5 columns| Alerts | ||||
|---|---|---|---|---|
| image | text | 0.0% | 4,319 |
near_unique
one_word
|
| question | text | 0.0% | 2,798 |
multilingual
duplicates
|
| answers | text | 0.0% | 4,295 |
near_unique
|
| answer_type | categorical | 0.0% | 4 |
|
| answerable | numeric | 0.0% | 2 |
|
image
text identifier near_unique one_wordThis column holds image filenames following a fixed `vizwiz_val_########.jpg` pattern, with all 4319 values unique and exactly 23 characters long. It functions as a per-row image identifier rather than analysable text — vocab_size equals n, one_word_rate is 1.0, and there are no duplicates or nulls. The negative Flesch score (-47.98) is an artifact of scoring filenames as prose and should be ignored. Treatment: Use as a key to join image features or load pixel data; do not treat as text.
- n
- 4,319
- nulls
- 0 (0.0%)
- unique
- 4,319
- len_min
- 23
- len_max
- 23
- len_mean
- 23
- len_median
- 23
- len_p95
- 23
- word_mean
- 1
- word_median
- 1
- n_empty
- 0
- n_duplicates
- 0
- duplicate_rate
- 0
- vocab_size
- 4,319
- readability_flesch_mean
- -47.98
- emoji_rate
- 0
- url_rate
- 0
- one_word_rate
- 1
- allcaps_rate
- 0
- boilerplate_rate
- 0
question
text free_text multilingual duplicatesShort natural-language questions, predominantly English (4308/4319) with a handful of other-language detections, averaging 7.26 words and 35.1 characters. The column is heavily repetitive: 35.2% duplicate rate with 1521 duplicates, and the single string "What is this?" alone accounts for 523 of 4319 rows. Vocabulary is small (2779 unique tokens) and dominated by interrogatives like "what", "is", "this", consistent with VQA-style prompts directed at images or objects. Treatment: Tokenize and embed as a text feature; expect heavy duplication so consider pairing with the associated image/context rather than treating as a standalone signal.
- n
- 4,319
- nulls
- 0 (0.0%)
- unique
- 2,798
- len_min
- 7
- len_max
- 264
- len_mean
- 35.1
- len_median
- 26
- len_p95
- 95
- word_mean
- 7.259
- word_median
- 5
- n_empty
- 0
- n_duplicates
- 1,521
- duplicate_rate
- 0.3522
- vocab_size
- 2,779
- readability_flesch_mean
- 101.7
- emoji_rate
- 0
- url_rate
- 0
- one_word_rate
- 0
- allcaps_rate
- 0.002547
- boilerplate_rate
- 0.003473
answers
text free_text near_uniqueThis column holds serialized lists of answer dictionaries (each containing 'answer' and 'answer_confidence' keys with values like 'yes', 'maybe', 'unanswerable', 'unsuitable'), not raw natural-language text. Lengths are tightly bounded (min 450, max 933, median 550 chars) and 4295 of 4319 rows are unique, with only 24 duplicates flagged as near_unique. The strongly negative Flesch score (-56.5) confirms this is structured/JSON-like content rather than prose. Treatment: Parse the literal dict/list structure and explode answer and answer_confidence into separate fields before modelling.
- n
- 4,319
- nulls
- 0 (0.0%)
- unique
- 4,295
- len_min
- 450
- len_max
- 933
- len_mean
- 559.7
- len_median
- 550
- len_p95
- 660.1
- word_mean
- 47.66
- word_median
- 45
- n_empty
- 0
- n_duplicates
- 24
- duplicate_rate
- 0.005557
- vocab_size
- 11,308
- readability_flesch_mean
- -56.5
- emoji_rate
- 0
- url_rate
- 0
- one_word_rate
- 0
- allcaps_rate
- 0
- boilerplate_rate
- 0
answer_type
categorical labelCategorical label tagging the type of answer for each row, with only 4 distinct values and no nulls across 4319 rows. The distribution is heavily skewed: 'other' covers 62.3% (2691) and 'unanswerable' another 1385, while 'yes/no' (195) and 'number' (48) are rare. Entropy ratio of 0.61 confirms the imbalance, which will matter for any stratification or class-weighted modelling. Treatment: One-hot encode and apply class weighting or stratified sampling to handle the imbalance.
- n
- 4,319
- nulls
- 0 (0.0%)
- unique
- 4
- top_value
- other
- top_rate
- 0.6231
- cardinality
- 4
- entropy
- 1.225
- entropy_ratio
- 0.6127
answerable
numeric labelBinary 0/1 flag, almost certainly indicating whether a question is answerable. About 67.9% of rows are 1 and 32.1% are 0, with no nulls across 4319 rows. The class imbalance is moderate but worth noting for any classifier trained on this label. Treatment: Use directly as a binary target; account for the ~68/32 class imbalance during training.
- n
- 4,319
- nulls
- 0 (0.0%)
- unique
- 2
- min
- 0
- max
- 1
- mean
- 0.6793
- median
- 1
- std
- 0.4668
- q1
- 0
- q3
- 1
- iqr
- 1
- skew
- -0.7684
- kurtosis
- -1.41
- n_outliers
- 0
- outlier_rate
- 0
- zero_rate
- 0.3207