vizwiz val annotations
Reading
This dataset contains 4,319 rows from the VizWiz validation annotations, structured around image filenames, the questions asked about each image, the answers, an answer_type label, and an answerable flag. The questions column is the most interesting: about 35% are duplicates, with 'What is this?' alone appearing 523 times, suggesting a heavy concentration of generic identification queries. Answer_type is dominated by 'other' (62%) and 'unanswerable' (32%), and the answerable flag confirms that roughly 32% of items are flagged as not answerable — a key signal for any downstream modeling. The image column is uniquely identifying per row and not worth deeper analysis, while the answers column was skipped by the profiler.
citing: row_count · columns.question.stats.duplicate_rate · columns.question.top_values · columns.answer_type.top_values · columns.answer_type.stats.top_rate · columns.answerable.stats.mean · columns.answerable.stats.zero_rate · columns.question.language_counts
Charts the summary said to look at first
Show data table
| value | count | share |
|---|---|---|
| other | 2691 | 62.3% |
| unanswerable | 1385 | 32.1% |
| yes/no | 195 | 4.5% |
| number | 48 | 1.1% |
Show data table
| bin | count |
|---|---|
| 0 – 0.025 | 1385 |
| 0.025 – 0.05 | 0 |
| 0.05 – 0.075 | 0 |
| 0.075 – 0.1 | 0 |
| 0.1 – 0.125 | 0 |
| 0.125 – 0.15 | 0 |
| 0.15 – 0.175 | 0 |
| 0.175 – 0.2 | 0 |
| 0.2 – 0.225 | 0 |
| 0.225 – 0.25 | 0 |
| 0.25 – 0.275 | 0 |
| 0.275 – 0.3 | 0 |
| 0.3 – 0.325 | 0 |
| 0.325 – 0.35 | 0 |
| 0.35 – 0.375 | 0 |
| 0.375 – 0.4 | 0 |
| 0.4 – 0.425 | 0 |
| 0.425 – 0.45 | 0 |
| 0.45 – 0.475 | 0 |
| 0.475 – 0.5 | 0 |
| 0.5 – 0.525 | 0 |
| 0.525 – 0.55 | 0 |
| 0.55 – 0.575 | 0 |
| 0.575 – 0.6 | 0 |
| 0.6 – 0.625 | 0 |
| 0.625 – 0.65 | 0 |
| 0.65 – 0.675 | 0 |
| 0.675 – 0.7 | 0 |
| 0.7 – 0.725 | 0 |
| 0.725 – 0.75 | 0 |
| 0.75 – 0.775 | 0 |
| 0.775 – 0.8 | 0 |
| 0.8 – 0.825 | 0 |
| 0.825 – 0.85 | 0 |
| 0.85 – 0.875 | 0 |
| 0.875 – 0.9 | 0 |
| 0.9 – 0.925 | 0 |
| 0.925 – 0.95 | 0 |
| 0.95 – 0.975 | 0 |
| 0.975 – 1 | 2934 |
Show data table
| chars | count |
|---|---|
| 7 – 13 | 759 |
| 13 – 20 | 550 |
| 20 – 26 | 931 |
| 26 – 33 | 609 |
| 33 – 39 | 368 |
| 39 – 46 | 250 |
| 46 – 52 | 143 |
| 52 – 58 | 143 |
| 58 – 65 | 96 |
| 65 – 71 | 84 |
| 71 – 78 | 68 |
| 78 – 84 | 42 |
| 84 – 91 | 35 |
| 91 – 97 | 35 |
| 97 – 103 | 30 |
| 103 – 110 | 23 |
| 110 – 116 | 19 |
| 116 – 123 | 12 |
| 123 – 129 | 23 |
| 129 – 136 | 14 |
| 136 – 142 | 8 |
| 142 – 148 | 14 |
| 148 – 155 | 8 |
| 155 – 161 | 6 |
| 161 – 168 | 10 |
| 168 – 174 | 8 |
| 174 – 180 | 8 |
| 180 – 187 | 3 |
| 187 – 193 | 5 |
| 193 – 200 | 3 |
| 200 – 206 | 3 |
| 206 – 213 | 2 |
| 213 – 219 | 1 |
| 219 – 225 | 2 |
| 225 – 232 | 1 |
| 232 – 238 | 1 |
| 238 – 245 | 0 |
| 245 – 251 | 1 |
| 251 – 258 | 0 |
| 258 – 264 | 1 |
Show data table
| chars | count |
|---|---|
| 7 – 13 | 759 |
| 13 – 20 | 550 |
| 20 – 26 | 931 |
| 26 – 33 | 609 |
| 33 – 39 | 368 |
| 39 – 46 | 250 |
| 46 – 52 | 143 |
| 52 – 58 | 143 |
| 58 – 65 | 96 |
| 65 – 71 | 84 |
| 71 – 78 | 68 |
| 78 – 84 | 42 |
| 84 – 91 | 35 |
| 91 – 97 | 35 |
| 97 – 103 | 30 |
| 103 – 110 | 23 |
| 110 – 116 | 19 |
| 116 – 123 | 12 |
| 123 – 129 | 23 |
| 129 – 136 | 14 |
| 136 – 142 | 8 |
| 142 – 148 | 14 |
| 148 – 155 | 8 |
| 155 – 161 | 6 |
| 161 – 168 | 10 |
| 168 – 174 | 8 |
| 174 – 180 | 8 |
| 180 – 187 | 3 |
| 187 – 193 | 5 |
| 193 – 200 | 3 |
| 200 – 206 | 3 |
| 206 – 213 | 2 |
| 213 – 219 | 1 |
| 219 – 225 | 2 |
| 225 – 232 | 1 |
| 232 – 238 | 1 |
| 238 – 245 | 0 |
| 245 – 251 | 1 |
| 251 – 258 | 0 |
| 258 – 264 | 1 |
Schema
5 columns| Alerts | ||||
|---|---|---|---|---|
| image | text | 0.0% | 4,319 |
near_unique
one_word
|
| question | text | 0.0% | 2,798 |
multilingual
duplicates
|
| answers | unknown | 0.0% | — |
skipped
|
| answer_type | categorical | 0.0% | 4 |
|
| answerable | numeric | 0.0% | 2 |
|
image
text identifier near_unique one_wordThis column holds image filenames following the pattern `vizwiz_val_########.jpg`, with all 4319 values being unique single tokens of exactly 23 characters. There are no nulls, duplicates, or vocabulary variation — every row maps one-to-one to a distinct image in what appears to be the VizWiz validation split. The negative Flesch score is an artifact of scoring filenames as prose and can be ignored. Treatment: Use as a foreign key to load the corresponding image file; do not feed as text to a model.
- n
- 4,319
- nulls
- 0 (0.0%)
- unique
- 4,319
- len_min
- 23
- len_max
- 23
- len_mean
- 23
- len_median
- 23
- len_p95
- 23
- word_mean
- 1
- word_median
- 1
- n_empty
- 0
- n_duplicates
- 0
- duplicate_rate
- 0
- vocab_size
- 4,319
- readability_flesch_mean
- -47.98
- emoji_rate
- 0
- url_rate
- 0
- one_word_rate
- 1
- allcaps_rate
- 0
- boilerplate_rate
- 0
question
text free_text multilingual duplicatesShort natural-language questions, mostly English (4308/4319) and overwhelmingly identification prompts — "What is this?" alone appears 523 times and the top 10 values are all generic "what is/color/says" queries. Heavy duplication (35.2%, 1521 rows) and a small vocab (2779 unique words across 4319 rows) suggest a VQA-style prompt set rather than diverse free text. Mean length is 35 chars / 7.3 words with very high Flesch readability (101.7), and a handful of non-English rows (es, la, it, fy, hu, ia, ast) introduce minor language drift. Treatment: Tokenize and embed for modelling; deduplicate or weight by frequency given the 35% duplicate rate.
- n
- 4,319
- nulls
- 0 (0.0%)
- unique
- 2,798
- len_min
- 7
- len_max
- 264
- len_mean
- 35.1
- len_median
- 26
- len_p95
- 95
- word_mean
- 7.259
- word_median
- 5
- n_empty
- 0
- n_duplicates
- 1,521
- duplicate_rate
- 0.3522
- vocab_size
- 2,779
- readability_flesch_mean
- 101.7
- emoji_rate
- 0
- url_rate
- 0
- one_word_rate
- 0
- allcaps_rate
- 0.002547
- boilerplate_rate
- 0.003473
answers
unknown other skippedThe column 'answers' was skipped by the profiler, so its kind is unknown and no descriptive statistics were computed. All 4319 rows are non-null, but uniqueness, type, and value distribution are unavailable. The name suggests it holds response content, likely structured (e.g., nested objects or arrays) which is why automatic profiling bailed out. Treatment: Inspect raw values manually and parse into a typed structure before further profiling.
- n
- 4,319
- nulls
- 0 (0.0%)
- unique
- —
answer_type
categorical labelCategorical label with only 4 distinct values across 4319 rows and no nulls, classifying answers into 'other', 'unanswerable', 'yes/no', and 'number'. The distribution is heavily imbalanced: 'other' covers 62.3% and 'unanswerable' another 1385 rows, while 'number' appears only 48 times. Entropy ratio of 0.61 confirms the skew toward the top two classes. Treatment: One-hot or integer-encode; consider class-weighting or stratified sampling given the imbalance toward 'other'.
- n
- 4,319
- nulls
- 0 (0.0%)
- unique
- 4
- top_value
- other
- top_rate
- 0.6231
- cardinality
- 4
- entropy
- 1.225
- entropy_ratio
- 0.6127
answerable
numeric labelBinary 0/1 flag indicating whether a question is answerable, with 4319 rows and no nulls. Roughly 68% are marked answerable (mean 0.6793) and 32% are zeros, giving a moderate class imbalance toward the positive class. Only two unique values confirm this is a clean indicator rather than a probability score. Treatment: Use directly as a binary target; account for the ~68/32 class imbalance during training or evaluation.
- n
- 4,319
- nulls
- 0 (0.0%)
- unique
- 2
- min
- 0
- max
- 1
- mean
- 0.6793
- median
- 1
- std
- 0.4668
- q1
- 0
- q3
- 1
- iqr
- 1
- skew
- -0.7684
- kurtosis
- -1.41
- n_outliers
- 0
- outlier_rate
- 0
- zero_rate
- 0.3207