saturn·

vizwiz val annotations

source /home/coolhand/html/datavis/data_trove/cache/vizwiz_val_annotations.json 4,319 rows 5 columns profiled 2026-05-01 raw JSON static .html .ipynb Report Notebook

Reading

dataset summary · high confidence anthropic:claude-opus-4-7

This dataset contains 4,319 rows from the VizWiz validation annotations, structured around image filenames, the questions asked about each image, the answers, an answer_type label, and an answerable flag. The questions column is the most interesting: about 35% are duplicates, with 'What is this?' alone appearing 523 times, suggesting a heavy concentration of generic identification queries. Answer_type is dominated by 'other' (62%) and 'unanswerable' (32%), and the answerable flag confirms that roughly 32% of items are flagged as not answerable — a key signal for any downstream modeling. The image column is uniquely identifying per row and not worth deeper analysis, while the answers column was skipped by the profiler.

citing: row_count · columns.question.stats.duplicate_rate · columns.question.top_values · columns.answer_type.top_values · columns.answer_type.stats.top_rate · columns.answerable.stats.mean · columns.answerable.stats.zero_rate · columns.question.language_counts

Schema

5 columns
Per-column summary. Click column name to jump to its detail.
Alerts
image text 0.0% 4,319
near_unique one_word
question text 0.0% 2,798
multilingual duplicates
answers unknown 0.0%
skipped
answer_type categorical 0.0% 4
answerable numeric 0.0% 2

image

text identifier near_unique one_word
This column holds image filenames following the pattern `vizwiz_val_########.jpg`, with all 4319 values being unique single tokens of exactly 23 characters. There are no nulls, duplicates, or vocabulary variation — every row maps one-to-one to a distinct image in what appears to be the VizWiz validation split. The negative Flesch score is an artifact of scoring filenames as prose and can be ignored. Treatment: Use as a foreign key to load the corresponding image file; do not feed as text to a model. high · anthropic:claude-opus-4-7
n
4,319
nulls
0 (0.0%)
unique
4,319
len_min
23
len_max
23
len_mean
23
len_median
23
len_p95
23
word_mean
1
word_median
1
n_empty
0
n_duplicates
0
duplicate_rate
0
vocab_size
4,319
readability_flesch_mean
-47.98
emoji_rate
0
url_rate
0
one_word_rate
1
allcaps_rate
0
boilerplate_rate
0

question

text free_text multilingual duplicates
Short natural-language questions, mostly English (4308/4319) and overwhelmingly identification prompts — "What is this?" alone appears 523 times and the top 10 values are all generic "what is/color/says" queries. Heavy duplication (35.2%, 1521 rows) and a small vocab (2779 unique words across 4319 rows) suggest a VQA-style prompt set rather than diverse free text. Mean length is 35 chars / 7.3 words with very high Flesch readability (101.7), and a handful of non-English rows (es, la, it, fy, hu, ia, ast) introduce minor language drift. Treatment: Tokenize and embed for modelling; deduplicate or weight by frequency given the 35% duplicate rate. high · anthropic:claude-opus-4-7
n
4,319
nulls
0 (0.0%)
unique
2,798
len_min
7
len_max
264
len_mean
35.1
len_median
26
len_p95
95
word_mean
7.259
word_median
5
n_empty
0
n_duplicates
1,521
duplicate_rate
0.3522
vocab_size
2,779
readability_flesch_mean
101.7
emoji_rate
0
url_rate
0
one_word_rate
0
allcaps_rate
0.002547
boilerplate_rate
0.003473

answers

unknown other skipped
The column 'answers' was skipped by the profiler, so its kind is unknown and no descriptive statistics were computed. All 4319 rows are non-null, but uniqueness, type, and value distribution are unavailable. The name suggests it holds response content, likely structured (e.g., nested objects or arrays) which is why automatic profiling bailed out. Treatment: Inspect raw values manually and parse into a typed structure before further profiling. low · anthropic:claude-opus-4-7
n
4,319
nulls
0 (0.0%)
unique

answer_type

categorical label
Categorical label with only 4 distinct values across 4319 rows and no nulls, classifying answers into 'other', 'unanswerable', 'yes/no', and 'number'. The distribution is heavily imbalanced: 'other' covers 62.3% and 'unanswerable' another 1385 rows, while 'number' appears only 48 times. Entropy ratio of 0.61 confirms the skew toward the top two classes. Treatment: One-hot or integer-encode; consider class-weighting or stratified sampling given the imbalance toward 'other'. high · anthropic:claude-opus-4-7
n
4,319
nulls
0 (0.0%)
unique
4
top_value
other
top_rate
0.6231
cardinality
4
entropy
1.225
entropy_ratio
0.6127

answerable

numeric label
Binary 0/1 flag indicating whether a question is answerable, with 4319 rows and no nulls. Roughly 68% are marked answerable (mean 0.6793) and 32% are zeros, giving a moderate class imbalance toward the positive class. Only two unique values confirm this is a clean indicator rather than a probability score. Treatment: Use directly as a binary target; account for the ~68/32 class imbalance during training or evaluation. high · anthropic:claude-opus-4-7
n
4,319
nulls
0 (0.0%)
unique
2
min
0
max
1
mean
0.6793
median
1
std
0.4668
q1
0
q3
1
iqr
1
skew
-0.7684
kurtosis
-1.41
n_outliers
0
outlier_rate
0
zero_rate
0.3207