saturn·

accessibility atlas vizwiz val annotations

saturn notebook · generated 2026-05-01 Report Notebook

Overview

Source: /home/coolhand/datasets/accessibility-atlas/vizwiz_val_annotations.csv

Saturn profiled 4,319 rows across 5 columns. The stats below are deterministic and machine-readable; the prose is a language-model interpretation of those stats (opt-in, added after the fact, never sees raw rows).

[2]:
!pip install saturn-dissect
import subprocess
subprocess.run([
    "saturn", "analyze", "/home/coolhand/datasets/accessibility-atlas/vizwiz_val_annotations.csv",
    "--findings", "accessibility-atlas-vizwiz_val_annotations.json",
    "--llm", "anthropic:claude-opus-4-7",
])

Summary confidence: high

This dataset is the VizWiz validation annotations file with 4,319 rows and 5 columns: an image filename, a question, a set of crowd answers, an answer_type label, and a binary answerable flag. The questions are dominated by a small number of generic openers — 'What is this?' alone accounts for 523 rows and questions have a 35% duplicate rate, so visual variety hides behind repeated prompts. Answer_type is heavily skewed: 'other' covers 62% of rows and 'unanswerable' another 1,385, while 'yes/no' and 'number' are rare. Consistent with that, the answerable flag has a mean of 0.68, meaning roughly 32% of items are flagged unanswerable — a notable share to inspect before modeling. The answers column is a serialized list of dicts (long strings averaging ~560 characters) and will need parsing rather than direct text analysis.

citing: row_count · column_count · columns.question.stats.duplicate_rate · columns.question.top_values · columns.answer_type.top_values · columns.answer_type.stats.top_rate · columns.answerable.stats.mean · columns.answerable.stats.zero_rate · columns.answers.stats.len_mean

Out[4]:

saturn.schema() · 5 columns

column kind n null% unique alerts
image text 4,319 0.0% 4,319 near_unique one_word
question text 4,319 0.0% 2,798 multilingual duplicates
answers text 4,319 0.0% 4,295 near_unique
answer_type categorical 4,319 0.0% 4
answerable numeric 4,319 0.0% 2
Fig 1.
answer_type · Shows how 'other' dominates and 'unanswerable' is the clear second category.
Show data table
Top values for answer_type (4 unique shown, of 4 total).
valuecountshare
other269162.3%
unanswerable138532.1%
yes/no1954.5%
number481.1%
Fig 2.
answerable · Visualizes the ~68/32 split between answerable and unanswerable items.
Show data table
Histogram bins for answerable (median: 1.0).
bincount
0 – 0.0251385
0.025 – 0.050
0.05 – 0.0750
0.075 – 0.10
0.1 – 0.1250
0.125 – 0.150
0.15 – 0.1750
0.175 – 0.20
0.2 – 0.2250
0.225 – 0.250
0.25 – 0.2750
0.275 – 0.30
0.3 – 0.3250
0.325 – 0.350
0.35 – 0.3750
0.375 – 0.40
0.4 – 0.4250
0.425 – 0.450
0.45 – 0.4750
0.475 – 0.50
0.5 – 0.5250
0.525 – 0.550
0.55 – 0.5750
0.575 – 0.60
0.6 – 0.6250
0.625 – 0.650
0.65 – 0.6750
0.675 – 0.70
0.7 – 0.7250
0.725 – 0.750
0.75 – 0.7750
0.775 – 0.80
0.8 – 0.8250
0.825 – 0.850
0.85 – 0.8750
0.875 – 0.90
0.9 – 0.9250
0.925 – 0.950
0.95 – 0.9750
0.975 – 12934
Fig 3.
question · Highlights the most repeated prompts, led by 'What is this?' at 523 occurrences.
Show data table
Character-length distribution for question (mean: 35.10141236397314).
charscount
7 – 13759
13 – 20550
20 – 26931
26 – 33609
33 – 39368
39 – 46250
46 – 52143
52 – 58143
58 – 6596
65 – 7184
71 – 7868
78 – 8442
84 – 9135
91 – 9735
97 – 10330
103 – 11023
110 – 11619
116 – 12312
123 – 12923
129 – 13614
136 – 1428
142 – 14814
148 – 1558
155 – 1616
161 – 16810
168 – 1748
174 – 1808
180 – 1873
187 – 1935
193 – 2003
200 – 2063
206 – 2132
213 – 2191
219 – 2252
225 – 2321
232 – 2381
238 – 2450
245 – 2511
251 – 2580
258 – 2641
Fig 4.
question · Reveals that most questions are short (median 26 chars) with a long tail up to 264.
Show data table
Character-length distribution for question (mean: 35.10141236397314).
charscount
7 – 13759
13 – 20550
20 – 26931
26 – 33609
33 – 39368
39 – 46250
46 – 52143
52 – 58143
58 – 6596
65 – 7184
71 – 7868
78 – 8442
84 – 9135
91 – 9735
97 – 10330
103 – 11023
110 – 11619
116 – 12312
123 – 12923
129 – 13614
136 – 1428
142 – 14814
148 – 1558
155 – 1616
161 – 16810
168 – 1748
174 – 1808
180 – 1873
187 – 1935
193 – 2003
200 – 2063
206 – 2132
213 – 2191
219 – 2252
225 – 2321
232 – 2381
238 – 2450
245 – 2511
251 – 2580
258 – 2641
Fig 5.
answers · Confirms answers are long serialized structures (~560 chars) needing parsing before use.
Show data table
Character-length distribution for answers (mean: 559.675387821255).
charscount
450 – 46214
462 – 47471
474 – 486126
486 – 498175
498 – 510228
510 – 522279
522 – 535464
535 – 547598
547 – 559585
559 – 571369
571 – 583330
583 – 595282
595 – 607212
607 – 619133
619 – 63191
631 – 64372
643 – 65554
655 – 66744
667 – 67938
679 – 69224
692 – 70428
704 – 71618
716 – 72818
728 – 7406
740 – 75210
752 – 7648
764 – 77610
776 – 7883
788 – 8007
800 – 8124
812 – 8245
824 – 8362
836 – 8482
848 – 8611
861 – 8732
873 – 8851
885 – 8971
897 – 9090
909 – 9212
921 – 9332
Fig 6.
Per-column null rate across the corpus. Columns are ordered by input position.
Show data table
Per-column null rate across the corpus.
columnkindnull %
imagetext0.0%
questiontext0.0%
answerstext0.0%
answer_typecategorical0.0%
answerablenumeric0.0%
Fig 7.
Language mix across all text columns (per-string detection, sampled).
Show data table
Per-language counts (total 4,317 detected strings).
langcountshare
en430899.8%
la20.0%
es20.0%
hu10.0%
fy10.0%
ast10.0%
it10.0%
ia10.0%

image text identifier

This column holds image filenames following a fixed `vizwiz_val_########.jpg` pattern, with all 4319 values unique and exactly 23 characters long. It functions as a per-row image identifier rather than analysable text — vocab_size equals n, one_word_rate is 1.0, and there are no duplicates or nulls. The negative Flesch score (-47.98) is an artifact of scoring filenames as prose and should be ignored.

Treatment: Use as a key to join image features or load pixel data; do not treat as text.

anthropic:claude-opus-4-7 · confidence high
Out[13]:

saturn.columns["image"].stats

statvalue
n4,319
nulls0 (0.0%)
unique4,319
len_min 23
len_max 23
len_mean 23
len_median 23
len_p95 23
word_mean 1
word_median 1
n_empty 0
n_duplicates 0
duplicate_rate 0
vocab_size 4,319
readability_flesch_mean -47.98
emoji_rate 0
url_rate 0
one_word_rate 1
allcaps_rate 0
boilerplate_rate 0
alert: near_unique100.0% of rows are unique strings
alert: one_word100.0% rows are a single word
Fig 8.
Character-length distribution for image.
Show data table
Character-length distribution for image (mean: 23.0).
charscount
22 – 230
23 – 230
23 – 230
23 – 230
23 – 230
23 – 230
23 – 230
23 – 230
23 – 230
23 – 230
23 – 230
23 – 230
23 – 230
23 – 230
23 – 230
23 – 230
23 – 230
23 – 230
23 – 230
23 – 230
23 – 234319
23 – 230
23 – 230
23 – 230
23 – 230
23 – 230
23 – 230
23 – 230
23 – 230
23 – 230
23 – 230
23 – 230
23 – 230
23 – 230
23 – 230
23 – 230
23 – 230
23 – 230
23 – 230
23 – 240

question text free_text

Short natural-language questions, predominantly English (4308/4319) with a handful of other-language detections, averaging 7.26 words and 35.1 characters. The column is heavily repetitive: 35.2% duplicate rate with 1521 duplicates, and the single string "What is this?" alone accounts for 523 of 4319 rows. Vocabulary is small (2779 unique tokens) and dominated by interrogatives like "what", "is", "this", consistent with VQA-style prompts directed at images or objects.

Treatment: Tokenize and embed as a text feature; expect heavy duplication so consider pairing with the associated image/context rather than treating as a standalone signal.

anthropic:claude-opus-4-7 · confidence high
Out[16]:

saturn.columns["question"].stats

statvalue
n4,319
nulls0 (0.0%)
unique2,798
len_min 7
len_max 264
len_mean 35.1
len_median 26
len_p95 95
word_mean 7.259
word_median 5
n_empty 0
n_duplicates 1,521
duplicate_rate 0.3522
vocab_size 2,779
readability_flesch_mean 101.7
emoji_rate 0
url_rate 0
one_word_rate 0
allcaps_rate 0.002547
boilerplate_rate 0.003473
alert: multilingual9 languages detected in sample
alert: duplicates35.2% duplicate strings
Fig 9.
Character-length distribution for question.
Show data table
Character-length distribution for question (mean: 35.10141236397314).
charscount
7 – 13759
13 – 20550
20 – 26931
26 – 33609
33 – 39368
39 – 46250
46 – 52143
52 – 58143
58 – 6596
65 – 7184
71 – 7868
78 – 8442
84 – 9135
91 – 9735
97 – 10330
103 – 11023
110 – 11619
116 – 12312
123 – 12923
129 – 13614
136 – 1428
142 – 14814
148 – 1558
155 – 1616
161 – 16810
168 – 1748
174 – 1808
180 – 1873
187 – 1935
193 – 2003
200 – 2063
206 – 2132
213 – 2191
219 – 2252
225 – 2321
232 – 2381
238 – 2450
245 – 2511
251 – 2580
258 – 2641

answers text free_text

This column holds serialized lists of answer dictionaries (each containing 'answer' and 'answer_confidence' keys with values like 'yes', 'maybe', 'unanswerable', 'unsuitable'), not raw natural-language text. Lengths are tightly bounded (min 450, max 933, median 550 chars) and 4295 of 4319 rows are unique, with only 24 duplicates flagged as near_unique. The strongly negative Flesch score (-56.5) confirms this is structured/JSON-like content rather than prose.

Treatment: Parse the literal dict/list structure and explode answer and answer_confidence into separate fields before modelling.

anthropic:claude-opus-4-7 · confidence high
Out[19]:

saturn.columns["answers"].stats

statvalue
n4,319
nulls0 (0.0%)
unique4,295
len_min 450
len_max 933
len_mean 559.7
len_median 550
len_p95 660.1
word_mean 47.66
word_median 45
n_empty 0
n_duplicates 24
duplicate_rate 0.005557
vocab_size 11,308
readability_flesch_mean -56.5
emoji_rate 0
url_rate 0
one_word_rate 0
allcaps_rate 0
boilerplate_rate 0
alert: near_unique99.4% of rows are unique strings
Fig 10.
Character-length distribution for answers.
Show data table
Character-length distribution for answers (mean: 559.675387821255).
charscount
450 – 46214
462 – 47471
474 – 486126
486 – 498175
498 – 510228
510 – 522279
522 – 535464
535 – 547598
547 – 559585
559 – 571369
571 – 583330
583 – 595282
595 – 607212
607 – 619133
619 – 63191
631 – 64372
643 – 65554
655 – 66744
667 – 67938
679 – 69224
692 – 70428
704 – 71618
716 – 72818
728 – 7406
740 – 75210
752 – 7648
764 – 77610
776 – 7883
788 – 8007
800 – 8124
812 – 8245
824 – 8362
836 – 8482
848 – 8611
861 – 8732
873 – 8851
885 – 8971
897 – 9090
909 – 9212
921 – 9332

answer_type categorical label

Categorical label tagging the type of answer for each row, with only 4 distinct values and no nulls across 4319 rows. The distribution is heavily skewed: 'other' covers 62.3% (2691) and 'unanswerable' another 1385, while 'yes/no' (195) and 'number' (48) are rare. Entropy ratio of 0.61 confirms the imbalance, which will matter for any stratification or class-weighted modelling.

Treatment: One-hot encode and apply class weighting or stratified sampling to handle the imbalance.

anthropic:claude-opus-4-7 · confidence high
Out[22]:

saturn.columns["answer_type"].stats

statvalue
n4,319
nulls0 (0.0%)
unique4
top_value other
top_rate 0.6231
cardinality 4
entropy 1.225
entropy_ratio 0.6127
Fig 11.
Top values for answer_type.
Show data table
Top values for answer_type (4 unique shown, of 4 total).
valuecountshare
other269162.3%
unanswerable138532.1%
yes/no1954.5%
number481.1%

answerable numeric label

Binary 0/1 flag, almost certainly indicating whether a question is answerable. About 67.9% of rows are 1 and 32.1% are 0, with no nulls across 4319 rows. The class imbalance is moderate but worth noting for any classifier trained on this label.

Treatment: Use directly as a binary target; account for the ~68/32 class imbalance during training.

anthropic:claude-opus-4-7 · confidence high
Out[25]:

saturn.columns["answerable"].stats

statvalue
n4,319
nulls0 (0.0%)
unique2
min 0
max 1
mean 0.6793
median 1
std 0.4668
q1 0
q3 1
iqr 1
skew -0.7684
kurtosis -1.41
n_outliers 0
outlier_rate 0
zero_rate 0.3207
Fig 12.
Distribution of answerable. Vertical dash marks the median.
Show data table
Histogram bins for answerable (median: 1.0).
bincount
0 – 0.0251385
0.025 – 0.050
0.05 – 0.0750
0.075 – 0.10
0.1 – 0.1250
0.125 – 0.150
0.15 – 0.1750
0.175 – 0.20
0.2 – 0.2250
0.225 – 0.250
0.25 – 0.2750
0.275 – 0.30
0.3 – 0.3250
0.325 – 0.350
0.35 – 0.3750
0.375 – 0.40
0.4 – 0.4250
0.425 – 0.450
0.45 – 0.4750
0.475 – 0.50
0.5 – 0.5250
0.525 – 0.550
0.55 – 0.5750
0.575 – 0.60
0.6 – 0.6250
0.625 – 0.650
0.65 – 0.6750
0.675 – 0.70
0.7 – 0.7250
0.725 – 0.750
0.75 – 0.7750
0.775 – 0.80
0.8 – 0.8250
0.825 – 0.850
0.85 – 0.8750
0.875 – 0.90
0.9 – 0.9250
0.925 – 0.950
0.95 – 0.9750
0.975 – 12934

How to cite

click to copy

BibTeX
@misc{saturn-accessibility-atlas-vizwiz-val-annotations-2026,
  author       = {Steuber, Luke},
  title        = {Saturn reading: accessibility atlas vizwiz val annotations},
  year         ={2026},
  howpublished = {\url{https://dr.eamer.dev/saturn/view/accessibility-atlas-vizwiz_val_annotations}},
  note         = {Profiled with saturn-dissect v0.2.0, prompt saturn-insight-v2, model anthropic:claude-opus-4-7},
}
APA
Steuber, L. (2026). Saturn reading: accessibility atlas vizwiz val annotations. Source: /home/coolhand/datasets/accessibility-atlas/vizwiz_val_annotations.csv. Profiled with saturn-dissect v0.2.0 (saturn-insight-v2, anthropic:claude-opus-4-7). Retrieved from https://dr.eamer.dev/saturn/view/accessibility-atlas-vizwiz_val_annotations