saturn·

vizwiz

saturn notebook · generated 2026-04-22 Report Notebook

Overview

Source: /home/coolhand/datasets/accessibility-atlas/vizwiz_val_annotations.csv

Saturn profiled 4,319 rows across 5 columns. The stats below are deterministic and machine-readable; the prose is a language-model interpretation of those stats (opt-in, added after the fact, never sees raw rows).

[2]:
!pip install saturn-dissect
import subprocess
subprocess.run([
    "saturn", "analyze", "/home/coolhand/datasets/accessibility-atlas/vizwiz_val_annotations.csv",
    "--findings", "vizwiz.json",
    "--llm", "anthropic:claude-opus-4-7",
])

Summary confidence: high

This is the VizWiz validation annotation set: 4,319 rows linking an image filename to a question, a bundle of crowd answers, an answer_type label, and a binary 'answerable' flag. The question column is where the dataset's character lives — it has only 2,798 unique values with a 35% duplicate rate, dominated by short generic prompts like 'What is this?' (523 occurrences). Worth a closer look: the answer_type distribution is heavily skewed toward 'other' (62%) with 'unanswerable' a strong second, and the numeric 'answerable' flag confirms that ~32% of items are flagged unanswerable — a meaningful portion to account for in any downstream evaluation.

citing: row_count · column_count · columns.question.n_unique · columns.question.stats.duplicate_rate · columns.question.top_values · columns.answer_type.top_values · columns.answer_type.stats.top_rate · columns.answerable.stats.mean · columns.answerable.stats.zero_rate · columns.question.stats.word_mean

Out[4]:

saturn.schema() · 5 columns

column kind n null% unique alerts
image text 4,319 0.0% 4,319 near_unique one_word
question text 4,319 0.0% 2,798 duplicates multilingual
answers text 4,319 0.0% 4,295 near_unique
answer_type categorical 4,319 0.0% 4
answerable numeric 4,319 0.0% 2
Fig 1.
answer_type · Shows how 'other' dominates over unanswerable, yes/no, and number — useful for sizing class imbalance.
Show data table
Top values for answer_type (4 unique shown, of 4 total).
valuecountshare
other269162.3%
unanswerable138532.1%
yes/no1954.5%
number481.1%
Fig 2.
answerable · Roughly two-thirds answerable vs. one-third unanswerable; check this before any accuracy calculation.
Show data table
Histogram bins for answerable (median: 1.0).
bincount
0 – 0.0251385
0.025 – 0.050
0.05 – 0.0750
0.075 – 0.10
0.1 – 0.1250
0.125 – 0.150
0.15 – 0.1750
0.175 – 0.20
0.2 – 0.2250
0.225 – 0.250
0.25 – 0.2750
0.275 – 0.30
0.3 – 0.3250
0.325 – 0.350
0.35 – 0.3750
0.375 – 0.40
0.4 – 0.4250
0.425 – 0.450
0.45 – 0.4750
0.475 – 0.50
0.5 – 0.5250
0.525 – 0.550
0.55 – 0.5750
0.575 – 0.60
0.6 – 0.6250
0.625 – 0.650
0.65 – 0.6750
0.675 – 0.70
0.7 – 0.7250
0.725 – 0.750
0.75 – 0.7750
0.775 – 0.80
0.8 – 0.8250
0.825 – 0.850
0.85 – 0.8750
0.875 – 0.90
0.9 – 0.9250
0.925 – 0.950
0.95 – 0.9750
0.975 – 12934
Fig 3.
question · Top question strings reveal heavy repetition of generic prompts like 'What is this?' — confirms the 35% duplicate rate.
Show data table
Character-length distribution for question (mean: 35.10141236397314).
charscount
7 – 13759
13 – 20550
20 – 26931
26 – 33609
33 – 39368
39 – 46250
46 – 52143
52 – 58143
58 – 6596
65 – 7184
71 – 7868
78 – 8442
84 – 9135
91 – 9735
97 – 10330
103 – 11023
110 – 11619
116 – 12312
123 – 12923
129 – 13614
136 – 1428
142 – 14814
148 – 1558
155 – 1616
161 – 16810
168 – 1748
174 – 1808
180 – 1873
187 – 1935
193 – 2003
200 – 2063
206 – 2132
213 – 2191
219 – 2252
225 – 2321
232 – 2381
238 – 2450
245 – 2511
251 – 2580
258 – 2641
Fig 4.
question · Question length distribution is short and right-skewed (median 26 chars, max 264) — a few long outliers worth inspecting.
Show data table
Character-length distribution for question (mean: 35.10141236397314).
charscount
7 – 13759
13 – 20550
20 – 26931
26 – 33609
33 – 39368
39 – 46250
46 – 52143
52 – 58143
58 – 6596
65 – 7184
71 – 7868
78 – 8442
84 – 9135
91 – 9735
97 – 10330
103 – 11023
110 – 11619
116 – 12312
123 – 12923
129 – 13614
136 – 1428
142 – 14814
148 – 1558
155 – 1616
161 – 16810
168 – 1748
174 – 1808
180 – 1873
187 – 1935
193 – 2003
200 – 2063
206 – 2132
213 – 2191
219 – 2252
225 – 2321
232 – 2381
238 – 2450
245 – 2511
251 – 2580
258 – 2641
Fig 5.
answers · Answer-bundle string lengths cluster tightly (450–660 chars) because each row stores a fixed-size list of crowd responses.
Show data table
Character-length distribution for answers (mean: 559.675387821255).
charscount
450 – 46214
462 – 47471
474 – 486126
486 – 498175
498 – 510228
510 – 522279
522 – 535464
535 – 547598
547 – 559585
559 – 571369
571 – 583330
583 – 595282
595 – 607212
607 – 619133
619 – 63191
631 – 64372
643 – 65554
655 – 66744
667 – 67938
679 – 69224
692 – 70428
704 – 71618
716 – 72818
728 – 7406
740 – 75210
752 – 7648
764 – 77610
776 – 7883
788 – 8007
800 – 8124
812 – 8245
824 – 8362
836 – 8482
848 – 8611
861 – 8732
873 – 8851
885 – 8971
897 – 9090
909 – 9212
921 – 9332
Fig 6.
Per-column null rate across the corpus. Columns are ordered by input position.
Show data table
Per-column null rate across the corpus.
columnkindnull %
imagetext0.0%
questiontext0.0%
answerstext0.0%
answer_typecategorical0.0%
answerablenumeric0.0%
Fig 7.
Language mix across all text columns (per-string detection, sampled).
Show data table
Per-language counts (total 4,317 detected strings).
langcountshare
en430899.8%
la20.0%
es20.0%
hu10.0%
fy10.0%
ast10.0%
it10.0%
ia10.0%

image text identifier

This column holds image filenames following the pattern `vizwiz_val_########.jpg`, with all 4319 values unique and exactly 23 characters long. Every entry is a single token with no duplicates or nulls, confirming it functions as a per-row file pointer rather than analyzable text.

Treatment: Treat as a file-path key; join to image assets rather than modelling the string.

anthropic:claude-opus-4-7 · confidence high
Out[13]:

saturn.columns["image"].stats

statvalue
n4,319
nulls0 (0.0%)
unique4,319
len_min 23
len_max 23
len_mean 23
len_median 23
len_p95 23
word_mean 1
word_median 1
n_empty 0
n_duplicates 0
duplicate_rate 0
vocab_size 4,319
readability_flesch_mean -47.98
emoji_rate 0
url_rate 0
one_word_rate 1
allcaps_rate 0
boilerplate_rate 0
alert: near_unique100.0% of rows are unique strings
alert: one_word100.0% rows are a single word
Fig 8.
Character-length distribution for image.
Show data table
Character-length distribution for image (mean: 23.0).
charscount
22 – 230
23 – 230
23 – 230
23 – 230
23 – 230
23 – 230
23 – 230
23 – 230
23 – 230
23 – 230
23 – 230
23 – 230
23 – 230
23 – 230
23 – 230
23 – 230
23 – 230
23 – 230
23 – 230
23 – 230
23 – 234319
23 – 230
23 – 230
23 – 230
23 – 230
23 – 230
23 – 230
23 – 230
23 – 230
23 – 230
23 – 230
23 – 230
23 – 230
23 – 230
23 – 230
23 – 230
23 – 230
23 – 230
23 – 230
23 – 240

question text free_text

Short English questions, averaging 7.26 words and 35 characters, overwhelmingly of the form 'What is this?' (523 occurrences alone). 35.2% of the 4319 rows are duplicates, leaving only 2798 unique strings, and the vocabulary is tiny (2779 tokens) with very high Flesch readability (101.7). A handful of rows are tagged as non-English (es, la, it, fy, hu, ia, ast), but English dominates at 4308.

Treatment: Tokenize and embed for modelling; consider deduplicating or weighting given the 35% duplicate rate.

anthropic:claude-opus-4-7 · confidence high
Out[16]:

saturn.columns["question"].stats

statvalue
n4,319
nulls0 (0.0%)
unique2,798
len_min 7
len_max 264
len_mean 35.1
len_median 26
len_p95 95
word_mean 7.259
word_median 5
n_empty 0
n_duplicates 1,521
duplicate_rate 0.3522
vocab_size 2,779
readability_flesch_mean 101.7
emoji_rate 0
url_rate 0
one_word_rate 0
allcaps_rate 0.002547
boilerplate_rate 0.003473
alert: duplicates35.2% duplicate strings
alert: multilingual9 languages detected in sample
Fig 9.
Character-length distribution for question.
Show data table
Character-length distribution for question (mean: 35.10141236397314).
charscount
7 – 13759
13 – 20550
20 – 26931
26 – 33609
33 – 39368
39 – 46250
46 – 52143
52 – 58143
58 – 6596
65 – 7184
71 – 7868
78 – 8442
84 – 9135
91 – 9735
97 – 10330
103 – 11023
110 – 11619
116 – 12312
123 – 12923
129 – 13614
136 – 1428
142 – 14814
148 – 1558
155 – 1616
161 – 16810
168 – 1748
174 – 1808
180 – 1873
187 – 1935
193 – 2003
200 – 2063
206 – 2132
213 – 2191
219 – 2252
225 – 2321
232 – 2381
238 – 2450
245 – 2511
251 – 2580
258 – 2641

answers text feature

This column holds serialized lists of answer dicts (keys like 'answer' and 'answer_confidence' with values such as 'yes', 'maybe', 'unanswerable'), not free-form text. Rows are long and uniform (len_mean 559.7, len_min 450, len_max 933) and nearly all unique (4295/4319), with a tiny 0.56% duplicate rate. The strongly negative Flesch score (-56.5) confirms this is structured payload rather than natural language.

Treatment: Parse the stringified dicts and explode answer/confidence fields into structured columns before modelling.

anthropic:claude-opus-4-7 · confidence high
Out[19]:

saturn.columns["answers"].stats

statvalue
n4,319
nulls0 (0.0%)
unique4,295
len_min 450
len_max 933
len_mean 559.7
len_median 550
len_p95 660.1
word_mean 47.66
word_median 45
n_empty 0
n_duplicates 24
duplicate_rate 0.005557
vocab_size 11,308
readability_flesch_mean -56.5
emoji_rate 0
url_rate 0
one_word_rate 0
allcaps_rate 0
boilerplate_rate 0
alert: near_unique99.4% of rows are unique strings
Fig 10.
Character-length distribution for answers.
Show data table
Character-length distribution for answers (mean: 559.675387821255).
charscount
450 – 46214
462 – 47471
474 – 486126
486 – 498175
498 – 510228
510 – 522279
522 – 535464
535 – 547598
547 – 559585
559 – 571369
571 – 583330
583 – 595282
595 – 607212
607 – 619133
619 – 63191
631 – 64372
643 – 65554
655 – 66744
667 – 67938
679 – 69224
692 – 70428
704 – 71618
716 – 72818
728 – 7406
740 – 75210
752 – 7648
764 – 77610
776 – 7883
788 – 8007
800 – 8124
812 – 8245
824 – 8362
836 – 8482
848 – 8611
861 – 8732
873 – 8851
885 – 8971
897 – 9090
909 – 9212
921 – 9332

answer_type categorical label

Categorical label tagging the type of answer expected, with just 4 classes: 'other' dominates at 62.3% (2691/4319), followed by 'unanswerable' at 1385, while 'yes/no' (195) and 'number' (48) are rare. No nulls, but the class imbalance is severe — 'number' represents barely 1% of rows. Entropy ratio of 0.61 confirms the distribution is far from uniform.

Treatment: Use as a stratified target; consider class weighting or merging rare classes ('yes/no', 'number') given the imbalance.

anthropic:claude-opus-4-7 · confidence high
Out[22]:

saturn.columns["answer_type"].stats

statvalue
n4,319
nulls0 (0.0%)
unique4
top_value other
top_rate 0.6231
cardinality 4
entropy 1.225
entropy_ratio 0.6127
Fig 11.
Top values for answer_type.
Show data table
Top values for answer_type (4 unique shown, of 4 total).
valuecountshare
other269162.3%
unanswerable138532.1%
yes/no1954.5%
number481.1%

answerable numeric label

Binary 0/1 flag indicating whether an item is answerable, with 4319 rows and no nulls. Class is imbalanced toward 1: mean 0.6793 implies roughly 68% positives versus a 0.3207 zero-rate, and skew -0.768 with kurtosis -1.41 confirm the lopsided two-point distribution.

Treatment: Use as binary target; account for the ~68/32 class imbalance via stratified splits or class weights.

anthropic:claude-opus-4-7 · confidence high
Out[25]:

saturn.columns["answerable"].stats

statvalue
n4,319
nulls0 (0.0%)
unique2
min 0
max 1
mean 0.6793
median 1
std 0.4668
q1 0
q3 1
iqr 1
skew -0.7684
kurtosis -1.41
n_outliers 0
outlier_rate 0
zero_rate 0.3207
Fig 12.
Distribution of answerable. Vertical dash marks the median.
Show data table
Histogram bins for answerable (median: 1.0).
bincount
0 – 0.0251385
0.025 – 0.050
0.05 – 0.0750
0.075 – 0.10
0.1 – 0.1250
0.125 – 0.150
0.15 – 0.1750
0.175 – 0.20
0.2 – 0.2250
0.225 – 0.250
0.25 – 0.2750
0.275 – 0.30
0.3 – 0.3250
0.325 – 0.350
0.35 – 0.3750
0.375 – 0.40
0.4 – 0.4250
0.425 – 0.450
0.45 – 0.4750
0.475 – 0.50
0.5 – 0.5250
0.525 – 0.550
0.55 – 0.5750
0.575 – 0.60
0.6 – 0.6250
0.625 – 0.650
0.65 – 0.6750
0.675 – 0.70
0.7 – 0.7250
0.725 – 0.750
0.75 – 0.7750
0.775 – 0.80
0.8 – 0.8250
0.825 – 0.850
0.85 – 0.8750
0.875 – 0.90
0.9 – 0.9250
0.925 – 0.950
0.95 – 0.9750
0.975 – 12934

How to cite

click to copy

BibTeX
@misc{saturn-vizwiz-2026,
  author       = {Steuber, Luke},
  title        = {Saturn reading: vizwiz},
  year         ={2026},
  howpublished = {\url{https://dr.eamer.dev/saturn/view/vizwiz}},
  note         = {Profiled with saturn-dissect v0.2.0, prompt saturn-insight-v2, model anthropic:claude-opus-4-7},
}
APA
Steuber, L. (2026). Saturn reading: vizwiz. Source: /home/coolhand/datasets/accessibility-atlas/vizwiz_val_annotations.csv. Profiled with saturn-dissect v0.2.0 (saturn-insight-v2, anthropic:claude-opus-4-7). Retrieved from https://dr.eamer.dev/saturn/view/vizwiz