Summary confidence: high
This dataset is the VizWiz validation annotations file with 4,319 rows and 5 columns: an image filename, a question, a set of crowd answers, an answer_type label, and a binary answerable flag. The questions are dominated by a small number of generic openers — 'What is this?' alone accounts for 523 rows and questions have a 35% duplicate rate, so visual variety hides behind repeated prompts. Answer_type is heavily skewed: 'other' covers 62% of rows and 'unanswerable' another 1,385, while 'yes/no' and 'number' are rare. Consistent with that, the answerable flag has a mean of 0.68, meaning roughly 32% of items are flagged unanswerable — a notable share to inspect before modeling. The answers column is a serialized list of dicts (long strings averaging ~560 characters) and will need parsing rather than direct text analysis.
citing: row_count · column_count · columns.question.stats.duplicate_rate · columns.question.top_values · columns.answer_type.top_values · columns.answer_type.stats.top_rate · columns.answerable.stats.mean · columns.answerable.stats.zero_rate · columns.answers.stats.len_mean