Summary confidence: high
This dataset contains 147,890 UFO sighting reports across 13 columns, mixing free-text descriptions (Summary, Text, Location details), structured categoricals (Shape, Explanation), timestamps (Occurred, Reported, Posted), and a numeric witness count. The Shape field is a clean place to start: 39 categories with 'Light' leading at ~27,494 sightings, followed by Circle and Triangle. Two things deserve a closer look. First, 'No of observers' is extremely skewed — values run from -10 to 20,000 with a median of 2 and over 18,000 outliers, suggesting data-entry errors that need cleaning before any aggregation. Second, the Explanation column is 99.46% null, so claims about 'what UFOs really were' rest on under 800 labelled rows, dominated by Starlink and rocket attributions. Location is dense and US-heavy (Phoenix, Seattle, Las Vegas top the list), and the Characteristics field collapses to ~43 vocabulary tokens dominated by 'Lights on object'.
citing: Shape.top_values · Shape.cardinality · No of observers.skew · No of observers.max · No of observers.min · No of observers.median · No of observers.n_outliers · Explanation.null_rate · Explanation.top_values · Location.top_values · Characteristics.top_values · Characteristics.vocab_size · Duration.top_values · row_count · column_count