This dataset contains 502 API request logs across 11 columns, capturing usage telemetry like response time, status code, endpoint, and user agent. Traffic is dominated by a single API ('linguistic-api' at 99.6%) and a single method (GET), with all requests coming from one IP (127.0.0.1), so the interesting variation lives in endpoint, response_time_ms, status_code, and user_agent. Response times are heavily skewed: the median is just 3ms but the mean is 163ms with a max of 1238ms and 78 outliers (~24%), plus a 34% null rate worth investigating. Status codes split between 200 and 429, hinting at rate-limiting behavior. The endpoint column has a long tail of 209 distinct paths, with /api/languages and /api/search leading.
saturn
/home/coolhand/data/api_auth.db 502 rows sample n=502 seed 42 2026-05-01T17:20:14+00:00
Overview
| Source | /home/coolhand/data/api_auth.db |
| Total rows | 502 |
| Profiled sample | 502 |
| Columns | 11 |
| Generated | 2026-05-01T17:20:14+00:00 |
Insights opt-in
Model-generated narrative. These are opinions, not facts — the stats below are what saturn measured. Generated by: anthropic:claude-opus-4-7.
A monotonic surrogate key: 502 unique values across 502 rows with no nulls, ranging from 1 to 502 and a perfectly symmetric mean and median of 251.5. Skew is 0.0 and there are no outliers, consistent with a sequential row identifier rather than a measured quantity.
The column 'key_id' was skipped by the profiler, so no type, uniqueness, or distributional statistics are available beyond a row count of 502 and a null rate of 0.0. The name suggests an identifier, but without n_unique or sample values this cannot be confirmed from the evidence.
This is a categorical API identifier with only 2 distinct values across 502 rows. It is overwhelmingly dominated by 'linguistic-api' (500 rows, 99.6%), with 'blissAPI' appearing just twice, yielding near-zero entropy (0.037). The column is effectively a constant with two anomalous records.
This column records API endpoint paths, with 209 unique routes across 502 requests and no nulls. Traffic is spread fairly evenly (entropy ratio 0.828), though /api/languages leads at 11.6% of hits, followed by /api/search (36) and /api/stats (24); a long tail of rarely-hit routes triggers the alert. Notably /api/languages/NOPE appears 11 times, suggesting either a probing client or a broken reference worth investigating.
This column records the HTTP method, but every one of the 502 rows is "GET" — cardinality is 1 and entropy is 0. It carries no information for any downstream model or segmentation.
This column holds HTTP status codes, taking only 2 distinct values across 502 rows: 200 (success) and 429 (rate-limited), with 200 as the median and 429 as Q3. The mean of 278.9 implies roughly a third of requests were throttled, which is a notable failure rate worth investigating. No nulls or outliers, and the bimodal shape is reflected in the negative kurtosis (-1.57).
This column captures response times in milliseconds for 502 records. The distribution is severely right-skewed (skew 1.90, kurtosis 1.74): the median is just 3 ms and Q3 is 21 ms, yet the mean is 162.84 ms and the max reaches 1238 ms, with std at 345.34. Two analyst-relevant flags: 34.46% of rows are null, and 23.71% (78 values) fall outside the IQR fence.
This is a numeric flag named cache_hit, presumably a 0/1 indicator of whether a cache lookup succeeded. Across all 502 rows it is constant at 0 (zero_rate 1.0, n_unique 1, std 0.0), meaning no cache hit was ever recorded. That is either a broken instrumentation path or a workload where caching is disabled.
This column records an IP address but holds the loopback value 127.0.0.1 for all 502 rows, with zero nulls and cardinality of 1. Entropy is 0.0, so the field carries no information and looks like a placeholder or a logging artefact rather than a real client IP.
HTTP User-Agent strings from request logs, with only 12 distinct values across 502 rows and no nulls. Traffic is dominated by non-browser clients: Werkzeug/3.1.4 (the Flask dev server's default UA) at 215 hits and GoogleOther at 196, together covering ~82% of requests, with curl and assorted bots (Bytespider, Applebot, Googlebot, bingbot, GPTBot) making up most of the rest. Genuine human browser traffic appears negligible — only a single Firefox hit and a handful of mobile Safari/Chrome entries.
This is a timestamp column stored as strings, with 299 unique values across 502 rows and no nulls. The distribution is unusually clumpy for a timestamp: six values on 2026-04-17 each repeat 14-38 times, accounting for roughly 189 rows, while most other timestamps appear only a handful of times. That burst pattern suggests batched events or a logging artifact rather than free-flowing event time.
Numeric correlation
usage_id numeric
key_id unknown
api_name categorical
Top values (rank 1–20)
- linguistic-api — 500
- blissAPI — 2
endpoint categorical
Top values (rank 1–20)
- /api/languages — 58
- /api/search — 36
- /api/stats — 24
- /api/languages/macroareas — 18
- /api/languages/by_macroarea/Africa — 16
- /api/languages/eng — 13
- /api/languages/spa/phonemes — 12
- /api/families/indo1319 — 12
- /api/typology/wals/parameters — 12
- /api/languages/NOPE — 11
- /api/languages/eng/features — 11
- /api/languages/cmn — 10
- /api/languages/eng/phonemes — 10
- /api/languages/NOPE/phonemes — 10
- /api/languages/NOPE/features — 10
- /api/families/NOPE — 10
- /api/typology/wals/map/81A — 8
- /api/typology/phonology/summary — 7
- /api/typology/phonology/inventory/eng — 7
- /api/typology/types/by_family — 6
method categorical
Top values (rank 1–20)
- GET — 502
status_code numeric
response_time_ms numeric
cache_hit numeric
ip_address categorical
Top values (rank 1–20)
- 127.0.0.1 — 502
user_agent categorical
Top values (rank 1–20)
- Werkzeug/3.1.4 — 215
- Mozilla/5.0 (Linux; Android 6.0.1; Nexus 5X Build/MMB29P) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/146.0.7680.177 Mobile Safari/537.36 (compatible; GoogleOther) — 196
- Mozilla/5.0 (Linux; Android 5.0) AppleWebKit/537.36 (KHTML, like Gecko) Mobile Safari/537.36 (compatible; Bytespider; https://zhanzhang.toutiao.com/) — 26
- curl/7.81.0 — 24
- Mozilla/5.0 (iPhone; CPU iPhone OS 26_5_0 like Mac OS X) AppleWebKit/605.1.15 (KHTML, like Gecko) CriOS/147.0.7727.99 Mobile/15E148 Safari/604.1 — 14
- Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/17.4 Safari/605.1.15 (Applebot/0.1; +http://www.apple.com/go/applebot) — 14
- Mozilla/5.0 (Linux; Android 6.0.1; Nexus 5X Build/MMB29P) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/146.0.7680.177 Mobile Safari/537.36 (compatible; Googlebot/2.1; +http://www.google.com/bot.html) — 7
- Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm) Chrome/136.0.0.0 Safari/537.36 — 2
- Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:146.0) Gecko/20100101 Firefox/146.0 — 1
- Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; GPTBot/1.3; +https://openai.com/gptbot) — 1
- Mozilla/5.0 (Linux; Android 6.0.1; Nexus 5X Build/MMB29P) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/147.0.7727.116 Mobile Safari/537.36 (compatible; Googlebot/2.1; +http://www.google.com/bot.html) — 1
- Mozilla/5.0 (Linux; Android 6.0.1; Nexus 5X Build/MMB29P) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/147.0.7727.137 Mobile Safari/537.36 (compatible; Googlebot/2.1; +http://www.google.com/bot.html) — 1
timestamp categorical
Top values (rank 1–20)
- 2026-04-17 10:06:57 — 38
- 2026-04-17 10:07:57 — 38
- 2026-04-17 10:32:51 — 37
- 2026-04-17 10:07:09 — 31
- 2026-04-17 10:26:55 — 31
- 2026-04-17 10:06:29 — 14
- 2026-04-23 00:15:04 — 4
- 2026-04-24 01:08:10 — 4
- 2026-04-24 22:11:19 — 4
- 2026-01-06 05:06:58 — 3
- 2026-04-19 04:06:23 — 3
- 2026-01-06 05:07:53 — 2
- 2026-04-17 10:26:53 — 2
- 2026-04-17 10:27:44 — 2
- 2026-04-19 04:33:00 — 2
- 2026-04-19 07:17:09 — 2
- 2026-04-29 21:16:59 — 2
- 2026-04-30 03:37:45 — 2
- 2026-01-06 05:39:57 — 1
- 2026-01-06 09:44:58 — 1