blissapi

source /home/coolhand/data/blissapi.db 6,181 rows 2 columns profiled 2026-05-01 raw JSON static .html .ipynb Report Notebook

Reading

dataset summary · high confidence anthropic:claude-opus-4-7

This dataset contains 6,181 rows and 2 columns drawn from blissapi.db, pairing a free-text 'keyword' field with a categorical 'symbol_count' field. Every keyword is unique (6,181 distinct values across 6,181 rows) and is exactly one word, with lengths ranging from 2 to 72 characters and a median of 12. The 'symbol_count' column is fully constant at the value '1', so it carries no information for analysis. The most useful first look is the distribution of keyword lengths, since that is essentially the only varying signal in the data.

citing: row_count · column_count · columns[keyword].n_unique · columns[keyword].stats.one_word_rate · columns[keyword].stats.len_min · columns[keyword].stats.len_max · columns[keyword].stats.len_median · columns[keyword].stats.len_mean · columns[keyword].stats.len_p95 · columns[symbol_count].n_unique · columns[symbol_count].stats.top_value · columns[symbol_count].stats.top_rate

Charts the summary said to look at first

keyword · Distribution of keyword character lengths to see the spread from 2 up to 72 characters.

Show data table

Character-length distribution for keyword (mean: 14.530334897265815).
chars	count
2 – 4	84
4 – 6	531
6 – 7	641
7 – 9	374
9 – 11	774
11 – 12	696
12 – 14	588
14 – 16	268
16 – 18	438
18 – 20	395
20 – 21	267
21 – 23	132
23 – 25	223
25 – 26	172
26 – 28	142
28 – 30	61
30 – 32	96
32 – 34	62
34 – 35	56
35 – 37	26
37 – 39	41
39 – 40	24
40 – 42	26
42 – 44	14
44 – 46	10
46 – 48	9
48 – 49	5
49 – 51	3
51 – 53	6
53 – 54	3
54 – 56	0
56 – 58	3
58 – 60	2
60 – 62	1
62 – 63	6
63 – 65	0
65 – 67	0
67 – 68	0
68 – 70	1
70 – 72	1

keyword · Histogram view of keyword lengths to spot the typical range around the median of 12.

Show data table

Character-length distribution for keyword (mean: 14.530334897265815).
chars	count
2 – 4	84
4 – 6	531
6 – 7	641
7 – 9	374
9 – 11	774
11 – 12	696
12 – 14	588
14 – 16	268
16 – 18	438
18 – 20	395
20 – 21	267
21 – 23	132
23 – 25	223
25 – 26	172
26 – 28	142
28 – 30	61
30 – 32	96
32 – 34	62
34 – 35	56
35 – 37	26
37 – 39	41
39 – 40	24
40 – 42	26
42 – 44	14
44 – 46	10
46 – 48	9
48 – 49	5
49 – 51	3
51 – 53	6
53 – 54	3
54 – 56	0
56 – 58	3
58 – 60	2
60 – 62	1
62 – 63	6
63 – 65	0
65 – 67	0
67 – 68	0
68 – 70	1
70 – 72	1

symbol_count · Confirms that symbol_count is a constant single category ('1') with no variation.

Show data table

Top values for symbol_count (1 unique shown, of 1 total).
value	count	share
1	6181	100.0%

Schema

2 columns

Per-column summary. Click column name to jump to its detail.
				Alerts
keyword	text	0.0%	6,181	near_unique one_word
symbol_count	categorical	0.0%	1	imbalance

keyword

text identifier near_unique one_word

This column is a single-word keyword or concept tag, with every one of the 6181 rows holding a unique value (n_unique = 6181, duplicate_rate = 0.0, one_word_rate = 1.0). Tokens are short (len_mean 14.5, len_median 12) and many are compound forms joined by underscores like 'tangerine_clementine_mandarin' or 'cns_injury', suggesting a controlled vocabulary of concept labels rather than free text. The fully unique vocabulary means it behaves like an identifier for distinct concepts, not a categorical feature. Treatment: Treat as a concept key; split underscore-joined tokens and embed if semantic similarity is needed, otherwise leave out of modelling. high · anthropic:claude-opus-4-7

n: 6,181
nulls: 0 (0.0%)
unique: 6,181
len_min: 2
len_max: 72
len_mean: 14.53
len_median: 12
len_p95: 31
word_mean: 1
word_median: 1
n_empty: 0
n_duplicates: 0
duplicate_rate: 0
vocab_size: 6,181
readability_flesch_mean: -75.9
emoji_rate: 0
url_rate: 0
one_word_rate: 1
allcaps_rate: 0
boilerplate_rate: 0

symbol_count

categorical metadata imbalance

This column records a symbol count, but every one of the 6181 rows holds the value "1" (top_rate 1.0, cardinality 1, entropy 0). It carries no information and was flagged for imbalance. There are no nulls, just a single constant. Treatment: Drop; constant column with zero entropy. high · anthropic:claude-opus-4-7

n: 6,181
nulls: 0 (0.0%)
unique: 1
top_value: 1
top_rate: 1
cardinality: 1
entropy: 0
entropy_ratio: 0