data trove steam users
Reading
This is a single-row metadata record describing the Steam Users dataset, a collection of 14,306,064 Steam user profiles sourced from Steam Store data (likely via Kaggle or SteamSpy) and last updated on 2025-01-20. Rather than being an analytical dataset itself, it serves as a data catalogue entry pointing analysts toward the actual user data file (185 MB) which links to a recommendations.csv via user_id. The most important thing to note is the scale: over 14 million user profiles covering library size and review activity represent a substantial analytical resource. Before diving in, analysts should locate and join the referenced recommendations.csv to unlock the full relational value of this dataset.
citing: record_count.max · notes.top_value · source.top_value · last_updated.top_value · row_count
Charts the summary said to look at first
Show data table
| value | count | share |
|---|---|---|
| Steam Users | 1 | 100.0% |
Show data table
| value | count | share |
|---|---|---|
| Steam Store user data (likely via Kaggle or SteamSpy) | 1 | 100.0% |
Show data table
| value | count | share |
|---|---|---|
| 2025-01-20 | 1 | 100.0% |
Show data table
| value | count | share |
|---|---|---|
| 14.3 million Steam user profiles with library size and review activity. File is 185 MB. Links to recommendations.csv via user_id. | 1 | 100.0% |
Schema
6 columns| Alerts | ||||
|---|---|---|---|---|
| dataset_name | categorical | 0.0% | 1 |
long_tail
imbalance
|
| last_updated | categorical | 0.0% | 1 |
long_tail
imbalance
|
| source | categorical | 0.0% | 1 |
long_tail
imbalance
|
| record_count | numeric | 0.0% | 1 |
constant
|
| fields | unknown | 0.0% | — |
skipped
|
| notes | categorical | 0.0% | 1 |
long_tail
imbalance
|
dataset_name
categorical metadata long_tail imbalanceThis column is a dataset-level identifier or metadata tag indicating the source dataset, with every row labelled 'Steam Users'. With only 1 row and 1 unique value, the column carries zero entropy (0.0) and a top_rate of 1.0 — it is a constant and provides no discriminative information. The long_tail and imbalance alerts are technically correct but trivially explained by the single-row, single-value nature of the data. Treatment: Drop before modelling; constant column with no variance and only 1 row.
- n
- 1
- nulls
- 0 (0.0%)
- unique
- 1
- top_value
- Steam Users
- top_rate
- 1
- cardinality
- 1
- entropy
- 0
- entropy_ratio
- 0
last_updated
categorical timestamp long_tail imbalanceThis column is a timestamp or date field indicating when a record was last updated, stored as a categorical string. The dataset contains only a single row (n=1), and that row holds the value '2025-01-20', giving a cardinality of 1 and top_rate of 1.0. With only one observation, no distributional insight is possible; the 'long_tail' and 'imbalance' alerts are artefacts of the trivial sample size rather than meaningful signals. Treatment: Parse to a proper date type; defer any temporal analysis until the full dataset is loaded, as current sample has only 1 row.
- n
- 1
- nulls
- 0 (0.0%)
- unique
- 1
- top_value
- 2025-01-20
- top_rate
- 1
- cardinality
- 1
- entropy
- 0
- entropy_ratio
- 0
source
categorical metadata long_tail imbalanceThis column records the data provenance or source attribution for the dataset, with every single row carrying the identical value 'Steam Store user data (likely via Kaggle or SteamSpy)'. With n=1, cardinality=1, entropy=0.0, and top_rate=1.0, there is zero variance whatsoever — this is a constant column. It adds no analytical signal and likely exists as a metadata annotation injected during data collection or curation. Treatment: Drop before modelling — zero-variance constant column carries no predictive information.
- n
- 1
- nulls
- 0 (0.0%)
- unique
- 1
- top_value
- Steam Store user data (likely via Kaggle or SteamSpy)
- top_rate
- 1
- cardinality
- 1
- entropy
- 0
- entropy_ratio
- 0
record_count
numeric metadata constantThis column is a record count field, almost certainly a metadata scalar reporting the total row count of a source dataset — here fixed at 14,306,064 across all rows. With n=1, n_unique=1, and a constant value equal to mean, min, and max, there is zero variance; saturn has flagged it as 'constant'. This is not a feature or target but a summary statistic embedded in the dataset, likely from an ETL or export header row. Treatment: Drop before modelling; store separately as a provenance scalar if needed.
- n
- 1
- nulls
- 0 (0.0%)
- unique
- 1
- min
- 1.431e+07
- max
- 1.431e+07
- mean
- 1.431e+07
- median
- 1.431e+07
- std
- 0
- q1
- 1.431e+07
- q3
- 1.431e+07
- iqr
- 0
- skew
- 0
- kurtosis
- 0
- n_outliers
- 0
- outlier_rate
- 0
- zero_rate
- 0
fields
unknown other skippedThis column contains only a single row and was skipped by the profiler, yielding no distributional statistics. With n=1 and no uniqueness or type information available, no meaningful inference about its content or role is possible beyond the fact that it is non-null. The 'unknown' kind designation and empty stats block indicate the profiler could not parse or classify the value. Treatment: Manually inspect the single row value to determine type and role before any downstream use.
- n
- 1
- nulls
- 0 (0.0%)
- unique
- —
notes
categorical metadata long_tail imbalanceThis column is a dataset-level metadata note, not a real data column — it contains a single static string describing the dataset itself (14.3 million Steam user profiles, file size 185 MB, join key user_id). With n=1 and cardinality=1, it appears to be a singleton annotation row or a schema-level descriptor accidentally included in the profiled data. The entropy of 0.0 and top_rate of 1.0 confirm it carries zero analytical signal. Treatment: Drop before modelling; use the embedded join hint (user_id → recommendations.csv) as a schema reference only.
- n
- 1
- nulls
- 0 (0.0%)
- unique
- 1
- top_value
- 14.3 million Steam user profiles with library size and review activity. File is 185 MB. Links to recommendations.csv via user_id.
- top_rate
- 1
- cardinality
- 1
- entropy
- 0
- entropy_ratio
- 0