data trove solar system planets
Reading
This dataset contains orbital and physical characteristics of all 8 planets in the Solar System, sourced from NASA JPL Horizons on 2026-01-19. The most striking feature is the extreme spread in planetary mass: values range from 0.0553 to 317.8 Earth masses, with 2 outliers (25% outlier rate) pulling the mean far above the median of 7.75 — a clear sign that Jupiter and Saturn dominate. Rotation period is equally dramatic, with a mean of -22.7 days and a minimum of -243.025 days, reflecting both retrograde rotation (Venus) and the very slow spin of some planets — worth examining closely. The dataset splits cleanly into 4 Inner Planets and 4 Outer Planets, and ring data (has_rings, ring radii) is only populated for 1 planet (87.5% null rate), consistent with Saturn being the sole ringed entry recorded.
citing: mass_earth.stats.max · mass_earth.stats.min · mass_earth.stats.median · mass_earth.n_outliers · mass_earth.outlier_rate · rotation_period_days.stats.mean · rotation_period_days.stats.min · rotation_period_days.n_outliers · has_rings.null_rate · classification.top_values · type.top_values · orbital_period_years.stats.max · orbital_period_years.stats.min · diameter_km.stats.max · diameter_km.stats.min
Charts the summary said to look at first
Show data table
| bin | count |
|---|---|
| 0.0553 – 63.6 | 6 |
| 63.6 – 127.2 | 1 |
| 127.2 – 190.7 | 0 |
| 190.7 – 254.3 | 0 |
| 254.3 – 317.8 | 1 |
Show data table
| value | count | share |
|---|---|---|
| terrestrial | 4 | 50.0% |
| gas_giant | 2 | 25.0% |
| ice_giant | 2 | 25.0% |
Show data table
| bin | count |
|---|---|
| 0.2408 – 33.15 | 6 |
| 33.15 – 66.06 | 0 |
| 66.06 – 98.97 | 1 |
| 98.97 – 131.9 | 0 |
| 131.9 – 164.8 | 1 |
Show data table
| bin | count |
|---|---|
| -243 – -182.7 | 1 |
| -182.7 – -122.4 | 0 |
| -122.4 – -62.02 | 0 |
| -62.02 – -1.688 | 0 |
| -1.688 – 58.65 | 7 |
Show data table
| bin | count |
|---|---|
| 4879 – 3.25e+04 | 4 |
| 3.25e+04 – 6.012e+04 | 2 |
| 6.012e+04 – 8.774e+04 | 0 |
| 8.774e+04 – 1.154e+05 | 0 |
| 1.154e+05 – 1.43e+05 | 2 |
Schema
20 columns| Alerts | ||||
|---|---|---|---|---|
| name | categorical | 0.0% | 8 |
long_tail
|
| classification | categorical | 0.0% | 2 |
|
| type | categorical | 0.0% | 3 |
|
| diameter_km | numeric | 0.0% | 8 |
|
| mass_earth | numeric | 0.0% | 8 |
outliers
|
| semi_major_axis_au | numeric | 0.0% | 8 |
outliers
|
| eccentricity | numeric | 0.0% | 8 |
outliers
|
| inclination_deg | numeric | 0.0% | 8 |
outliers
|
| ascending_node_deg | numeric | 0.0% | 8 |
|
| argument_perihelion_deg | numeric | 0.0% | 8 |
|
| mean_anomaly_deg | numeric | 0.0% | 8 |
|
| orbital_period_years | numeric | 0.0% | 8 |
outliers
|
| rotation_period_days | numeric | 0.0% | 8 |
high_skew
outliers
|
| perihelion_distance_au | numeric | 0.0% | 8 |
outliers
|
| aphelion_distance_au | numeric | 0.0% | 8 |
outliers
|
| has_rings | categorical | 87.5% | 1 |
long_tail
null_rate
imbalance
|
| ring_inner_radius_km | numeric | 87.5% | 1 |
null_rate
constant
|
| ring_outer_radius_km | numeric | 87.5% | 1 |
null_rate
constant
|
| data_source | categorical | 0.0% | 1 |
imbalance
|
| fetch_date | categorical | 0.0% | 1 |
imbalance
|
name
categorical identifier long_tailThis column contains the names of the eight planets in our solar system, each appearing exactly once (cardinality 8, n 8, null_rate 0.0). The entropy_ratio of 1.0 confirms perfectly uniform distribution — every value is unique — making this a natural row identifier rather than a grouping label. The 'long_tail' alert is a statistical artefact of the perfect uniformity, not a genuine distributional concern. Treatment: Use as a row label or index key; do not one-hot encode or use as a categorical feature.
- n
- 8
- nulls
- 0 (0.0%)
- unique
- 8
- top_value
- Mercury
- top_rate
- 0.125
- cardinality
- 8
- entropy
- 3
- entropy_ratio
- 1
classification
categorical labelThis column classifies planets into two categories — 'Inner Planet' and 'Outer Planet' — consistent with standard solar system taxonomy. The distribution is perfectly balanced, with exactly 4 instances of each class across only 8 total rows, yielding a maximum entropy_ratio of 1.0. The tiny dataset size (n=8) matches the 8 classical planets, suggesting this is a complete, exhaustive planetary dataset rather than a sample. The perfect 50/50 split is notable but expected given the known 4-inner / 4-outer planet structure of the solar system. Treatment: Use as a binary classification target or grouping variable; encode as 0/1 for modelling.
- n
- 8
- nulls
- 0 (0.0%)
- unique
- 2
- top_value
- Inner Planet
- top_rate
- 0.5
- cardinality
- 2
- entropy
- 1
- entropy_ratio
- 1
type
categorical labelThis column classifies planetary bodies into three physical types: terrestrial, gas_giant, and ice_giant — almost certainly a planet-type taxonomy from a solar system dataset. With only 8 rows and 3 categories, the dataset is tiny. 'terrestrial' dominates at 50% (4 of 8), while gas_giant and ice_giant each appear exactly twice, which mirrors the actual composition of our solar system's 8 planets. Treatment: One-hot encode for modelling; with only 3 categories and 8 rows, verify dataset is not a toy/sample before drawing conclusions.
- n
- 8
- nulls
- 0 (0.0%)
- unique
- 3
- top_value
- terrestrial
- top_rate
- 0.5
- cardinality
- 3
- entropy
- 1.5
- entropy_ratio
- 0.9464
diameter_km
numeric featureThis column represents the diameter in kilometres of solar system planets — the 8 unique values across n=8 rows (no nulls, no duplicates) align perfectly with the eight recognised planets. The range spans 4,879 km (Mercury) to 142,984 km (Jupiter), with a mean of 50,087 km and a median of 31,142 km, reflecting moderate right skew (0.85) driven by the gas giants pulling the distribution upward. Despite the skew, kurtosis is slightly negative (−0.86), indicating a flat, spread-out distribution rather than a peaked one — expected given the vast size differences across planetary classes. Treatment: Log-transform before regression or distance-based modelling to compress the wide range between terrestrial and gas giant values.
- n
- 8
- nulls
- 0 (0.0%)
- unique
- 8
- min
- 4,879
- max
- 142,984
- mean
- 5.009e+04
- median
- 31,142
- std
- 5.392e+04
- q1
- 10,776
- q3
- 6.847e+04
- iqr
- 5.77e+04
- skew
- 0.8487
- kurtosis
- -0.8589
- n_outliers
- 0
- outlier_rate
- 0
- zero_rate
- 0
mass_earth
numeric feature outliersThis column represents planetary mass expressed in Earth masses, almost certainly a small solar-system body catalogue (n=8 matches the eight classical planets). The distribution is extremely right-skewed (skew=1.95, kurtosis=2.19): the median is only 7.75 M⊕ while the mean is 55.82 M⊕, driven by the two flagged outliers that reach up to 317.8 M⊕ (consistent with Jupiter). The IQR of 35.99 versus a std of 110.61 confirms the heavy upper tail. Treatment: log-transform before regression or distance-based modelling to compress the heavy right tail.
- n
- 8
- nulls
- 0 (0.0%)
- unique
- 8
- min
- 0.0553
- max
- 317.8
- mean
- 55.82
- median
- 7.75
- std
- 110.6
- q1
- 0.638
- q3
- 36.62
- iqr
- 35.99
- skew
- 1.945
- kurtosis
- 2.188
- n_outliers
- 2
- outlier_rate
- 0.25
- zero_rate
- 0
semi_major_axis_au
numeric feature outliersThis column contains the semi-major axis (in astronomical units) for what appears to be the 8 classical planets of our solar system, given the min of ~0.387 AU (Mercury) and max of ~30.07 AU (Neptune). The distribution is right-skewed (skew = 1.15) with the mean (8.45 AU) pulled well above the median (3.36 AU), and 1 outlier (Neptune at 30.07 AU) is flagged — unsurprising given the exponential spacing of outer planets. With only 8 rows and 8 unique values, this is essentially a lookup table with no nulls or duplicates. Treatment: Log-transform before regression to compress the wide outer-planet range.
- n
- 8
- nulls
- 0 (0.0%)
- unique
- 8
- min
- 0.3871
- max
- 30.07
- mean
- 8.454
- median
- 3.363
- std
- 10.84
- q1
- 0.9308
- q3
- 11.95
- iqr
- 11.02
- skew
- 1.147
- kurtosis
- -0.105
- n_outliers
- 1
- outlier_rate
- 0.125
- zero_rate
- 0
eccentricity
numeric feature outliersThis column represents orbital eccentricity values, likely for planets or moons in a solar system dataset (n=8 strongly suggests the eight planets). Values range from 0.00677672 to 0.20563593, consistent with known planetary eccentricities (e.g., Mercury ~0.206, nearly circular orbits near 0). The distribution is right-skewed (skew=1.49) with one flagged outlier (outlier_rate=0.125, i.e., 1 of 8 rows), almost certainly the high-eccentricity body at max=0.20563593 — a physically meaningful extreme rather than a data error. Treatment: Use as-is for orbital mechanics modelling; the outlier is physically valid and should not be winsorized.
- n
- 8
- nulls
- 0 (0.0%)
- unique
- 8
- min
- 0.006777
- max
- 0.2056
- mean
- 0.06008
- median
- 0.04782
- std
- 0.06548
- q1
- 0.01468
- q3
- 0.06374
- iqr
- 0.04906
- skew
- 1.495
- kurtosis
- 1.165
- n_outliers
- 1
- outlier_rate
- 0.125
- zero_rate
- 0
inclination_deg
numeric feature outliersThis column records orbital inclination in degrees, likely for a small set of 8 celestial bodies (satellites, asteroids, or planets). The values cluster tightly between ~1.2° and ~2.7° (IQR), but the maximum of 7.00° is flagged as an outlier and pulls the mean (2.32°) well above the median (1.81°), producing notable right skew (1.32). With only 8 rows, each a unique value and no nulls, this is a clean but extremely small sample where a single high-inclination object (≈7°) dominates distributional shape. Treatment: Use as-is or apply mild log-transform to reduce right skew before regression; flag the single outlier (max=7.00°) for domain review given the tiny sample.
- n
- 8
- nulls
- 0 (0.0%)
- unique
- 8
- min
- 1.531e-05
- max
- 7.005
- mean
- 2.323
- median
- 1.81
- std
- 2.154
- q1
- 1.171
- q3
- 2.713
- iqr
- 1.542
- skew
- 1.32
- kurtosis
- 0.9325
- n_outliers
- 1
- outlier_rate
- 0.125
- zero_rate
- 0
ascending_node_deg
numeric featureThis column represents the longitude (or argument) of the ascending node in degrees, an orbital mechanics parameter defining where an orbit crosses a reference plane. With only 8 rows, all unique and no nulls, this is a very small dataset — likely a catalogue of 8 distinct celestial bodies or orbital elements. The minimum value is exactly 0.0 (zero_rate 0.125, i.e. one record), which could be a reference orbit or a true zero-node case, but the distribution is otherwise fairly uniform across 0–131.8° with mild negative skew (-0.36) and slightly platykurtic shape (-0.63), consistent with a small, spread-out sample rather than any clustering. Treatment: Use as-is in orbital mechanics modelling; consider sine/cosine encoding if used in a circular-aware ML pipeline, since degrees are periodic over 360°.
- n
- 8
- nulls
- 0 (0.0%)
- unique
- 8
- min
- 0
- max
- 131.8
- mean
- 74.31
- median
- 75.35
- std
- 42.01
- q1
- 49.25
- q3
- 103.8
- iqr
- 54.52
- skew
- -0.3594
- kurtosis
- -0.6349
- n_outliers
- 0
- outlier_rate
- 0
- zero_rate
- 0.125
argument_perihelion_deg
numeric featureThis column represents the argument of perihelion in degrees, an orbital mechanics parameter defining the angle between an orbit's ascending node and its closest approach point to the central body. With only 8 rows, all unique, values span nearly the full 0–360° angular range (min 29.13°, max 339.39°), which is physically expected for a diverse set of orbiting bodies. The distribution is remarkably symmetric (skew ≈ –0.016) with a near-flat shape (kurtosis –1.72), consistent with a quasi-uniform spread across the angular domain — no clustering or preferred orientation is evident. The wide IQR of 190.55° relative to the 310.26° total range confirms this near-uniform spread. Treatment: Use as-is or encode as sine/cosine pair (sin/cos of radians) to preserve angular periodicity before modelling.
- n
- 8
- nulls
- 0 (0.0%)
- unique
- 8
- min
- 29.13
- max
- 339.4
- mean
- 182.1
- median
- 187.9
- std
- 122.7
- q1
- 86.47
- q3
- 277
- iqr
- 190.6
- skew
- -0.01585
- kurtosis
- -1.725
- n_outliers
- 0
- outlier_rate
- 0
- zero_rate
- 0
mean_anomaly_deg
numeric featureThis column represents the mean anomaly in degrees, an orbital mechanics parameter describing the fraction of an orbital period elapsed since periapsis, typically ranging 0–360°. With only 8 rows (all unique, no nulls), this appears to be a very small dataset of celestial objects or orbital elements. Values span 19.39° to 317.02° with a large IQR of 152.56° and std of 109.84°, indicating wide, near-uniform spread across the angular range — consistent with the platykurtic kurtosis of −1.07 and mild positive skew of 0.48. Treatment: Use as-is or convert to sine/cosine pair to preserve circular periodicity before modelling.
- n
- 8
- nulls
- 0 (0.0%)
- unique
- 8
- min
- 19.39
- max
- 317
- mean
- 135
- median
- 121.4
- std
- 109.8
- q1
- 42.59
- q3
- 195.2
- iqr
- 152.6
- skew
- 0.478
- kurtosis
- -1.068
- n_outliers
- 0
- outlier_rate
- 0
- zero_rate
- 0
orbital_period_years
numeric feature outliersThis column holds orbital period measurements in years for what appears to be a small set of 8 planetary or solar-system bodies — values spanning 0.2408467 years (roughly 88 days, consistent with Mercury) up to 164.79132 years (consistent with Neptune). The distribution is strongly right-skewed (skew = 1.49) with a mean of 36.73 years pulled well above the median of 6.87 years, and 1 outlier (12.5% of rows) at the high end — almost certainly the Neptune-like body at 164.79 years. The IQR of 42.19 and std of 59.08 both confirm the wide spread driven by that extreme upper value. Treatment: Log-transform before regression or distance-based modelling to compress the heavy right tail.
- n
- 8
- nulls
- 0 (0.0%)
- unique
- 8
- min
- 0.2408
- max
- 164.8
- mean
- 36.73
- median
- 6.872
- std
- 59.08
- q1
- 0.9038
- q3
- 43.09
- iqr
- 42.19
- skew
- 1.486
- kurtosis
- 0.764
- n_outliers
- 1
- outlier_rate
- 0.125
- zero_rate
- 0
rotation_period_days
numeric feature high_skew outliersThis column records the rotational period of planetary or solar-system bodies in days. The most striking feature is a negative mean (-22.69) and a minimum of -243.025, which in planetary science convention indicates retrograde rotation (Venus = -243 days, Uranus ≈ -0.718 days); these negatives are domain-valid but will surprise analysts expecting strictly positive durations. With only 8 rows, all unique, and an outlier rate of 25% (2 of 8 values), the distribution is heavily dominated by the extreme retrograde values, producing a skew of -2.02 and std of 91.33 against a median of just 0.558 days. Treatment: Do not treat negatives as errors; preserve sign to encode rotation direction, but consider splitting into |rotation_period_days| and a binary retrograde flag, then log-transform the absolute value before modelling.
- n
- 8
- nulls
- 0 (0.0%)
- unique
- 8
- min
- -243
- max
- 58.65
- mean
- -22.69
- median
- 0.5576
- std
- 91.33
- q1
- 0.1306
- q3
- 1.004
- iqr
- 0.8739
- skew
- -2.022
- kurtosis
- 2.638
- n_outliers
- 2
- outlier_rate
- 0.25
- zero_rate
- 0
perihelion_distance_au
numeric feature outliersThis column records perihelion distance in astronomical units (AU) — the closest orbital approach to the Sun — for 8 distinct solar system objects. With only 8 rows, the sample is tiny, yet the spread is extreme: values range from 0.31 AU (a Sun-grazing or inner-solar-system body) to 29.81 AU (near Neptune's orbit), and the standard deviation of 10.67 AU nearly equals the mean of 8.18 AU. One outlier is flagged (outlier_rate = 0.125, i.e., 1 of 8 rows), almost certainly the 29.81 AU value, which pulls the mean far above the median of 3.17 AU and drives a skew of 1.20. Treatment: Log-transform before modelling to reduce right skew; flag the 29.81 AU record for domain review as a potential distant trans-Neptunian object.
- n
- 8
- nulls
- 0 (0.0%)
- unique
- 8
- min
- 0.3075
- max
- 29.81
- mean
- 8.182
- median
- 3.166
- std
- 10.67
- q1
- 0.9171
- q3
- 11.34
- iqr
- 10.42
- skew
- 1.198
- kurtosis
- 0.03886
- n_outliers
- 1
- outlier_rate
- 0.125
- zero_rate
- 0
aphelion_distance_au
numeric feature outliersThis column records the aphelion distance (farthest orbital point from the Sun) in astronomical units for 8 solar system bodies. The mean of 8.73 AU is heavily pulled above the median of 3.56 AU by a right-skewed distribution (skew 1.10), with one flagged outlier likely representing a distant outer-planet or dwarf-planet body near the maximum of 30.33 AU (plausibly Neptune or a trans-Neptunian object). With only 8 rows and all values unique, this is a tiny reference dataset; the IQR of 11.62 and std of 11.02 underscore the enormous orbital spread across the sample. Treatment: With n=8, use as-is for descriptive or rule-based work; consider log-transform to reduce skew if used in any distance-based or regression model.
- n
- 8
- nulls
- 0 (0.0%)
- unique
- 8
- min
- 0.4667
- max
- 30.33
- mean
- 8.726
- median
- 3.56
- std
- 11.02
- q1
- 0.9446
- q3
- 12.56
- iqr
- 11.62
- skew
- 1.1
- kurtosis
- -0.238
- n_outliers
- 1
- outlier_rate
- 0.125
- zero_rate
- 0
has_rings
categorical feature long_tail null_rate imbalanceThis column indicates whether a celestial body has rings, but with only 8 rows total and a null rate of 87.5%, just 1 non-null value exists — and that single value is 'True'. The column has cardinality of 1 and entropy of 0.0, meaning it carries zero discriminative information in its current state. The combination of near-total missingness and complete class imbalance makes this column analytically useless as-is. Treatment: Drop or impute with domain knowledge before modelling; currently provides no signal due to 87.5% null rate and single observed value.
- n
- 8
- nulls
- 7 (87.5%)
- unique
- 1
- top_value
- True
- top_rate
- 1
- cardinality
- 1
- entropy
- 0
- entropy_ratio
- 0
ring_inner_radius_km
numeric feature null_rate constantThis column records the inner radius (in kilometres) of a planetary ring system. It is nearly useless in its current state: 87.5% of rows are null, and the single non-null value is constant at 74,500.0 km across all 8 rows (n_unique = 1, std = 0.0). With no variance and overwhelming missingness, the column carries no discriminative signal for modelling. Treatment: Drop from modelling due to 87.5% null rate and zero variance; retain only if the single value (74500.0 km) has domain significance worth documenting as metadata.
- n
- 8
- nulls
- 7 (87.5%)
- unique
- 1
- min
- 74,500
- max
- 74,500
- mean
- 74,500
- median
- 74,500
- std
- 0
- q1
- 74,500
- q3
- 74,500
- iqr
- 0
- skew
- 0
- kurtosis
- 0
- n_outliers
- 0
- outlier_rate
- 0
- zero_rate
- 0
ring_outer_radius_km
numeric feature null_rate constantThis column records the outer radius (in km) of a planetary ring system. It is nearly entirely empty — 87.5% null rate across only 8 rows — and the single non-null value (140,220 km) is constant across all observed cases, giving zero variance. With n_unique=1 and std=0.0, this column carries no discriminative information in the current dataset. Treatment: Drop from modelling; if the dataset grows to include multiple ring systems, revisit as a numeric feature.
- n
- 8
- nulls
- 7 (87.5%)
- unique
- 1
- min
- 140,220
- max
- 140,220
- mean
- 140,220
- median
- 140,220
- std
- 0
- q1
- 140,220
- q3
- 140,220
- iqr
- 0
- skew
- 0
- kurtosis
- 0
- n_outliers
- 0
- outlier_rate
- 0
- zero_rate
- 0
data_source
categorical metadata imbalanceThis column records the data source for each row, and every single one of the 8 records carries the identical value 'NASA JPL Horizons' (top_rate = 1.0, cardinality = 1). With zero variance and zero nulls, the column carries no discriminative information whatsoever. The imbalance alert is technically correct but understates the situation — this is a fully constant column. Treatment: Drop before modelling; if data provenance tracking is needed, retain as a dataset-level annotation rather than a per-row column.
- n
- 8
- nulls
- 0 (0.0%)
- unique
- 1
- top_value
- NASA JPL Horizons
- top_rate
- 1
- cardinality
- 1
- entropy
- 0
- entropy_ratio
- 0
fetch_date
categorical metadata imbalanceThis column is a data-fetch or extraction timestamp recording when the dataset was retrieved. Every single one of the 8 rows carries the identical value '2026-01-19', giving it zero entropy and cardinality of 1. This makes it a constant column with no discriminative power — the 'imbalance' alert fires because top_rate is 1.0. It likely reflects a single snapshot pull rather than a longitudinal series. Treatment: Drop before modelling; constant column adds no signal and wastes a feature slot.
- n
- 8
- nulls
- 0 (0.0%)
- unique
- 1
- top_value
- 2026-01-19
- top_rate
- 1
- cardinality
- 1
- entropy
- 0
- entropy_ratio
- 0