1 Reliability (Data reliability)
- 1.1 Strengths
- 1.2 Weaknesses
2 Originality (Uniqueness / analytical value)
- 2.1 What the data allows us to analyse
- 2.2 Limitations
3 Comprehensiveness (Coverage / functional scope)
- 3.1 Analytical coverage provided by the 11 files
- 3.2 Temporal comprehensiveness
4 Citation (Documentation / traceability / reproducibility)
- 4.1 Strengths
- 4.2 Weaknesses
5 Currency (Data recency)
- 5.1 Consequences
Final ROCCC Summary

1 Reliability (Data reliability)

The reliability of the Fitabase/Bellabeat data is overall moderate, with several limitations.

1.1 Strengths

Data collected from real devices (Fitbit), therefore objective measurements: no self-reporting.
Consistent timestamping across the minute-by-minute, hourly and daily files.
No critical missing values in time columns and user identifiers.
Rich granularity: minute, hour, day → suitable for time-series analysis and pattern detection.

1.2 Weaknesses

Very small sample: only 30 users, which strongly limits generalizability.
Uneven distribution of user contributions: some users provide a lot of heart-rate data, others almost none (e.g. heartrate_seconds_merged).
Imbalance in some metrics:
- minuteMETsNarrow_merged: extremely skewed distribution;
- weightLogInfo_merged: very low coverage → potential bias.
Data collected over a short period (31 days) → no annual seasonality.

Reliability conclusion – Sufficient for an educational exploratory project, not strong enough to support robust market research recommendations.

2 Originality (Uniqueness / analytical value)

2.1 What the data allows us to analyse

Circadian patterns thanks to the minute- and hour-level files.
Global behavioural analysis: activity, sleep, calories, heart rate.
Multi-granularity combination → rare and valuable for modelling a typical day.
Possibility to reconstruct a complete user journey:
- sleep → wake-up;
- activity / intensity → calories → METs → daily behaviour.

2.2 Limitations

No socio-demographic variables → no profile-based analyses (age, gender, generalizable BMI, etc.).

Originality conclusion – The diversity of detail levels is the main strength of this dataset.

3 Comprehensiveness (Coverage / functional scope)

3.1 Analytical coverage provided by the 11 files

Area	Coverage level	Comment
Daily activity	High	`dailyActivity`, `dailyIntensities`, `dailySteps` → complete global view.
Hourly activity	Very high	`hourlyCalories`, `hourlyIntensities`, `hourlySteps` → robust circadian analyses.
Minute-level activity	Very high	Fine granularity for modelling or detecting activity peaks.
Calories / energy expenditure	High	`minuteCaloriesNarrow` + `hourlyCalories` → consistent measurement over time.
METs (physiological intensity)	High	Rare metric but highly skewed.
Sleep	Moderate	`minuteSleep_merged` → good level of detail but no sleep stages.
Heart rate	Low	`heartrate_seconds_merged` incomplete depending on the user.
Weight / BMI	Very low	`weightLogInfo_merged` almost unusable for global analyses.

3.2 Temporal comprehensiveness

31 days → sufficient for:
- daily patterns;
- behavioural clustering;
- habit quantification.
Insufficient for:
- seasonality;
- long-term behaviour change.

Comprehensiveness conclusion – High for activity, moderate for sleep, low for heart rate and weight/BMI.

4 Citation (Documentation / traceability / reproducibility)

4.1 Strengths

Clearly named files.
Homogeneous columns across files (Id, dateTime, value).
Fitabase documentation is publicly available.

4.2 Weaknesses

No metadata embedded in the files.
No device identifiers → loss of contextual information.
No complete official README.

Citation conclusion – Intrinsically weak at the raw file level; improved only through project documentation.

5 Currency (Data recency)

The data dates back to 2016.

5.1 Consequences

2016 Fitbit devices → significant technological bias.
Health guidelines, activity recommendations and intensity classifications have evolved since then.
User behaviour has changed (more smartphone integration, more modern sensors).

Currency conclusion – Weak for operational decision-making, but adequate for an academic or training-oriented analytics project.

Final ROCCC Summary

5.2 Strengths

Exceptionally rich granularity (minute → hour → day).
Temporal consistency.
Real, non self-reported data.
High potential for behavioural analysis and segmentation.
Dataset well-suited for practising EDA, cleaning, profiling, clustering and circadian analysis.

5.3 Weaknesses

Sample size too small (30 users → low statistical reliability).
Old data (2016).
Strongly skewed distributions for METs, intensity and minute-level calories.
No demographic variables.

5.4 Overall conclusion

The 11 Bellabeat files provide an excellent learning ground for data analytics: profiling, EDA, quality checks, ETL integration, visualisation, segmentation, and building an analytical narrative.

However, they are too limited for real-world decision-making, mainly because of:

the small sample size;
the lack of user diversity;
the age of the data.

Bellabeat – Global ROCCC Analysis