Energy Demand Forecasting
A 24-hour ahead electricity demand forecasting pipeline with feature engineering, rolling-origin validation, tree-based models, and uncertainty estimation — focused on reproducible ML evaluation.
Project image coming soon
Overview
A full machine learning pipeline for forecasting electricity demand 24 hours ahead. Built with an emphasis on reproducible experimentation, practical evaluation methodology, and honest uncertainty estimation.
Problem
Short-term electricity demand forecasting is a high-stakes problem: poor predictions lead to waste, grid instability, or shortfalls. This project treats it as a rigorous applied ML problem — not just a modeling exercise.
Pipeline
- Data ingestion and cleaning: Handling missing values, timezone normalization, and anomaly flagging.
- Feature engineering: Calendar features, lagged demand, rolling statistics, and weather-derived signals.
- Baseline comparison: Naive persistence, seasonal naive, and linear regression as evaluation anchors.
- Model training: Gradient-boosted trees (XGBoost, LightGBM) with hyperparameter tuning.
- Rolling-origin validation: Time-respecting cross-validation to avoid lookahead leakage.
- Uncertainty estimation: Quantile regression for prediction intervals.
- Evaluation dashboard: Streamlit interface for visual inspection of forecasts and residuals.
Key Contributions
- Rigorous rolling-origin validation — not a simple train/test split
- Explicit comparison against multiple baselines before claiming model value
- Quantile regression for honest uncertainty communication
- Fully reproducible pipeline with seeded randomness and documented hyperparameters
Key Learnings
The most important design decision was the validation strategy. Rolling-origin evaluation surfaces failure modes that simple splits miss — especially when calendar patterns and regime changes interact. Proper baseline comparison is also underrated: a well-tuned seasonal naive is hard to beat consistently.