• Conférence
  • Ingénierie & Outils numériques

Conférence : Communications avec actes dans un congrès international

Accurate prediction of energy production from building-integrated photovoltaic (BIPV) systems is essential for optimizing building energy use and supporting decarbonization strategies. Synthetic data provides a valuable foundation for model development, particularly when real measurements are limited, but validation on operational systems remains critical for reliable deployment. In this study, we propose a hybrid data approach, combining a large BIMSolar-generated synthetic dataset with real production data from the Les Compagnons du Devoir et du Tour de France building in Strasbourg, France. Several machine learning models including Random Forest, Gradient Boosting, Decision Tree, and XGBoost were trained and evaluated. Results show that XGBoost consistently achieved the best predictive accuracy (MSE of 2316.66, R² score of 0.98, MAE of 16.01, and MAPE of 0.18), while also being among the most energy-efficient models, with minimal training time and carbon footprint (9.06 × 10⁻⁶ gCO₂eq). In contrast, deep learning approaches required significantly higher energy with lower efficiency. These findings highlight the advantages of hybrid data strategies, which leverage the diversity of synthetic datasets while grounding predictions in real-world observations. Beyond accuracy, the study also emphasizes the importance of energy-efficient AI for sustainable BIPV modeling. Limitations of the current work and future research directions are discussed, particularly regarding automated real-time data integration and the exploration of advanced ML techniques.