Master's Thesis

Anomaly and Fault Classification and Prediction using Photovoltaic Digital Twins and Machine Learning based approaches

Ricardo de Jesus Vicente Tavares2026

Key information

Authors:

Ricardo de Jesus Vicente Tavares (Ricardo de Jesus Vicente Tavares)

Supervisors:

Hugo Gabriel Valente Morais (Hugo Gabriel Valente Morais); Amâncio Lucas de Sousa Pereira (Amâncio Lucas de Sousa Pereira)

Published in

February 24, 2026

Abstract

This thesis proposes a hybrid framework combining photovoltaic digital twins and Machine Learning for anomaly and fault classification and prediction. To address the scarcity of labelled real anomaly and fault data, digital twins were built using the PVlib Python library, simulating seven operational states: normal condition, three anomalies (soiling, shading, cracks), and three electrical faults (ground, arc, bypass diode). Synthetic daily and time series data were generated to train and evaluate multiple classifiers. Among tested algorithms, XGBoost demonstrated superior performance, achieving high classification accuracy, particularly for electrical faults. A probabilistic inference framework was applied to quantify prediction uncertainty and provide interpretable outputs for maintenance decisions. The trained model was successfully applied to real operational data for inference, showing plausible fault distributions. For prediction, a linear regression model was applied to forecast anomaly progression based on daily class probabilities. A sensitivity analysis identified Pareto-optimal parameter configurations, balancing early detection with prediction reliability. The results confirm that a digital twin-driven Machine Learning approach is a viable solution for predictive maintenance in photovoltaic systems, with XGBoost achieving 82.9% overall accuracy on the synthetic test set and up to 100% accuracy for distinct electrical faults such as ground and arc faults. The prediction framework successfully forecasted anomaly progression with success rates close to 100%, and an anticipation horizon of approximately 30 days. This work offers a practical pathway to enhance system reliability despite limited real fault data.

Publication details

Authors in the community:

Supervisors of this institution:

Fields of Science and Technology (FOS)

electrical-engineering-electronic-engineering-information-engineering - Electrical engineering, electronic engineering, information engineering

Publication language (ISO code)

eng - English

Rights type:

Embargoed access

Date available:

December 3, 2026

Institution name

Instituto Superior Técnico