What is it about?

The study discussed the development of a data gaps filling method to address missing observations in time series data, specifically in the context of solar PV output forecasting. Test cases were conducted with different percentages of data gaps (5%, 10%, 15%, and 20%). The proposed method utilized two solar irradiance datasets, SWR and SSRD, to compare their performance in filling the missing values. Results showed a reduction in the number of data gaps for each test case using SWR and SSRD datasets. The number of data gaps were significantly decreased using the SSRD datasets compared to the SWR datasets. The study excluded the unfilled data gaps after applying the filling method in the implementation of the XGBoost algorithm. To evaluate the forecasting performance of SWR and SSRD models, the RMSE and MAE values were compared for both dataset with filled and unfilled data gaps. The results indicated that datasets with filled data gaps had better forecasting accuracy than datasets with unfilled data gaps. SWR and SSRD models improved RMSE and MAE values for all test cases when the gaps were filled. The SWR models showed improvement ranging from 12.52% to 24.30% for ΔRMSE and 21.10% to 31.31% for ΔMAE. Meanwhile, the SSRD models showed improvement ranging from 14.01% to 28.54% for ΔRMSE and 22.39% to 35.53% for ΔMAE. The study concluded that filling the gaps using SSRD outperformed SWR in forecasting solar PV output. The accuracy of the XGBoost models improved as the number of filled data gaps increased. The proposed data gaps filling method significantly enhanced the forecasting accuracy of the models for datasets with less than 20% data gaps. To further validate the accuracy of the proposed method, the study recommended conducting evaluations using various solar PV output data from different locations, employing other forecasting methods, and considering datasets with higher data gaps.

Featured Image

Why is it important?

This work addresses a common problem in time series data analysis by proposing a data gaps filling method to handle missing observations. This improves forecasting accuracy, leading to more efficient utilization of solar energy resources and better decision-making in the renewable energy sector. Comparative analysis of different datasets, such as SWR and SSRD, helps identify the most reliable dataset for filling gaps. The proposed data gaps filling method is generalizable and applicable, contributing to time series analysis and forecasting techniques. Further validation and exploration of the method using different datasets, variables, and forecasting methods can further refine and expand the proposed method, leading to advancements in handling missing data and improving time series forecast accuracy.


This study offers a valuable solution to a common problem in data analysis - missing observations or data gaps. The researchers improved their solar PV output forecasting models by developing a method to fill data gaps. This is important because accurate forecasting is vital in managing and planning renewable energy resources, such as solar power. The findings also highlight the significance of using the right dataset in filling data gaps, further emphasizing the need for reliable data in forecasting. Overall, this work contributes to the broader field of time series analysis and offers insights that can benefit the renewable energy sector and other domains dealing with time series data.

Ian Benitez
University of the Philippines Diliman

Read the Original

This page is a summary of: A novel data gaps filling method for solar PV output forecasting, Journal of Renewable and Sustainable Energy, July 2023, American Institute of Physics, DOI: 10.1063/5.0157570.
You can read the full text:




The following have contributed to this page