1. Wind power data cleaning using RANSAC-based polynomial and linear regression with adaptive threshold
- Author
-
Haineng Yang, Jie Tang, Wu Shao, Jintian Yin, and Baiyang Liu
- Subjects
Wind power ,Data cleaning ,Random Sample Consensus algorithm ,Anomalous data ,Polynomial regression ,Adaptive threshold robust regression ,Medicine ,Science - Abstract
Abstract As the global demand for clean energy continues to rise, wind power has become one of the most important renewable energy sources. However, wind power data often contains a high proportion of dense anomalies, which not only significantly affect the accuracy of wind power forecasting models but may also mislead grid scheduling decisions, thereby jeopardizing grid security. To address this issue, this paper proposes an adaptive threshold robust regression model (RPR model) based on the combination of the Random Sample Consensus (RANSAC) algorithm and polynomial linear regression for wind power data cleaning. The model successfully captures the nonlinear relationship between wind speed and power by extending the polynomial features of wind speed and power, enabling the linear regression model to handle the nonlinearity. By combining the RANSAC algorithm and polynomial linear regression, a robust polynomial regression model is constructed to tackle anomalous data and enhance the accuracy of data cleaning. During the cleaning process, the model first fits the raw data by randomly selecting a minimal sample set, then dynamically adjusts the decision thresholds based on the median of residuals and median absolute deviation (MAD), ensuring effective identification and cleaning of anomalous data. The model’s robustness allows it to maintain efficient cleaning performance even with a high proportion of anomalous data, addressing the limitations of existing methods when handling densely distributed anomalies. The effectiveness and innovation of the proposed method were validated by applying it to real data from a wind farm operated by Longyuan Power. Compared to other commonly used cleaning methods, such as the Bidirectional Change Point Grouping Quartile Statistical Model, Principal Contour Image Processing Model, DBSCAN Clustering Model, and Support Vector Machine (SVM) Model, experimental results showed that the proposed method delivered the best performance in improving data quality. Specifically, the method significantly reduced the average absolute error (MAE) of the wind power forecasting model by 72.1%, which is higher than the reductions observed in other methods (ranging from 37.3 to 52.7%). Moreover, it effectively reduced the prediction error of the Convolutional Neural Network (CNN) + Gated Recurrent Unit (GRU) forecasting model, ensuring high prediction accuracy. The adaptive threshold robust regression model proposed in this study is innovative and has significant application potential. It provides an effective new approach for wind power data cleaning, applicable not only to conventional scenarios with low proportions of anomalous data but also to complex datasets with a high proportion of dense anomalies.
- Published
- 2025
- Full Text
- View/download PDF