Back to Search Start Over

ForeXGBoost: passenger car sales prediction based on XGBoost

Authors :
Jiaxin Sun
Rui Zhang
Libing Wu
Shan Xue
Zhenchang Xia
Yanjiao Chen
Source :
Distributed and Parallel Databases. 38:713-738
Publication Year :
2020
Publisher :
Springer Science and Business Media LLC, 2020.

Abstract

The rapid development of machine learning has spurred wide applications to various industries, where prediction models are built to forecast sales to help enterprises and governments make better plans. Alibaba Cloud and the Yancheng Municipal Government held a competition in 2018, calling for global efforts to build machine learning models that can accurately forecast vehicle sales based on large-scale datasets. This paper presents the design, implementation and evaluation of ForeXGBoost, and our proposed model that won the first place in the competition. ForeXGBoost takes full advantage of carefully-designed data filling algorithms for missing values to improve data quality. By using the sliding window to extract historical sales and production data features, ForeXGBoost can improve prediction accuracy. An extensive study is conducted to evaluate the influence of different attributes on vehicle sales via information gain and data correlation, based on which we select the most indicative features from the feature set for prediction. Furthermore, we leverage the XGBoost prediction algorithm to achieve a high prediction accuracy with short running time for vehicle sales prediction. Extensive experiments confirm that ForeXGBoost can achieve a high prediction accuracy with a low overhead.

Details

ISSN :
15737578 and 09268782
Volume :
38
Database :
OpenAIRE
Journal :
Distributed and Parallel Databases
Accession number :
edsair.doi...........a181138d7b19c02c8ca388919c485eca
Full Text :
https://doi.org/10.1007/s10619-020-07294-y