论文标题
提高准确性而不会失去可解释性:时间序列的ML方法预测
Improving Accuracy Without Losing Interpretability: A ML Approach for Time Series Forecasting
论文作者
论文摘要
在时间序列预测中,基于分解的算法将汇总数据分解为有意义的组件,因此因其在可解释性方面的特殊优势而受到赞赏。最近的算法通常将机器学习(以下称为ML)方法与分解相结合以提高预测准确性。但是,通常认为合并ML不可避免地牺牲可解释性。此外,现有的混合算法通常依赖具有统计假设的理论模型,仅关注聚合预测的准确性,因此遭受了准确性问题,尤其是在组件估计中。为了应对上述问题,本研究探讨了提高准确性而不会在时间序列预测中失去可解释性的可能性。我们首先定量定义数据驱动的预测的可解释性,并从可解释性的角度系统地检查现有的预测算法。因此,我们提出了W-R算法,这是一种从新颖的角度结合分解和ML的混合算法。具体而言,W-R算法用加权变体代替了标准添加剂组合函数,并使用ML同时修改所有组件的估计值。我们数学上分析了算法的理论基础,并通过广泛的数值实验来验证其性能。通常,W-R算法的表现优于所有基于分解的基准和ML基准。基于p50_ql,该算法的准确性相对高8.76%,而公共电力负载数据集的实践销售预测和77.99%的实际销售预测。这项研究提供了一种创新的观点,可以结合统计和ML算法,JD.com实施了W-R算法来进行准确的销售预测并指导其营销活动。
In time series forecasting, decomposition-based algorithms break aggregate data into meaningful components and are therefore appreciated for their particular advantages in interpretability. Recent algorithms often combine machine learning (hereafter ML) methodology with decomposition to improve prediction accuracy. However, incorporating ML is generally considered to sacrifice interpretability inevitably. In addition, existing hybrid algorithms usually rely on theoretical models with statistical assumptions and focus only on the accuracy of aggregate predictions, and thus suffer from accuracy problems, especially in component estimates. In response to the above issues, this research explores the possibility of improving accuracy without losing interpretability in time series forecasting. We first quantitatively define interpretability for data-driven forecasts and systematically review the existing forecasting algorithms from the perspective of interpretability. Accordingly, we propose the W-R algorithm, a hybrid algorithm that combines decomposition and ML from a novel perspective. Specifically, the W-R algorithm replaces the standard additive combination function with a weighted variant and uses ML to modify the estimates of all components simultaneously. We mathematically analyze the theoretical basis of the algorithm and validate its performance through extensive numerical experiments. In general, the W-R algorithm outperforms all decomposition-based and ML benchmarks. Based on P50_QL, the algorithm relatively improves by 8.76% in accuracy on the practical sales forecasts of JD.com and 77.99% on a public dataset of electricity loads. This research offers an innovative perspective to combine the statistical and ML algorithms, and JD.com has implemented the W-R algorithm to make accurate sales predictions and guide its marketing activities.