论文标题
使用市场间数据来评估机器学习的有效市场假设
Using Intermarket Data to Evaluate the Efficient Market Hypothesis with Machine Learning
论文作者
论文摘要
在其半突变形式中,有效的市场假设(EMH)意味着技术分析不会通过市场间数据分析揭示任何隐藏的统计趋势。如果对市场间数据的技术分析揭示了可以利用的趋势,这些趋势可以极大地超过股票市场,那么半肌EMH就无法实现。在这项工作中,我们利用各种机器学习技术来使用股票市场,外币(外汇),国际政府债券,指数未来和商品未来资产来经验评估EMH。我们在每个数据集上训练五个机器学习模型,并分析这些模型的平均性能,以预测SPDR S&P 500 Trust ETF(SPY)近似的未来标准普尔500运动的方向。从我们的分析中,包含债券,指数期货和/或商品期货数据的数据集大大优于基准。此外,我们发现,市场间数据的使用会引起对精度,宏F1得分,加权F1得分以及接收器操作特征曲线的统计学上显着的积极影响,以在95%的置信度下为多种模型。这提供了与半强EMH相矛盾的有力经验证据。
In its semi-strong form, the Efficient Market Hypothesis (EMH) implies that technical analysis will not reveal any hidden statistical trends via intermarket data analysis. If technical analysis on intermarket data reveals trends which can be leveraged to significantly outperform the stock market, then the semi-strong EMH does not hold. In this work, we utilize a variety of machine learning techniques to empirically evaluate the EMH using stock market, foreign currency (Forex), international government bond, index future, and commodities future assets. We train five machine learning models on each dataset and analyze the average performance of these models for predicting the direction of future S&P 500 movement as approximated by the SPDR S&P 500 Trust ETF (SPY). From our analysis, the datasets containing bonds, index futures, and/or commodities futures data notably outperform baselines by substantial margins. Further, we find that the usage of intermarket data induce statistically significant positive impacts on the accuracy, macro F1 score, weighted F1 score, and area under receiver operating characteristic curve for a variety of models at the 95% confidence level. This provides strong empirical evidence contradicting the semi-strong EMH.