论文标题
熊:素描BFGS算法,用于超高维度的特征选择
BEAR: Sketching BFGS Algorithm for Ultra-High Dimensional Feature Selection in Sublinear Memory
论文作者
论文摘要
我们考虑用于机器学习中应用程序的功能选择,其中数据的维度如此之大,以至于它超过了(本地)计算机的工作记忆。不幸的是,由于不可逆的碰撞和随机梯度噪声在草图域中的不可逆转碰撞和积累,目前的大规模素描算法表明,记忆准确的权衡不佳。在这里,我们开发了一种称为BEAR的二阶超高维功能选择算法,该算法通过在著名的Broyden-Fletcher-Goldfarb-Shannon(BFGS)算法中存储二阶梯度来避免额外的碰撞,以素描中的素布记忆数据结构从流媒体文献中获得。对现实世界数据集的实验表明,与一阶草图算法相比,BEAR最多需要三个数量级的记忆空间,以达到相同的分类精度。理论分析证明了在草图算法的T迭代中,Bear与速率O(1/T)的收敛性。我们的算法揭示了二阶优化的未探索优势,用于对在超高维数据集中训练的模型的内存约束素描。
We consider feature selection for applications in machine learning where the dimensionality of the data is so large that it exceeds the working memory of the (local) computing machine. Unfortunately, current large-scale sketching algorithms show poor memory-accuracy trade-off due to the irreversible collision and accumulation of the stochastic gradient noise in the sketched domain. Here, we develop a second-order ultra-high dimensional feature selection algorithm, called BEAR, which avoids the extra collisions by storing the second-order gradients in the celebrated Broyden-Fletcher-Goldfarb-Shannon (BFGS) algorithm in Count Sketch, a sublinear memory data structure from the streaming literature. Experiments on real-world data sets demonstrate that BEAR requires up to three orders of magnitude less memory space to achieve the same classification accuracy compared to the first-order sketching algorithms. Theoretical analysis proves convergence of BEAR with rate O(1/t) in t iterations of the sketched algorithm. Our algorithm reveals an unexplored advantage of second-order optimization for memory-constrained sketching of models trained on ultra-high dimensional data sets.