论文标题

Axcell:从机器学习论文中自动提取结果

AxCell: Automatic Extraction of Results from Machine Learning Papers

论文作者

Kardas, Marcin, Czapla, Piotr, Stenetorp, Pontus, Ruder, Sebastian, Riedel, Sebastian, Taylor, Ross, Stojnic, Robert

论文摘要

随着论文数量的最近爆炸,跟踪机器学习的进度变得越来越困难。在本文中,我们提出了Axcell,这是一种自动机器学习管道,用于从论文中提取结果。 Axcell使用几个新的组件,包括表分割子任务,以学习有助于提取的相关结构知识。与现有方法相比,我们的方法显着改善了最新的最终提取。我们还发布了一个结构化的,注释的数据集,用于培训模型,以提取结果,以及用于评估模型在此任务上的性能的数据集。最后,我们显示了方法的生存能力使其可以用于半自动化结果的生产结果提取,这表明我们的改进使得这项任务几乎是第一次可行。代码可在GitHub上找到。

Tracking progress in machine learning has become increasingly difficult with the recent explosion in the number of papers. In this paper, we present AxCell, an automatic machine learning pipeline for extracting results from papers. AxCell uses several novel components, including a table segmentation subtask, to learn relevant structural knowledge that aids extraction. When compared with existing methods, our approach significantly improves the state of the art for results extraction. We also release a structured, annotated dataset for training models for results extraction, and a dataset for evaluating the performance of models on this task. Lastly, we show the viability of our approach enables it to be used for semi-automated results extraction in production, suggesting our improvements make this task practically viable for the first time. Code is available on GitHub.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源