解决大学数学问题的AI计划的限制

论文标题

解决大学数学问题的AI计划的限制

Limits of an AI program for solving college math problems

论文作者

Davis, Ernest

论文摘要

Drori等。（2022）报告说：“神经网络通过计划综合解决，解释和产生大学数学问题，而人类层面很少学习……[它]自动回答了81 \％的大学级数学问题。”他们描述的系统确实令人印象深刻。但是，上述描述夸大了。解决问题的工作不是由神经网络而是由符号代数软件包Sympy完成的。各种格式的问题被排除在考虑之外。所谓的“解释”只是代码行的重新词。答案被标记为问题中未指定的形式的正确。最严重的是，似乎在许多情况下，该系统使用测试语料库中给出的正确答案来指导其解决问题的道路。

Drori et al. (2022) report that "A neural network solves, explains, and generates university math problems by program synthesis and few-shot learning at human level ... [It] automatically answers 81\% of university-level mathematics problems." The system they describe is indeed impressive; however, the above description is very much overstated. The work of solving the problems is done, not by a neural network, but by the symbolic algebra package Sympy. Problems of various formats are excluded from consideration. The so-called "explanations" are just rewordings of lines of code. Answers are marked as correct that are not in the form specified in the problem. Most seriously, it seems that in many cases the system uses the correct answer given in the test corpus to guide its path to solving the problem.

下载PDF全文

下载文献需遵守相关版权规定

论文标题