论文标题
inalu:改进的神经算术逻辑单元
iNALU: Improved Neural Arithmetic Logic Unit
论文作者
论文摘要
神经网络必须捕获数学关系才能学习各种任务。它们隐含地近似这些关系,因此通常不能很好地概括。最近提出的神经算术逻辑单元(NALU)是一种新型的神经结构,能够通过网络单位明确表示数学关系,以学习诸如求和,减法或乘法之类的操作。尽管已证明Nalus在各种下游任务上都表现良好,但深入的分析揭示了设计的实际缺点,例如无法为更深的网络乘以或划分负面输入值或训练稳定性问题。我们解决这些问题,并提出改进的模型体系结构。我们在各种环境中从学习基本算术操作到更复杂的功能进行经验评估我们的模型。我们的实验表明,我们的模型解决了稳定性问题,并以算术精度和收敛性优于原始NALU模型。
Neural networks have to capture mathematical relationships in order to learn various tasks. They approximate these relations implicitly and therefore often do not generalize well. The recently proposed Neural Arithmetic Logic Unit (NALU) is a novel neural architecture which is able to explicitly represent the mathematical relationships by the units of the network to learn operations such as summation, subtraction or multiplication. Although NALUs have been shown to perform well on various downstream tasks, an in-depth analysis reveals practical shortcomings by design, such as the inability to multiply or divide negative input values or training stability issues for deeper networks. We address these issues and propose an improved model architecture. We evaluate our model empirically in various settings from learning basic arithmetic operations to more complex functions. Our experiments indicate that our model solves stability issues and outperforms the original NALU model in means of arithmetic precision and convergence.