论文标题
$λ_s$:具有高阶功能和数据类型的可微分编程的可计算语义
$λ_S$: Computable Semantics for Differentiable Programming with Higher-Order Functions and Datatypes
论文作者
论文摘要
深度学习正在朝着越来越复杂的优化目标迈进,这些目标采用了高阶功能,例如集成,持续优化和扎根。由于诸如Pytorch和TensorFlow之类的可区分编程框架没有这些功能的一流表示,因此开发人员必须对此类目标的语义进行推理,并将其手动转换为可区分的代码。 我们提出了一种可区分的编程语言,即$λ_s$,这是第一个为高阶函数,高阶衍生产品和Lipschitz提供语义的语言,但可以使用不可分割的功能。这些功能共同使$λ_s$揭示了具有自动计算的导数的一流函数,以揭示积分,优化和根找到的高阶功能。 $λ_s$的语义是可以计算的,这意味着可以将值计算为任意精度,并且我们在Haskell中实现$λ_s$作为嵌入式语言。 我们使用$λ_s$来构建新颖的可区分库来表示概率分布,隐式表面和广义参数表面(都是作为高阶数据类型的实例),并且目前依赖于计算这些高阶函数和数据类型的衍生物的案例研究。除了对现有的可区分算法进行建模(例如,用于隐式表面的可区分射线示踪剂),而无需任何用户级别的差分代码,我们还演示了新的可区分算法,例如普遍的参数表面的Hausdorff距离。
Deep learning is moving towards increasingly sophisticated optimization objectives that employ higher-order functions, such as integration, continuous optimization, and root-finding. Since differentiable programming frameworks such as PyTorch and TensorFlow do not have first-class representations of these functions, developers must reason about the semantics of such objectives and manually translate them to differentiable code. We present a differentiable programming language, $λ_S$, that is the first to deliver a semantics for higher-order functions, higher-order derivatives, and Lipschitz but nondifferentiable functions. Together, these features enable $λ_S$ to expose differentiable, higher-order functions for integration, optimization, and root-finding as first-class functions with automatically computed derivatives. $λ_S$'s semantics is computable, meaning that values can be computed to arbitrary precision, and we implement $λ_S$ as an embedded language in Haskell. We use $λ_S$ to construct novel differentiable libraries for representing probability distributions, implicit surfaces, and generalized parametric surfaces -- all as instances of higher-order datatypes -- and present case studies that rely on computing the derivatives of these higher-order functions and datatypes. In addition to modeling existing differentiable algorithms, such as a differentiable ray tracer for implicit surfaces, without requiring any user-level differentiation code, we demonstrate new differentiable algorithms, such as the Hausdorff distance of generalized parametric surfaces.