低潜伏期卷积操作的非线性光学关节转换相关器

论文标题

低潜伏期卷积操作的非线性光学关节转换相关器

Nonlinear Optical Joint Transform Correlator for Low Latency Convolution Operations

论文作者

George, Jonathan K., Solyanik-Gorgone, Maria, Yang, Hangbo, Wong, Chee Wei, Sorger, Volker J.

论文摘要

卷积是人工智能（AI）系统中最相关的操作之一。较高的计算复杂性缩放提出了重大挑战，尤其是在快速响应的网络边缘AI应用程序中。幸运的是，卷积定理可以通过关节转换相关器（JTC）在光学域中执行，从而从根本上降低计算复杂性。尽管如此，经典JTC的迭代两步过程使它们不可行。在这里，我们介绍了一种新颖的实现光学卷积处理器，能够通过使用JTC内的全光非线性来实现接近零的延迟，从而最大程度地减少了电子信号或转换延迟。从根本上讲，我们展示了这种非线性自动相关器如何启用降低高$ O（n^4）$缩放处理二维数据的比例复杂性，仅为$ o（n^2）$。此外，这种光学JTC在时间并行中处理数百万个渠道，非常适合大型矩阵机器学习任务。利用四波混合的非线性过程的示例性，我们显示出光处理，执行完整的卷积，该卷积仅受镜头的几何特征和非线性材料的响应时间的限制。我们进一步讨论，当通过诸如epsilon-near-Zero之类的缓慢效果增强时，全光非线性表现出超过$> 10^{3} $的增益。通过基本降低的复杂性缩放来实现的机器学习加速器的这种新颖的实现，具有低延迟和非著作大规模数据并行性，对网络边缘和云AI系统具有巨大的希望。

Convolutions are one of the most relevant operations in artificial intelligence (AI) systems. High computational complexity scaling poses significant challenges, especially in fast-responding network-edge AI applications. Fortunately, the convolution theorem can be executed on-the-fly in the optical domain via a joint transform correlator (JTC) offering to fundamentally reduce the computational complexity. Nonetheless, the iterative two-step process of a classical JTC renders them unpractical. Here we introduce a novel implementation of an optical convolution-processor capable of near-zero latency by utilizing all-optical nonlinearity inside a JTC, thus minimizing electronic signal or conversion delay. Fundamentally we show how this nonlinear auto-correlator enables reducing the high $O(n^4)$ scaling complexity of processing two-dimensional data to only $O(n^2)$. Moreover, this optical JTC processes millions of channels in time-parallel, ideal for large-matrix machine learning tasks. Exemplary utilizing the nonlinear process of four-wave mixing, we show light processing performing a full convolution that is temporally limited only by geometric features of the lens and the nonlinear material's response time. We further discuss that the all-optical nonlinearity exhibits gain in excess of $>10^{3}$ when enhanced by slow-light effects such as epsilon-near-zero. Such novel implementation for a machine learning accelerator featuring low-latency and non-iterative massive data parallelism enabled by fundamental reduced complexity scaling bears significant promise for network-edge, and cloud AI systems.

下载PDF全文

下载文献需遵守相关版权规定

论文标题