决策和控制衍射光网络

论文标题

决策和控制衍射光网络

Decision-making and control with diffractive optical networks

论文作者

Qiu, Jumin, Xiao, Shuyuan, Huang, Lujun, Miroshnichenko, Andrey, Zhang, Dejian, Liu, Tingting, Yu, Tianbao

论文摘要

人工智能的最终目标是模仿人脑，以直接从高维感觉输入中执行决策和控制。衍射光网络为实施高速和低功率消耗实施人工智能提供了有希望的解决方案。大多数报告的衍射光网络都集中在不涉及环境交互的单个或多个任务上，例如对象识别和图像分类。相比之下，尚未建立能够执行决策和控制的网络据我们所知。在这里，我们建议使用深度强化学习来实施模仿人类水平的决策和控制能力的衍射光网络。这样的网络利用残差体系结构，可以通过与环境的互动来查找最佳控制策略，并且可以轻松地使用现有的光学设备实现。这些网络的出色性能通过与三种类型的经典游戏：TIC-TAC-TOE，SUPER MARIO BROS。和RACing一起验证。最后，我们通过利用基于空间光调节器的衍射光学网络来播放TIC-TAC-TOE进行实验证明。我们的工作代表了促进衍射光网络迈出的坚实一步，该网络有望从目标驱动的预定状态的控制权进行简单识别或分类任务转变为人工智能的高级感官能力的基本转变。它可能会在自动驾驶，智能机器人和智能制造中找到令人兴奋的应用程序。

The ultimate goal of artificial intelligence is to mimic the human brain to perform decision-making and control directly from high-dimensional sensory input. Diffractive optical networks provide a promising solution for implementing artificial intelligence with high-speed and low-power consumption. Most of the reported diffractive optical networks focus on single or multiple tasks that do not involve environmental interaction, such as object recognition and image classification. In contrast, the networks capable of performing decision-making and control have not yet been developed to our knowledge. Here, we propose using deep reinforcement learning to implement diffractive optical networks that imitate human-level decision-making and control capability. Such networks taking advantage of a residual architecture, allow for finding optimal control policies through interaction with the environment and can be readily implemented with existing optical devices. The superior performance of these networks is verified by engaging three types of classic games, Tic-Tac-Toe, Super Mario Bros., and Car Racing. Finally, we present an experimental demonstration of playing Tic-Tac-Toe by leveraging diffractive optical networks based on a spatial light modulator. Our work represents a solid step forward in advancing diffractive optical networks, which promises a fundamental shift from the target-driven control of a pre-designed state for simple recognition or classification tasks to the high-level sensory capability of artificial intelligence. It may find exciting applications in autonomous driving, intelligent robots, and intelligent manufacturing.

下载PDF全文

下载文献需遵守相关版权规定

论文标题