YOLOPEDS：有效的实时实时单发行人检测智能相机应用程序

论文标题

YOLOPEDS：有效的实时实时单发行人检测智能相机应用程序

YOLOpeds: Efficient Real-Time Single-Shot Pedestrian Detection for Smart Camera Applications

论文作者

Kyrkou, Christos

论文摘要

基于深度学习的对象探测器可以在广泛的机器视觉应用中增强智能相机系统的功能，包括视频监视，自动驾驶，机器人和无人机，智能工厂和健康监控。行人检测在所有这些应用中都起着关键作用，并且可以使用深度学习来构建准确的最新检测器。但是，这种复杂的范例并不容易扩展，并且传统上并未在资源受限的智能摄像机中实施用于设备处理的智能摄像机，在实时监控和鲁棒性至关重要时，在情况下具有很大的优势。有效的神经网络不仅可以启用移动应用程序和设备体验，而且还可以成为隐私和安全性的关键推动者，使用户能够获得神经网络的好处，而无需将其数据发送到服务器。这项工作解决了在智能摄像机应用程序中有效部署基于深度学习的行人检测的准确性和速度之间取得良好权衡的挑战。基于可分离的卷积引入了一个计算高效的体系结构，并提出了整合层次和多尺度特征融合的密集连接以提高代表性能力，同时减少参数和操作的数量。特别是，这项工作的贡献如下：1）有效的主链结合了多尺度特征操作，2）使用PETS2009监控数据集对320x320图像的PETS2009监控数据集进行了评估，以改善定位的更精细的损失函数，3）一种无锚定方法。总体而言，Yolopeds提供的实时持续操作每秒超过30帧，检测率在86％的范围内优于现有深度学习模型。

Deep Learning-based object detectors can enhance the capabilities of smart camera systems in a wide spectrum of machine vision applications including video surveillance, autonomous driving, robots and drones, smart factory, and health monitoring. Pedestrian detection plays a key role in all these applications and deep learning can be used to construct accurate state-of-the-art detectors. However, such complex paradigms do not scale easily and are not traditionally implemented in resource-constrained smart cameras for on-device processing which offers significant advantages in situations when real-time monitoring and robustness are vital. Efficient neural networks can not only enable mobile applications and on-device experiences but can also be a key enabler of privacy and security allowing a user to gain the benefits of neural networks without needing to send their data to the server to be evaluated. This work addresses the challenge of achieving a good trade-off between accuracy and speed for efficient deployment of deep-learning-based pedestrian detection in smart camera applications. A computationally efficient architecture is introduced based on separable convolutions and proposes integrating dense connections across layers and multi-scale feature fusion to improve representational capacity while decreasing the number of parameters and operations. In particular, the contributions of this work are the following: 1) An efficient backbone combining multi-scale feature operations, 2) a more elaborate loss function for improved localization, 3) an anchor-less approach for detection, The proposed approach called YOLOpeds is evaluated using the PETS2009 surveillance dataset on 320x320 images. Overall, YOLOpeds provides real-time sustained operation of over 30 frames per second with detection rates in the range of 86% outperforming existing deep learning models.

下载PDF全文

下载文献需遵守相关版权规定

论文标题