用于航空场景分类的多阶段双链融合融合

论文标题

用于航空场景分类的多阶段双链融合融合

A Multi-Stage Duplex Fusion ConvNet for Aerial Scene Classification

论文作者

Yi, Jingjun, Zhou, Beichen

论文摘要

现有的基于深度学习的方法有效地促进了航空场景分类的性能。但是，由于大量参数和计算成本，很难将这些方法应用于多个实时遥感应用程序，例如无人机和卫星上的板上数据戒练。在本文中，我们通过开发一个名为多阶段双工融合网络（MSDF-NET）的轻巧转换来解决此任务。关键思想是在获得尽可能强大的场景表示能力的同时，尽可能少地使用参数。为此，开发了一个残留的双链融合策略，以增强特征传播，同时尽可能地重复使用参数，并通过我们的双链融合块（DFBLOCK）实现。具体而言，我们的MSDF-NET由带有DFBLOCK的多阶段结构组成。此外，开发了双工语义聚合（DSA）模块，以从提取的卷积特征中挖掘遥感场景信息，其中还包含两个平行分支用于语义描述。广泛的实验是在三个广泛使用的空中场景分类基准上进行的，并反映出我们的MSDF-NET可以在最近的最新工厂实现竞争性能，同时最多减少80％的参数数字。特别是，只有4.9亿参数的辅助工具可实现92.96％的精度。

Existing deep learning based methods effectively prompt the performance of aerial scene classification. However, due to the large amount of parameters and computational cost, it is rather difficult to apply these methods to multiple real-time remote sensing applications such as on-board data preception on drones and satellites. In this paper, we address this task by developing a light-weight ConvNet named multi-stage duplex fusion network (MSDF-Net). The key idea is to use parameters as little as possible while obtaining as strong as possible scene representation capability. To this end, a residual-dense duplex fusion strategy is developed to enhance the feature propagation while re-using parameters as much as possible, and is realized by our duplex fusion block (DFblock). Specifically, our MSDF-Net consists of multi-stage structures with DFblock. Moreover, duplex semantic aggregation (DSA) module is developed to mine the remote sensing scene information from extracted convolutional features, which also contains two parallel branches for semantic description. Extensive experiments are conducted on three widely-used aerial scene classification benchmarks, and reflect that our MSDF-Net can achieve a competitive performance against the recent state-of-art while reducing up to 80% parameter numbers. Particularly, an accuracy of 92.96% is achieved on AID with only 0.49M parameters.

下载PDF全文

下载文献需遵守相关版权规定

论文标题