[1]黄海波.基于改进FCOS的目标检测轻量化方法研究[J].大众科技,2023,25(1):22-25.
 Research on Lightweight Method of Target Detection Based on Improved FCOS[J].Popular Science & Technology,2023,25(1):22-25.
点击复制

基于改进FCOS的目标检测轻量化方法研究()
分享到:

《大众科技》[ISSN:1008-1151/CN:45-1235/N]

卷:
25
期数:
2023年1
页码:
22-25
栏目:
信息技术与通信
出版日期:
2023-01-20

文章信息/Info

Title:
Research on Lightweight Method of Target Detection Based on Improved FCOS
作者:
黄海波 
(桂林电子科技大学信息与通信学院,广西 桂林 541004)
关键词:
目标检测FCOSS ShuffleNetV2BiFPNFocal Loss轻量化
Keywords:
target detection FCOS ShuffleNetV2 BiFPN Focal Loss lightweight
文献标志码:
A
摘要:
随着边缘AI的快速发展,在终端使用轻量级目标检测技术成为研究热点。因此,对FCOS全卷积单阶段目标检测算法进行改进,提出轻量级的LIm-FCOS网络用于终端检测具有意义。首先提取特征骨干网络使用ShuffleNetV2,颈部结构引入改进的BiFPN代替FPN,并采用深度可分离卷积替代普通卷积从而减少计算量。检测头部分改为单独检测,分类分支去掉Center-ness,使用Quality Focal Loss预测分类和边框质量进一步消除训练和预测时置信度的差异,回归分支采用Distribution Focal Loss来改善边框位置的分布,为提高坐标回归准确度加入GIoU Loss辅助收敛。经过COCO2017数据集测试,得到LIm-FCOS的mAP为27.5%,与YOLOX-Nano相比,多了1.5 M参数量、0.43GFLOPs计算量,精度提升了2.2%,同时经过PC端模型推理可视化结果说明网络轻量化方法有效。
Abstract:
With the rapid development of edge AI, the use of lightweight target detection technology in terminals has become a research hotspot. Therefore, it is very meaningful to improve the FCOS full convolution single-stage target detection algorithm and propose a lightweight LIm-FCOS network for terminal detection. First, the feature backbone network is extracted by using ShuffleNetV2, the neck structure is introduced with an improved BiFPN instead of FPN, and the depthwise separable convolution is used instead of ordinary convolution to reduce the amount of computation. The detection head part is changed to separate detection, the classification branch removes Center-ness, uses Quality Focal Loss to predict classification and frame quality to further eliminate the difference in confidence between training and prediction, and the regression branch uses Distribution Focal Loss to improve the distribution of frame positions. Coordinate regression accuracy is added to GIoU Loss to assist convergence. After the COCO2017 data set test, the mAP of LIm-FCOS is 27.5%. Compared with YOLOX-Nano, it has 1.5M more parameters and 0.43GFLOPs calculation, and the accuracy is improved by 2.2%. At the same time, the visualization results of the PC-side model inference show that the network lightweight method is effective.

参考文献/References:

[1] GIRSHICK R, DONAHUE J, DARRELL T, et al. Rich feature hierarchies for accurate object detection and semantic segmentation[C]. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2014: 580-587. [2] GIRSHICK R. Fast r-cnn[C]. Proceedings of the IEEE International Conference on Computer Vision, 2015: 1440-1448. [3] REN S, HE K, GIRSHICK R, et al. Faster r-cnn: Towards real-time object detection with region proposal networks[J]. Advances in Neural Information Processing Systems, 2015, [4] LIU W, ANGUELOV D, ERHAN D, et al. Ssd: Single shot multibox detector[C]. European Conference on Computer Vsion, 2016: 21-37 [5] REDMON J, DIVVALA S, GIRSHICK R, et al. You only look once: Unified, real-time object detection[C]. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016: 779-788. [6] REDMON J, FARHADI A. YOLO9000: better, faster, stronger[C]. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2017: 7263-7271. [7] REDMON J, FARHADI A. Yolov3: An incremental improvement[J]. arXiv preprint arXiv, 2018, 18: 2767. [8] TIAN Z, SHEN C, CHEN H, et al. Fcos: Fully convolutional one-stage object detection[C]. Proceedings of the IEEE/CVF International Conference on Computer Vision,. 2019: 9627-9636. [9] MA N, ZHANG X, ZHENG H T, et al. Shufflenet v2: Practical guidelines for efficient cnn architecture design[C]. Proceedings of the European Conference on Computer Vision, 2018: 122-138. [10] LI X, WANG W, WU L, et al. Generalized focal loss: Learning qualified and distributed bounding boxes for dense object detection[J]. Advances in Neural Information Processing Systems, 2020, 33: 21002-21012. [11] REZATOFIGHI H, TSOI N, GWAK J Y, et al. Generalized intersection over union: A metric and a loss for bounding box regression[C]. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2019: 658-666. [12] ZHANG X, ZHOU X, LIN M, et al. Shufflenet: An extremely efficient convolutional neural network for mobile devices[C]. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018: 6848-6856. [13] TAN M, PANG R, LE Q V. Efficientdet: Scalable and efficient object detection[C]. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2020: 10781-10790. [14] LIN T Y, MAIRE M, BELONGIE S, et al. Microsoft coco: Common objects in context[C]. European Conference on Computer Vision, 2014: 740-755. [15] CHEN Q, WANG Y, YANG T, et al. You only look one-level feature[C]. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021: 13039-13048. [16] GE Z, LIU S, WANG F, et al. Yolox: Exceeding yolo series in 2021[J]. arXiv preprint arXiv, 2021, 2107: 8430. [17] OpenMMLab. Overview of Benchmark and Model Zoo[EB/OL]. (2022-7-15)[2022-7-13]. https://www.github.com/open-mmlab/mmdetection. [18] SANDLER M, HOWARD A, ZHU M, et al. Mobilenetv2: Inverted residuals and linear bottlenecks[C]. Proceedings of the IEEE conference on computer vision and pattern recognition, 2018: 4510-4520.

备注/Memo

备注/Memo:
【收稿日期】2022-08-12 【作者简介】黄海波(1988-),男,广西贺州人,桂林电子科技大学信息与通信学院在读硕士研究生,研究方向为计算机视觉。
更新日期/Last Update: 2023-03-30