Click
here to close Hello! We notice that
you are using Internet Explorer, which is not supported by Echinobase
and may cause the site to display incorrectly. We suggest using a
current version of Chrome,
FireFox,
or Safari.
Underwater Holothurian Target-Detection Algorithm Based on Improved CenterNet and Scene Feature Fusion.
Han Y
,
Chen L
,
Luo Y
,
Ai H
,
Hong Z
,
Ma Z
,
Wang J
,
Zhou R
,
Zhang Y
.
???displayArticle.abstract???
Aiming at the common problems, such as noise pollution, low contrast, and color distortion in underwater images, and the characteristics of holothurian recognition, such as morphological ambiguity, high similarity with the background, and coexistence of special ecological scenes, this paper proposes an underwater holothurian target-detection algorithm (FA-CenterNet), based on improved CenterNet and scene feature fusion. First, to reduce the model's occupancy of embedded device resources, we use EfficientNet-B3 as the backbone network to reduce the model's Params and FLOPs. At the same time, EfficientNet-B3 increases the depth and width of the model, which improves the accuracy of the model. Then, we design an effective FPT (feature pyramid transformer) combination module to fully focus and mine the information on holothurian ecological scenarios of different scales and spaces (e.g., holothurian spines, reefs, and waterweeds are often present in the same scenario as holothurians). The co-existing scene information can be used as auxiliary features to detect holothurians, which can improve the detection ability of fuzzy and small-sized holothurians. Finally, we add the AFF module to realize the deep fusion of the shallow-detail and high-level semantic features of holothurians. The results show that the method presented in this paper yields better results on the 2020 CURPC underwater target-detection image dataset with an AP50 of 83.43%, Params of 15.90 M, and FLOPs of 25.12 G compared to other methods. In the underwater holothurian-detection task, this method improves the accuracy of detecting holothurians with fuzzy features, a small size, and dense scene. It also achieves a good balance between detection accuracy, Params, and FLOPs, and is suitable for underwater holothurian detection in most situations.
Figure 1. FA-CenterNet network structure. FA-CenterNet uses EfficientNet-B3 as its backbone network, adding FPT and AFF modules. Compared to the original CenterNet, FA-CenterNet improves the accuracy of underwater holothurian detection while reducing FLOPs and Params. There is a down-sampling relationship between blocks 7, 5, 3, and 2 in which stride is 2. For ease of describing the details of the FPT implementation, blocks 7, 5, 3, and 2 outputs are named X0, X1, X2, and X3. It can be observed that the FPT module incorporates two distinct sets of features.
Figure 2. MBConv module.
Figure 3. Holothurian scene in the CURPC dataset. (a) Reefs and holothurian appear in the same scenario. (b)Waterweeds and holothurian appear in the same scenario. (c) Holothurians whose body features are blurred but can be identified by its spines.
Figure 4. Improved structure of FPT modules. Different texture patterns represent different feature converters, and different colors represent feature maps of different scales. In order to describe the FPT module more succinctly, the outputs of blocks 7, 5, 3, and 2 are named X0, X1, X2, and X3. “Conv1” and “Conv2” on the right-hand side of the structure are 3 × 3 convolution modules with 192 and 96 output channels, respectively. (a) The FPT input is a feature pyramid consisting of two combinations. (b) FPT are the designs of three transformers (c) FPT output that controls the number of feature channels.
Figure 5. FPT feature interaction diagram.
Figure 6. Structure of the MS-CAM module.
Figure 7. Structure of the AFF module.
Figure 8. Impact of Score thresholds. (a) Precision vary with the score threshold. (b) Recall vary with the score threshold. (c) F1-scores vary with the score threshold.
Figure 9. Waterweed falsely tested as holothurian.
Figure 10. Visualizing heat maps of four models. (a) Original Input Pictures. (b) Heatmap visualization of the CenterNet. (c) Heatmap visualization of the CenterNet(B3). (d) Heatmap visualization of the F-CenterNet. (e) Heatmap visualization of the FA-CenterNet.
Figure 11. Performance of different detection methods in CURPC 2020 datasets. (a) Original Input Pictures. (b) Results of SSD. (c) Results of YOLOv3. (d) Results of YOLOv4-tiny. (e) Results of YOLOv5-s. (f) Results of YOLOv5-l. (g) Results of FA-CenterNet.
Ren,
Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks.
2017, Pubmed
Ren,
Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks.
2017,
Pubmed
Schoening,
Semi-automated image analysis for the assessment of megafaunal densities at the Arctic deep-sea observatory HAUSGARTEN.
2012,
Pubmed
,
Echinobase
Zurowietz,
MAIA-A machine learning assisted image annotation method for environmental monitoring and exploration.
2018,
Pubmed