Skip to main content
Advertisement
Browse Subject Areas
?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

Bass detection model based on improved YOLOv5 in circulating water system

  • Longqin Xu ,

    Contributed equally to this work with: Longqin Xu, Hao Deng

    Roles Software

    Affiliations College of Information Science and Technology, Zhongkai University of Agriculture and Engineering, Guangzhou, China, Academy of Intelligent Agricultural Engineering Innovations, Zhongkai University of Agriculture and Engineering, Guangzhou, China, Intelligent Agriculture Engineering Technology Research Center of Guangdong Higher Education Institues, Zhongkai University of Agriculture and Engineering, Guangzhou, China, Guangzhou Key Laboratory of Agricultural Products Quality & Safety Traceability Information Technology, Zhongkai University of Agriculture and Engineering, Guangzhou, China

  • Hao Deng ,

    Contributed equally to this work with: Longqin Xu, Hao Deng

    Roles Conceptualization, Data curation, Writing – original draft

    Affiliations College of Information Science and Technology, Zhongkai University of Agriculture and Engineering, Guangzhou, China, Academy of Intelligent Agricultural Engineering Innovations, Zhongkai University of Agriculture and Engineering, Guangzhou, China, Intelligent Agriculture Engineering Technology Research Center of Guangdong Higher Education Institues, Zhongkai University of Agriculture and Engineering, Guangzhou, China, Guangzhou Key Laboratory of Agricultural Products Quality & Safety Traceability Information Technology, Zhongkai University of Agriculture and Engineering, Guangzhou, China

  • Yingying Cao,

    Roles Investigation

    Affiliation College of Computer and Information Engineering, Tianjin Agricultural University, Tianjin, China

  • Wenjun Liu,

    Roles Validation

    Affiliation College of Computer and Information Engineering, Tianjin Agricultural University, Tianjin, China

  • Guohuang He,

    Roles Validation

    Affiliations College of Information Science and Technology, Zhongkai University of Agriculture and Engineering, Guangzhou, China, Academy of Intelligent Agricultural Engineering Innovations, Zhongkai University of Agriculture and Engineering, Guangzhou, China, Intelligent Agriculture Engineering Technology Research Center of Guangdong Higher Education Institues, Zhongkai University of Agriculture and Engineering, Guangzhou, China, Guangzhou Key Laboratory of Agricultural Products Quality & Safety Traceability Information Technology, Zhongkai University of Agriculture and Engineering, Guangzhou, China

  • Wenting Fan,

    Roles Validation

    Affiliations College of Information Science and Technology, Zhongkai University of Agriculture and Engineering, Guangzhou, China, Academy of Intelligent Agricultural Engineering Innovations, Zhongkai University of Agriculture and Engineering, Guangzhou, China, Intelligent Agriculture Engineering Technology Research Center of Guangdong Higher Education Institues, Zhongkai University of Agriculture and Engineering, Guangzhou, China, Guangzhou Key Laboratory of Agricultural Products Quality & Safety Traceability Information Technology, Zhongkai University of Agriculture and Engineering, Guangzhou, China

  • Tangliang Wei,

    Roles Validation

    Affiliation College of Information Science and Technology, Zhongkai University of Agriculture and Engineering, Guangzhou, China

  • Liang Cao,

    Roles Methodology

    Affiliations College of Information Science and Technology, Zhongkai University of Agriculture and Engineering, Guangzhou, China, Academy of Intelligent Agricultural Engineering Innovations, Zhongkai University of Agriculture and Engineering, Guangzhou, China, Intelligent Agriculture Engineering Technology Research Center of Guangdong Higher Education Institues, Zhongkai University of Agriculture and Engineering, Guangzhou, China, Guangzhou Key Laboratory of Agricultural Products Quality & Safety Traceability Information Technology, Zhongkai University of Agriculture and Engineering, Guangzhou, China

  • Tonglai Liu ,

    Roles Supervision, Writing – original draft

    tonglailiu@zhku.edu.cn (TL); shuangyinliu@zhku.edu.cn (SL)

    Affiliations College of Information Science and Technology, Zhongkai University of Agriculture and Engineering, Guangzhou, China, Academy of Intelligent Agricultural Engineering Innovations, Zhongkai University of Agriculture and Engineering, Guangzhou, China, Intelligent Agriculture Engineering Technology Research Center of Guangdong Higher Education Institues, Zhongkai University of Agriculture and Engineering, Guangzhou, China, Guangzhou Key Laboratory of Agricultural Products Quality & Safety Traceability Information Technology, Zhongkai University of Agriculture and Engineering, Guangzhou, China

  • Shuangyin Liu

    Roles Conceptualization, Funding acquisition, Resources

    tonglailiu@zhku.edu.cn (TL); shuangyinliu@zhku.edu.cn (SL)

    Affiliations College of Information Science and Technology, Zhongkai University of Agriculture and Engineering, Guangzhou, China, Academy of Intelligent Agricultural Engineering Innovations, Zhongkai University of Agriculture and Engineering, Guangzhou, China, Intelligent Agriculture Engineering Technology Research Center of Guangdong Higher Education Institues, Zhongkai University of Agriculture and Engineering, Guangzhou, China, Guangzhou Key Laboratory of Agricultural Products Quality & Safety Traceability Information Technology, Zhongkai University of Agriculture and Engineering, Guangzhou, China, College of Computer and Information Engineering, Tianjin Agricultural University, Tianjin, China

Abstract

The feeding amount of bass farming is closely related to the number of bass. It is of great significance to master the number of bass to achieve accurate feeding and improve the economic benefits of the farm. In view of the interference caused by the problems of multiple targets and target occlusion in bass data for bass detection, this paper proposes a bass target detection model based on improved YOLOV5 in circulating water system. Firstly, acquiring by HD cameras, Mosaic-8, a data augmentation method, is utilized to expand datasets and improve the generalization ability of the model. And K-means clustering algorithm is applied to generate suitable coordinates of prior boxes to improve training efficiency. Secondly, Coordinate Attention mechanism (CA) is introduced into backbone feature extraction network and neck feature fusion network to enhance attention to targets of interest. Finally, Soft-NMS algorithm replaces Non-Maximum Suppression algorithm (NMS) to re-screen prediction boxes and keep targets with higher overlap, which effectively solves the problems of missed detection and false detection. The experiments show that the proposed model can reach 98.09% in detection accuracy and detection speed reaches 13.4ms. The proposed model can help bass farmers under the circulating water system to accurately grasp the number of bass, which has important application value to realize accurate feeding and water conservation.

Introduction

Bass is an edible fish with high nutritional value. The amount of bait feeding in bass culture is closely related to the number of bass [1]. If the feeding amount is too small, it will easily lead to mutual injury of bass due to starvation and reduce survival rates. If fed too much, it will reduce the quality of water and waste resources. Therefore, monitoring the bass population is beneficial to scientific feeding of bass and control of culture density [2], which is important to achieve efficient bass culture. The traditional methods of bass counting mainly relies on manual estimation, which is time-consuming and error-prone. With the application of computer vision technology in aquaculture, target detection algorithms are adopted to solve the problem of accurate counting by acquiring biological image data through HD cameras and other devices. Traditional target detection algorithms use image processing methods to extract features [36]. Zhang [7] applied image processing methods such as binarization, expansion and erosion to extract fry images and employed connected area algorithm and refinement algorithm to count bass in the images. Comparing the two algorithms, the refinement algorithm has higher accuracy for images with high overlap, but it requires high quality image acquisition. In practical applications, most of the acquired images are not suitable for direct counting. After designing a tank with a stable flow rate, Wang [8] used local threshold segmentation to extract targets and calculated the total number of bass from non-overlapping regions in images. In addition, four-neighborhood labeling [9], grayscale image analysis [10], image noise reduction and segmentation [11] are also employed to biological counting. Convolutional neural networks can greatly improve accuracy and generalization ability [1215], which are widely used in the field of fish target detection [1621]. An improved YOLOv3 model was proposed by Cui [22] to implement a counting system of puffer fish. The improved YOLOv3 model effectively reduced the occurrence of missed and false detection in overlapping areas, and the counting accuracy can reach 92.5%. Using local image training and migration learning, Lu [23] proposed a lightweight YOLOv4-based shrimp automatic counting model with 92.12% counting accuracy. For high-resolution images, Chen [24] designed an adaptive cropping preprocessing algorithm to augment datasets and proposed a YOLOv5-based model, which achieved 92.55% accuracy. Li [25] improved VGG-19 (a convolutional neural network) to achieve 99.79% accuracy in evaluating iced pomfret freshness. The fish recognition method based on convolutional neural networks have high recognition accuracy, but the inference speed is slow, which cannot meet the demand of rapid multi-target detection. Zhao [26] proposed a fish detection method combining visual attention mechanism SKNet and YOLOv5, it had recognition accuracy of 98.86% and recall rate of 96.64%, 2.14% higher and 2.29% higher compared with YOLOv5. Li [27] proposed a novel method of abnormal behavior detection based on image fusion, the BCS-YOLOv5 based image fusion achieved the best accuracy with an average accuracy of 96.69%. Zhao [28] proposed a high-precision and lightweight end-to-end target detection model based on deformable convolution and improved YOLOv4, the proposed model has an accuracy of 95.47% while the parameter amount is reduced by 10 times and the FPS is doubled. To address the above problems, this paper proposes an improved YOLOv5 model for bass detection under recirculating water aquaculture system, aiming to improve the rapid localization ability and counting accuracy. Taking bass as the research object, the proposed model introduces Coordinate Attention mechanism [29] based on YOLOv5 model to accurately locate and identify the targets of interest, and utilizes the Soft-NMS [30] algorithm instead of the original NMS [31] algorithm for selecting more suitable target boxes.

Materials and methods

Experimental steps of proposed work

In this study, The Experimental steps of the proposed work are shown in Fig 1.

Data acquisition

In the circulating water bass breeding workshop of Guangdong Provincial General Fisheries Technology Promotion Station in Nansha District, Guangzhou, a bass video data collection platform is built, mainly including a circular fish pond made of polypropylene plates, HD cameras and video recorders. As shown in Fig 2, the diameter of the fish pond is 2.7 meters. The datasets are collected from 7:00 am to 18:00 pm. In order to obtain complete images of fish ponds from the top viewing angle, cameras are directly installed above the center of fish pond, with a height of 1 meter from the water surface. The cameras have 1080P resolution, 2.8mm focal length lens, 130 degree wide angle lens, and 30FPS MP4 video capture format. The acquired videos are intercepted as JPG format images, and 1066 photos are intercepted at different time periods and light intensities, as shown in Fig 3.

Data annotation

In this paper, the datasets are manually labeled with the labeling tool LabelImg in VOC dataset format, naming label target as fish, as shown in Fig 4. The annotation information is stored in an XML file, which corresponds to the image file. Each line stores one target information, in order: target category, X and Y axis coordinates of the center point of the detection boxes, and target width and height.

Data augmentation

YOLOv5 model applied the data augmentation method Mosaic, which randomly scales and crops four images and then concatenates them to form a single image. This method calculates 4 images at a time during the normalization operation, which can reduce the requirement for memory and speed up training. Therefore, an improved data augmentation method Mosaic-8 is adopted in this paper. Mosaic-8 increases the number of images processed from 4 to 8 [32], while random noise and random brightness are introduced to improve robustness and generalization, as shown in Fig 5.

Improved YOLOv5 bass target detection algorithm

Principle of YOLOv5s algorithm.

In this paper, lightweight detection model YOLOv5s is applied, which has characteristics of fast detection, high performance, flexibility and rapid deployment. YOLOv5s consists of a backbone feature extraction network (Backbone), a neck feature fusion network (Neck), and a prediction layer (Prediction).

The backbone feature extraction network employs a CSPDarknet network structure that mainly consists of a residual convolution layer (Bottleneck) and a CSPnet network structure, which can increase the depth of networks and enhance the learning ability of convolutional neural networks. Secondly, the backbone feature extraction network introduces a Focus network structure to slice and stack the input images, expand the number of channels, decrease parameters and reduce the memory usage during training. Moreover, the backbone feature extraction network also introduces a SPPF network structure, which employs maximum pooling with different kernel sizes to increase the perceptual field.

Inspired by feature pyramids, the neck feature fusion network consists of a FPN [33] layer with a PAN [34] layer. The FPN layer performs a down-sampling operation to deliver strong semantic information from the top down, while the PAN layer performs an up-sampling operation to deliver strong localization information from the bottom down. They aggregate features from different backbone feature layers to different prediction layers.

The prediction layer consists of three prediction feature layers, which aim to detect targets of different sizes. The detection results output from the prediction layer are filtered by non-maximal suppression (NMS) to obtain the highest scoring detection boxes. The structure of the YOLOv5s network is shown in Fig 6.

Coordinate attention.

Since environmental constraints and fixed camera shooting angles, fish overlap and occlusion problems often occur, resulting in low model detection accuracy. Therefore, in this paper, Coordinate Attention mechanism (CA) that combines location information with channel information, is added to capture not only cross-channel information but also direction-aware and location-sensitive information. The method helps to locate and identify the target of interestand improve accuracy in the case of overlap and occlusion.

As shown in Fig 7, CA mechanism is divided into two main steps, namely coordinate information embedding and coordinate attention generation, aim to embed location information into channel attention.

Improved backbone feature extraction network.

To extract different scale features in the backbone feature extraction network, CA mechanism is inserted into the C3 module, and the CBS module on another branch is removed. As shown in Fig 8, the C3CA module is replaced by the C3 module, which can effectively reduce parameters and make the model more lightweight.

Improved neck feature fusion network.

Similarly, the neck feature fusion network incorporates CA mechanism to obtain multi-scale feature information, which can help to locate targets of interest.

The framework diagram of the improved YOLOv5 perch detection model is shown in Fig 9.

Loss function.

The loss function of the improved YOLOv5 target detection model is shown in Eq 1, which consists of a bounding box regression loss Lbbox, a confidence loss Lcfd and a object class loss Lcls. (1)

The bounding box regression loss Lbbox employs CIOU that takes into account overlap area, central point distance and aspect ratio between ground-turth boxes and predicted boxes, making the detection accuracy higher, as shown in Eqs 3, 4, 5 and 6. (2) (3) (4) (5) (6) where IoU denotes the ratio of the intersection of the prediction box A and the ground-turth box B to their concatenation. ρ denotes Euclidean distance between the center point b of the prediction box and the center point bgt of ground-turth boxes, and c presents the diagonal length of the minimum enclosing box. α is a weight parameter, and v is a aspect ratio coefficient of the ground-turth box to the predicted box. h and w denote the height and width of the prediction box, and hgt and wgt denote the height and width of ground-turth boxes.

The confidence loss Lcfd is shown as Eq 7. (7) where K × K denotes the grid parameters of the output layer, M denotes the number of prediction boxes corresponding to each output grid, C is the total number of categories, and λnoobj is a penalty weight.

The object class loss Lcls is shown as Eq 8. (8) where P is the classification probability, Pi(c) is the probability that the i-th sample is of class c, and is the probability that the i-th sample is predicted to be of class c.

Soft-NMS.

In the post-processing stage of YOLOv5, Non-Maximum Suppression (NMS) is used to remove most of duplicate predictors and reduce the incidence of false detection. Prediction boxes are sorted first followed by filtering according to the pre-set threshold, and maximum value prediction boxes are output, as shown in Eq 9. (9) where sk is a score of the k-th pre-set box, Nj is the preset threshold, M is the current highest scoring predictor box, bk is a k-th predictor box to be filtered, and AIoU is the overlap region between M and bk.

When two fish are close to each other in a densely populated image, NMS only keeps the highest scoring prediction boxes and delete low-scoring prediction boxes, resulting in a low recall and missed detection. Therefore, Soft-NMS method is utilized to replace the traditional NMS in this paper, as shown in Eq 10. (10)

Soft-NMS resets scores based on the overlap of similar detection boxes and retains the prediction boxes for next screening, solving the problem that traditional NMS directly removes low-scoring prediction boxes that lead to missed detection.

Experimental results and analysis

Experimental platform

All experiments are conducted in Ubuntu 16.04 and Pytorch 1.7.1, where GPU is GeForce GTX 3090 with 32 GB of memory, CPU is Intel(R) Xeon(R) Gold 6146 @2.40 GHz.

The original YOLOv5s model introduces a prior boxes to speed up training. In this experiment, K-means algorithm is applied to cluster different sizes of bounding boxes, and 9 clustering centers are regenerated as prior boxes. The coordinates of clustering centers are (12,72), (46,27), (16,81), (21,76), (44,39), (28,63), (20,97), (41,58), and (35,79), respectively, as shown in Fig 10.

Evaluation metrics

The average precision (AP) and F1 are employed as evaluation metrics, where AP is the area under the Precision-Recall curve, as shown in Eqs 11, 12 and 13. (11) (12) (13) where P denotes the probability of correct prediction, R denotes the probability of incorrect prediction, TP is the number of successful predictions, FP is the number of incorrect predictions, and FN is the number of unpredicted predictions.

F1 is the average of precision and recall, as shown in Eq 14. (14)

In addition, experiments also analyze the model performance in terms of detection time, number of parameters, and model sizes.

Model training and hyperparameter setting.

To verify the performance of the proposed model, it is compared with target detection models YOLOv3tiny, YOLOv3, YOLOv4, YOLOv7 and YOLOv5s. The total number of iterations for experiments are 200, the iteration batch size is 32, the optimizer is Adam, and the default parameters are adopted. The learning rate is set to 0.01, and the cosine annealing learning rate adjustment strategy is introduced. The training loss and validation loss of each iteration are recorded and plotted, and the model with the lowest loss in validation set is saved as the training result. The losses for training and validation of different models are shown in Fig 11.

Comparative analysis of different target detection models

As shown in Table 1, compared with YOLOv5, YOLOv5-CA-Soft-NMS shows 2.91% increase in AP and 8.03% decrease in the amount of parameters, and 1.1ms decreases in detection time. This shows that with the introduction of C3CA module, the backbone feature extraction network can effectively extract location features and channel features to focus on the region of interest. Compared with YOLOv5-CA, YOLOv5-CA-Soft-NMS has an increased detection time by 1.6ms, but it can also meet the requirements of fast detection with higher accuracy. This phenomenon indicates that although Soft-NMS has more computational overhead than NMS, it is more effective in detecting dense overlapping perch targets.

In terms of accuracy, YOLOv5-CA-Soft-NMS improves AP by 3.21% and 0.43% compared with YOLOv3-tiny and YOLOv4, respectively. However, compared with YOLOv3, AP is reduced by 0.59%. It is possible that YOLOv3 classifies incorrect targets as correct predictions, resulting in higher AP.

In terms of performance, compared with YOLOv5-CA-Soft-NMS, YOLOv3-tiny, YOLOv3, YOLOV4 and YOLOv7 has 1.3 times, 9.69 times, 8.15 times and 5.76 times higher number of parameters, 1.31 times, 15.3 times, 9.45 times and 5.63 times higher model size, and 1.26 times, 3.4 times, 2.6 times and 1.34 times higher detection time, respectively. As shown in Table 1, YOLOv5-CA-Soft-NMS has a lower number of parameters, smaller model weights and better detection speed.

Comparative analysis of detection effect of different models

To verify the effectiveness of detection, YOLOv5-CA-Soft-NMS and other models are compared on the test set, as shown in Fig 12.

thumbnail
Fig 12. Comparison of detection effects of different models.

https://doi.org/10.1371/journal.pone.0283671.g012

In region A of Fig 12(a), YOLOv5-CA successfully detects all targets that cross-obscure each other, and all other models have missed detection. In region B of Fig 12(a), YOLOv5-CA-Soft-NMS successfully detects all parallel-obscured targets, while all other models have missed detection phenomena.

In region C of Fig 12(b), YOLOv5-CA successfully detects all densely overlapping targets, and all other models suffer from false detection.

In region E of Fig 12(c), there are partially overlapping targets and a large number of independent targets. YOLOv5-CA-Soft-NMS successfully detects all targets, while other models have missed or false detection. There are targets with a high degree of overlap in region F of Fig 11(c), and YOLOv5-CA-Soft-NMS and YOLOv5-CA successfully detect all targets, while other models have missed detection.

In summary, YOLOv5-CA-Soft-NMS detects targets better than other models and effectively improves accuracy in the case of target occlusion.

Conclusions

For the problems of multiple target occlusion and overlap in fish dense farming scenarios, this paper proposes a bass target detection model based on improved YOLOv5.

To effectively detect bass targets, CA mechanism is inserted to improve the backbone feature extraction network and neck feature fusion network of YOLOv5 to capture cross-channel information, orientation awareness and location sensitive information. The average mean accuracy of model detection is improved by 2.46% compared with YOLOv5s model.

To effectively retain targets with high overlap degree and improve the recognition accuracy, Soft-NMS replaces NMS to re-screen prediction boxes. The detection accuracy is improved by 2.91% compared with YOLOv5s.

In this paper, we propose a bass detection model, YOLOv5-CA-Soft-NMS, which has a detection speed of 13.4ms and a detection accuracy of 98.09%. Compared with YOLOv3-tiny, YOLOv3, YOLOv4, YOLOv5 and YOLOv7, it has higher detection accuracy and good detection speed. The proposed model is more suitable for rapid detection of bass population, so that farmers can feed bait scientifically, control cost and save resources, which has high practical application value.

Acknowledgments

The authors would like to thank Prof. Shuangyin Liu, Dr. Tonglai Liu for revising the manuscript.

References

  1. 1. Zhao J. Key points of healthy breeding technology for perch[J]. Fishery Guide to be Rich, 2020, (23): 49–51.
  2. 2. Li K, Jiang X, Chen E, et al. Auto-counting the Eel Anguilla in recirculating aquaculture system via deep learning[J]. Oceanologia et Limnologia Sinica, 2022, 53(03): 664–674.
  3. 3. Awalludin EA, Muhammad W N A W, Arsad T N T, et al. Fish larvae counting system using image processing techniques[C]//Journal of Physics: Conference Series. IOP Publishing, 2020, 1529(5): 052040.
  4. 4. Kesvarakul R, Chianrabutra C, Chianrabutra S. Baby shrimp counting via automated image processing[C]//Proceedings of the 9th International Conference on Machine Learning and Computing. 2017: 352–356.
  5. 5. Albuquerque P L F, Garcia V, Junior A S O, et al. Automatic live fingerlings counting using computer vision[J]. Computers and Electronics in Agriculture, 2019, 167: 105015.
  6. 6. Klapp I, Arad O, Rosenfeld L, et al. Ornamental fish counting by non-imaging optical system for real-time applications[J]. Computers and electronics in agriculture, 2018, 153: 126–133.
  7. 7. Zhang J, Pang H, Cai W, et al. Using image processing technology to create a novel fry counting algorithm[J]. Aquaculture and Fisheries, 2022, 7(4): 441–449.
  8. 8. Wang W, Xu J, Du Q. Study on a computer vision based automatic counting system of fries[J]. Fishery Modernization, 2016, 43(03): 34–38,73.
  9. 9. Liu S, Chen J, Liu X, et al. Study on Chlorella automatic counting based on the algae fluorescence excitation effect[J]. Fishery Modernization, 2012, 39(05): 16–20.
  10. 10. Fan L, Liu Y. Automate fry counting using computer vision and multi-class least squares support vector machine[J]. Aquaculture, 2013, 380: 91–98.
  11. 11. Huang L, Hu B, Cao N. The Novel Fries-Counting Method Based on Image Processing[J]. Hubei Agricultural Sciences, 2012, 51(09): 1880–1882.
  12. 12. Ren S, He K, Girshick R, et al. Faster r-cnn: Towards real-time object detection with region proposal networks[J]. Advances in neural information processing systems, 2015, 28.
  13. 13. Redmon J, Divvala S, Girshick R, et al. You only look once: Unified, real-time object detection[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2016: 779–788.
  14. 14. Liu W, Anguelov D, Erhan D, et al. Ssd: Single shot multibox detector[C]//European conference on computer vision. Springer, Cham, 2016: 21–37.
  15. 15. Bai Qiangy Gao Ronghuay Zhao Chunjiangy et al. Multi-scale behavior recognition method for dairy cows based on improved YOLOV5s network[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2022, 38(12): 163–172.
  16. 16. Chen YY, Gong C Y, Liu Y Q. Fish identification method based on FTVGG16 convolutional neural network[J]. Trans. Chin. Soc. Agric. Mach, 2019, 50: 223–231.
  17. 17. Cai K, Miao X, Wang W, et al. A modified YOLOv3 model for fish detection based on MobileNetv1 as backbone[J]. Aquacultural Engineering, 2020, 91: 102117.
  18. 18. Hong Khai T, Abdullah S N H S, Hasan MK, et al. Underwater Fish Detection and Counting Using Mask Regional Convolutional Neural Network[J]. Water, 2022, 14(2): 222.
  19. 19. Kandimalla V, Richard M, Smith F, et al. Automated detection, classification and counting of fish in fish passages with deep learning[J]. Frontiers in Marine Science, 2022: 2049.
  20. 20. Lainez S M D, Gonzales D B. Automated fingerlings counting using convolutional neural network[C]//2019 IEEE 4th International Conference on Computer and Communication Systems (ICCCS). IEEE, 2019: 67–72.
  21. 21. Li D, Miao Z, Peng F, et al. Automatic counting methods in aquaculture: A review[J]. Journal of the World Aquaculture Society, 2021, 52(2): 269–283.
  22. 22. Cui Z. Based on computer vision the counting system of puffer fish[D]. Dalian Ocean University, 2020.
  23. 23. Zhang L, Zhou X, Li B, et al. Automatic shrimp counting method using local images and lightweight YOLOv4[J]. Biosystems Engineering, 2022, 220: 39–54.
  24. 24. Chen Z, Li Z, Yang Z, et al. Research on YOLOv5-based object detection method for factory farmed shrimp[J]. Marine Fisheries: 1–13.
  25. 25. Li Zhenbo, Li Meng, Zhao Yuanyang, et al. Iced pomfret freshness evaluation method based on improved VGG-19 convolutional neural networks[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2021, 37(22):286–294.
  26. 26. ZHAO Meng, Hong YU, Haiqing LI, et al. Detection of fish stocks by fused with SKNet and YOLOv deep learning[J]. Journal of Dalian Ocean University, 2022, 37(2): 312–319.
  27. 27. Li Xin, Hao Yinfeng, Zhang Pan, et al. A novel automatic detection method for abnormal behavior of single fish using image fusion. Computers and Electronics in Agriculture, 2022, 203: 107435.
  28. 28. Zhao Shili, Zhang Song, Lu Jiamin, Wang He, Feng Yu, Shi Chen, et al. A lightweight dead fish detection method based on deformable convolution and YOLOV4, Computers and Electronics in Agriculture, 2022, 198: 107098.
  29. 29. Hou Q, Zhou D, Feng J. Coordinate attention for efficient mobile network design[C]//Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2021: 13713–13722.
  30. 30. Bodla N, Singh B, Chellapa R, et al. Soft-nms- improving object detection with one line of code C].2017 IEEE International Conference on Computer Vision (ICCV), 2017:5562–5569.
  31. 31. Neubeck A, Van Gool L. Efficient non-maximum suppression[C]//18th International Conference on Pattern Recognition (ICPR’06). IEEE, 2006, 3: 850–855.
  32. 32. Guo L, Wang Q, Xue W, et al. A Small Object Detection Algorithm Based on Improved YOLOv5[J]. Journal of University of Electronic Science and Technology of China, 2022, 51(02): 251–258.
  33. 33. Lin T Y, Dollár P, Girshick R, et al. Feature pyramid networks for object detection[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2017: 2117–2125.
  34. 34. Liu S, Qi L, Qin H, et al. Path aggregation network for instance segmentation[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2018: 8759–8768.