Abstract:In order to detect young apple fruits quickly and accurately in the natural environment, an improved YOLO v7 model (YOLO v7-ECA) was proposed to solve the problems of high similarity, small size, dense distribution and difficult identification between young apple fruits and leaves. By inserting the ECA mechanism into the three reparameterized paths of the model, the local cross-channel interaction of adjacent channels could be carried out without reducing the channel dimension, which can effectively emphasize the important information of young apple fruits, suppress redundant and useless features, and improve the efficiency of the model. Totally 2557 images of young apple fruits were collected as training samples, totally 547 images as validation samples, and 550 images as test samples in the natural environment, and input them into the model for training and testing. The YOLO v7-ECA model was trained to have a precision of 97.2%, a recall rate of 93.6%, an mAP of 98.2%, and F1 value of 95.37%. Compared with the Faster R-CNN, SSD, Scaled-YOLO v4, YOLO v5, YOLO v6, YOLO v7 models, its mAP was increased by 15.5, 4.6, 1.6, 1.8, 3.0 and 1.8 percentage points, its precision was increased by 49.7, 0.9, 18.5, 1.2, 0.9 and 1.0 percentage points, its F1 value was increased by 33.53, 2.81, 9.16, 1.26, 2.38 and 1.43 percentage points, and its recall rate was increased by 5.0, 4.5, 1.3, 3.7 and 1.8 percentage points for Faster R-CNN, SSD, YOLO v5, YOLO v6 and YOLO v7 models, respectively; the image detection time was 28.9ms, which could realize efficient detection of young apple fruits. Aiming at the fuzzy, shadowing and severe occlusion of young fruit targets, totally 550 test images were used to test the robustness of the model. The mAP of YOLO v7-ECA was 91.1% and the F1 value was 89.8% under the condition of adding noise and fuzziness. Compared with the Faster R-CNN, SSD, Scaled-YOLO v4, YOLO v5, YOLO v6 and YOLO v7 models, its mAP was increased by 26.3, 21.0, 5.4, 8.0, 11.5 and 8.9 percentage points, and its F1 value was increased by 27.19, 7.08, 8.50, 4.20, 3.94 and 4.67 percentage points, respectively. The mAP of YOLO v7-ECA was 97.5% and the F1 value was 95.36% in the shadow. Compared with the Faster R-CNN, SSD, Scaled-YOLO v4, YOLO v5, YOLO v6 and YOLO v7 models, its mAP was increased by 14.8, 8.8, 2.1, 2.4, 5.4 and 2.5 percentage points, and its F1 value was increased by 21.51, 2.60, 10.49, 1.53, 3.23 and 2.56 percentage points, respectively. The mAP of YOLO v7-ECA was 98.6% and the F1 value was 94.8% under severe occlusion. Compared with that of the Faster R-CNN, SSD, Scaled-YOLO v4, YOLO v5, YOLO v6 and YOLO v7 models, its mAP was increased by 21.7, 13.7, 2.3, 2.4, 4.8 and 2.2 percentage points, and its F1 value was increased by 28.29, 3.50, 6.45, 0.96, 1.36 and 1.36 percentage points, respectively. Experiments showed that the proposed model was of high accuracy and speed, it was also robust to different interference situations such as blurred scene, shadow and severe occlusion. The research result can provide an effective reference for the detection system of apple young fruit.