Abstract:In order to promote the intelligence of fishery equipment, video streaming-based fish feeding behaviour recognition has received extensive attention in recent years. The model of traditional recognition methods based on video streaming is too complex to be realized on edge computing devices. To address this problem, a lightweight 2D-convolutional motion feature extraction network, Motion-EfficientNetV2, was proposed which can effectively recognize fish feeding behaviour by using video streams as input. The proposed model used EfficientNetV2 as the backbone network, constructed the motion feature extraction module Motion based on TEA and ECANet, and embeded the Motion module into each Fused-MBConv module of EfficientNetV2, in order to give EfficientNetV2 the ability to extract motion features. The MBConv in the EfficientNetV2 network was also improved by using ECANet to enhance its channel feature extraction capability. Null convolution was used in Motion-EfficientNetV2 to expand the receptive field and improve the wide-range feature extraction capability. The experimental results showed that after introducing the designed Motion module and a series of improvements, the number of parameters and FLOPs of Motion-EfficientNetV2 was 9×106 and 1.31×1010, respectively, which were reduced compared with EfficientNetV2. Comparison experiments using the same dataset in the algorithmic models of TSN-ResNet50, TSN-EfficientNetV2, C3D, and R3D, respectively, showed that the present algorithm achieved an accuracy of 93.97% while the number of parameters and FLOPs were lower than the rest of the models. Therefore, the model proposed can effectively identify fish feeding behavior and guide aquaculturists to develop fish feeding strategies.