Abstract:Machine learning models have been applied for monitoring crop growth condition and estimating crop yield, it is difficult to understand the internal mechanisms of complex models. In order to estimate crop yields accurately and make understandable explanations at the same time, LightGBM was used to develop yield estimation models of winter wheat in the Guanzhong Plain, PR China by using vegetation temperature condition index (VTCI), and interpretable methods such as local interpretable model-agnostic explanation (LIME), submodular pick-LIME, partial dependence plot (PDP), and individual conditional expectation (ICE) at global and local scales were used for further interpretations of the yield estimation models. Compared with other models, the results of LightGBM optimized by grid search showed that the R2 between the estimated and official yield records of winter wheat was 0.32, the RMSE was 809.10kg/hm2, and the MRE was 16.55%, which reached the extremely significant level (P<0.01), indicating that the model had high prediction precision and strong generalization ability. The interpretability of the experiments showed that the model can extract the knowledge in the data. In global interpretation, VTCI at the jointing stage for yield formation was the most important, followed by VTCI at the heading to filling stage and VTCI at the dough stage, and VTCI at the turning green stage had the least effect, which were consistent with prior knowledge. In local interpretation, based on the spatial characteristics of winter wheat yield that was high in the west and low in the east, the local interpretable methods further provided the reasons for the differences in the yield formation of different counties (districts), which provided references for field management in the Guanzhong Plain, PR China. These methods had application value for increasing and stabilizing the yield of winter wheat.