Abstract:Traditional methods of obtaining potassium content of citrus leaves are timeconsuming procedures with complex operations which can be harmful to citrus trees. Moreover, traditional methods cannot meet the demand for rapid and nondestructive monitoring of potassium content in largescale citrus orchards. Combined with the stateoftheart deep learning technology, a model based on stacked sparse autoencoder (SSAE) and deep learning networks (DLNs) using hyperspectral information for potassium content prediction in four growth stages was proposed. The experiments were conducted in the Crab Village of Luogang District, Guangzhou City, Guangdong Province, and the samples were 109 citrus trees planted. During four growth stages, i.e., germination, stability, bloom and picking stages, hyperspectral reflectance of citrus leaves was respectively measured by spectrometer (ASD FieldSpec 3), and at the same time, potassium content of citrus leaves was obtained by using traditional chemical method. All the collected samples constituted a largescale dataset with totally 436 tuples, 80% of which were utilized as the calibration set and remaining 20% as the validation set. The constructed model which relied on the calibration set and the validation set was evaluated respectively. Firstly, successive projection algorithm (SPA) was provided to deal with the highdimensional spectral vectors for dimension reduction and feature extraction. A prediction model of multiple linear regression (MLR) for potassium content of citrus leaves was established based on those extracted features. The result showed that the potassium spectrum contained a large number of complex nonlinear characteristics. Secondly, wavelet denoising was applied to reduce the highfrequency noise in the original spectrum, and the optimized parameter combination of wavelet denoising through orthogonal test was as follows: “coif2” as wavelet basis function, the number of decomposition layer was 3, “sqtwolog” as the threshold, and “one” as noise estimation scheme, respectively. Thirdly, the features of SSAE in a specific stage were transferred and merged into baseline layer by layer to find out the best number of layers. The result showed that the best numbers of transferred layer were 3, 1, 4 and 3, and the corresponding values of determination coefficients for calibration set were 0.8999, 0.8598, 0.8869 and 0.8547 at germination, stability, bloom and picking stages, respectively, which were improved by 19.82%, 9.45%, 21.49% and 7.21%, respectively, compared with baseline. Then, the features of SSAE in the best layer were transferred and merged into baseline stage by stage to find out the best number of transferred stage. The experiment revealed that features of all four stages were transferred to its corresponding stage domain achieving the best performance. In this situation, the coefficients of determination for calibration set were 0.877 2, 0.8981, 0.9049 and 0.8894 at germination, stability, bloom and picking stages, respectively, which were improved by 16.80%, 14.32%, 23.96% and 11.56%, respectively, compared with baseline. Fourthly, after performing wavelet denoising and four kinds of spectrum transformation, i.e., the first derivative, second derivative, reciprocal and logarithm to the original spectrum, the layers’ features and stages’ features, which were obtained in SSAE previously, were transferred and merged into spectrum in four growth stages. When the first derivative spectrum was used as the input vector of the samples with wavelet denoising, the SSAE-DLNs model achieved the best result and the coefficients of determination for calibration set were 0.8992, 0.8899, 0.8838, 0.8727 and 0.8988, respectively, and the corresponding values of RMSE were 0.5425, 0.5496, 0.5509, 0.5539 and 0.5443, respectively, and the corresponding values of sparse proportion were 0.1411, 0.1633, 0.1189, 0.1856 and 0.2078, respectively, at germination, stability, bloom, picking stages and the whole growth period; and for the validation set, the coefficients of determination were 0.8651, 0.8704, 0.8551, 0.8580 and 0.8771, respectively, and the corresponding values of RMSE were 0.5693, 0.5674, 0.5786, 0.5722 and 0.5528, respectively. Comparing with traditional models such as support vector regression (SVR), partial least square regression (PLSR), general regression neural networks (GRNN) and stepwise multiple linear regression (SMLR), SSAE-DLNs model achieved the best performance, and the next was SVR, in which R2 of calibration and validation set were 0.8988 and 0.8771, respectively. Finally, the research result proved the feasibility of monitoring potassium content of citrus leaves, which may provide a theoretical basis for growth monitoring and nutritional diagnosis of citrus trees.