Abstract:Aiming at the problems of high throughput, high concurrency and slow response of farmland data in the process of multi-condition processing, such as high computational load and slow response speed. The data processing technology of load balancing large-scale cluster was studied, the Hbase farmland database in multi-condition retrieval was optimized, a two-level non-primary key index method was put forward based on Solr, and a large farmland data platform was buildt based on Hadoop. The 100TB data was generated by eight operations, such as subsoiling, plant protection and conservation tillage, and those data was retrieved and tested on the platform based on Hadoop. The experimental results showed that the response time of the optimized technical model was less than 1 s when the concurrent volume of farm data was 50 million, and the performance of the optimized model was improved by about four times compared with the original Hbase. When the concurrent volume of simulated users were 500000, the query per second (QPS)and transaction per second (TPS)of the system were increased by about one time, the response time (RT) of the system was increased by 2.5 times, and the average response time was 183ms. To a certain extent, this system solved the problem of low efficiency of farmland data retrieval caused by high throughput and concurrency, and improved the computing ability of real-time processing of massive farmland data.