This article is part of our exclusive IEEE Journal Watch series created in collaboration with IEEE Xplore.

Researchers from Zhejiang University and Tongdun Technology, a risk management company based in Hangzhou, China, have improved crop yield predictions using deep learning methods. This is a promising method that can take into account how crop yields are affected by the location of farmland and could help farmers and policy makers make more accurate predictions.

Yield forecasting is an important part of agriculture, which has historically consisted of tracking factors such as weather and soil conditions. Accurate forecasts give farmers an edge in making financial decisions for their businesses and help governments avoid disasters like famines. Climate change and increased food production have made accurate forecasts more important than ever, with less room for error. Climate change increases the risk of low yields in many regions, which could trigger a global crisis.

Many of the variables used to predict yields, such as climate, soil quality, and crop management practices, have remained the same, but modeling techniques have become more sophisticated in recent years. Deep learning methods can not only calculate how variables like rainfall and temperature affect yields, but also how they affect each other. For example, the benefits of frequent rainfall may be offset by extremely hot temperatures. The way the variables interact can lead to different results than considering each variable independently.

In their study, the researchers used a recurrent neural network, which is a deep learning technique that tracks the relationship of various variables over time, to help uncover “complex time dependencies” that affect crop yields. Yield-related variables that are affected by time include temperature, sunlight and rainfall, said Chao Wu, a researcher at Zhejiang University and one of the paper’s authors. Wu said these factors “change over time, interact with each other in complex ways, and their impact on yields is usually cumulative.”

The method also allows the impact of variables that are difficult to quantify, such as continuous improvement in breeding and farming practices, to be determined by the method, Wu says. As a result, their model has benefited from capturing larger trends beyond a single year.

The researchers also wanted to include spatial information, such as proximity information between two regions of farmland, to help determine if their yields would be the same. To do this, the researchers combined their recurrent neural network with a graph neural network representing geographic distance to determine how the area around them would affect predictions for specific locations. In other words, the researchers could include information about the surrounding regions for each farmland area and help the model learn from relationships across time and space.

The researchers tested their new method against US soybean yield data released by the National Agricultural Statistics Service. They input climate data including precipitation, sunlight and vapor pressure; soil data such as electrical conductivity, acidity and soil composition; and management data such as the percentage of fields planted. The model was trained on soybean yield data from 1980 to 2013 and tested using data from 2015 to 2017. Compared to existing models, the proposed method performed significantly better than models trained using non-deep learning methods and better than other deep learning models. which did not take into account spatial relationships.

In their future work, the researchers want to make the training data more dynamic and add security features to the model training process. Currently, the model is trained on data that has been aggregated, which prevents private data from being stored. Wu said that could become a problem if data like yields and farm management practices are seen by competitors and used to gain an unfair advantage in the market. Agricultural data such as farm location and yields can also leave farmers vulnerable to fraud and theft. The possibility of data disclosure can also deter participation by reducing the amount of data available for training and negatively impacting the accuracy of trained models.

The researchers hope to use a federated learning approach to train future yield models, allowing the global model to be updated while keeping different data sources isolated from each other.

The researchers presented their findings at the 26th International Computer Design Collaborative Conference, which took place May 24-26 in Rio de Janeiro, Brazil.

From articles on your site

Related articles online