A Data Science Course Project About Crop Yield and Price Prediction I’m Still Not Ashamed Of

The article describes the author’s nostalgic reflection on a student project about crop yield and price prediction during their Master’s degree. They formed a team and chose a topic related to geographic information analysis and economics. The project involved data analysis, statistical modeling, and visualization, leading to successful outcomes and valuable lessons.

 A Data Science Course Project About Crop Yield and Price Prediction I’m Still Not Ashamed Of

“`html


A Data Science Course Project About Crop Yield and Price Prediction I’m Still Not Ashamed Of

During the Christmas holidays, I experienced a feeling of nostalgia for the past student years. That’s why I decided to write a post about a student project that was done almost four years ago as a project on the course “Methods and models for Multivariate data analysis” during my Master’s degree in ITMO University.

Choosing the Topic

I suggested choosing a theme that was big enough that we could work independently on different parts of it and the domain which was close to our interests (geographic information analysis for me and economics for my colleagues).

Research & Data Sources

We started with a literature review to understand exactly how crop yield and crop price are predicted. We also wanted to understand what kind of forecast error could be considered satisfactory.

Climate Data Preprocessing

We have started with an assumption that wheat, rice, maize, and barley yields depend on weather conditions in the first half of the year. Thus, we obtained matrices for the whole territory of Europe with calculated features for the future model(s).

Aggregation of Information by Country

Not all of the country’s territory is suitable for agriculture. Therefore, it was necessary to aggregate information only from certain pixels. In order to account for the location of agricultural land, the following matrix was prepared.

Time Series Forecasting

Putting this method into practice proved to be the easiest. For example, in Python there are several libraries that allow to customize and apply the ARIMA model, for example pmdarima.

Ensembling

After all the models were built, we explored exactly how each model is “mistaken”. The Kalman filter was used to improve the quality of the forecast.

Futures Price Prediction

And the final part: model (lasso regression), which used predicted yield values and Futures features to estimate possible price values.

Why I Still Think This Project Is a Good One

So that’s the end of the story. Above there were posted some of tips. And in the last paragraph, I want to summarize the final point and say why I am satisfied with that project. Here are three main items:

Organisation of work and choice of topic, Meaningful theme, Hard skills. Well, we also got great marks on the exam XD



“`

List of Useful Links:

AI Products for Business or Try Custom Development

AI Sales Bot

Welcome AI Sales Bot, your 24/7 teammate! Engaging customers in natural language across all channels and learning from your materials, it’s a step towards efficient, enriched customer interactions and sales

AI Document Assistant

Unlock insights and drive decisions with our AI Insights Suite. Indexing your documents and data, it provides smart, AI-driven decision support, enhancing your productivity and decision-making.

AI Customer Support

Upgrade your support with our AI Assistant, reducing response times and personalizing interactions by analyzing documents and past engagements. Boost your team and customer satisfaction

AI Scrum Bot

Enhance agile management with our AI Scrum Bot, it helps to organize retrospectives. It answers queries and boosts collaboration and efficiency in your scrum processes.