This text discusses the use of multiple model forms for capturing and forecasting components of complex time series. It explores the application of mixed models for time series analysis and forecasting, utilizing various model tools to capture trend, seasonality, and noise components. The methods are demonstrated using real-world road traffic incident data from the UK.
“`html
Using Multiple Model Forms to Capture and Forecast the Components of Complex Time Series
Introduction
I recently had to fix the fence in my back yard. It’s old, wooden, and has been threatening to topple over for a while now. Between curses it really struck me how many tools I needed to use to get the job done, and how sometimes you really need more than one tool for the job.
What does this have to do with time series regression? In general, very little. In particular, quite a bit: today we’ll be diving into using mixed models for time series analysis and forecasting. Or in more DIY terms — using more model tools to get the forecasting job done.
So, without too much more rambling, we will get cracking with:
- Revisiting the big picture, and talking a bit about mixed models.
- Looking at some real-world data.
- Using a simple model to capture the trend in our time series.
- Seasonality three ways: decision trees, linear regression, and classic time series.
- Putting Humpty Dumpty back together to get a single time series prediction.
Aside: you’ll be pleased to know that my thumbs have made a full recovery following some hammer-related accidents.
The Big(ish) Picture
As always, we’re trying to build the most “accurate” model possible. In this case, we’re focusing on forecasting so we’ll prioritize models which can produce the most accurate estimates of future time series values.
I’ve previously written at length about using a regression approach based on Meta’s Prophet methodology. In those articles, I explained why I chose to use the LASSO model: non-stationarity made using classic methods difficult, the need for sensible extrapolation ruled out tree-based approaches, and the desire for simplicity and explainability excluded any sort of neural network.
But I also hinted at using mixed model forms — that is, modeling each of the time series components using a different model form, and combining the output of the various models to produce a single forecast.
As we’ll soon see, breaking the modeling task down into smaller and more targeted problems allows us to use multiple tools better suited to a task than one generalized tool which tackles the task in its entirety.
Data
We’ll be using some real-world data from the UK¹. In this case, it’s a summarized version of road traffic incidents, which looks something like this:
This series is a little messier than what you’d usually see in a demonstration — note the downward (changeable) trend, strong seasonality, and general noisiness inherent in the target.
“`