Model Series: Univariate vs. Multivariate Modeling

Author: Matt Rutty, Chris Monotoya, Created: 2024-04-07

With approximately 25 different algorithms in SensibleAI Forecast, it can be a challenge to keep track of what all of them do. Even with each model optimizing on different features, events, seasonality, and growth trends, they all fall into 2 major categories: Univariate and Multivariate.

Univariate forecasting uses a target’s past values to predict the future values. They incorporate only one variable - past actuals - across a time horizon. Legacy demand forecast methodologies often fall in a category of univariate forecasting (in using lags), but univariate forecasting incorporates other statistical models that use more robust methodologies than ‘what was the value last year’.

Multivariate forecasting, as the name implies, incorporates more than 1 type of variable to forecast. Take for example a random forest model that might build decision trees based off the GDP feature for a specific country, or a seasonality feature of the day of the week. These are used in addition to the idea of ‘past actuals’ to build a forecast that uses a variety of sources.

All this being said, it’s not always a clear-cut victory for Multivariate models. First, the trade-off of complexity is important. Multivariate approaches need more data to be used successfully compared to a univariate model (as you could see from SensibleAI Forecast model thresholds). Second, it’s important to include a wealth of data for multivariate models, since on short periods of time, they can be prone to overfitting on trends or correlations that do not have predictive value in the future.

Univariate Model Approaches

Mean Model

Forecast all future values as the average of past demand.
Useful when the forecasted value fluctuates around a constant level with no clear trend or seasonality.
Can lag behind shifts (bias), and does not perform well with seasonality.

Last Value Naïve Model

Uses the last observed value as a forecast for all future periods.
Difficult to beat for short term or extremely volatile series.
Growth Rates:
- In a series with no growth, the error of a last value model is equal to the noise of the series.
- In a series with upward growth, there will be a negative (underestimating) bias as well as error equal to the noise of the series.
- In a series with downward growth, there will be a positive (overestimating) bias as well as error equal to the noise of the series.

Seasonal Naïve Model

Shifts the last known period from the previous season (such as using last January’s forecast for this January’s forecast).
Performs well as a benchmark when seasonality is the dominant pattern.
Ignores long term trends and one-off changes in predictions.

Simple Moving Average

Selects a window (set of days) to take the average of and predict, rolling through future predictions.
Useful when demand has no clear trend, allowing predicted values to oscillate randomly around a stable mean.
Lagged in interpreting trends, a moving average will always underestimate the beginning part of a shift up until the mean can catch up.
- Long windows allow for higher stability but increase this lag, while a short window can be more responsive but noisy.

Simple Exponential Smoothing (SES)

Addresses lag issue by weighting recent observations more heavily with a smoothing parameter.
Ideal for data with a relatively constant level and random fluctuations.
Cannot handle trends or seasonality, since it converges at an average.

Holt’s Linear Model

Extends SES by adding trends, keeping smoothing of the level but allowing the forecast to project a growth overall.
Effective for products with steady growth or decline.
Assumes an approximately constant trend, and can overshoot over long horizons as a result.

Theta Model

Combines Simple Exponential Smoothing and trend extrapolation.
Tends to handle a wide range of patterns by balancing a conservative forecast (no-trend) with an extrapolated one.
Does not automatically handle seasonality. Consider pre-processing out the seasonality if used for heavily seasonal data.

Croston Model

Croston’s method is specialized for intermittent demand forecasting, by including two values it predicts: one for the average demand when it happens, and another for the average duration between spikes.
The Croston Model therefore does not get dragged down by constant 0/low predicted values like a moving average or naïve method would.
Croston tends to over-forecast slightly and typically is not successful on continuous patterns.

Fourier Model

Uses sine and cosine to capture seasonal patterns (daily, weekly, annually) across peaks and troughs.
Can fit any period pattern given it has enough terms, and can incorporate multiple seasonal cycles in the same dataset.
With too many terms, Fourier models are at risk of overfitting. Does not capture events like Holidays.

Tweedie GLM

Utilizes distribution-based modeling (Poisson) to predict values. Tweedie GLM is a model that blurs the lines between uni and multivariate, and statistical and Machine Learning. Can incorporate regressors (promotions, months of year, etc.) into the predictions.
Tweedie GLM can handle zero values very well because of its foundation in Poisson distributions.
GLMs typically require a link function and a so-called ‘power parameter’ - something that we do not see in SensibleAI Forecast.

Multivariate Models Approaches

Multivariate models within SensibleAI Forecast can be characterized for the most part as the various Machine Learning algorithms that are deployed throughout the solution. Contrary to pure time-series models, ML models incorporate vast amounts of data including lags of target datasets, related series, calendar variables, and more to interpret and predict complex patterns.

Machine Learning models excel at daily, granular forecasts where the are many data points to train on and they can incorporate factors such as day of the week, holidays, and features to drive rationale for variations.

Interpretability vs. Accuracy Trade-offs

When we typically think of interpretability within SensibleAI Forecast, we often reflect on prediction explanations and feature impact - and for good reason. These are incredibly powerful tools for understanding what is driving a model’s feature selection and output. However, statistical and baseline univariate models have a substantial leg up in the understanding of their base forecast, for example “the forecast for this period in a prior year was 120, with a 5% increase YOY to get to a new forecast of 126”. That interpretability can often be conflated or thought as of inferior to “your prediction explanation for Day-Week is +30 units from a baseline value in SHAP”. In practical terms, the first explanation is easier to interpret. In addition, it’s easier to understand what is captured in a univariate model (typically nothing related to promotions, events, etc.) whereas machine learning models may back them into a myraid of various engineered features (recentering, date variable, lagged features, events). This allows top line adjustments for univariate models to be easier to understand and confirm for finance leadership.

The point of this isn’t to say that machine learning models are inherently uninterpretable. The dashboards and data provided by SensibleAI Forecast is extremely powerful for businesss users. However, it’s still a balance to include this kind of data into a planning process, and takes a substantial amount of change management effort to fully integrate this new way of forecasting and doing business to established corporations.

As with all things, balance between interpretability and accuracy is key. It is not necessarily the case that a use case with hundreds of deployed Machine Learning models, even with prediction explanations, beats statistical models in terms of interpretability even when the Machine Learning models are more accurate.

Success Criteria for each Model

Statistical models generally perform better for aggregated forecasts (such as at the monthly level), while machine learning models can capture daily or weekly granularity better. A study of 1,045 monthly time series use cases found that traditional methods consistently outperformed ML methods in accuracy.

That being said, Machine Learning approaches are excellent on daily data, as a 3 year historical sample of daily information would easily surpass 1,000 data points, unlocking a plethora of trends and drivers to your forecast. Machine Learning forecasts also unlock the ability to be interpreted through SensibleAI Forecast’s explainability visuals, further enhancing information that can be transferred to various financial analysts and end users.

Typically, accuracy of a SensibleAI Forecast is viewed in the lens of both Mean Absolute Error (MAE) and Bias. When making a decision to select models or project configurations viewed as optimal, it’s important to identify the most important factors for the business.

As an example, working with a customer we interpreted that for a set of cold start targets, the Machine Learning models were highly ineffective at forecasting a monthly aggregate number with a daily forecast. However, the Machine Learning models were viewed extremely favorably in the model arena, because they did a pretty good job of predicting up and down trends throughout days and weeks. This became a challenge - how can we properly ‘punish’ SensibleAI Forecast for not being good at predicting high level growth trends even though it interprets a complex ML model as doing pretty well on capturing Friday and Saturday spikes?

In the example above, the answer would be bias. Bias ‘sticks’ with a forecast through temporal and dimensional aggregation. If your forecast across 1,000 targets at the daily level has a bias of 10% and an MAE of 12%, you’re going to end up with an MAE and Bias of 10% aggregating up to the top level across the entire forecast. For this reason, we analyzed all models returned by SensibleAI Forecast and recognized that the Holt Linear Model, which was quite far down in terms of Model Rank in the Model Arena, was actually the best suited model for the monthly level because it most efficiently and accurately captured growth trends.

Conclusion

When interpreting the best time to use Univariate vs Multivariate models, a solid understanding of the data and needs of the customer are the first and most important steps towards making the correct decision. Consider the following:

Stable Patterns over a Long Horizon: Expect that statstical models are going to perform well here. In many cases, they provide reliabilty and transparency that is valuable for long range planning and financial statement predictions.
Volatile Patterns over a Shorter Horizon: These use cases sit within Machine Learning’s wheelhouse. Incorporating promotional events, price adjustments, and other business related features can provide insights to short-term tactical forecasts that human and statistical methods cannot replicate.
Intermittent or New Product Demand: These can be a very mixed bag. For univariate models, there is not a lot of history and no other data that can be used. For multivariate models, that lack of data history presents itself in difficulty of finding trends between feature and target datasets. Consider ‘advanced’ statistical methods like Croston or Tweedie GLM to predict these high variance points.
Supply Chain Forecasting: The most important focus to these use cases is integration. For example, incorporating a demand plan to supply chain forecasts (shipped units) can be a very valuable link to make sure the entire business is flowing in the same direction. This is a luxury that only multivariate models can use. However, some ML models can struggle with the inherent fluctuations of irregular supplier data.

Univariate Model Approaches​

Mean Model​

Last Value Naïve Model​

Seasonal Naïve Model​

Simple Moving Average​

Simple Exponential Smoothing (SES)​

Holt’s Linear Model​

Theta Model​

Croston Model​

Fourier Model​

Tweedie GLM​

Multivariate Models Approaches​

Interpretability vs. Accuracy Trade-offs​

Success Criteria for each Model​

Conclusion​