Frequency Aggregation for Forecast Accuracy Comparison
Sometimes during SensibleAI Forecast engagements, the data frequency (daily, weekly, monthly) is self-evident given the nature of the stakeholder’s request and source data. If the stakeholder (use case sponsor, executive, business team, etc.) wants a monthly forecast and only has monthly data - common for financial statement forecasts - then SensibleAI Forecast power users have no choice but to run projects at the monthly level. Similarly, if the stakeholder wants a weekly forecast for operational planning, and only has a week-close aggregate of shipped or ordered units, then power users must build projects at that weekly level.
However, when a stakeholder wants a higher-level forecast than the frequency native to the dataset, power users are presented with the option of forecasting at a different level than what will ultimately be reported. In general, there are no challenges when building daily-level projects to report out at the month or week level, since days fit neatly and consistently into weeks and months. However, there are more considerations when forecasting at the weekly level and rolling up to a month for reporting.
This article seeks to outline through example how a power user should weigh producing a weekly project for a stakeholder who desires monthly forecasts. Furthermore, any challenges in processing will be listed, as well as tips for ensuring standardization will be recommended.
When to Consider Weekly over Monthly
There are several reasons why a power user might want to build a project using weekly-level data. First, the lower the granularity, the stronger impact features, and especially events can have. As an example in retail, typical Christmas behavior is to have a spike in sales the week before Christmas when everyone buys gifts followed by a sharp drop as stores will close, or people stay home for the holidays. Overall, the Christmas event may be expected to slightly raise overall sales for the entire month of December, which SensibleAI Forecast can correctly identify on a monthly dataset. However, it completely misses the nuance of the spike followed by the drop-off. Furthermore, monthly datasets are unable to distinguish the effect of two events during the same month. On the other hand, running with a weekly dataset can identify how these events affect intra-monthly demand. For a business planner, weekly projects give better insights and explanations on how features and events affect demand.
Next, data history concerns may be a reason to use a weekly dataset over a monthly dataset. In an example where a stakeholder has 5 years of training data, a monthly dataset will be unable to generate any machine learning models, as it only has 60 data points to learn from. That same weekly dataset will have 260 data points, allowing for rich ML models to be created. This is especially exaggerated when the data history is 2-3 years, the difference between feature and event incorporation versus solely baseline models.
Finally, sometimes stakeholders have a different definition of a month, typically following a fiscal year. In this circumstance, data pre-processing would be required for the monthly dataset, as calendar months do not align with fiscal months. In this case, weekly data aligns more with the fiscal month (following ISO 8601, for example), whereas heavy post-processing with allocations would be required to convert a calendar month prediction to a fiscal month prediction.
Pros and Cons of Weekly
Let us compare the benefits and potential concerns of using a weekly dataset over monthly when the end goal is to feed a monthly forecast to the stakeholder’s OneStream consumption.
Weekly Pros
- Allows for more data points within the same period.
- Stronger feature and event correlation to intra-monthly effects.
- Potential for greater overall accuracy.
- If a fiscal-month output is required, SensibleAI Forecast can still use calendar data. If forecasting directly at the monthly level, source data inputs must be converted to a fiscal month before SensibleAI Forecast forecasting.
- This has the consequence that events would need to be translated to their fiscal month. For example, Halloween might then fall in November!
- No event translation is needed at the weekly level.
Weekly Cons
- Weeks do not divide evenly into calendar months or years.
- Forecast start and end dates will not align between a week start and a month start.
- Tighter controls will be required to chop off data that starts or ends in the middle of the month. This means more inputs or safeguards for the business user when starting predictions.
- Greater compute and storage size is required for running and storing long-range weekly predictions. Predicting 12 values for a year forecast changes to 52 values per target.
- Feature insights will be at the weekly level, rather than at the granularity in which business decisions are made.
- Data sparsity can become an issue. SensibleAI Forecast struggles to predict 0 values for oscillating (sparse) datasets. Targets that have demand values once every two weeks are no issue when aggregating to monthly but will have 50% “missing” data points at the weekly level.
- SensibleAI Forecast will typically generate a mean-tracking forecast that sums to the same total units overall but looks quite concerning to a planner (apparently missing the up-to-down trends entirely!)
Considerations for Weekly Projects
After weighing the benefits and drawbacks, as well as performing experimentation on whether weekly granularity produces more accurate forecasts than at the monthly level, a power user may decide it is best to proceed with running a weekly project with monthly reporting. If this is the case, here are a couple of other factors to keep in mind.
Unless the stakeholder states that their business operates under a particular financial calendar, defining the week to start on Monday is strongly recommended. This is governed by ISO 8601, which is an international standard for time-related communication. Furthermore, ISO 8601 has a rigorous definition of how to convert the date to its week number, implementable in SQL with:
SELECT DATEPART(iso_week, DATE_COLUMN);
This is useful for consistently aggregating daily to weekly to monthly data.
Furthermore, many businesses operate on a fiscal year basis, which lends itself neatly to weekly forecasts. Sometimes, fiscal months are defined as 4 weeks long, resulting in 13 months per year. Other times, fiscal months operate on a 4-4-5 timing, which creates even quarters. Since fiscal months will not align with calendar months, but rather are evenly composed of weeks, summing up weekly forecasts to a fiscal month level can be quite easy and painless.
The final point of clarification is that the power user must understand how the stakeholder’s fiscal year is created and to ensure any data that flows into SensibleAI Forecast is on that period as well. It will be extremely tricky to convert calendar-based forecasts to fiscal-based forecasts when navigating the week-month mismatches.
Furthermore, when projects are in utilization, extra care must be placed to ensure the latest actual date in the source data is complete and aligns with the corresponding beginning of the month (calendar or fiscal).
Necessary Data Processing
During experimentation, the first challenge encountered will be in comparing SensibleAI Forecast’s forecast to the customer benchmark. Assuming a calendar-month desired output, the consumption group can be configured to automatically export the weekly forecast at the monthly level. However, the ForecastStartDate column will still contain the week start date of the source data. Since the ForecastStartDate of the customer benchmark will typically be a month start value, the model forecast table must be modified so that all forecast start dates align. Many times, this can be achieved with the following SQL code:
SELECT DATEFROMPARTS(YEAR(ForecastStartDate), MONTH(ForecastStartDate), 1)
as ForecastStartDate;
Importantly, this will not work for financial months, as there are examples of different financial months that start in the same calendar month. A separate conversion table or additional logic will be required to align a weekly forecast start date to its corresponding financial month.
Next, an power user might seek to better handle weeks that straddle months. Rather than mapping an entire week to the month that the week starts in, forecast values might be proportionally split on several days that fall into each month.
I find the most consistent way to handle weekly to calendar month aggregations to be as follows:
Assumptions:
- Calendar Month Start desired forecast.
- Data is at a weekly level following ISO 8601.
Procedure:
- Ensure each forecast date is the first Monday of each month.
- Convert the ForecastStartDate of the model forecast table to its month start value.
- Allocate the weekly forecast to a daily forecast, with each day having 1/7th of its weekly value.
- Aggregate the daily forecast to the monthly level using its calendar month.
This will evenly distribute forecast values that straddle months and ensure there are no forecast start date duplicates after converting to its monthly date.
On the other hand, follow this procedure to handle fiscal month aggregations:
- Create a reference table with WEEK_NUMBER and MONTH_NUMBER columns. For each value of 1 to 54, include the corresponding fiscal month value per the client’s practices.
- Identify the week and date that each fiscal month starts. Generally, fiscal months will begin on the week-start. Ensure the train and prediction datasets are properly terminated to align with the new months' starts.
- Finally, each ForecastStartDate and Date value can be converted to the corresponding fiscal month by finding the ISO week value (if the client follows this standard) and using the lookup table generated in Step 1. No allocations or data splitting is needed.
Importantly, when forecasting at the weekly level with a fiscal month-end goal, SensibleAI Forecast can run using calendar input data since there is a simple and rigorous method to align calendar dates to fiscal months. Therefore, no conversion will be necessary for using calendar events.
Dangers
One of the biggest dangers is when forecasts are run using partially complete actuals or when the forecast starts and ends in the middle of a month. Both instances are examples where predicted values will be lower than what they should be. At the monthly level, running a forecast in the middle of a month can cause the first and last month's sums to be halved!
This issue can be addressed with safeguards built into a data ingress sequence. First, if a data ingress sequence is run in the middle of a week, a filter should cut off any data past the last complete week. This might mean if a stakeholder wants to run a prediction on a Wednesday, the pipeline will cut off any data past Sunday so that the latest data is a complete week (assuming weeks start on Mondays). Furthermore, if the stakeholder has a specific frequency of forecast, like generating a new forecast every fiscal month, data management sequences can apply an additional filter to cut off any data past the start of the current month. Therefore, a business user can run a prediction in the middle of M3 and SensibleAI Forecast will only be fed data up until the end of M2. Since weeks are more easily split into fiscal months than calendar months, it may be suggested to use this strategy only for cases where a fiscal month output is desired.
Alternatively, the stakeholder can be instructed to establish a regular pattern for running forecasts, such as the first Monday of each month to solve the mid-week and mid-month forecast generation, although this leaves the potential for the beginning month to miss a week’s worth of actuals.
Summary
Power users are faced with many decisions when target data is available at the daily or weekly level and the stakeholder desires a monthly forecast. Forecasting at the daily or weekly level can be appealing due to the potential for greater accuracy and stronger feature and event insight and impact. On the other hand, reconciling the mismatch between week and month alignment poses several challenges during final forecast conversion. Importantly, a power user must also consider the type of month the stakeholder desires (calendar or fiscal), as this will also shape how data is pre- and post-processed. Finally, a power user must recognize, safeguard against, and warn business users of starting forecasts when the actuals history doesn’t completely fill a week or month.