We’re all going through a time that’s unprecedented in our lifetimes. Our first concerns are for our families, friends and communities—and doing all we can to keep them safe. And yet, we need to keep things going and preserve as much of a normal life as possible. Which brings us to the topic of forecasting.
Business operations have been disrupted in unpredictable ways. So it’s not unreasonable to ask whether there’s any point in forecasting at this time. At Legion, the goal of data-driven forecasting is to learn as much as possible from the available data—both customer-provided and external—and predict intelligently. Of course, a human may then want to exercise judgment and adjust the outcome. We allow for this but aim to provide the best data-driven forecasts possible.
Many of our customers are essential businesses and continue to operate during the crisis. So the question is: what’s the best data-driven forecast during these challenging times?
In normal times, we forecast demand based on many different factors. We look at previous years’ seasonality, recent trends, historical weather, past and upcoming local events, promotions, etc. and feed this data through a forecasting engine that evaluates over 50 machine learning models to find the one that best learns the patterns that are specific to each dataset. Each location may use different sets of models to suit their unique needs.
Chart 1: Legion ML forecasts are continuously monitored; when model drifts occur (such as during this coronavirus crisis) new ML models can be selected, deployed and trained.
Now, many of these factors are not as relevant. Last years’ seasonality or even trends from a few weeks ago are no longer as important in a time when demand has suddenly and dramatically changed. So a data-driven forecast needs to focus on this most recent data.
One may ask “why won’t normal forecasting models see this and automatically adapt to the change?” The reason is that normal ML models are specifically designed not to respond quickly to sudden changes in data. In normal times, changes like these are almost always a result of a temporary phenomenon (e.g., a street closure) and the data will quickly go back to its original pattern. If the models react quickly to these changes, normal forecast accuracy would drop.
Instead, models are designed to observe patterns over a course of weeks and see if a change is, in fact, permanent before factoring it in. But waiting for weeks is not helpful in a situation like ours.
What if the forecasting engine could instead detect that we’re in an exceptional situation (e.g., by noticing a simultaneous deviation in several locations) and that sudden changes are not only expected but should be treated as normal? And, what if our collection of ML models includes some that are better able to adapt to fast-changing conditions? The engine can then use these models to learn the new patterns in the data and adjust the forecasts accordingly.
Legion’s model collection includes those that are better suited to our current situation—that focus primarily on the most immediate trends. Forecasts based on these models give us the best indication of what the data is saying about the current situation. In many cases, they will show a sharp downward trend, but in some cases (e.g., online or delivery sales) they may go the other way.
Let’s illustrate this with a couple of examples. Legion forecasts are done in increments of less than an hour because this is what is needed for optimal labor scheduling. For these examples, though, we’ll look at forecasts at the weekly level because the trends are more recognizable visually.
Chart 2: Weekly dollar forecast adjustment with sharp drop for one location as of March 15
The chart shows last year’s and this year’s demand for one location of a business. The yellow-green line shows a sharp drop in recent weeks while the pink line shows the forecast that was generated by a “normal” model. Note that even though the recent demand showed a sudden drop, the forecast does not yet react to this drop since the model still keeps open the possibility that this might be a temporary situation. Interestingly, and purely by coincidence, there was a dip last year around the same time. The forecast model correctly treats this as an anomalous situation and ignores it.
The forecast based on the normal model does not react quickly to the drop that we know is due to the virus. As data comes in over the next few weeks, the model will adjust downward, but we would instead like to be more proactive. This is achieved when the forecasting engine learns that we’re in an anomalous situation that is likely to persist. It then restarts the model selection process based on this new information and regenerates forecasts based on the newly-selected model.
The green line shows the effect. Now the forecasting model focuses on the most immediate trend and projects that forward. The resulting forecast is much lower, reflecting the sudden recent drop in demand. Notice that the drop tapers off over time, which is the reasonable thing to do in the absence of any other information. As new data (and actuals) come in, the model will adjust the forecasts automatically based on the trends it learns from the data.
Below is another example that shows similar results where we look at units forecasted.
Chart 3: Weekly unit forecast adjustment for one location as of March 15
The difference here is that the drop was not so sharp (in fact there was a slight increase in the week before the last). But the model was still able to pick up the downward trend and project that forward.
In both these examples, the original forecast was reasonable for normal times, but the new models produced forecasts that are more appropriate for times like these.
As always, a data-driven forecast will need to be blended with human judgment, especially in a situation like this. Only a person on the ground will be aware of the restrictions that might be placed on people and businesses. But with a good data-driven forecast as a starting point, one should be able to make better decisions.
The beauty of ML is that you have a system that can accommodate extraordinary periods like these. Legion monitoring analyzes the effectiveness of new ML models against actuals so forecasts can continue to improve over time by factoring in those actuals. The process can continue again once this crisis is finished. Which we hope is soon.
We do wish this model selection and retraining was not necessary and that circumstances were different. We know Legion customers and their families in the restaurant, retail, health and fitness areas are experiencing tremendous stress and uncertainty. Legion will continue to monitor data and forecasts so you can better serve consumers during this crisis. In the meantime, we do hope you and your families stay safe and healthy.
Thomas Joseph has been the head of data science at Legion since 2017. At Legion, he works with customers to develop almost-perfect labor demand forecasts while staying ahead of the curve on machine learning and deep learning practices to enable huge increases in labor efficiency while increasing employee engagement. Before Legion, he was a Senior Vice President in the Chief Technology Office at SAP and served as Chief Technology Officer at TIBCO. He holds a Ph.D. in Computer Science from Cornell.