• Thomas Joseph on the State of Machine Learning, How Legion Uses It, and How We Should Evaluate Its Efficacy •
Thomas Joseph joined Legion in October 2017 as the head of data science. He’s the driving force behind the machine learning (ML) algorithms that we use to forecast labor demand for our clients, and he helps to educate operations directors, data analysts, and the inquisitive public about ML and its unparalleled ability to unearth patterns and inform business decisions.
On a recent sunny afternoon, we caught up with Tommy to chat about the whys and hows of ML, and discuss how his career has evolved in the context of accelerated advances in cloud computing.
Let’s Get Precise: AI vs. ML
Right off the bat, Tommy gave the proverbial eye roll 🙄 when I mentioned “artificial intelligence.” He’s entitled to, being a scientist and all — Tommy prefers language that is specific and to the point. At Legion, we use AI and ML more or less interchangeably in our client-facing materials, but it doesn’t hurt to have the data guy set us straight.
“There’s this joke that I heard a few days ago:
‘What’s the difference between ML and AI?’
‘If it’s written in Python, it’s ML. If it’s written in PowerPoint, it’s AI.’
People use ‘AI’ to mean all kinds of things. It’s not well defined. Personally I don’t use that term. If you say artificial intelligence, it’s whatever you want it to be — anything from simple machine learning all the way to some futuristic robots doing whatever. With machine learning, it is very specific. You can form a picture of what it is. The name describes itself, but you can go further. You can identify frameworks that can be used: programs, a set of algorithms, a set of technologies. Machine learning is a very concrete concept, whereas AI is just this nebulous thing. So honestly, if I had a choice, I would simply talk about machine learning. We tend to say AI / ML, but it’s really machine learning. Everything we do is about learning from something, and applying the lessons learned to something else. To me, that’s machine learning. I don’t know if there’s a more widely accepted definition for AI. If you ask me, I don’t have a definition for it.”
For the record, of course Tommy (Ph.D., Computer Science) knows quite well the textbook definition for AI, but for all intents and purposes, we breathe ML here at Legion, which powers our demand forecast and autonomous matching modules to generate demand-ready employee schedules, comply with rules and regulations, and account for workers’ preferences and skill levels.
Machine Learning In Demand Forecasting
Our conversation shifted to demand forecasting. What’s with all the buzz right now? Why do we even need machine learning?
“First of all, people have been doing demand forecasting forever. So that part is not new. There are simple formulas one can use. I’ve got my history, I build a linear regression or something similar, and I project forward. What is different about what we do is that we’re looking at lots and lots of different kinds of data. If I’m looking at just historical sales and mathematically projecting forward, then, yes, some of the traditional techniques might be sufficient. But if I want to combine that with a lot of external data — store promotions, the weather, nearby events like street fairs, and so on — now we’re talking about a different problem that makes machine learning techniques more useful. You need machine learning to learn from all of these things at the same time.”
However, we’re not just using machine learning for the sake of machine learning. In truth, it’s not humanly possible to do what we do, with the vast volumes of data that we want to blaze through.
“If I’m the manager of one store, I might be able to, as a human, do as well as or even better than some of these algorithms, because I know deeply how these things work since I’ve studied it over many, many years. But if you want to scale out to multiple locations, especially if you want to do it in a repeatable, consistent way across your whole organization — if you’ve got a hundred stores or even a thousand stores or ten thousand stores — you can’t trust every single store manager to do a good job. We’re trying to say, ‘How do you take your best managers and replicate them?’ Which is kind of what machine learning is about. There’s an intelligent person who can do certain things with the right knowledge and the right experience and so on, and we’re trying to learn that and to replicate that everywhere automatically.
That’s the problem we’re trying to solve. Sure, you can find that one store manager who does a very good job with forecasting, but are you going to find a hundred of them, or a thousand of them, and populate them everywhere and make sure they’re doing the right thing every day? No, that’s not humanly possible. So the two points we want to emphasize are SCALE and AUTOMATION.”
Applying Machine Learning Models
Broadly speaking, we need machine learning to do high quality, accurate demand forecasting on a large scale, in a completely automated process. But how? At Legion, we use upwards of 50 machine learning models with our clients’ data — from the familiar Linear Regression to the more intricate, like Random Forest and Neural Networks — for the simple fact that all data are not created equal.
“First of all, we want to scale across multiple locations. If every location were identical, the problem would be somewhat easy. But they’re not. Each location has its own pattern of traffic, things happening nearby, whatever — everything is different. I need to take into account all the nuances of each location. So you’ve got to be able to do this not just in a scalable, replicable way, but also in a very localized way. Every model learns from local data. If I’ve got a store in Portland, I’m going to give it data from Portland to learn from, and not data from San Francisco. But we go a step further, and that’s where all these different models come into play. We’re not going to come up with one model that fits all stores. Instead, we’re going to come up with models that are tailored to each store. You couldn’t do this if you didn’t have a very high degree of automation. We’ve got to build to that.”
Essentially Legion is revolutionizing the way labor deployment is played out by supplanting corporate-to-store directives with more granular and usable data that point to more efficient outcomes. The old, stale way of doing things is no longer adequate if the goal is to have the most appropriate level of staffing to meet customer demand.
Then & Now
“In most organizations, there is a centralized group that comes up with forecasts for every store. But they’re fairly high-level forecasts. I’ll forecast your daily sales, for example, and based on that, I’m going to set up a budget for your store. And then it’s your job to figure out what to do with that budget. The forecasts are not broken down by the hour, or the minute, or the labor type. Obviously, the central department will take into account some local conditions, but it’s limited, maybe some regional holidays. It’s almost like a cap as opposed to a real forecast: ‘This is what I think your sales is going to be, and I’m going to give you this much budget, and you use that to allocate your labor.’ That was sort of the state of the art. Legion’s forecasts are much more detailed. For example, we forecast in thirty-minute increments. We can forecast several different drivers of labor demand separately, and take into account lots of different kinds of local data. Apart from the scale and apart from being hyper-specific, machine learning is absolutely needed if you want to automatically do all this and come up with a good schedule. Just saying that your daily sales is X doesn’t tell you anything about when you might need what kind of labor. Doing those sorts of very fine-grained forecasts as we do, we need ML algorithms.”
How To Think About Predictive Accuracy
Legion’s labor requirement computation is derived from forecasted customer demand. What does that entail, and what do the numbers mean?
“We’re not forecasting the labor directly. We’re forecasting something that determines the labor — it could be sales, or it could be cups of coffee and number of sandwiches. If I forecast demand for coffee and sandwiches, I can then translate that demand into a forecasted demand for labor. To find out how accurate I am, I wait and see. If I predicted 500 cups of coffee to be sold, and 520 cups were sold, I can see how accurate I was on the demand side. I can do the same for labor. I predicted 500 cups of coffee, which translated to so many hours of labor, and actually, 520 were sold — how many hours should that have been? That’s how I determine the labor forecast accuracy. Now the labor forecast accuracy is always higher, mainly because labor tends to be a bit lumpier. If there’s a slight bump in demand between 5 and 6 o’clock, I’m not going to open up another shift for that. So labor tends to be far more stable than customer demand, and the labor computation, therefore, tends to be more accurate than the demand forecast.
We are about 98% accurate in our labor computation. This number is constrained by the level of predictability of the underlying customer demand, which in turn depends on the environment you’re in. Let’s say you have a coffee shop in an office building. Well it’s probably pretty predictable. People come at 9 o’clock, and they have their same cup of coffee at the same time every day — you can get a high level of accuracy. You have another coffee shop in the middle of some park, and it’s just not as predictable, so it’s not reasonable to expect the same level of accuracy across different types of locations. By the way, we’ve actually done quite a bit of work trying to figure out an almost impossible-to-solve problem: can you say a priori what is a good level of predictability for a certain dataset? In some ways it becomes a circular question. I can’t tell you how predictable something is until I predict it — but that doesn’t really answer the question. That said, we have come up with ways to estimate this, and we use this to measure how well we do.”
How To Measure Value
If that circular question made you feel somewhat loopy, Tommy suggests a more concrete way to measure the value of using forecasted demand drivers to compute labor requirements (as opposed to improvising schedules based on a budget sent down from corporate HQ).
“It’s easy to measure the cost of over-forecasting. If I ask you to bring in one extra person when that person is not really needed, you can take that person’s hourly wage and multiply that by the hours he was there and look at what it cost you. That’s the measure that people use. But that begs the question: could you ever have gotten better? This is comparing it to a book perfect prediction, but a book perfect prediction is not possible. So the real comparison is, how did we do relative to what an expert might have done? Let’s say we come in with 90% accuracy, and that’s the best an expert could have done, given this location has several random things happening. Well then, we did as well as we could.” (If this point is too hard to grasp and you want to educate yourself further, Tommy recommends this paper on the science of predictability.)
“It’s much harder to quantify the cost of under-forecasting, which may come with the loss of sales or customer satisfaction. There is for sure a cost. And one way people tend to do this is by just using the same measure: if it’s going to cost me this many dollars for every hour of over-forecasting, I’ll use the same number for under-forecasting. It’s not exact, but it’s a way to come up with some number.
I think the real way to look at value is this — it’s all qualitative. Everybody can agree that the more accurate you are, the better you are. The more money you save, one way or another, or the higher your customer satisfaction — you’re better off. Whether 98 percent is the right number or 96 percent, it really depends on what your business looks like. What you can say is that, yes, using the techniques we employ will lead to better results because you’re using more sophisticated techniques that will get you to better scale, in a way that you couldn’t do manually. It will take into account things that you normally would not have taken into account.”
How Tommy And Machine Learning Found Each Other
Machine learning really came of age in the last half-decade or so. Every imaginable type of start-up in Silicon Valley seems to have some AI / ML component. Back when Tommy was coming of age as a scholar and plowing through his Ph.D. thesis (“Managing Data Consistency in Large Scale Distributed Systems”) at Cornell, machine learning was barely a thing.
“Actually in some ways, it was a thing, and then it died. My background was in large scale distributed systems. When I was doing my Ph.D., that was quite the up-and-coming — the fact that you could do computation across a network of computers instead of a single computer. But it got very complicated quickly because you had data that was all over the place and needed to be kept consistent. Eventually, all of that grew into the cloud — the ultimate distributed system. All the concepts and technologies behind cloud computing were effectively called ‘distributed systems’ in those days.
Around the time when I was doing my thesis was actually when artificial intelligence came into being, and there was a lot of hype around it. People got excited, and then it all fizzled out, for a while. And the reason it fizzled out was, first of all, there just wasn’t enough data: you couldn’t collect the amount of data you needed, and you couldn’t process it. The techniques were there, but we didn’t have the data, the storage capacity and the computing power to make it actually work. Then the cloud came, and suddenly all that changed. You could actually collect huge amounts of data, then process it with large numbers of computers, so there’s a resurgence in all those ML techniques. Once people got into it, they came up with more and more sophisticated algorithms to do the kind of things that people were talking about before. For me personally, working with large systems and large amounts of data, it became clear that all the primary applications of those systems have become the machine learning that we know now. So that’s how I got into machine learning.”
The Real Challenge Of Machine Learning
“Within the last four to five years, many techniques that have taken a long time to develop have become very nicely packaged. With a few lines of code, you can do things that would have taken months or years to set up. It’s become far, far more accessible. That’s both good and bad. Good in the sense that anyone can write a machine learning program easily. But the hard part of machine learning is coming up with the features and the data that you want to put in there, and whether it makes sense. If you put in a whole bunch of garbage data, you’ll get garbage results. The program will run and give you an answer, but it doesn’t necessarily mean anything. The challenge now is to really make sure what these programs are doing is relevant to your problem. Those are always hard questions. People have this sense that now we can just gather a lot of data, throw it into the machine, and get the answer — that’s not true and that will never be true.”
How To Be Awesome Like Tommy
What does it take to be a good data scientist? Do you really need a Ph.D. in computer science?
“You really need to understand your domain well. You need to understand what the data means, what data could have an impact and what wouldn’t. Then you can pick a technique and set up your model so that you’re learning properly.
I think the best background — and this is probably not at the top of everybody’s list — is you need to be a statistician to understand data properly. You need to know programming for sure, but a lot of the programs now are packaged, and you don’t have to be a super computer scientist anymore. Some good data scientists are people who come from fields like physics and economics and so on — they can actually be better data scientists than computer scientists. Computer science is no longer the top-most thing: the cloud has its infrastructure, and someone else can take care of the computer problems, making sure things scale and all that kind of stuff, so you don’t have to think about it. Amazon or Microsoft or Google will take care of that for you. A data scientist just needs to understand algorithms enough to know whether what they are doing to the data makes sense.”
Despite being a geek, Tommy is super personable and explains things in laypeople’s terms, so he often gets roped into meetings with prospective clients because someone from management wants to nerd out with him.
“That happens quite a bit. Not always the CEO, but somebody there — the data person / the data geek — wants to know what we are doing, and why. And that’s good! In fact, I’d rather have that than the opposite, where someone says, ‘Why aren’t you getting to 99% accuracy?’ People are curious also. They have heard a lot about machine learning and this and that, and they want to know what it really does vs. what is just hype.”
We hope this conversation has whetted your curiosity about machine learning and how it is used at Legion for demand forecasting. Tommy actually did a much deeper dive on the topic than what this blog post has room for, so watch this space for the next installment. Meanwhile, if you have any burning questions for Tommy, feel free to reach out at email@example.com.
Tommy lives in Atherton with his wife Holly, his son Anton, and his two cats, Brutus and Hemmingway (pictured above).