She took a sip of coffee and continued to pace up and down the room, stopping every now and then to look at her computer screen and sigh. This was the moment she’d been dreading all along.
She had spent a lot of time developing this Machine Learning model to help predict claims values and adjustments, experimenting with offline historical data, designing the model, testing, and improving it. And it was working like a charm: the data was more than enough, decent quality, and the R-squared was way over 90%.
Then she got involved in deploying the model into production. Getting it live! And it all worked so well, the first time. Her business unit colleagues were ecstatic, praising her for the improvements they were getting in their day-to-day work. She was so proud. After all, getting praised by the claims underwriter was not a daily treat, and she even forgot about the time when she had to play the Data Engineer, DevOps Engineer, and ML Engineer parts, to get things done.
A few weeks later, it was time for another model. Then, a few months later, another one. Success was nice, but the more she worked, the harder it got, and the more time it took. And now, go figure, the first model was beginning to show signs of fatigue. Its predictions were off target. Maybe the data had shifted, or maybe there was something else afoot. Her colleagues were not happy anymore, as the whole business process had been changed due to the new model. It has become an important tool and they were not willing to go back to the days before.
Whatever it was, it was time to go back to the drawing board, re-test, re-train, improve and re-deploy. All this almost by herself, with little help, and with her own hands as automation seemed far away. And of course, the pressure. Not the Data Scientist life she had envisioned and worked hard for.
She went to the kitchen, walked to the coffee machine, and pushed the button to select a double espresso, black. It was going to be a long day. A long evening, maybe even a long night. Something needed to change though, she thought. Or else…
This is not a sad story. It is a realistic one, in many aspects a common one. Many Data Scientists around the globe are faced with a wave of manual operational tasks when developing, testing, deploying, re-training, and re-deploying Machine Learning models, and often they are facing all these tasks alone or with a small team of colleagues.
Enthusiastic and dedicated, they must perform cross-functional tasks to move things along, fill in missing gaps, and ensure that the business objectives are met. Tasks that can more easily be classified as Software Engineering, DevOps, Support rather than Data Science. They understand that a Machine Learning model becomes useful to the organization only when it is used in real life and delivers the results when integrated into genuine business processes. And once it goes there, after the moment of triumph has passed, they are looking at yet another production-grade component that has a distinct lifecycle. Is it still them who need to maintain it?
Try as they might, these heroes of Data Science will never be able to shine at their brightest without real support. So, what can be done?
Re-Introducing MLOps
MLOps is by no means a new concept or a new practice. Giants such as Google, Microsoft, IBM, Amazon, and others, have been relying on it to empower Data Science projects, reduce time-to-value for Machine Learning modeling, and thus accelerate their data-driven approach to insight-powered business success.
What has changed in the meantime is that companies of all sizes, operating in all sectors, have become increasingly reliant on data insights to fuel their decision-making processes to survive in a global business environment that is growing more competitive.
Therefore, Data Science, Machine Learning, and Artificial Intelligence are becoming mainstream functional capabilities in many companies, rather than luxury services that are set aside and planned for a distant future.
OK, but what is it all about?
MLOps is, firstly, the concept of placing Data Science at the center of a well-organized process, ensuring that Data Scientists are provided with exactly the right data, technical and operational support, collaboration tools, platforms, pipelines, versioning capabilities, and automation services that can prevent them from being left alone to toil like inventors in hidden laboratories, who rarely ever get to see one of their masterpieces in the light of day.
Secondly, MLOps is the practice of guaranteeing that Machine Learning projects are collaborative, involving the sturdy base of the ML/Data platform and the support of the services provided by teams of Data, DevOps, and Software Engineers, who take the burden of operational steps, leaving Data Scientists to use all their might to design ever better models, improve them and oversee their deployment from above.
This collaborative and structured approach has multiple advantages. The first is the notion that Data Scientists only ever focus on the craft they prepared and trained themselves for years and years. This can spell a major improvement in the quality of their work.
Secondly, since MLOps services cover the Machine Learning process end-to-end, the risk is reduced through increased security, maximized compliance, and minimized errors in production. This means overall exponentially improved efficiency, resilience, and productivity.
Most importantly, MLOps is the very definition of added value through process speed-up and dramatic reduction of the time-to-production. Since an ML model is only useful after launched into production, compressing the time needed to deploy, while at the same time increasing accuracy and minimizing risks, means that profitability is achievable and constantly improvable.
Finally, MLOps solves the problem of efficient continuity. Using pipelines and automated processes, together with complex versioning, feature, and model repositories, the Data Scientists are enabled to easily analyze, audit, re-train, and re-deploy any given model. They no longer need to resort to manual processing, instead of relying on a database of their own works and concepts, created and curated for them by the delivered MLOps solution and services.
2021 has been called by some IT industry actors “the year of Machine Learning”, heralding the time when ML predictions become the norm in almost all industries and sectors around the globe.
As the pressure is stepped up on all organizations to become more competitive by augmenting their decision-making capabilities with the help of Artificial Intelligence, more Data Scientists will be called upon to help drive growth with their knowledge.
As they increasingly face more pressure to deliver, they must and will rely on MLOps services to succeed and shine.