While machine learning (ML) knowledge is vital for an AI engineer, building an effective career in AI essentially also requires production engineering capabilities.
That's where machine learning operations (MLOps) comes in.
What is MLOps?
MLOps is a collection of practices, tools and techniques that enable ML engineers to reliably and efficiently deploy and maintain ML models in production. The abbreviation “MLOps” is a combination of the words "machine learning" and the practice of “DevOps” in the software engineering discipline.
ML models are typically trained and tested in an isolated experimental process, and when the model is ready to be deployed, MLOps is employed to transform the model to a production system.
Like the DevOps approach, MLOps aims to improve the quality of production models by bringing more automation into the process. (Also read: MLOps: The Key to Success in Enterprise AI.)
MLOps Best Practices
MLOps best practices include:
Data Preparation and Feature Engineering
Data is the backbone of a ML model, and quality data can produce a quality model. It is therefore vital to ensure data is valid or complete (i.e., that it contains relevant attributes and no missing values) and clean (e.g., removing duplicate and irrelevant observations and filtering unwanted noise). (Also read: How AI Can Ensure Good Data Quality.)
After data preparation, features extraction is a vital task which requires iterative data transformation, aggregation and deduplication. It is important to ensure data verification and feature extraction scripts are reusable at the production stage.
Label quality is crucial in supervised learning tasks, as wrong labels introduce noise which may lead to sub-optimal results.
To this end, labelling processes should be well-defined and controlled. Therefore, it is vital that labels are peer-reviewed.
Training and Tuning
It is useful to start with a simple and interpretable model so you can get the infrastructure right and debug the model.
To select a ML model for production, there should be a fair comparison between algorithms based on effective hyperparameter search and model selection. ML toolkits such as Google Cloud AutoML, MLflow, Scikit-Learn and Microsoft Azure ML Studio can be used for this task. (Also read: Data-Centric vs. Model-Centric AI: The Key to Improved Algorithms.)
Review and Governance
It is useful to keep track of model lineage, model versioning and the model's transitions through its lifecycle.
You can use open-source MLOps platforms, such as mlflow and Amazon SageMaker, to discover, share and collaborate among ML models.
To produce registered models, they should be packaged, have their access managed and be deployed on the cloud or on edge devices as per their requirements.
Model packaging can be performed either by wrapping the model with an API server and exposing REST or gRPC endpoints or using a docker container to deploy the model on cloud infrastructure.
You can deploy the model on a serverless cloud platform or on a mobile app for edge-based models. (Also read: Experts Share the Top Cloud Computing Trends of 2022.)
After deploying the model, it is important to implement monitoring infrastructure to maintain it. Monitoring includes keeping an eye on the following:
- The infrastructure on which the model is deployed. This infrastructure should meet benchmarks in terms of load, usage, storage and health.
- The ML model itself. In order to keep up with model drift due to changes between training and inference data, you should implement an automated alert system as well as a model re-training process.
While training a ML model on a given dataset is relatively easy, producing a model that is fast, accurate, reliable and can be employed by a large number of users has become quite challenging. Some key challenges are:
- Data management. ML models are typically trained on large amount of data, and keeping track of all the data can be tough, especially for a single person. Moreover, ML models rely on training data to make predictions — and, as data changes, so should the model. This means ML engineers must keep track of data changes and make sure the model learns accordingly.
- Parameter management. ML models are getting bigger and bigger in terms of the number of parameters they comprise, making it challenging to keep track of all the parameters. The small changes in parameters can make a massive differences in the results.
- Debugging. Unlike to typical software, debugging ML models is a very challenging art.
MLOps vs DevOps
Though MLOps is built on DevOps principles, and they have fundamental similarities, they’re quite distinct in execution.
Some key differences between MLOps and DevOps include:
- MLOps is more experimental than DevOps. In MLOps, data scientists and ML engineers are required to tweak features such as models, parameters and hyperparameters. They must also manage data and code base so they can reproduce their results.
- MLOps projects are typically developed by people without expertise in software engineering. This could include data scientists researchers who specialize in exploratory data analysis, model creation and/or experimentations.
- Testing ML models involves model validation, model training and testing. This is quite different from conventional software testing such as integration testing and unit testing. (Also read: Why ML Testing Could Be The Future of Data Science Careers.)
- ML models are typically trained offline. However, deploying ML models as a prediction service requires continuous retraining and deployment.
- ML models can deteriorate in more ways than conventional software systems. Because data profiles evolve constantly, ML models' performance can decline during the production phase. This phenomenon, known as "model drift," occurs for a number of reasons, such as:
- Differences between training data and inference data.
- The wrong hypothesis (i.e., objective) was selected to serve an underlying task. This often leads you to collect biased data for model training, resulting in wrong predictions at the production stage. In the retraining phase, when you correct mistakes and feed the model with the same data and different labels, the model gets further biased — and this snowball keeps growing.
- ML models need to be continually monitored, even during the production phase. On top of that, the summary statistics of the data the model uses need to be continually monitored too. Summary statistics can change over time and it's important for ML engineers to know when that happens, especially when the values deviate from the expectations, so they can retrain the model if/when required.
Besides these differences, MLOps and DevOps share many similarities — especially when it comes to the continuous integration of source control, integration testing, unit testing and delivering software modules/the package.
MLOps is primarily applied as a set of best practices. However, the discipline is now evolving into an independent approach to ML lifecycle management. MLOps deals with the entire life cycle of a machine learning model — including conceptualization, data gathering, data analysis and preparation, model development, model deployment and maintenance.
Compared to standard ML modeling, MLOps production systems require handling continuously evolving data on top of providing maximum performance and running relentlessly. This presents some unique challenges but, when executed properly, MLOps provides a reliable and efficient method of deploying and maintaining ML models. (Also read: Debunking the Top 4 Myths About Machine Learning.)