What Does MLOps Mean?
Machine learning operations (MLOps) is an approach to managing the entire lifecycle of a machine learning model — including its training, tuning, everyday use in a production environment and retirement.
MLOps, which is sometimes referred to as DevOps for ML, seeks to improve communication and collaboration between the data scientists who develop machine learning models and the operations teams who oversee an ML model's use in production. It achieves this by automating as many repetitive tasks as possible and improving feedback loops.
An important goal of MLOps is to help stakeholders use artificial intelligence (AI) tools to solve business problems while also ensuring an ML model's output meets best practices for responsible and trustworthy AI.
Techopedia Explains MLOps
MLOps was developed with the knowledge that not all data scientists and ML engineers have experience with programming languages and IT operations. The continuous feedback loops that MLOps provides allows employees outside data science to focus solely on what they know best instead of having to stop and learn new skills.
An MLOps rollout requires five important components to be successful:
1. Pipelines
ML pipelines automate the workflow it takes to produce a machine learning model. A well-designed pipeline supports two-way flows for data collection, data cleaning, data transformation, feature extraction and model validation.
2. Monitoring
Machine learning uses iterative mathematical functions instead of programmed instructions, so it's not unusual for an ML model’s performance to decline over time as new data is introduced. This phenomenon, which is known as model drift, requires continuous monitoring to ensure model outputs remain within acceptable limits.
3. Collaboration
Successful ML deployments require a variety of technical skills as well as a work environment that values inter-departmental collaboration. Feedback loops can help bridge the cultural and technical gaps between the data scientists who create machine learning models and the operations teams who manage them in production.
4. Versioning
In addition to versioning code releases, other elements that need to be tracked include training data and meta-information that describes specific ML models.
5. Validation
MLOps uses shift left testing to reduce bugs in development and shift right testing to reduce bugs in operations. Shift right is a synonym for "testing in production."
MLOps Implementation
A well-designed MLOps implementation can be used as a monitoring and automation system for ML models from the early stages of development to end-of-life. At its best, MLOps will support the needs of data scientists, software developers, compliance teams, data engineers, ML researchers and business leaders.
Unfortunately, MLOps has a high failure rate when it is not implemented properly. One of the most common challenges is cultural, created by competing priorities and siloed communication between business divisions. In response, new tools and services that facilitate feedback loops as well the technical aspects of a model's lifecycle are being adopted on a frequent basis.
MLOps vs DevOps
MLOps and DevOps share many similarities in their development phases. They both support the continuous integration of source control, automated testing and a continuous delivery approach to code releases. An important difference, however, is that while DevOps embraces a shift left approach to conducting integration tests and unit tests during the development phase, MLOps uses both shift left and shift right testing to prevent model drift in production.
MLOps Best Practices
MLOps teams are cross-functional, which means they have a mix of stakeholders from different departments within the organization. To ensure data scientists, engineers, analysts, operations and other stakeholders can develop and deploy ML models that continue to product optimal results, it’s important for the team to maintain good communication throughout the model’s lifecycle and follow best practices for each of the pipeline’s components. This includes:
Data Preparation
Data is the backbone of a ML model, and data quality is an important consideration. It is important to ensure the data used to train ML models follows best practices for data preprocessing. This includes best practices for data transformation, exploratory data analysis and data cleaning.
Feature Engineering
An important goal of feature engineering is to optimize the accuracy of a supervised learning outputs. Best practices include a process known as data verification. It's also important to make sure feature extraction scripts can be reused in production for retraining.
Data Labeling
Label quality is very important for supervised learning tasks. A best practice is to ensure the labelling process is well-defined and peer-reviewed.
Training and Tuning
It's useful to train and tune simple, interpretable ML models to start with because they are easier to debug. ML toolkits such as Google Cloud AutoML, MLflow, Scikit-Learn and Microsoft Azure ML Studio can make the debugging process easier for more complex models.
Review and Governance
Like DevOps, MLOps best practices include keeping track of versioning. This includes tracing the model’s lineage for changes throughout the model's lifecycle. Cloud platforms such as mlflow or Amazon SageMaker can be used to support this best practice.
Monitoring
After deploying the model, an important best practice is to monitor model outputs and summary statistics on a continual basis. This includes keeping an eye on:
- The infrastructure on which the model is deployed to ensure it meets benchmarks in terms of load, usage, storage and health.
- Statistical summaries that indicate the existence of bias introduced by input data that is either over-represented or under-represented.
- The ML model itself. An automated alert system can be used to trigger a model’s re-training process when outputs for the model drift beyond acceptable statistical boundaries.