These days, many companies integrate machine learning solutions into their analytics tool set to enhance brand management, improve customer experience and increase operational efficiency. Machine learning models are the core component of machine learning solutions. Models are trained using mathematical algorithms and large data sets to make reliable predictions. Two common examples of predictions are (1) determining if a set of financial transactions indicates fraud or (2) assessing consumer sentiment around a product, based on input collected from social media.
Amazon SageMaker is a fully managed service that lets developers and data scientists build, train and deploy machine learning models. In SageMaker, you can use out-of-the-box algorithms or go the bring-your-own path for a more customized solution. Both choices are valid and serve equally well as the basis for a successful machine learning solution.
(Editor's note: You can see other alternatives to SageMaker here.)
SageMaker’s out-of-the-box algorithms include popular, highly optimized examples for image classification, natural language processing, etc. The complete list can be found here.
- Out-of-the-Box Advantages: These algorithms have been pre-optimized (and are undergoing continuous improvement). You can be up, running and deployed fast. Plus, AWS automatic hyper-parameter tuning is available.
- Out-of-the-Box Considerations: The continuous improvements mentioned above may not produce results as predictably as if you had complete control over the implementation of your algorithms.
If these algorithms aren’t suitable for your project, you have three other choices: (1) Amazon’s Apache Spark Library, (2) custom Python code (that uses TensorFLow or Apache MXNet) or (3) “bring your own” where you are essentially unconstrained, but will need to create a Docker image in order to train and serve your model (you may do so using the instructions here).
The bring-your-own approach offers you complete freedom. This may prove attractive to data scientists who have already built up a library of custom and/or proprietary algorithmic code that may not be represented in the current out-of-the box set.
- Bring-Your-Own Advantages: Enables complete control over the entire data science pipeline along with the use of proprietary IP.
- Bring-Your-Own Considerations: Dockerization is required to train and serve the resulting model. Incorporating algorithmic improvements are your responsibility.
Regardless of your algorithm choice, SageMaker on AWS is an approach worth considering, given how much focus has been placed on ease-of-use from a data science perspective. If you’ve ever attempted to migrate a machine learning project from your local environment to a hosted one, you’ll be pleasantly surprised at how seamless SageMaker makes it. And if you’re starting from scratch, you’re already several steps closer to your goal, given how much is already at your fingertips.