What's better, a platform or a bring-your-own machine learning algorithm on AWS?

Q:

What's better, a platform or a bring-your-own machine learning algorithm on AWS?

A:

These days, many companies integrate machine learning solutions into their analytics tool set to enhance brand management, improve customer experience and increase operational efficiency. Machine learning models are the core component of machine learning solutions. Models are trained using mathematical algorithms and large data sets to make reliable predictions. Two common examples of predictions are (1) determining if a set of financial transactions indicates fraud or (2) assessing consumer sentiment around a product, based on input collected from social media.

Amazon SageMaker is a fully managed service that lets developers and data scientists build, train and deploy machine learning models. In SageMaker, you can use out-of-the-box algorithms or go the bring-your-own path for a more customized solution. Both choices are valid and serve equally well as the basis for a successful machine learning solution.

(Editor's note: You can see other alternatives to SageMaker here.)

SageMaker’s out-of-the-box algorithms include popular, highly optimized examples for image classification, natural language processing, etc. The complete list can be found here.

  • Out-of-the-Box Advantages: These algorithms have been pre-optimized (and are undergoing continuous improvement). You can be up, running and deployed fast. Plus, AWS automatic hyper-parameter tuning is available.
  • Out-of-the-Box Considerations: The continuous improvements mentioned above may not produce results as predictably as if you had complete control over the implementation of your algorithms.

If these algorithms aren’t suitable for your project, you have three other choices: (1) Amazon’s Apache Spark Library, (2) custom Python code (that uses TensorFLow or Apache MXNet) or (3) “bring your own” where you are essentially unconstrained, but will need to create a Docker image in order to train and serve your model (you may do so using the instructions here).

The bring-your-own approach offers you complete freedom. This may prove attractive to data scientists who have already built up a library of custom and/or proprietary algorithmic code that may not be represented in the current out-of-the box set.

  • Bring-Your-Own Advantages: Enables complete control over the entire data science pipeline along with the use of proprietary IP.
  • Bring-Your-Own Considerations: Dockerization is required to train and serve the resulting model. Incorporating algorithmic improvements are your responsibility.

Regardless of your algorithm choice, SageMaker on AWS is an approach worth considering, given how much focus has been placed on ease-of-use from a data science perspective. If you’ve ever attempted to migrate a machine learning project from your local environment to a hosted one, you’ll be pleasantly surprised at how seamless SageMaker makes it. And if you’re starting from scratch, you’re already several steps closer to your goal, given how much is already at your fingertips.

Have a question? Ask us here.

View all questions from Michael Golub.

Share this:
Written by Michael Golub
Profile Picture of Michael Golub

As Anexinet’s Senior Vice President of Analytics and Machine Learning, Michael oversees innovation and delivery of Anexinet’s Analytics offerings to empower and modernize our customers with systems of insight for competitive advantage.

Michael has been building and leading enterprise modernization efforts for over 25 years with a focus on applying emergent technologies to strategic business initiatives. Prior to joining Anexinet in 2011, Michael served as Program Manager for Accenture Federal Services where his team delivered a life-saving solution to the Department of Defense that won the NDIA Top 5 DoD Systems Engineering Program Award and changed the way the Army manages technology programs. Before this, Michael lead enterprise logistics and robotics based transformations at QVC.

 Full Bio