Machine learning (ML) and cloud computing are well on the way to forming a symbiotic relationship of sorts, promising a new wave of productivity gains far beyond what either technology can achieve on its own.
A close look at the strengths and weaknesses of these solutions reveals that they complement each other in novel ways. The cloud, after all, is adept at providing scale for the massive amounts of data that easily exceeds the storage capacity of most enterprises.
Meanwhile, ML brings order and structure to those volumes, while also identifying new opportunities to enhance their value.
ML and the Cloud: Two Great Technologies for the Enterprise
This is why many cloud providers are touting new ML services to clients looking to manage terabytes of data, even as users work to deploy their own solutions that transcend the multiple platforms that span hybrid cloud deployments.
As Datamation’s Sean Michael Kerner noted recently, ML services are working their way into a wide variety of workloads, spanning both proprietary and open source platforms. The cloud, in fact, offers a far easier way to deploy ML and other artificially intelligent (AI) tools compared to traditional datacenter infrastructure. In many cases, ML requires specialized hardware featuring optimized GPUs and inference processors, which can be expensive to deploy on-premises.
At the same time, ML frameworks and processes can be difficult to configure and deploy in-house, particularly within organizations that lack the specialized training for such projects.
Cloud providers, on the other hand, can more easily distribute these costs among multiple clients and can deliver on automated workflows, data analysis, development models and a host of other benefits to put ML to work quickly and effectively. Users should take care, however, to establish a consistent ML framework across all clouds while ensuring that data access remain as broad as possible and workflow modelling does not become overly complex.
Even with the cloud, ML must be deployed in a scalable and sustainable way, which invariably requires adoption throughout the entire data pipeline, including monitoring, auditing and version tracking. This is what Google is aiming for with the Cloud AI Platform Pipelines service.
The idea is to merge ML with DevOps (MLOps) to ensure a more repeatable, automated approach to key functions like data prep, analysis, training and evaluation. In this way, the company hopes to flatten the learning curve for newcomers to ML while allowing more advanced user to quickly develop customized pipelines either built from scratch or by compiling a growing list of pre-built tools and templates.
Getting Database Management Right
One of the most crucial applications for machine learning is database management. A team of researchers at Perdue University recently unveiled a cloud-based solution called OptimusCloud that uses ML to devise algorithms that improve application and service performance.
The system utilizes NoSQL platforms like Apache Cassandra and Redis running on AWS, Google Cloud and Azure, with plans for additional clouds in the future. Its strength is the ability to function with long-running, dynamic workloads for everything from advanced scientific research to manufacturing optimization to autonomous vehicle data processing.
By adding an ML component, the team says it will help users achieve better balance between workload demands and resource allocation, which benefits users in terms of cost-containment while affording cloud providers the opportunity to accommodate more clients simultaneously and thus derive greater revenue from existing infrastructure.
Storage
One area that is often overlooked when deploying ML solutions, however, is storage. Machine learning tends to generate a lot of data itself, so organizations can quickly find themselves overwhelmed if they are not properly prepared.
Cloudian’s Gary Ogasawara recommends object storage as the go-to solution for ML due to its virtually limitless scalability, cost efficiency and its ability to support fully customizable metadata. The easiest way to acquire large volumes of object storage, of course, is in the cloud, which also provides key features such as robust durability, locality management and integration.
It should be clear by now that few organizations will be able to support in-house infrastructure at the scale needed to compete in today’s data-driven economy, and this is before we even contemplate the workloads about to be unleashed by 5G and the Internet of Things (IoT). But capacity alone is only part of the equation. Equally important is the need to process, analyze and manage all that data with the speed and flexibility demanded of today’s digital consumer.
Final Thoughts
As the pace of business continues to accelerate, data services will have to become more flexible and agile in order to build and maintain user satisfaction and loyalty. But this cannot happen under a traditional manual management regime. The only solution is to build greater autonomy into the data ecosystem so that problems can be resolved quicker and opportunities can be capitalized on before they are lost.
Machines can handle this load, but they must acquire the means to learn how to do so. By leveraging both the cloud and machine learning, organizations will find that they can have their scale and manage it too.