If there’s one thing we’ve learned in recent years about the growth of cloud deployments, it’s that things can get really complicated really fast. There’s public, private and hybrid cloud and the blurry definitions between each. There’s an ever-growing roster of cloud platforms and cost structures. Compliance only gets more complicated... If that sounds like more than a person could ever keep track of, you’re probably right. After all, we’re only human.
When we spoke to Turbonomic’s CEO, Ben Nye, last year, we took a deep dive into autonomic computing, and how it’s being used to solve the problem of increasingly complex, data-driven environments that are beyond a person’s ability to efficiently manage. It is a new paradigm for system admins who have long adhered to the break/fix model of application management. Turning all that control over to software is a new approach. But from a practical perspective, allocating and provisioning cloud resources based on the demand on workloads in real time is becoming a powerful force in a crowded cloud market catering to increasingly complex data centers.
Techopedia's Cory Janssen sat down with Ben again to talk about how the cloud landscape has changed over the past year, where it might be going and how companies are shifting the way they manage cloud resources.
Cory: It has been a little over a year since we last talked, what have been some of the biggest changes in the cloud landscape over the last year?
Ben: The dynamism of this marketplace continues unabated. The pace of change that we talked about in the last interview – with traditional gateway hardware vendors giving way to software in the data center and cloud – has accelerated. And, the rivalry between cloud vendors (mainly AWS and Azure) is picking up pace, while also creating new alliances (Google and Cisco, VMware and AWS).
So, with this backdrop, what do CIOs care about? Many are implementing a cloud-first strategy which is requiring them to figure out which workloads should go to the public cloud, and which ones should stay private.
A hybrid and multi-cloud future is speeding toward all of us at a much faster clip than expected. This pace of change is forcing a new approach to managing and optimizing IT.
Cory: In the enterprise space, the whole concept of the cloud seems to be shifting toward hybrid. Is the old idea of cloud dead? Is hybrid the new cloud?
Ben: Without a doubt, it will be a hybrid cloud future. There has been an incredible pace of change that we’re seeing in adoption of hybrid cloud as the de facto; the public cloud is growing incredibly well, but it doesn’t mean that the private cloud is shrinking. If you look at various sources forecasting this trend (like the Cisco Cloud Index and Morgan Stanley CIO Survey), when married together, you’ll see roughly a 3 to 5 percent growth rate on the private cloud and a 60 percent growth rate on the public cloud.
There is also more adoption of complex enterprise applications in the public cloud, not only native or new apps, but taking the more production-oriented apps into their public cloud equivalent environments.
This reality is a forcing function to closely consider how to manage these changes in the most cost-efficient, performant, and compliant way.
Cory: There’s so much buzz about machine learning right now. You guys were working on autonomic features in your software a few years ago. Do you think you were ahead of the curve there in terms of talking about getting cloud management out of human control?
Ben: Fortunately, yes. A lot of people thought big data was the way to manage performance and, in the absence of that, they used older provisioning and manual intervention techniques – basically, people responding to machine-generated alerts. We believe what was missing was the ability to understand the demand so that the application workloads could autonomically, based on advanced real-time analytics, make intelligent decisions themselves about where to run, when to start or stop, when to size up or down. The answer was a self-managing system, which is far more efficient than overprovisioning and people chasing machine-generated monitoring alerts. It’s also more efficient and more timely than a traditional big data exercise whereby people aggregate enormous amounts of data without understanding exactly what they’re trying to collect. Then they have to move that data to a common repository or data warehouse. They then have to structure that data, correlate that data, all with the goal of finding an inference.
We're not big believers in big data. Our intelligence is a different type of AI for performance management. With big data it’s expensive to collect all that data, and very easy to clog the very systems you’re trying to manage by virtue of moving that data. By the time you move it, structure it, correlate it, and find an inference, you are no longer real time. Finally, that inference, when you derive it, you have to give it back to people again. That’s what makes machine learning so valuable for finding insight in large data sets; it’s not quite as valuable for delivering performance management in IT systems.
Cory: According to the Morgan Stanley CIO study, half of all workloads will run in the public cloud by 2020. What risks do organizations face when making that shift?
Ben: Virtually all workloads in the on-premises world are over-provisioned and under-utilized, which is the result of well-intentioned guesstimates from IT. This is the foundation that organizations are working with as they consider moving and migrating to cloud. This has been true for more than two decades. The on-premise world is predominately a fixed-cost environment where there is ownership of capacity – so there’s little penalty to pay.
As organizations adopt hybrid cloud, they are moving their overprovisioned workloads into the cloud – a variable-cost world. If you're over-provisioned, you’re paying for that by the second or minute, depending upon your public cloud provider. Being compliant also becomes a great risk in this new model.
Cory: On paper, in theory, just moving to variable-cost makes sense, but when you put it that way, it’s so simple. I mean, you’re asking the architects and the IT side to also be finance guys.
Ben: Exactly. It’s estimated that public cloud bills are more than two times what is expected. Why is that? Because when you’re migrating a workload to the public cloud, you’re taking it based on an allocation template. You’re not sizing it up and sizing it down. The likelihood of over-provisioning is high, and therefore your expense levels will be high. It’s critical to understand the true consumption of a workload and then size it appropriately (up or down): this is one of the benefits of Turbonomic.
Ben: So, one of your earlier questions was about the changes happening in the cloud landscape. As an example, Amazon now has per-second pricing for compute and storage. Think about how dynamic the market is that they can come down to, literally, a per-second offering. Pretty wild, considering that it was a little less than a year ago that Google came out with per-minute pricing, because Amazon had been per hour.
We now can do compute, memory, network and storage in Amazon using their pricing flexibilities literally down to the second.
Cory: I’m sure that when you talk about those big databases, all those big relational databases, that’s one of the most expensive instances with AWS, right? So, you’re going right to the meat of it.
Ben: There are several important issues you hit on there. If you look at Amazon, for example, they’ve actually taken your question about the database to another level. Database as a service is one of the fastest growing platform-as-a-service offerings they have. And, both AWS and Microsoft have built quite a large number of platform-as-a-service offerings. Some are around big data machine learning. Whether you’re using their database or your database, storage costs are quite large, and total costs can be quite large, and the variability – or the opportunity to improve on those – is significant. That’s what we’re doing: Customers can close to double their ROI when running our new Turbonomic storage capabilities for public cloud, as well as the compute and memory and network capabilities we offered before.
If you look at Microsoft, they made a number of major announcements at their recent Ignite event. They now have availability zones and reserved instance offers, like AWS. That’s important because it shows what customers are asking for. But it also shows that, as with these things, there’s complexity, and complexity can quickly overwhelm people.
Cory: Can you talk a little bit about how Turbonomic has been able to marry together the different cloud platforms? We’ve been kind of dancing around that quite a bit in terms of their various features on AWS and Azure. It almost sounds like it’s a situation where, over the last couple of years, there was a choice where you’re one or the other, but more and more companies are able to marry those together now.
Ben: Historically, when a new platform was introduced, new tools were introduced to aggregate data and give it to a person to manage or fix. The limiting factor is the human skillset. This complexity is forcing a new way of managing IT. You’re hearing a lot more these days about AI, self-driving databases, data centers, etc. We believe that the answer for managing complexity in a hybrid environment is by creating a self-managing environment via a control system capable of bridging both existing gaps. We give people a bionic ability of sorts to harness the complexity of their environment with software that eliminates the guesswork and limitations that previously existed in order to ensure workloads run performantly, compliantly and cost-effectively, regardless of whether it’s in a private or public cloud.
Cory: You might as well throw in Google as they ramp up their offerings over the next couple years. It’s all about cherry picking the best services on each platform.
Ben: Yes. We’re excited to support Google environments in a future software release. To your point, there’s a bunch of decisions around where to place a workload, and how and when to size a workload, and when to start and stop a workload. Remember: A workload might be a VM or container, it might be a VDI – so the flexibility inherent in making those choices across a larger set of alternatives or options is enormously valuable to customers seeking to run the lowest cost, the best performance, and assured compliance. At this scale, software can do this far more efficiently, versus relying on people responding to machine-generated alerts when the applications have broken or violated a threshold.
And, consider the new breed of regulations continually being introduced. There’s Global Data Protection Regulations, and that affects what data you hold and where that data resides, requiring data sovereignty. Then there’s affinity and anti-affinity around which data can sit with other data sets. And then there’s business continuity and high availability requirements on top of this! In the public cloud, if you want five nines, you need to be in at least four availability zones. You’ve got to think about disaster recovery, multiple business rules. The reality is this: If you don’t inspect those business rules, every time you size, start, move, place or clone a workload, then you don’t know that you’re in continuous compliance. You’re either compliant – or you’re not. It’s a binary issue.
Cory: It’s almost become so complicated that the business rule almost makes it impossible for a human to handle.
Ben: Exactly, and that’s the problem, especially when we’re running at a scale that is 80 to 90 percent virtualized in the enterprise. We’re running at a scale that has to mature beyond manual intervention by responding to machine alerts when the applications are allowed to break. Oh, and by the way, I’ve got to be able to learn these new skills in order to do the same thing on better terms in the public cloud. It’s just way too much.
Cory: You know what? As you’re talking to me about this, it’s amazing to me how the underlying issue is not whether you’re talking about migration or whether you’re talking about compliance issues. There’s so much overlap there, and even as you’re going through compliance, a lot of those issues really overlap. The core issue is that there’s only going to be more complexity in the next few years. If you’re not on the right path right now, you’re dead in the water, because if you can’t handle things now, then how are you going to handle them in the year 2020?
Ben: Totally agree. And then, by the way, just to make your point, it gets even more complex, because now we have to think about not just where a workload runs, but what is a workload? So, you may actually be in a world of optimizing a VM today, but it might be containers and microservices with a cloud OS tomorrow. Well, OK, that’s fine, but then how are you going to find a Kubernetes person, let’s say, in Kansas, or a Docker person in Delaware? So, there’s constant evolution in the way people are addressing these things.
It gets a little scary, but if I can use software to help solve that problem then, wow, it becomes invigorating instead, right? Because, we take people up the value chain, and we have software do the lower-value, mundane things.
Cory: Right. Then you can have your high-level resources actually take a step back and think, which is what they should be doing, instead of managing alerts.
Ben: Exactly! People went into technology because they were interested in evolving the technology landscape and, frankly, creating cool things. Those were the great reasons to go into technology, right? It was not to be beholden to an alerting regime. So, this is a new set of skills that’s able to come from it. I mean, how is anybody ever going to resource every container in real time? No one’s even answered that problem yet. And the answer is that it will be done through software.