The Demand-Driven Data Center - What System Administrators Can Learn F
The economics of supply and demand has a strong effect on the technology industry, and data centers are greatly affected by this.
Our economy is a complex and constantly evolving system. It affects nearly every aspect of our lives, from our career choices, to the products we buy, to the homes we live in. Just like the economy, data centers are complex and constantly evolving systems. And, of course, while the average person doesn’t like to think of what the cloud actually is, they too are using back-end data centers pretty much every day.
I can’t help but think that we need to take a step back and look at how we think about data centers – and maybe, just maybe, make a sysadmin’s life a little easier.
So how can we do that? Well, what if we use the economy as an analogy for a virtualized environment? After all, these two systems aren't that far apart; the people on Wall Street took to technology years ago – they were the early adopters from the business community. And if you've done any investing lately, you know that technology has become an integral part of how our financial system operates. What if we apply a little of the same logic to how data centers are managed?
A Bit of Background
I’m a weird mix of tech and finance. My past company was a financial education site by the name of Investopedia. Back when we started out, there was no other tech staff. I was the guy building the Web server, optimizing the configuration, and responding to the alert when we lost a hard drive at 2 a.m. (Why is it that hard drives only seem to crash in the middle of the night?)
Those days already seem so far away. We don’t build physical servers anymore. This site, Techopedia, runs on Amazon AWS. So while we still build virtual servers, we no longer need to replace failed hard drives. But while technology has changed drastically over the past decade, our processes haven’t. Virtualization has provided amazing tools and capabilities compared to what was available 10, or even five years ago. The ability to generate a virtual machine at the click of a button is pretty crazy for us old folks who recall the old way of doing things.
The Principles of Supply & Demand
One of the ways a market economy regulates prices is through the laws of supply and demand. Simply put, supply refers to how much of a product is available, while demand describes consumer interest in a product. If you go to business school and take an economics class, you’ll see lots of charts like this:
All along the demand line, companies are willing to provide a product. All along the supply line, consumers are willing to buy that product. The “P” and the “Q” are price and quantity. So, at a lower price, people buy more. That isn't so hard to understand, right? On Black Friday, prices are lower for that flat screen TV, so more people buy one.
There is a lot more to it, of course. You can get a PhD in economics. But rather than blabber on, I think the best explanation came from Father Guido Sarducci. He was on SNL in the early days, and this part of his Five Minute University explains economics as succinctly as anybody.
But that basic concept is enough to get us started. Now, let's take a look at how we can relate this to the tech world.
Supply, Demand and Tech
The effect of supply and demand reaches further than the price of a given product. It can also change the direction that technology takes. Consider Google Glass. When the device was first introduced, there was quite a bit of media buzz surrounding it. However, when the practicality of the glasses was called into question, demand plummeted. After Glass was released, the product failed to gain traction and within a year, Google canceled production. No demand for the product lead the company to terminate supply. Simple.
If you're already thinking of some examples where technology seems to be immune to the laws of supply and demand, we're way ahead of you. After all, how do you explain those long lines outside of the Apple stores prior to a release? Don't they suggest that price doesn’t matter? Well, it turns out that price does matter, at least on a worldwide scale. The iPhone you buy in the U.S. is priced totally differently around the world. For example, the release of the iPhone 6s and 6s Plus was roughly 50% higher in India in October of 2015. It seems that Apple’s strategy in that market is to appeal to the high end luxury consumer. Compare this to China, where some experts believe that Apple will lose out to lower-cost competitors. For the Chinese market, the price of an iPhone may just be too high.
So, enough economics. The point here is that supply and demand works. Free markets have proved to be the best at setting prices. Now, let's move on to how this might apply to a data center.
|Free Download: Demand-Driven Control for the Data Center|
Data Center Complexity and Tradeoffs
Anyone who works with a data center can testify to the intricacy of their inner workings. This complexity, however, can often require sysadmins to make certain tradeoffs to solve frustrating problems. It would be nice if we just had to deal with a simple supply/demand chart with a couple of variables. However, in a data center, you have to balance:
- CPU vs. memory
- Storage I/O, latency and growth
- Network throughput (both North-South and East-West) and network latency
- Cost and budgeting issues
- Application performance vs. infrastructure utilization
- Changing business goals
- Cooling vs. power (and all the environmental factors)
And this list is far from complete. I’m sure you could probably add another half-dozen variables or mix and match the tradeoffs for your situation. The point is that in a data center, we must walk a fine line between trying to both increase performance and get the most out of the data center's capacity. The model isn’t two- or three-dimensional, it is n-Dimensional with almost unlimited permutations.
As each area of the n-Dimensional environment fluctuates, it means that all of the relationships between entities shift. All of the tradeoffs fluctuate along with the environment. Fixing a CPU issue may move an application away from the data. Fixing a memory issue may actually highlight that it was really a storage issue. All of the tradeoffs scale, but the people who have to deal with this can't scale with it.
This video describes the problem well:
The Never-Ending Tradeoffs
When I think of tradeoffs, I think of house shopping. You start out with the perfect wish list of where you want to be - that nice bungalow with a big backyard that has a three-car garage and enough room for a man-cave. Your spouse wants a nice yard too, but won’t live in a certain part of town, and simply won’t settle for a small bathroom or give up the current walk-in closet.
In the end, you both settle for a two-story with a tiny garage, basically no backyard, and a bathroom that needs re-modelling because you couldn’t pass up the price you got to live in a neighborhood with the top school system in town.
Now that is a tradeoff.
This is the sort of tradeoff that has to be made in data centers every day. Sure, you can do anything given unlimited budget, but nobody has unlimited budget. We also all have the same number of hours in a week. And, of course, the oldest tradeoff in all of IT is that of over-provisioning. You can’t be down – you simply have to be up to however many decimal places your company determines. Yet, that underutilized infrastructure has a cost. The Uptime Institute, a research firm, claims decommissioning a single rack of servers can save $2,500 per year.
That number seems low to me. You may have even figured out the cost of underutilized infrastructure in your own environment. But that is just the cost variable. Throw in the other dozen variables. Now, throw in the fact that when each variable changes, it impacts every other variable. The number of potential end states gets into the millions or billions pretty fast.
It’s just like picking a house – the“end state” of what you pick is impossible to model out given that you have an n-dimensional problem.
My point with all this is that our break-and-fix method of managing data centers is broken. If you haven’t heard of break and fix, you will know inherently what it is if you’ve spent any time in IT. It is basically when we put something up, watch it break, change a config, go back to log files, tweak again, watch it break again, etc.
Rinse and repeat, rinse and repeat, day after day. This works on a small scale, but it falls apart given the complexity of today's data centers.
Back to Wall Street
We began this article by talking about what data centers can learn from Wall Street. So first, let's take a look at how Wall Street works, and how stock trading has evolved from being a manual system, to being one that's run by technology.
The New York Stock Exchange opened way back in 1792, but the trading of listed companies as we now know it began in 1934, when the NYSE registered as a national securities exchange with the Securities and Exchange Commission. In the beginning, stock exchanges operated on what's called an "open outcry" system. Traders and stock brokers would crowd onto the trading floor, and would yell and use hand signals to transfer information about buy and sell orders. This system used bids and competition to arrive at a price, much like many other types of auctions do. The system got the job done, but it was chaos.
By the 1980s and '90s, stock exchanges were moving toward using technology to replace some of what was happening on the stock exchange floor. The trading floor evolved when it became possible to submit requests electronically based on what was happening on the floor using devices connected to the Internet.
But Wall Street didn't stop at just embracing technology. In fact, it has continued to evolve to the point where now the big investment banks and hedge funds have invested in quantitative traders (quants) to build algorithms that trade based on arbitrage and price inaccuracies. This new high-frequency trading automates the old process of the person with the paper in real time such that a decision is made automatically if it is in the trader's best interest. Electronic trading has proved to be faster, cheaper, more transparent and more profitable, and Wall Street has, for the most part, decided to trust these this technology to make billions in trades each day.
As a result, stock exchanges are no longer full of traders screaming in the trading "pit." The press roam the floor that traders used to inhabit, and people now joke that there are almost as many journalists on the floor as there are traders.
When technology was first introduced to in stock exchanges, it was a big deal; now it just makes sense. After all, who do you think will get the better price in a trade?
Or this "guy"?
The first picture is of a floor trader on the New York Stock Exchange (NYSE). The second picture is one of the data centers for the NYSE in New Jersey. The NYSE used to run based on what happened between a bunch of old paper traders on the stock exchange floor, all of them yelling and screaming over each other to get the best price for their clientele. We're talking about hundreds of people standing on the floor, waving their hands in the trading pit, screaming at each other. And it actually worked. Well, sort of.
In a way, we are still managing data centers like that old model of the NYSE. Yes we have monitoring tools, but if the tool is based on gathering data for when something is broken, it is based on that old break-and-fix model. This might have worked when data centers were less complex, but how is it possible for a human to keep up when you are talking about dozens of constraints across hundreds (or thousands) of hypervisors or containers?
A human can’t keep up. It's beyond human scale.
Actually, we shouldn't have to. Algorithmic trading looks at dozens of variables in real time and makes decisions based on - you guessed it - supply and demand. The traders using these algorithms combine computer science with an understanding of supply and demand.
The question is, why can’t we use this same concept in a data center?
A demand-based system lets your engineers get back to engineering solutions for your business, instead of fighting fires and dealing with alerts. I think the solution is more akin to autopilot on an airplane. Imagine if autopilot experimented by breaking things until it came up with the right solution - not a great flight if you are on that plane. Instead, the concept behind autopilot is more one that takes into account all variables and manages an ideal state in real time. Sounds pretty good, right? I highly recommend this paper on the concept.
The parallel in a data center is that this is a move from simply monitoring to actually controlling your infrastructure. An airline pilot can’t monitor everything – that’s what the system does. By giving up some control to technology, the system can actually become that much more powerful.
So what did we cover?
- Supply and demand is the most basic concept in economics. It works.
- We can learn from economists by applying supply/demand to our data centers.
- The issue is that data centers are too complicated and there are too many tradeoffs for a human to manage effectively.
- This leads to the break/fix model we still see in many data centers today.
- We get the best of both worlds when we combine an understanding of demand-driven management with a tool that is proactive in understanding the tradeoffs.
If you're a sysadmin, we hope this high level view gives you some food for thought. In some ways, this is a simple concept, but it cuts to the core of what sysadmins do day-to-day. Most of all, it suggests that so much more is possible.
More from Turbonomic
- Why would companies invest in decision automation?
- What are some advantages of multi-cloud deployments?
- How does software-defined networking differ from virtual networking?
- How does dynamic allocation in the cloud save companies money?
- Why should companies be considering intent-based networking?
- Why is it important to manage a relational database system in the cloud?
- How can businesses innovate in managing data center bandwidth?
- What are some best practices for cloud encryption?
- How does visibility help with the uncertainty of handing data to a cloud provider?
- How can companies maintain application availability standards?
- Why do cloud providers seek FEDRamp certification?
- How might a team make an app "cloud-ready"?
- Why does loosely coupled architecture help to scale some types of systems?
- How might companies deal with hardware dependencies while moving toward a virtualization model?
- Why does virtualization speed up server deployment?
- What is the virtualization "backlash" and why is it important?
- Why could a "resource hog" make virtualization difficult?
- How might a company utilize a virtualization resource summary?
- Why do undersized VMs lead to latency and other problems?
- What are some of the positives of a demand-driven migration model?
- Why should cloud services offer both elasticity and scalability?
- What are some of the values of real-time hybrid cloud monitoring?
- Why might a company assess right-sizing on-premises versus in the cloud?
- How can companies deal with “dynamic unpredictability?”
- What are some basic ideas for optimizing hybrid cloud?
- Why do some companies choose Azure or AWS over open-source technologies like OpenStack?
- What are some advantages and drawbacks of stateless applications?
- Why is it important to look at the "full stack" in virtualization?
- How does automation help individual system operators?
- How do companies develop a "data center BMI"?
- How can companies tally up cloud costs for multi-cloud or complex cloud systems?
- Why is a good HTML5 interface important for a business project?
- How do companies work toward composable infrastructure?
- How can a manager use a workload chart?
- How can companies work to achieve a desired state?
- How can companies cultivate a better approach to “object-based” network changes?
- Why do naming conventions for virtual machines help with IT organization?
- Why is reserve capacity important in systems?
- What are some values of cloud-native architecture?
- Why is it important to match uptime to infrastructure?
- What's commonly involved in site reliability engineering?
- What are some important considerations for implementing PaaS?
- What are some challenges with handling an architecture's storage layers?
- What are some of the benefits of software-defined storage?
- What are some things that rightsizing virtual environments can do for a business?
- What are some benefits of continuous real-time placement of user workloads?
- How can stakeholders use the three key operations phases of autonomic hyperconvergent management?
- Why would managers suspend VMs when VDI instances are not in use?
- Why would managers differentiate storage for I/O-intensive workloads?
- Why would companies assess quality of service for VMs?
- What's the utility of a cluster capacity dashboard?
- How can companies use raw device mapping?
- Why might someone use an N+1 approach for a cluster?
- How do companies balance security, cost, scalability and data access for cloud services?
- How do companies battle application sprawl?
- What are some benefits of self-driving data centers?
- What are some concerns companies might have with a "lift and shift" cloud approach?
- What is involved in choosing the right EC2 instances for AWS?
- What are some benefits of workload abstraction?
- What are some challenges of scaling in OpenStack?
- How do companies use Kubernetes?
- What methods do companies use to improve app performance in cloud models?
- How do businesses use virtualization health charts?
- What is the difference between convergence, hyperconvergence and superconvergence in cloud computing?
- What are some of the business limitations of the public cloud?
- What is the difference between deploying containers inside a VM vs directly on bare metal?
- What are the benefits of converged infrastructure in cloud computing?
- How is containerization different from virtualization?