HAL: “Sorry to interrupt the festivities, Dave, but I think we've got a problem.”

BOWMAN: “What is it, HAL?”

HAL: “My F.P.C. shows an impending failure of the antenna orientation unit.”

What if you could accurately predict when a given component of your IT infrastructure will fail? That would help, wouldn’t it? But even the artificial intelligence computer in Stanley Kubrick’s film “2001: A Space Odyssey” could only provide an estimated time of failure:

HAL: “The unit is still operational, Dave, but it will fail within seventy-two hours.”

Information technology management has come of age – though still far short of the predictive ability of a HAL 9000. For decades now, network and system managers, engineers and technicians have demonstrated continuous improvement in their handling of technical issues. Through proactive and reactive efforts, IT teams have brought a wide variety of technologies under control. But a new wave of IT infrastructure management is now upon us, and a promising future lies ahead.

Operations & Maintenance (O&M)

Before we consider current trends in IT infrastructure management and the potential advancements in the field, let’s see what has brought us to this point. What have been recognized as best practices in IT management have been codified in such standard frameworks as ITIL (Information Technology Infrastructure Library). Keeping networks and systems running and optimized is the primary focus of any IT operations department. Front-line data center and network operations center technicians work proactive and reactive tickets to quickly address failures according to service-level agreements. A panoply of methods are employed to manage configurations and improve performance.

To label this section “The Past” would be inaccurate. The IT practices discussed here remain in place to the present day and will continue beyond. We can, however, look at some of the key facets and approaches that have been part of IT infrastructure management since its inception. These may follow a sort of historical timeline, but the concepts will survive for years to come.

Break-fix is a manner of work that any tradesman could understand. If it’s broken, fix it. According to the Computer History Museum, the maintenance team on the ENIAC developed the skill to be able to replace one of its 18,000 vacuum tubes in just 15 minutes. For decades technical support personnel have honed their abilities to quickly identify and resolve routine issues in any component of the IT infrastructure, whether hardware or software. (For more on the ENIAC, see The Women of ENIAC: Programming Pioneers.)

Network operations centers (NOCs) and data centers have been critical to the management of the burgeoning digital environment over the years. Telecommunications carriers developed a “follow the Sun” system to monitor global networks so that personnel in Europe, for instance, could cover a shift while those in North America slept. Before the end of the dot-com bubble, web hosting and data center companies such as Exodus Communications and Global Center outfitted vast spaces within secure buildings with raised floors, dual diesel generator backup, mantraps, biometric access, cabinets and cages, and sophisticated fire suppression systems. Many of these companies went bankrupt, but colocation, hosting and other service providers use such data centers today.

IT management can be both proactive and reactive. Monitoring, or surveillance, has been among the hallmarks of IT infrastructure management. Network and systems management tools such as HP Openview (now called HP Business Technology Optimization) were widely employed to provide an effective visual representation of IT architecture. Managed objects were linked by SNMP to screen icons that turned green or red depending on their current state. Technicians also responded to customer reports by telephone or email. As processes became more sophisticated, automated ticket systems generated into departmental queues based on alarm or event thresholds that had been exceeded.

Managed services are outsourced to companies that offer skills that may not otherwise be available. This may be offered in a 24/7 support model. As in NOCs and data centers, these providers may allocate their support personnel to tiers 1, 2 or 3, depending on each technician or engineer’s skill level. Vendor support is sometimes considered level 4.

Virtualization and Cloud Computing

Today’s technical environment looks a lot different from the early days. The footprint for computer equipment continues to shrink, and soon it will be almost invisible. The UNIVAC I was 25 feet wide and 50 feet long. Nowadays we hold computers in the palms of our hands. With virtualization, machines that once were buzzing and humming just feet away from us now may exist in an artificial environment either onsite or halfway around the world. First it was server operating systems, such as Linux or Windows, that were simulated on virtual machines. Now even devices like switches or routers – or practically any network appliance – can be virtualized in a state-of-the-art information environment. (See Network Virtualization: A New Framework.)

Centralization is back. In the old days, a mainframe computer handled all the processing and memory in a single physical machine. Remote terminals established connections to the mainframe and shared its resources. With the advent of PCs, computing became decentralized. Users had their own processors and hard drives, and would even eventually be able to sit by a lake with their laptops, free and independent. Mobile computing is an extension of this trend. Improvements in network connectivity have made centralization attractive again. Everything is moving to the cloud.

Categories of cloud computing include software as a service (SaaS), platform as a service (PaaS), and infrastructure as a service (IaaS). The outsourcing of an enterprise’s infrastructure makes it possible to automate administrative tasks and virtualize platforms. R&G Technologies offers 3 Reasons Why You Should Consider Infrastructure as a Service (IaaS):

  1. Greater productivity
  2. Scalability
  3. Increased data security

Whatever the reasons, more and more companies are moving to cloud computing, and they are finding that they can significantly reduce or eliminate their equipment footprint through virtualization. But even with the trends toward these innovative technologies, many believe that significant inefficiencies remain in the field of IT infrastructure management.

In a blog for Wipro Limited, the general manager of Proactive and Automation, Ramkumar Balasubramanian, writes about “The Future of IT Infrastructure.” He notes that manpower and static tools account for 40 percent of infrastructure management. “CIOs are grappling with the inability of existing infrastructure to keep up with increasing demands from business and outdated IT processes that are incapable of exploiting new technologies,” he says. A new model is required.

Automation and Analytics

Wipro sees a different future. Balasubramanian further says, “Automation and analytics are the weapons in a CIO's arsenal that can enable the shift to the future state of IT infrastructure services transforming the data center into a strategic business asset.” In their white paper called Infrastructure Automation and Analytics, the company explains the evolution of IT infrastructure as a history of five phases:

  • Phase 1: Chaotic
  • Phase 2: Reactive
  • Phase 3: Proactive
  • Phase 4: Managed
  • Phase 5: Utility

They see the increasing demands of businesses, outdated IT processes, lack of standardization, and evolving strategies as drivers for the adoption of an improved model of IT management. The keys are “operational efficiency and a flexible infrastructure.”

Automation will be implemented both in the data center and with the end user. This will include self-healing and event correlation, auto ticketing, asset discovery and machine learning on the data center side as well as self-service, assisted services and application performance management at the user’s end. The idea is eliminate the need for human activity and let the machines do the work.

Predictive analytics have been successfully applied to many other industries. Why not IT? HP offers an IT operations analytics tool that it describes as “an operational intelligence solution that leverages machine data to help IT identify insights hidden in system silos to resolve root cause of failures faster and improve operational performance with predictive analytics.” Now it’s starting to sound more like a HAL 9000! Will it interrupt our festivities and inform us of an impending failure of our IT equipment any time soon? We’ll see.

Conclusion

Computers continue to improve, and networks continue to evolve. Along with this progress are the increasing demands that we place upon our IT infrastructure components, along with the sometimes diminishing financial resources available to manage them. We have developed and cataloged proven methods for fixing IT issues as they arise. And we are gradually reducing device footprints and the burden of so many separate computer boxes. If the futurists are right, eventually we will create machines that can heal themselves, operate independently, and benefit from predictive analytics. Stanley Kubrick may be onto something.