DevOps and site reliability engineering (SRE) are two of the most discussed topics in the IT world these days. These two disciplines are sometimes a bit difficult to differentiate. The purpose of a DevOps initiative is to combine the development and operation processes and make them frictionless. And the purpose of SRE is to achieve reliability by implementing the best practices in engineering and operations. In short, SRE provides solutions to succeed in different DevOps scenarios. So, these two streams are not competing with each other, rather they are providing the best of their respective solutions to achieve the common goals of software development. (To learn more about what is involved in DevOps, see DevOps Managers Explain What They Do.)

Confusion Between the Two

DevOps and site reliability engineering are probably the most commonly used methods for the development of software. The two terms are often confused by people, but at the same time, they also overlap to quite an extent. As a result, they aren’t so different after all. Therefore, we need to understand the finer details to differentiate them and identify the similarities.

Why SRE?

Almost a decade ago, Google took steps to change the manner in which it undertook production management. The R&D team was responsible for creating and pushing new features to production, while the operations team was bent on keeping the production process stable. The problem, though, was that both teams were moving in the opposite directions.

In trying to bridge this gap, a solution was identified. Rather than have an operations team solely working as administrators, software engineers (with an R&D background) could help both teams work together. This is when the position of site reliability engineer was created.

The job of site reliability engineers is to create a stable production environment, but also cater to the development of new features. Teams are generally composed of both systems engineers and software developers. Engineers have the job of solving problems using software. They also must easily integrate with the development team. The idea behind this is to improve the quality of code and automation testing. The value of SRE was quickly identified by several key organizations which began to embrace the discipline. These companies include Netflix, Dropbox and GitHub.

Why DevOps?

The DevOps movement came about because it enabled developers to write code without a complete understanding of how it was meant to run in production. It is very much a recent movement that helps organizations move in an agile manner. The combination of knowledge and effort of both the dev team and the ops team is meant to produce a more agile, reliable and robust product.

The idea behind the DevOps team is to bring about an automated system in agile software development, allowing the dev team to focus on providing new software but also catering to the functional and compliance principles needed by operations. (DevOps doesn’t always work out so well. Learn more in When DevOps Goes Bad.)

What Are the Differences?

On the whole, both SRE and DevOps are used for the management of an organization’s production operations. However, both have stark differences. Primarily, DevOps finds problems and then dispatches them to the dev team for solutions. However, the purpose of SRE is to find and solve some of these problems themselves. The DevOps team would work on solving problems conservatively, such that the production environment remains untouched. However, SREs usually push for rapid changes and software updates, despite maintaining a stable environment for production.

DevOps is typically not a role. Rather, it is something that should be done as a team. Contrarily, SRE allows for the creation and maintenance of a highly available service. It is thus a role assigned to a professional.

The goal of DevOps is to focus on empowering developers so that they can build and manage services with measurable metrics that help in prioritizing tasks. SRE is meant for the monitoring of applications and services after they have been deployed and to implement automation for improving the health and availability of a system.

What Are the Similarities?

While SRE and DevOps sound different on the basis of their core philosophies, they actually have the same definitions of success. Some of these are:

  • Reducing the number of organizational silos
  • Offering an environment where failure can be accepted and also is to be expected
  • Making changes incrementally
  • Making use of automation
  • Monitoring of success

It is also to be noted that SRE is scalable for continuous development of complex frameworks. At the same time, DevOps is suitable for frequent releases of code, ideally digitally distributed products. Both of these, though, exist under the paradigm of DevOps.

SRE also prioritizes continuous improvement over continuous development. With SRE, the improvement to existing programs falls under the development team, with importance being as much as that of new releases.

Competing Methods?

DevOps and SRE may not be considered two competing approaches, but close methods with overlapping areas. They are designed to function alongside each other as a means to overcome organizational barriers for the delivery of faster and better software.

DevOps and SRE are both different operational approaches that can be chosen by any organization. To say that one is correct or trumps the other would be incorrect. Rather, it all depends on an organization’s philosophy and needs, and likewise, this must be kept in mind when making the choice of approach.