Which Write is Right? A Look at I/O Caching Methods
An application's speed is largely dependent on cache I/O speed. Here we compare different cache I/O methods.
Application performance is rooted in speed – speed in completing the read and write requests that your applications demand from your infrastructure. Storage is responsible for the speed of returning I/O (input/output) requests, and the method chosen to commit the writes and deliver the reads has a profound impact on application performance. A common method in today’s industry is to use SSDs for caching on traditional spinning disk storage, hybrid arrays or all-flash arrays. Most caching solutions have accelerated reads for applications, but the real question remains, “Which write is right?”
Patent-pending network and storage technology with compute, virtualization, and SaaS management in ONE enterprise cloud in a box.
Witness the power of Ignite today.
Let’s look at why write optimization affects your application performance so drastically. Write I/O implies that it is new data not written on your underlying storage. In traditional SAN storage, for example, writes are written directly on the underlying storage and then returned to the application. With applications that are constantly writing new data, primarily big database applications (SQL, etc.), traditional spinning disks can’t keep up. Caching on SSDs became a solution that allowed writes to be written locally and cached based on the frequency of the application demand; however, there are several methods to the write-cache’s relationship with the underlying storage that cause a huge difference in performance.
These are the 3 forms of I/O writing:
All three forms have different benefits that are based primarily on the type of data being written: sequential vs. random. Sequential I/O is the most optimized by the underlying disk (files or video streams for example), while random I/Os are optimized by the cache. Most caching appliances don’t have the dynamic intelligence to change the form of writing technology based on the type of data. Let’s understand the difference between the three forms of I/O writing.
Write-around, also known as read-only caching mode, is beneficial purely to free up space to cache reads. Incoming I/O never hits the cache. I/Os are written directly to permanent storage without caching any data.
What could possibly be the benefit of the cache if it isn’t used? It helps reduce the cache being flooded with write I/O that will not subsequently be re-read, but has the disadvantage that a read request for recently written data will create a “cache miss” and have to be read from slower bulk storage and experience higher latency. If your application is transactional, as most mission critical applications are, then application speed will slow down and I/O queues will grow. Essentially the value of this mode would be for rare use cases because it is time-consuming, slow, and not performant.
This method is commonly used in caching and hybrid storage solutions today. Write-through is known as a read caching mode, meaning that all data is written to the cache and the underlying storage at the same time. The write is ONLY considered complete once it has been written to your storage. Sounds pretty safe actually…but there is a speed drawback.
Here’s the issue: Every write operation is done twice, in the cache and then in permanent storage. Before applications can proceed, the permanent storage must return the I/O commit back to the cache, then back to the applications. This method is commonly implemented for failure resiliency and to avoid implementing a failover or HA strategy with cache because data lives in both locations. However, Write-Through incurs latency as the I/O commit is determined by the speed of the permanent storage, which is no match for the speeds of CPU and networking. You’re only as fast as your slowest component, and Write-Through can critically hamstring application speed.
Write-Back improves system outcomes in terms of speed – because the system doesn’t have to wait for writes to go to underlying storage.
When data comes in to be written, Write-Back will put the data into the cache, send an “all done” message, and keep the data for write to the storage disk later.
This solves a lot of the latency problems, because the system doesn’t have to wait for those deep writes.
With the right support, Write-Back can be the best method for multi-stage caching. It helps when the cache has large amounts of memory (i.e. memory measured in terabytes, not gigabytes) in order to handle large volumes of activity. Sophisticated systems will also need more than one solid state drive, which can add cost. It’s critically important to consider scenarios like power failure or other situations where critical data can be lost. But with the right “cache protection,” Write-Back can really speed up an architecture with few down-sides. For example, Write-Back systems can make use of RAID or redundant designs to keep data safe.
Even more elaborate systems will help the cache and the SAN or underlying storage disk to work with each other on as “as-needed basis,” delegating writes to either the deep storage or the cache depending on the disk’s workload.
The design philosophy of Write-Back is one that reflects the problem-solving that today’s advanced data handling systems bring to big tasks. By creating a more complex architecture, and using a cache in a complex way, Write-Back destroys latency problems, and although it may require more overhead, it allows for better system growth, and fewer growing pains.
More from Cloudistics
- What are the benefits of cloud-in-a-box deployments?
- What are some of the factors that prevent companies from fully adopting the cloud?
- Will converged, hyper-converged and super-converged systems lead to the demise of stand-alone servers?
- What are some of the benefits of private cloud?
- What are the compliance limitations of public cloud?
- How is a private cloud platform different from a public cloud platform?
- What is tunneling as it applies to a virtual network?
- How do containers differ from virtual machines?
- What's the difference between software-defined networking (SDN) and network virtualization?