What does tiered storage really do to performance?
- 28 April, 2011 00:52
This vendor-written tech primer has been edited by Network World to eliminate product promotion, but readers should note it will likely favor the submitter's approach.
Enterprises are challenged to keep pace with mounting unstructured file data. While network attached storage (NAS) represents the optimal choice for storing such data, the strain to scale NAS economically while preserving application performance is like trying to use your fingers to plug holes in a dam that has sprouted thousands of leaks.
This is worsened by the fact that much of today's enterprise storage is consumed by inactive data. In response, vendors have delivered storage tiers and media types optimized to each tier. But tiered storage can increase latency, and nothing can eat at performance faster or frustrate users more than adding latency.
IN DEPTH: Struggling with supersized storage
How much latency is acceptable varies greatly. In the world of visual effects and computer graphics, for example, latencies of .2ms are considered a competitive advantage, while tolerances of 1ms or less are ideal in the database domain. On the other hand, users accessing a general purpose application may tolerate much higher latencies.
Tiered storage falls short in achieving targeted latencies. Among popular file serving benchmarks, measuring very high-end NAS devices with integrated tiered storage (including SSD technology), a .4ms latency was recently achieved. This was the absolute best the high-end NAS solution being tested could deliver. As the benchmark test added incremental client load, the latency increased, reaching .8ms.
The issue isn't so much with the different media in a tiered storage device, but more with the underlying architecture. Within any NAS is a server with processors, memory and I/O. What makes NAS "NAS" and not a general purpose file server is the customized file systems and the many associated applications resting on top used to protect and/or optimize data placement.
Tiered storage architectures often execute clustered file systems, joining N number of nodes, load balancing across nodes, moving data to and from NVRAM to disks, hosting RAID controllers, performing housekeeping of file-system metadata, and executing data protection applications. The point is that NAS controllers perform many tasks that consume system bandwidth. The most sacred function is the orderly storage, retrieval and protection of user data, and not performance.
The question becomes, How do enterprises get the best of both worlds: scalable capacity and scalable performance that preserves low latency?
What is needed is a performance layer within the NAS that isolates high performing media such as Flash, a performance layer that sits alongside the NAS the enterprise already has. It is void of latency-consuming responsibilities, such as executing a file system. Its job is simply to cache hot and warm data and accelerate NFS processing such that clients get sought after data quickly. It should pass on newly written data to the back-end NAS while adding minimal latency, say as low as 10 microseconds.
Such a system should be capable of achieving .2ms in overall latency, inclusive of the NAS it supports and be able to cling to this level of latency as client loads scale. This is achievable because the performance appliance is tuned to accelerate NFS data, nothing else.
With such a performance layer in place, traffic destined to the NAS can be greatly reduced, meaning the NAS can scale more efficiently. It takes fewer disk drives and less controller bandwidth to achieve performance. The appliance serves as a high-speed cache, storing the most active data, supporting multiple NAS devices, and providing the benefits of tiered storage without the overhead.
Low latency afforded by a performance layer can also help general storage administration even if raw performance isn't the immediate goal. For example, a performance read cache can provide the means to use less expensive storage media while maintaining targeted performance: you can use less expansive SATA drives rather than SAS or Fibre Channel and even substantially increase the capacity utilization on these drives.
This is true when the working set largely resides in the cache and where write traffic, as characterized by leading file serving benchmarks, constitutes about 10% of the overall operations. By delivering extremely low latency on non-write data, increased latency by way of decreased IOPS on disk drives can be added to the mix. For example, SATA drives can be added, providing 45% fewer IOPS than SAS drives, adding perhaps 1.8ms in overall latency, but achieving 2.xms overall, which still might meet the performance objectives of the organization.
Introducing a performance layer appliance alongside NAS allows enterprises to achieve better latencies and achieve the advantages of tiered storage without the complexity and overhead of actually installing tiered storage, while achieving superior performance.
Administrators have the choice of granting these performance improvements directly to users or capturing it to offset the use of lower performing media for inactive data. Not only should a performance layer appliance deliver significant performance gains it should save the company tremendous capital, power, space and cooling costs. By reducing the dependency on the NAS to deliver both performance and capacity, and the associated data protection applications, much greater cost efficiencies can be realized. Disk-drive savings alone can be significant.
Alacritech was founded in 1997 by network storage pioneer Larry Boucher. The company develops NFS Acceleration appliances that mitigate network attached storage (NAS) sprawl and improve the performance of NFS infrastructures. Alacritech can be found at www.alacritech.com.
Read more about data center in Network World's Data Center section.