Evaluating Disk Storage, Part 3: The RAID Controller
In parts 1 and 2 of this series on disk storage, we looked at drive types and the benefits of using multiple drives in a RAID array. This week, we turn our attention to the final piece of the disk storage puzzle: the RAID controller.
Beyond the drives and the array types and layouts, the RAID controller is the key to performance optimization (and cost optimization). There are steps which can be taken to improve general performance and steps to improve performance under specific types of disk IO load.
Improving General Performance
One of the first general performance gains in RAID controller technology was the creation of a write-back feature, versus write-through. Write-through was a cheaper process, where the RAID controller got the write, put the data on the disk, and then returned a successful write response to the operating system. But the operating system had to wait for the return, which reduced performance, so write-back was created.
With write-back the RAID controller was given built-in memory for caching, so it got the write, stored it in the memory cache (very speedy), and returned a success response to the operating system right away. It then wrote the data to disk as soon as it was able.
A problem came up when the OS thought the data had been written to disk, but something prevented the write (power loss, bad disk sector, etc). To combat this, battery backup units were added, so that if power was lost, the RAID controller would remember what had not yet been written, and could write it to disk once the drives were back in operation.
Tracking was also added so that the system would know which drive was the most ‘up to date,’ in case a drive failed.
With write-back and tracking in use, operating system features like read/write re-ordering were added into the RAID controllers as well, since the controller knew what was really going on ‘under the hood.’
Other steps for boosting general reliability and performance include adding redundant hot-spare RAID controllers, shared cache monitoring, redundant power supplies, and multiple paths to the drives.
Essentially, the improvements in RAID controllers are a cost vs performance tradeoff. Low-end controllers may support only write-through, while higher-end controllers support write-back. More expensive controllers support SAS+SATA vs SATA only, 6Gb vs 3Gb bus speeds, and larger cache sizes.
Improving Read Performance
Some solutions are either read-heavy or write-heavy, so steps can be taken to improve the performance for those specific situations. For example, a database server with large amounts of read IO for a relatively small write dataset can use solid-state drives (SSD) to cache frequently read data, such as database index information, or actual tables.
This has to be structured such that an SSD failure does not lose any data, but simply lowers performance until it is replaced. This works by using memory cache of the RAID controller to “stand in” for the down SSD. The RAID controller has to make certain guarantees about write ordering and caching, but can make effective use of these technologies in a “hybrid” mode. The key in this case is that SSD speeds up reads but has no (or minimal) impact on writes.
A RAID controller can also be configured to prefer certain drives for writes and others for reads, allowing for increased performance for write-heavy situations (guaranteeing data integrity), while still providing acceptable read performance.
Improving Data Integrity Guarantees
Modern drives (both SAS and SATA) do self-checks and provide alerts if an issue is found on a self-check. A RAID controller can have hot-spare drives configured so that if a drive shows any signs of ‘old age,’ a hot-spare drive is immediately brought into operation. This minimizes the risk window of reduced data redundancy.
Some RAID controllers are also capable of using spare IO bandwidth to actively police the drives, trying to force an error while the drive is not in use, again, to catch problems as they begin to arise.
At ServInt, we use a combination of RAID10, RAID50, and RAID60 arrays, and offer full backups of all drive arrays, either by block replication, straight file system copies, or other mechanisms. Our users have different needs, and we meet them with the best solutions possible. For some people, it is directly attached storage RAID arrays with full backups, coupled with hot-spare hardware for extremely high-speed disk access. For other customers, it’s a simple RAID1 configuration with off-site backups. Other customers have even more specific needs. Our goal is effective data performance and protection, with the best risk-mitigation possible.Photo by Daniel J. Mitchell