Disaster Recovery

RAID 6 or RAID 1+0: Which should you choose?

RAID selection can be a tough choice, especially when budgets are tight. Learn why RAID 10 might be a better choice than RAID 6 even though it's more expensive.

Last week, my friend and fellow TechRepublic contributor Rick Vanover provided some compelling reasons why you should opt for RAID 6 instead of RAID 5 for data protection, particularly as individual disk size increases and more and more disks are added to an array.  As Rick indicated, RAID 6 provides much greater protection against data loss than RAID 5. In fact, a lot has been written about the growing need to avoid RAID 5 due to its inherent and growing set of limitations.

When it comes to a choice between RAID 5 and RAID 6, I agree with Rick that, from a data protection standpoint, RAID 6 is the better choice. There are, however, some significant tradeoffs, which Rick alluded to in his article.  Most importantly, RAID 6 imposes a serious (as in, not insignificant) write performance penalty, even when compared to RAID 5. For every write operation initiated against a RAID 6-based array, six I/O operations are required. For RAID 5, a write operation results in just four I/O operations. In my opinion, these are significant roadblocks even in a relatively equally balanced read/write environment.

Although cost is always an important consideration when buying new storage, growing disk sizes have gone a long way toward making it possible to focus much more on the performance side of the storage equation as opposed to the capacity side. That's why, in most situations, if given the option, I'd choose RAID 10 (data striping over mirrored data sets) over either RAID 5 or RAID 6. In fact, I recently put my money where my mouth is on the answer to this question when I purchased an expansion disk shelf for our SAN.

Storage capacity didn't enter into the equation (we have plenty of space); however, we were hitting an IOPS wall. As such, the primary focus was performance - balanced read and write performance. The new disk shelf is configured as a RAID 1+0 array. Whereas RAID 6 imposes that aforementioned 6x write penalty and RAID 5 imposes a 4x penalty, RAID 1+0 imposes just a 2x penalty and has other significant benefits:

  • Better write performance. RAID 1+0 imposes only a 2x write performance hit.
  • Faster rebuild speed. Rebuilding a failed disk that takes part in a mirror is a much faster process than rebuilding a failed disk from a RAID 6 array. If you implement a hot spare, the rebuild process can go quite quickly, making it less likely that you'll suffer the simultaneous loss of a second disk.
  • Can withstand the loss of multiple disks (in some cases). This is a bit of a shaky proposition, but is important to note. In every case, RAID 6 can withstand the loss of two disks in an array; this is one of the main value propositions for those who use RAID 6. As long as disks aren't lost on both sides of the mirror sets, RAID 1+0 can also withstand the loss of multiple disks. If the stars were aligned correctly, you could theoretically lose every disk on one side of the mirror and still be operational on the other copy of the data. Again, don't count on losing disks on one side of the mirror, but it's still important to understand.
  • Performance degradation during rebuild process is minimal. When a RAID 6 disk fails, the rebuild process can have a seriously negative impact on overall storage performance due to the need to recalculate parity. With RAID 10, re-establishing a broken mirror is a relatively behind-the-scenes process.

Going back to the space cost inherent in making the choice between RAID 6 and RAID 1+0, understand that with RAID 6, you "lose" 2/number-of-disks-in-array worth of capacity to parity. With RAID 1+0, that "lost" space equals a straight 50% of total array capacity, regardless of the number of disks.  So, yes, RAID 10 does have a higher space cost, but I believe that the benefits brought to the table (particularly with regard to write performance) are powerful reasons to avoid RAID 6 in favor of RAID 10.

Summary

If your storage device is using RAID 6 and you're not having performance problems -- particularly related to writes -- there's no need to blow it away and replace it with a RAID 1+0 configuration. My advice here is intended to be food for thought when it comes to deploying new storage. Don't simply write off RAID 10-based storage due to its 50% overhead cost; it might be worth the cost trade off, depending on your situation. As always, your selection will be based on your own testing, application needs and risk tolerance.

Additional resources about RAID

Want to keep up with Scott Lowe's posts on TechRepublic?

About

Since 1994, Scott Lowe has been providing technology solutions to a variety of organizations. After spending 10 years in multiple CIO roles, Scott is now an independent consultant, blogger, author, owner of The 1610 Group, and a Senior IT Executive w...

23 comments
whitesites
whitesites

I am in the process of building a new server. Space isn't a huge issue. Currently using about 180 GB of my raid 10 array. The array consists of 4 x 146 GB 15K Fujitsu SAS drives. However I feel I have been hitting an IO wall as well. The server runs IIS 7.5, Mail Server, Stats Server, and MySQL 5.1. Right now the plan is to go for 6 x 256GB Samsung SSD 830 models. But not sure if I want to do RAID 10, or Raid 6 ( which would give me a little more room ). The Database is mostly reads. My question is even though IO won't be a problem with SSD drives, is there still a significant read penalty? Being I am using SSD drives its kind of a gamble with reliability. But I already have been running a 3 SSD raid 0 array on my home desktop with no issues for the past 2 years. Naturally I will be going for a beefy raid controller ( adaptec 6805 or similar ). Any thoughts?

aandruli
aandruli

If you are only concerned about performance, you wouldn't use RAID at all. Its for data protection, letting the system deliver the goods. You take a performance hit in the first place by using RAID -- I'd take a little bit more to make sure it does its job with RAID 6

zackers
zackers

One of the biggest reasons for moving away from RAID5 is the data error problem, where the number of data errors per TB has not been going down faster than the storage capacity has been going up. Some errors grow long after the data is written, and some errors occur at write time. Recently it has been shown that mechanical vibrations during the write can easily cause mispositioning of the head and a subsequent write error (even yelling close to the disk can cause these errors). So my question is has anybody ever tried to sort out what causes all the data errors? For example, has anybody ever done a read-after-write test to see just how many data errors occur at write time versus grow at a later date? And has anybody tried to sort out how many write time errors are due to vibration and how many are due to media defects? My point is that if a large percentage of these data errors are due to vibrations at write time, then maybe the situation is not as dire as everyone says it is.

Tony R.
Tony R.

Off-site backup via the Internet, i.e. "in the cloud". If your facility suffers flood, fire or theft, the RAID type is irrelevant. Off-site backup made all the difference for the companies that used it that were located in the World Trade Center in New York City on September 11, 2001.

radio1
radio1

Are your claims based on real test, or are they based on guesses. I am using both raid 1+0 and raid 6. My raid 1+0 is using 4 disk and my raid 6 is using 5. Both arrays are on the same controller. I am using an adaptec 51645 controller. I did some testing a while back and in every test I did, the raid 6 outperformed the raid 1+0 as far as read/write performance. I don't have the actual results in front of me, but I choose to go with raid 6 because of these results. As far as disk failures go, In the single disk degraded state, neither array had a significant change in speeds. In a 2 disk degraded state, the raid 6 was significantly slower. In 2 disk degraded state for raid 1+0 (stares were aligned)there was no significant change in performance. The rebuild times for both were fairly similar. For a 1.5TB array in raid 6 with single disk failure, the rebuild was less than 2 hours. Raid 1+0 rebuild was slightly faster, but nothing to jump up and down about. Raid 6 with 2 drive failure was longer, but I don't remember how much. i change a drive in my arrays every 2 months and the rebuild is always less than 2 hours with no noticeable change in speeds. I have 3 machines like this Super Micro x7DCA-L MB 2x Intel Xeon E5410 quad core 2.33 4 x 4 Gig Crucial Memory Registered ECC 10x WD RE3 1TB hard drives 2x Super Micro CSE-M35T-1B 5 sata hot swap bay Adaptec 51645 controller VMware Vsphere 4.0 u1

hjmangalam
hjmangalam

So RAID6 takes an additional 2 operations over RAID5 & an additional 4 ops over R10. So what? Is the IO Processor bottle-necked by the XOR operations? In most such RAID cards the IOP is NOT the bottleneck and in the data I've seen and measured, R6 is not appreciably slower than R5 or even R10. If you're making such decisions based on such pseudo/theoretical/single dimension constraints without data, you're making a huge mistake. Similarly, have you actually measured the IO degradation on a regenerating RAID? While the performance is not as good as on an intact device, it's usually tolerable with good controllers, especially when compared with a completely lost volume. Show us the data. Or at least link to it.

GDF
GDF

Boy, Scott, I don't know. I'm not sure you've made a compelling case for 1+0 over RAID 6. You mention performance issues, but you haven't really quantified that. You do acknowledge the overhead (1/2 vs. 2/N), and it seems to me that would be pretty significant in a high-capacity installation. In Rick Vanover's article that you mentioned, he cites a good reference for RAID differences. Here's the page for 1+0: http://www.acnc.com/04_01_10.html and the most telling comment on that page, under the "Disadvantages" column, is, "Very limited scalability at a very high inherent cost." The other disadvantages mentioned there are important to note, too, including the issue of syncing drives.

gdekhayser
gdekhayser

Scott - I know you went out of your way to avoid naming storage vendors here. Do you think that RAID-DP (a specific vendor's implementation of RAID-6) comes with the same 6x performance issues of 'standard' RAID-6 from other vendors, or would you make an exception with this specific technology?

sjdorst
sjdorst

Slight quibble here - RAID 0 - pure striping across multiple disks with no redundancy - is purely a performance enhancing RAID level.

Mike Karp
Mike Karp

Embarrassingly, a large quantity of the errors occur during rebuilds. Most typical causes? (1) Power hits during the process, and (2) involvement of the humanware during the rebuild.

Mike Karp
Mike Karp

...this question is about RAID (and thus presumably about online availability) and not about backup/recovery or your issue, which is clearly DR. Obviously, DR sites need to be as far away from the primary site as is practicable, and in a seismically uninteresting site to boot.

tbmay
tbmay

...uptime in the face of the most common point of failure...hard-drives. That is absolutely no substitute for backups. Saying that, I can't say I trust most "cloud" based backup solutions. If your company controls the data-center on the other end and you're getting the data there securely, great. I'm less enthused about third-party backup solutions.

jhoward
jhoward

I have never heard someone claim that RAID is a suitable replacement for off-site backups be it in the cloud or physically across geographical boundaries such as power grids, flood zones etc. Posts such as the original article usually have a link to a standard disclaimer stating the above. RAID itself can benefit the overall performance of your "local" storage and save time in the event of a single or multiple drive failure. Restoring from an off-site backup is usually time intensive and usually should be avoided where possible.

hjmangalam
hjmangalam

when I tried this with both Areca and 3ware controllers. To my surprise, R6 beat both R5 and R10. The IOP's hide the incremental numerical complexity, so it doesn't make any difference to the user. hjm

Scott Lowe
Scott Lowe

I agree 100% that RAID 10 will always cost more than RAID 6, with the one exception being a 4 disk array. After that 50% overhead point, RAID 6's capacity overhead begins to go down while RAID 10 stays at 50%. I mentioned in the article that this would be the case. As far as cost is concerned, if the end result is more performance-driven than capacity driven, RAID 10 should be given serious consideration, even if it means some extra dollars up front. That said, in many cases - especially for smaller shops - capacity is the key driver and as long as RAID 6 can keep up performance-wise, it will win the capacity argument every time. I am careful not to take too hard a stance in these postings. While, in my opinion and experience, RAID 10 has been a good solution, there are certainly instances in which RAID 6 is an excellent solution. Based on the application at hand and the budget, these will be individual decisions.

Scott Lowe
Scott Lowe

GDF, Unless in a majorly read-oriented environment - which I admit, many places are read-heavy - the RAID 6 write penalty is simply not worth the capacity trade off, in my opinion. In order to regain those write IOPS, you need a lot more disks. I have not quantified the disk quantity difference here, but take a look at another post I wrote for more information: http://blogs.techrepublic.com.com/datacenter/?p=2182 In some cases, RAID 6 is good for read-heavy places, but once you start hitting a critical mass with writes, I don't think it's a good choice. Scott

LifeNoBorders
LifeNoBorders

@gdekhayser


Disclosure NetApp Employee 


RAID-DP is a very efficient method of implementing RAID-6 from the point of view of computing the redundant parity data, but like all N+2 methods of data protection is has a theoretical worst case write performance hit of 6x disk transfers for one write coming in. The reason why RAID-DP is correctly identified as having similar and sometimes superior performance to RAID-10 is because it is inseparably combined with WAFL which ensures this worst case situation never happens. 


I wrote up exactly why this is the case in a blog post a few years ago, almost all of which is still applicable 



storagewithoutborders.wordpress.com/2010/07/19/data-storage-for-vdi-part-4-the-impact-of-raid-on-performance/


and 


storagewithoutborders.wordpress.com/2010/07/19/data-storage-for-vdi-part-5-raid-dp-wafl-the-ultimate-write-accelerator/


I hope this helps to clarify things a little.


Regards

John Martin

b4real
b4real

You can't assume everyone can have NetApp storage. Granted it is a common platform, but in case of RAID 6 or 1+0, almost every array controller - including local systems direct attached storage - can deliver this option.

bboyd
bboyd

You get full redundancy and most of the benefit of raid 0. Some people refer to it as RAID 10 Just a note Raid 0+1 does not provide the same level of data safety.

GDF
GDF

Am wondering what the application is where write performance is such an issue. I would guess either backups (which probably shouldn't be in a RAID6 configuration) or a database with a unusually high write ratio.

Scott Lowe
Scott Lowe

I generally call it RAID 10, but specifically referred to it at RAID 1+0 in this article in order to differentiate it from RAID 0+1, which is not, in my opinion, production-worthy. I've seen the terms RAID 1+0 and RAID 0+1 used interchangeably, so was careful here. Scott

WhiteKnight_
WhiteKnight_

The nature of MRP and DRP applications involved a lot of non-sequential I-O. MRP batch jobs themselves can be extremely write intensive while also being time sensitive. These aren't your insurance and banking types of apps.

jhoward
jhoward

We recently (last year) started using RAID 10 for our network monitoring software's database because the original RAID 5 was thrashing the disks regularly. Initially the number of nodes wasn't very large but each node gets between 6 and 15 updates per minute. As the number of nodes grew with our business it became clear we needed a better solution than RAID 5 or 6. Our problem stemmed from an initial ignorance of what was actually needed. RAID 5 was cheaper in the beginning but the system as a whole eventually could not tolerate the degraded performance of a bad disk and would cause database corruption and all sorts of other time intensive issues. Different RAID levels exist for a reason so there are pros and cons to each however it is increasingly becoming difficult for us to justify anything other than RAID 10 for our data center. In the end we needed to be honest with ourselves in what things actually cost. Performance is an issue and costs us time and money when it is degraded as it directly affects our end users.

Editor's Picks