How to Get Consumer Grade SSD to Work in Dell Poweredge R710!
Around this time December last year, I’ve been searching for a reliable, cost effective SSD for vSphere ESX environment as the one from Dell is prohibitory expensive to implement.
Besides, Dell SSDs may have poor performance problem as someone described in VMTN, those 1st generation SSD either provided by Sumsung OEM (100GB and 200GB), or Pliant based (149GB) which is much faster than Sumsung ones and of course much more expensive (over priced that is) as well. Both are e-MLC NAND.
Anyway, I’ve finally purchased a Crucial M4 SSD 128GB 2.5″ SATA 3 6Gb/s (around USD200 with 3 years warranty), here is for a list of reasons.
Then my goal is to put this SSD to Dell Poweredge R710 2.5″ tray and see how amazing it’s going to be. I know it’s not a supported solution, but no harm to try.
The Perc H700/H800 Technical Book specifically states it supports 3Gb/s if it’s a SATA SSD, but I found out this is not true, read on.
1. First thing first, I have Upgraded Crucial firmware from 0009 to 0309 as in early Jan 2012, users found out Crucial M4 SSD has a 5200 Hours BSOD problem, it’s still better than Intel SSD’s 8MB huge bug.
Correct a condition where an incorrect response to a SMART counter will cause the m4 drive to become unresponsive after 5184 hours of Power-on time. The drive will recover after a power cycle, however, this failure will repeat once per hour after reaching this point. The condition will allow the end user to successfully update firmware, and poses no risk to user or system data stored on the drive.
Something more to notice that SandForce controller based SSD has a weakness that when more and more data is stored in SSD, it’s performance will decrease gradually. Crucial M4 is based on Marvell 88SS9174 controller and it doesn’t have this kind of problem. It is more stable and the speed is consistent even with 100% full in data.
In additional, Crucial M4 Garbage Collection runs automatically at the drive level when it is idle and it has Garbage Collection which works automatically in the background in the same way as TRIM independent of the running OS. As TRIM is an OS related command, so TRIM will not be used if there is no support in the OS (ie, VMware ESX).
2. The most difficult part is actually finding the 2.5″ tray for Poweredge R710 as Dell do not sell those separately, luckily I was able to get two of them off the auction site locally quite cheap and later found out they might be Counterfeit parts, but they worked 100% fine, only the color is a bit lighter than the original ones.
3. Then the next obvious thing is to insert the M4 SSD to R710 and hopefully Perc H700 will recognize this drive immediately. Unfortunately, the first run failed miserably with both drive indicator lights OFF, as if there is no drive in the 2.5″ tray.
Check the OpenManage log, found out the drive is not Certified by Dell (Huh?) and Blocked by Perc H700 right away.
Status: Non-Critical 2359 Mon Dec 5 18:37:24 2011 Storage Service A non-Dell supplied disk drive has been detected: Physical Disk 1:7 Controller 0, Connector 1
Status: Non-Critical 2049 Mon Dec 5 18:38:00 2011 Storage Service Physical disk removed: Physical Disk 1:7 Controller 0, Connector 1
Status: Non-Critical 2131 Mon Dec 5 19:44:18 2011 Storage Service The current firmware version 12.3.0-0032 is older than the required firmware version 12.10.1-0001 for a controller of model 0×1F17: Controller 0 (PERC H700 Integrated)
4. Then I found out the reason is older H700 firmware blocked the non Dell drive access, so I have to updated Perc H700 firmware to latest (v12.10.x) using USC again. Before the upgrade, I boot into H700’s ROM and found indeed the SSD drive is not presented in the dirve pool. Anyway, the whole process took about 15 minutes to complete, not bad.
5. After the server returns to normal, the Crucial M4 128GB SSD now has light showing in the tray indicator and working correctly partly, as the indicator on the top always blinking in amber (ie, orange), “Not Certified by Dell” indicates in OpenManage log, and this caused the r710 front panel LCD also blinking in amber.
Besides, under Host Hardware Health in vCenter, there is one error message showing “Storage Drive 7: Drive Slot sensor for Storage, drive fault was asserted”
From Perc H700 log file:
Status: OK 2334 Mon Dec 5 19:44:38 2011 Storage Service Controller event log:
Inserted: PD 07(e0xff/s7): Controller 0 (PERC H700 Integrated)
Status: Non-Critical 2335 Mon Dec 5 19:44:38 2011 Storage Service Controller
event log: PD 07(e0xff/s7) is not a certified drive: Controller 0 (PERC H700 Integrated)
I clear the log in OpenManage turns the front panel LCD returns to blue, but SSD drive top indicator light still blinks in amber, don’t worry, it’s just indicator showing it’s a non-dell drive.
Later, this was confirmed by a message in VMTN as well.
The issue is that these drives do not have the Dell firmware on them to properly communicate with the Perc Controllers. The controllers are not getting the messages they are expecting from these drives and thus throws the error.
You really won’t get around this issue until Dell releases support for these drives and at this time there does not appear to be any move towards doing this.
I was able to clear all the logs under Server Administrator. The individual lights on the drives still blink amber but the main bevel panel blue. The bevel panel will go back to amber again after a reboot but clearing the logs will put it back to blue again. Minor annoyance for great performance.
Update: If you have OM version 8.5.0 or above, now you can disable the not a certified drive warning completely! Strange that Dell finally listen to their customers after years of complain.
In C:\Program Files\Dell\SysMgt\sm\stsvc.ini update the parameter to NonDellCertifiedFlag=no
6. The next most important thing is to do a VMFS ReScan, ESX 4.1 found this SSD immediately Yeah! and I added it to the Storage section for testing.
Then I tested this SSD with IOMeter, Wow…man! This SINGLE little drive blows our PS6000XV (14 x 15K RPM RAID10) away, 7,140 IOPS for real life 100% random, 65% read test, almost TWICE than PS6000XV!!! ABSOLUTELY SHOCKING!!!
What does this mean is A Single M4 = 28 x 15K RPM RAID10, absolutely crazy numbers!
##################################################################################
TEST NAME——————-Av. Resp. Time ms——Av. IOs/sek——-Av. MB/sek——
##################################################################################
Max Throughput-100%Read……1.4239………39832.88………1244.78
Max Throughput-100%Write……1.4772………37766.44………1180.20
RealLife-60%Rand-65%Read……8.1674………7140.76………55.79
EXCEPTIONS: CPU Util. 93.96%, 94.08, 30.26%
################################################################
So why would I spend 1,000 times more when I can get the result with a single SSD drive for under USD200? (later proved I was wrong as if you sustain the I/O process, Equallogic will stay at 3,500 IOPS and SSD will drop to 1/10 of it’s starting value)
Oh…one final good thing is Crucial M4 SATA SSD is recognized as 6Gbps device in H700, as mentioned in the very first lines, according to Perc H700 tech book, it said H700 SSD SATA interface only supports up to 3Gbps, I don’t know if it’s the latest Perc H700 firmware or actually the M4 SSD itself somehow breaks that limit.
Let’s talk something more about Perc H700 itself, most people know Dell’s Raid controller cards are LSI Megaraid OEM since Poweredge 2550 (the fifth generation) and Dell Perc H700 shares many advanced feature with its LSI Megaraid ones.
Such as CacheCade, FastPath, SSD Guard, but ONLY available in the Perc H700 1GB Cache NV Ram version.
Optimum Controller Settings for CacheCade – SSD Caching
Write Policy: Write Back
IO Policy: Cached IO
Read Policy: No Read Ahead
Stripe Size: 64 KB
Cut-Through IO = Fast Path Cut-through IO (CTIO) is an IO accelerator for SSD arrays that boosts the throughput of devices connected to the PERCController. It is enabled through disabling the write-back cache (enable write-through cache) and disabling Read Ahead.
So this means you can use LSI Megaraid Storage Manager to control your Perc H700 or H800. In my case, I found my H700 does not support any of the above as it’s only a 512MB cache version. However “SSD Caching = Enable” shows in the controller property under LSI Megaraid Storage Manager and cannot be turned off as there is no such option, I am not sure what this is (definitely it’s not CacheCade), if you know what this is, please let me know.
Then let’s move into something deeper regarding Perc H700’s bandwidth as I found the card itself can reach almost 2GB/s, this is again too good to believe!
The PERC H700 Integrated card with two x4 internal mini-SAS ports supports the PCIe 2.0 x8 PCIe host interface on the riser.
PERC H700 is x8 PCIe 2.0 (bandwidth is 500MB/s per x1 lane) with TWO SAS 2.0 (6Gbps) Ports with x4 lane, so total bandwidth for EACH lane is 500MB/s x 4 = 2,000MB/s (ie, 2GB/s).
EACH SATA III or SAS 2.0 bandwidth is 6Gbps, this means EACH drive maximum speed can produce 750MB/s (if there is such SAS drive), so it will take about SIXTEEN (16) 6Gbps 15K rpm disks (each about 120MB/s) in reality to saturate ONE PERC H700’s 2GB/s theoretical bandwidth.
A single Crucial SSD M4 is able to go over 1GB/s in both Read and Write really shocked me!
This means two consumer grade Crucial SSD M4 in RAID0 should be enough to saturate Perc H700’s total 2GB/s bandwidth easily.
From ESX Storage Performance Chart, it also shows the consistent IOPS with IOMeter’s result. (ie, over 35,000 in Seq. Read/Write).
From Veeam Monitor, showing 1.28GB/s Read and 1.23GB Write
In fact, not just me, in reality, I found out many people were able to achieve this maximum 1,600MB or 1.6GB/s. (yes, theoretical is 2GB/s) with two or more SSD under Perc H700.
Of course the newer PCIe 3.0 standard is 1GB/s per x1 line, so a x4 will give you 4GB/s,a 200% increase, hopefully someone will do a benchmark on Poweredge R720 with its Perc H710 shortly.
Some will say using a single SSD is not safe, OK, then let’s make it a RAID1, if not RAID10 or RAID50 with 8 SSD drives, with the newer Poweredge R720, you can put maximum 14 SSD to create a RAID10/RAID50/RAID60 with 2 hot-spare in a 2U, more than enough right?
The most important is the COST IS MUCH MUCH LOWER when using consumer grade SSD and it’s not hard to imagine 14 SSD in RAID10 will produce some incredible IOPS, I guess something in 50,000 to 100,000 should be able to achieve without much problem. So why use those sky high $$$ Fusion IO PCI-e cards forks?
Finally, I also did a few desktop benchmark from within the VM.
Atto:
HD Tune:
Conclusion, for a low cost consumer SSD, 100% Random RealLife-60%Rand-65%Read with 32K transfer request size, 7,000+ IOPS is simply amazing!!!
So ive been thinking..
What about an article describing in more depth, what your going to use the SSD for in your vmware enviroment?
Ive been thinking of giving SSDs a go on our M805 Blades, but i cant really find any advantages to it, since all my storage is located, so im not sure how i can even leverage the speed of the SSDs.
One of the really smart things about fusionio (we can agree their pricing scheme is crazy!) is their IOturbine technology http://www.fusionio.com/systems/ioturbine/ That basically lets me cache my traffic to and from the SAN on the IO card, giving huge improvements in day to day latency. Unfortunately as long as im using blades, PCI-e solutions aint really the thing. and im not sure if they make a mezzanine solution.
I use SSD disks mostly for MSSQL and MySQL IOPS intensive applications, I do see the response time increase by 10x in many cases.
The above IOmeter test was performed within a VM under ESX 4.1
RealLife-60%Rand-65%Read……8.1674………7140.76………55.79
7,140 IOPS is a really amazing number.
Btw, there is a good article about Distributed SSD Cache in Enterprise SAN and Server, various vendors uses different SSD Caching method to leverage speed acceleration to and from SAN on the server side.
Hi,
I’m using 16x Corsair P3 SSDs in my R910, which, I believe, are using the same controller as your drives.
Apart from the amber lights I can’t say a bad word about them. By the way: I think (not sure though, I’ve tried it a long time ago) some other SSDs (SandForce based OCZ Vertex or Corsair Force 3) keep the lights green.
The job the server does is not mission critical, but is extremely IO intensive – creating and deleting millions of files in the shortest possible time. So I don’t even have RAID10, I use 2x 8 drive RAID0.
I also use PCIe based SSDs in another server that runs SQL and they work great as well. Average queue depth is less than 0.03 and 0.01 for more than 80% of the time, average more than 400MB/s read.
Thanks for the feedback.
Btw, could you run IOMeter with RealLife-60%Rand-65%Read? I bet the result must be sky high with your RAID0 with 16 SSD drives! Probably in 100K IOPS range.
Hi
I read your interesting article. I have a R710 with 3 x Crucial 512 SSD configured as a raid 5 VD. As soon as we add other disks in another VD (7200 RPM 250 GB DELL DISKS), sometimes one of the SSD is reported as missing.
Did you experience that problem? Any idea on how to solve it?
thanks !
philippe
We have similar implementation but using in vsphere and OEM SSD in R710 (of course the most difficult is to get the cradle…) The SSD purpose as the Swap for memory over-commission.
I used Crucial M4 SSD’s with a Dell PERC controller for awhile and they were very fast.
Unfortunately, after a few months it just marked the disks as “Foreign” and refused to use them.
I’m not sure if it is a problem with SAS raid and SATA SSD but it marked them both bad at the same time; even though I can load them just fine on a PC SATA controller.
At some point these controllers will take MLC SSD issues into account, hopefully…
You said, “later proved I was wrong as if you sustain the I/O process, Equallogic will stay at 3,500 IOPS and SSD will drop to 1/10 of it’s starting value.”
Sustain it for how long? Why does it do this? Is this a problem for servers with lots of disk IO, such as terminal servers and database servers?
Eric, the period for sustained I/O is above 48 hours, then you will see this strange phenomenon. Yes, if your Exchange/SQL has 24/7 high I/O, then SSD may experience this issue, but I am not sure if this also exist in Enterprise SSD, as I am using MLC consumer grade SSD in the test.
After all, Equallogic does worth for every penny you paid.
Thanks for your response! My db servers have 75-100 separate instances of MySQL running on them handling 75-100 separate MyISAM databases. (This works out better for us than having a single instance handle all databases.) During normal business hours, each instance does 90% selects and 10% inserts, probably no more than 1 insert per second on average, for a total of 100 inserts per second if you count all instances. I would not call this heavy IO. During the evenings, this drops down to nearly nothing. In the very early morning, we backup the server using rsync, which reads the disks for 1-2 hours. Would you expect SSD drives to experience performance issues in our environment?
You mentioned that SandForce-based drives have a problem where they slow down gradually as data is added to the drive. I spoke with several vendors today and it seems that newer SandForce firmware eliminates this problem. Also, you mentioned that Marvell-based drives perform at full speed even when they are 100% full. I assume you mean if Trim is turned on, correct?
I would say SSD caching or auto-tier, many vendor provide such solution now, if you are with Dell, the latest R720 exactly does what you need, SAS+2 SSD disks. It will help in your case as your environment is almost 90% Read/10% Write IO pattern, no need to go for full SSD as it’s cost prohibited.
Btw, regarding your 2nd question, I still do believe it should be a firmware issue with SandForce (glad they solved it), that’s with TRIM ON of course.
Boy, what a ride. I’ve been reading up on SSDs for the past 3 days straight. There are so many concerns. It is nearly impossible to find consumer-grade drives that are both fast and reliable. Many people say the most reliable drives are Intel 520, Samsung 830, Plextor M3P, and Crucial M4. But it turns out that “reliable” means two different things. If you mean reliable in terms of product validation and testing, Intel really shines. BUT it is SandForce-based, and neither SandForce nor Intel have completely fixed their BSOD problems. On the other hand, if you mean reliable in terms of product lifetime, frequency of unrecoverable read errors, and so on, then SandForce’s DuraWrite and RAISE technologies seem awesome. If you mean reliability in terms of predictable performance, it would seem that Marvell-based controllers are the best choice because the work the same with compressible and uncompressible data, but they are not as reliable when it comes to the physicial lifetime of the media. Sigh.
Hey, since TRIM does not work through a RAID controller, is there a way to manually trigger it, maybe through a cron job? Or am I stuck with garbage collection? And if so, now I have to go back through my list of drives and see which ones have the best GC algorithms for systems with medium to heavy load.
I thought Crucial M4 Marvell based controller SSD does GC automatically independent of the OS. TRIM =! GC right?
Btw, I am pretty happy if Crucial M4 can last for 2 years under heavy I/O environment. Why? It’s bcoz it’s so cheap now, 128GB for about USD110, just make a RAID1 or RAID5/6, then it completely solve your reliability concern.
From what I’ve read, all SSDs do GC. Some also do TRIM, but that also depends on whether the OS and/or controller send the TRIM command to the drive. The problem with RAID is that even if the OS sends the command, it will not pass through the RAID driver. So if you’re going to use RAID, you have to rely on the drive’s built-in GC. However, drives do GC differently. Some do it in the background only when the drive is idle, which can be undesirable in high-usage environments where the drive rarely if ever goes idle. The OCZ Vertex 4 must be idle for a whole hour before GC kicks in. Some drives do DC in the foreground, which means they do it at the moment data is written and space needs to be freed up. You take a performance hit, but it is probably preferable anyway.
What RAID level are you been running on your MySQL servers where you mentioned seeing up to 10x performace boost?
Raid 5 mostly with three to five M4 SSDs.
In two years, have you ever had to replace a Crucial drive? Is Dell OpenManage still able to predict in advance that a drive is going bad?
I’ve only start to deploy these consumer grade SSDs on R710 on a very small scale. Touch wood! There is nothing bad happened so far, the failed drive prediction never happens as well, I do hope it will work though.
Please share with us your interesting SSD on PE R710/R720 results later, thanks.
Finally, if you got budget, you may consider Fusion-io ioDrive2 or ioDrive2 Duo, you will be surprised how extreme the IOPS it can deliver, but money wise, it’s also prohibitively expensive.
I just bought a few of the M4’s for our R710s and they are running the latest available firmware and experincing the same exact issues. Did you have any problems with the disks failing over time. I read the comment from the gentleman above and it sounded like his disks eventually were treated as foreign. Thanks in advance for your response back!
Andy
The issue with consumer SSDs is that most have limited write endurance. This is not a problem in your personal computer but in a server running a hypervisor the number of IOPS can be several orders of magnitude greater causing you to blow through the write endurance very quickly. RAID will not help you, because RAID controllers like to balance the writes across all drives so they wear uniformly. With SSD, this can be a disaster because it means your SSD drives will fail within a very short period of each other (possibly hours or maybe days). To increase reliability using consumer grade SSDs, make sure that you have a hot spare configured in your RAID set, have the rebuild rate set to 60% (or higher) and that you configure your RAID set to utilize only half of the available space on your SSDs. This doubles the write endurance (at the expense of available disk space). Have a look at the Intel 710 enterprise SSD. It’s 300GB but costs US $1,400. It should be clear that we cannot expect 5 years or even 3 years of trouble free operation when using consumer SSDs in a server (especially hypervisor) environment.
“Make sure that you have a hot spare configured in your SSD RAID set”
Good Point Stephan!
Great Article! Thanks! Question: I have a similar configuration (Dell 610 with H700 and Crucial SSDs on RAID 10). Is there a way to find out the Crucial HD Firmware version from Dell Openmanage (or any other utility) that can be checked while Windows Server is running (IOW, without requiring a reboot)?
Thanks in advance.
I believe Openmanage or iDRAC should be able to tell the firmware of your SSD/HD.
Thanks for the quick reply! Unfortunately after checking both options, neither the Openmanage nor iDRAC interfaces provide this info.
Furthermore, doing some more research, we found an article saying that this particular controller may not provide this type of details and info about the underlying disks:
Copied from the post:
“…According our current knowledge, other Dell controllers (like the H700) does not really provide any access to hard disks, that’s why further details cannot be detected. We’re constantly researching about different ways and of course ideally HD Sentinel would support all possible controllers (if that would only depend on me), but for some controllers, it depends on the manufacturer: if the functions required to access hard disk status are physically missing from the controller, its firmware or drivers, that may prevent detection of any kind of hard disk information…”
Full Post: http://www.hdsentinel.com/forum/viewtopic.php?f=6&t=1567
If anybody has any suggestion on how to obtain this info for this controller, please advise, because after reading the issue of the 5200 hs with these disks I’m concerned.
Thanks!
Just checked, sure it is in OpenManage, but not in iDRAC, you need to click the + sign to expand the disk property. I’ve uploaded a photo to show this, see Revision:DB08, hope this helps.
Thank you VERY MUCH for this information! Using the DRAC I was able to see that our HDs have the 000F Firmware (which I believe is the one after 0309) so I should not have the 5000hs issue (although I’ve seen other forums with some issues relating freezing on reboot). Very much appreciated!
I’ll take advantage of the great knowledge and experience of the people behind this post to ask another question: May you have recent information on the combination of the Dell H700 controller and the Intel SSD S3700 series? It seems the price of this drives is getting more affordable so I would like to consider replacing the Crucial SSDs that I have now for the Intel S3700 series. Thanks in advance.
Personally I will go for Intel SSD any day if the price is affordable.
Btw, when you said you were able find out the firmware of your SDD from iDRAC, somehow I still couldn’t figure out where that option is.
Could you post a screen cap or steps how can I obtain that information pls?
Thanks.
Hi there, is in the same place you showed in your snapshot, under Failure Predicted, showing Revision 000F. I can email you a snapshot if you send me an email or suggest somewhere to upload it. But basically the steps are exactly what you showed in your “photo” in the previous comment. Hopefully I’m understanding your question correctly. Regards.
Sorry… I confused things up… I’m referring to OpenManage not to iDrac. I was not able to try with iDrac because I cannot reboot the server. My apologies. Whenever we get a chance to reboot it, I’ll look and see if I see any option showing this data, but if you did not find it, it is likely that is not there. Sorry for the confusion.
Two years on and how is this working? Have you had the crucial drives die yet or any other updates you can provide? I’m looking at implementing this exact setup so thanks for your efforts.
Fyi, it’s still working great after almost 4 years! LOL
Its 2017, did any of your SSDs fail? Hows reliability so far
Still running and no failure so far even after warranty expired for 4 years.
RAID 1 or RAID 6 is your consumer grade SSD solution if you are concern about the reliability