Do not update firmware/BIOS from within ESX console
Today I’ve learnt the hard way, and with the help of Dell ProSupport, I was able to rectify the problem that almost rendered my Poweredge R710 non-bootable.
- I saw there is a new BIOS update (FW 2.1.15, released on September 13, 2010) on Dell’s web site for Poweredge R710/R610 today. As described it fixed some serious problem that hang the server especially if you are using Xeon Westmere 5600 series, so it’s strongly suggested. Then I’ve upgrade the BIOS on R610 without any problem as it’s a Windows Server 2008 R2.
- The big problem came now, how to update BIOS/Firmware for servers running ESX? The answer is simple, take the host to Maintenance mode (VMotion all the VM off that host off course), then reboot, press F10 to go into Unified Server Configuration (USC), then setup IP and DNS server address, then use Update System, then use FTP to grab a bunch of updates directly from Dell’s FTP server, sounds easy right? Yes, it should, but what if Dell has not updated the catalog.xml which contains the latest BIOS path? Like us, today is Oct 2, 2010, and Dell still hasn’t update that important file, leaving every R710 having the existing and available BIOS same as 2.1.09, What the Hack! So you stuck, as there is no way you can easily update your BIOS, there is no Virtual Floppy anymore in iDRAC6, if there is, then I can simply boot into DOS and then attach another ISO contains the BIOS. Or shall I say I do not know where to download a bootable DOS image (ISO).
- Now I have boot my ESX again as USC method failed, I start to Google around and called Pro-Support, they suggest running the linux version of BIOS update programe.BIN directly from ESX console, ok, some source from Google saying it’s doable, then I use FastSCP to upload the BIN to /tmp, and then Putty into the server, then chmod u+x BIOS.BIN, then ./BIOS.BIN, after pressing “q”, it asked me if I want to continue to update BIOS (Y/N), I pressed Y, then after 5 seconds it stopped saying Update failed!
- Then the “BEAUTY” CAME! When I issue a reboot from vCenter, it just hang there viewing from iDRAC6’s console, “Waiting for Inventory Collector to finish” with many “……” counting, then after 20 mins, the server finally reboot itself, I tried reboot it again and it just hang again and this time, I used Reset from iDRAC6, then I found there is no more F10 available as it’s saying System Service is NOT AVAIABLE! What!!! Then Dell Pro-Support told me to go to iDRAC by Ctrl+E, then set Cancel System Service to YES, it will clear the fail state and bring back F10 after exit iDRAC. THIS IS DEFINITELY NOT GOOD! SOMETHING in the ./BIOS.BIN script MUST HAVE changed my server setting!!!
- I searched through Google and luckily I found Dell’s official KB.After OpenManage Server Administrator 6.3 is installed on ESX 4.1, when the system is rebooted, the system may not reboot until the Inventory Collector has completed. A message may be displayed that states “Waiting for Inventory Collector to Finish”. The system will not reboot for approximately 15 to 20 minutes. Note: This issue can also affect the Server Update Utility (SUU) and Dell Update Packages (DUPs).The key to fix it is to issue command “chkconfig usbarbitrator off” to turn off usbarbitrator.
- Dell Pro-Support Level 2 engineer told me to type a list of things- “chkconfig –list” to show the Linux configure
- “cat /etc/redhat-distro” to show the service console is actually RHEL 5.0, then I google around and found others also failed when directly updating server firmware as it’s not compatible with the general Redhat Linux may be.
- “service usbarbitrator stop” to stop usbarbitrator service
- “ps aux |grep usb” again to show usbarbitrator is no longer running
- finally issue “chkconfig usbarbitrator off” to permanently disable usbarbitrator service. - Finally I compared the original system config using “chkconfig –list” with my other untouched R710s, I found the only line has been changed is usbarbitrator 3:on, it should be 3:off!!! So the ./BIOS.BIN must have changed that in between and failed to update BIOS after that, and it didn’t roll back, so my system configuration has been changed! Dell’s KB 374107 didn’t specify and indicating the original ESX 4.1 system configure usbarbitrator is indeed with 3:off!
Why Dell still hasn’t update the catalog.xml in their FTP (both ftp.dell .com and ftp.us.dell.com), the BIOS has been released for two weeks? Anyway, I will wait till the end of October and try to use USC to update it again.
The following is quoted from the official Dell Update Packages README for Linux
* Due to the USB arbitration services of VMWare ESX 4.1, the USB devices appear invisible to the Hypervisor. So, when DUPs or the Inventory Collector runs on the Managed Node, the partitions exposed as USB devices are not shown, and it reaches the timeout after 15 to 20 minutes.
This timeout occurs in the following cases:
* If you run DUPs or Inventory Collector on VMware ESX 4.1, the partitions exposed as USB devices are not visible due to the USB arbitration service of VMware ESX 4.1 and timeout occurs.
The timeout occurs in the following instances:
• When you start “DSM SA Shared Service” on the VMware ESX 4.1 managed node, it runs Inventory Collector. To work around this issue, uninstall Server Administrator or wait until the Inventory Collector completes execution before attempting to stop the “DSM SA Shared Service”.
• When you manually try to run DUPs or the Inventory Collector on the VMware ESX 4.1 managed node while USB arbitration service is running. To fix the issue, stop the USB arbitration service and run the DUPs or the Inventory Collector.
To stop the USB arbitration service:
1. Use the “ps aux|grep” usb to check if the USB arbitration
service is running.
2. Use the “chkconfig usbarbitrator off” command to prevent the USB
arbitration service from starting during boot.
3. After you stop the usbarbitrator, reboot the server to allow the
DUPs and/or the Inventory collector to run.Note: If you require the usbarbitrator, enable it manually. To enable the usbarbitrator, run the command – chkconfig usbarbitrator on.
Update: April 6, 2012
* The USB arbitration service of VMWare ESX 4.1 makes the USB devices invisible to the Hypervisor. So, when DUPs or the Inventory Collector runs on the MN, the partitions exposed as USB devices are not shown, and it reaches the timeout after 15 to 20 minutes. This timeout occurs in the following cases:
When you start “DSM SA Shared Service” on the VMware ESX 4.1 managed node, it runs the Inventory Collector. While the USB arbitration service is running, you must wait for 15 to 20 minutes for the Inventory collector to complete the execution before attempting to stop this service, or uninstall Server Administrator.
When you manually run the Inventory Collector (invcol) on the VMware ESX 4.1 managed node while the USB arbitration service is running, you must wait for 15 to 20 minutes before the operations end. The invcol output file has the following:
<InventoryError lang=”en”>
<SPStatus result=”false” module=”MaserIE -i”>
<Message> Inventory Failure: Partition Failure – Attach
partition has failed</Message>
</SPStatus><SPStatus result=”false” module=”MaserIE -i”>
<Message>Invalid inventory results.</Message>
</SPStatus><SPStatus result=”false”>To fix the issue, stop the USB arbitration service and run the DUPs, or Inventory Collector.
Do the following to stop the USB arbitration service:1. Use ps aux | grep usb to find out if the USB arbitration service is running.
2. To stop the USB arbitration service from starting up at bootup, use chkconfig usbarbitrator off.
3. Reboot the server after stopping the usbarbitrator to allow the
DUPs and/or the Inventory collector to run.If you require the usbarbitor, enable it manually. To enable the usbarbitrator, run the command – chkconfig usbarbitrator on. (373924)
I got the same problem, my IDRAC card doesn´t got access to Internet.
I have got an flp image from my Dell contact with a DOS-image, and Winimage to change bios.exe on it.
Works for me.
Thanks for leaving reply to me, could you kindly email me the iso of DOS-Image (I’ve sent you a mail from my gmail account, please check), so I can boot R710 using this .iso and then execute the update BIOS.exe. (I found iDRAC6 does not support remote boot from USB or Floppy, then why Dell made a DOS update BIOS available? Huh?)
Btw, even I am able to boot into DOS, I still haven’t figured out how do you run the update bios image from? Do you make another iso or combined the updated BIOS 2.1.15 into the bootable DOS image .iso?
Any detail would be greatly apprecaited.
Thanks in advance.
You should use the Dell Repo Manager tool (read more about Repo Manager @ http://www.delltechcenter.com/page/Repository+Manager)
Repo manager will allow you to create a “Linux Deployment Media” disk. This is a bootable ISO you can custom tailor to contain only the updates you need (usually under 100mb). You can mount this as virtual media and run the updates over the idrac… keep in mind the risks of running updates over virtual media… if your connection drops in the middle of an update..
You could also boot OMSA live and run the RHEL update package formats of the updates.
The FTP catalog is only updated about once a quarter… I don’t see why it couldn’t be scripted.
Hello Admin,
I don’t whether you saw the instruction when you run the bin file of updating file.
As I know, it will show you below message.
============
If you run into the above memory limitation on VMWare ESX, the problem is because the console OS available memory is only 272M by default. Please increase the console OS memory to 800M temporarily and perform the FW update. During ESX server boot up please perform the below steps to increase the memory available.
1. While booting, press ‘e’ on the VMware ESX line (Grub option display screen).
2. Press ‘e’ again on the ‘uppermem=’ line and edit uppermem=819200. Press Enter.
3. Press ‘e’ on the kernel line (Just below the uppermem line). Edit the kernel line with mem=800M and press Enter.
4. Press ‘b’ to boot with these options.
5. Perform the FW update.
AH,
Thanks for pointing out it’s an ESX memory limitation.
Honestly, I never really read the instruction when performing the BIOS action. I thought it would be the same as updating firmware from within Windows environment while the OS is running, obviously I was wrong.
Btw, are you Antti?
Just received an update from Virtualization Buster, which is the proper way to update firmware using an USB method via F10 Unified Server Configuration (USC).
Updating Dell R-Series aka 11-th Generation Servers via USB and Repository Manager
http://www.virtualizationbuster.com/?p=1301