Pivotal Knowledge Base

Follow

HowTo - Recover from a DCA v2 Firmware Upgrade Failure

Problem

If a node fails during the DCA v2 firmware upgrade procedure, follow the instructions herein to recover from the fault and manually apply the firmware update and/or RAID Controller firmware on the devices that failed to complete successfully.

During a firmware upgrade one might encounter the following error:

" The component BIOS Firmware update <BIOS_version>: Failed, Firmware interface failure (an error occurred when reading or writing to the BMC, setting the update notification, or updating the BMC, FRU, HSC, Intel Local Control Panel, or SDR)"

This type of error signifies that sp-interface is “down” and not communicating properly thus script is not able to push updated binary files. 
The fix for this problem type requires three steps

  1. cold reboot
  2. reinstall and run FW script on failed node
  3. node reboot

Solution

The instructions below will assist in System Firmware Upgrade failures and RAID Controller Upgrade failures.

If the DCA V2 Firmware update script failed to work on one or more nodes it will be necessary to update the affected nodes manually.  Follow the steps below to rectify the issue.

System Firmware Upgrade Failures

  1. Perform cold reboot of the BMC (power cycle the system by disconnecting the power cord)
  2. Perform a manual local upgrade of the firmware on the failed node (in other words ssh to the failed node and copy the firmware upgrade binaries to the local storage from master node and perform upgrade from the console of the failed node)
     a. Prepare a directory for firmware binaries on local storage:
          mkdir -p /tmp/update
          cd /tmp/update
     b. copy firmware binary from master node to local storage
         scp root@mdw:/tmp/update/dca_firmware_2A01.tgz /tmp/update
             (/tmp/update/dca_firmware_2A01.tgz is the default path to v2 firmware 
              upgrade binary on the master-node, please, adjust this path accordingly 
             if it was changed during initial preparation for the upgrade)
     c. extract firmware binary from archive
          tar -zxvf dca_firmware_2A01.tgz
     d. initiate firmware upgrade from specific location (folder starting with BIOS)
          cd BIOS*
          flashupdt -u flashupdt.cfg
    
  3.  Now Perform a reboot of the affected node.
  4. Once the server is back online, log in to verify the firmware update successfully completed.

Firmware Update with Monitor Keyboard & USB Stick (if available)
Should the above steps fail to complete the firmware update, try the following steps:

Note: Only perform these steps if required equipment is available for use and the manual steps above were not successful. If required equipment is not available and issue persists, then a system replacement is the final step.
Required equipment: External Keyboard and Monitor (either physical components or virtual through a USB KVM), USB stick (128MB or larger) formatted
as FAT16 or FAT32.

  1. Download the EFI BIOS update package using the link below.
    https://download.emc.com/downloads/DL56335
    
  2. Extract the contents of the ZIP file on the main directory of a USB stick
  3. Boot system into EFI mode (hit F6 on startup and select EFI
  4. Allow system to boot into EFI mode.  Firmware upgrade process will start automatically.
  5. Reboot system after successful completion.
  6. Once the server is back online, log in to verify the firmware update successfully completed
If the above steps do not generate a successful system firmware upgrade, then replace the server.

RAID Controller Upgrade Failures

There is a small chance that the RAID Firmware upgrade process may fail, making the RAID controller no longer recognizable to the system.  In this case, the system would need to be replaced.  The data and RAID configuration will remain intact on the disks, allowing the configuration to be imported on the replacement system.
 
If the RAID controller does not show the expected RAID Firmware version after system reboot, then run the command manually on the affected server.
  1. Perform a manual local upgrade of the firmware on the failed node (in other words ssh to the failed node and copy the firmware upgrade binaries to the local storage from master node and perform upgrade from the console of the failed node)
  2. Prepare a directory for the firmware binaries on the local storage:
    [root@sdw1 ~] # mkdir -p /tmp/update
    [root@sdw1 ~] # cd /tmp/update
  3. Copy the firmware binary from the master node to the local storage (Note: /tmp/update/dca_firmware_2A01.tgz is the default path to v2 firmware upgrade binary on the master-node. Adjust this path accordingly if it was changed during initial preparation for the upgrade):
    [root@sdw1 update]# scp root@mdw:/tmp/update/dca_firmware_2A01.tgz /tmp/update
  4. Extract the firmware binary from the archive
    [root@sdw1 update]# tar -zxvf dca_firmware_2A01.tgz
  5. Initiate the firmware upgrade from the specific location (folder starting with BIOS)
    [root@sdw1 update]# cd BIOS02030003_ME20107328_BMC1216038_FRUSDR112/
    [root@sdw1 BIOS...DR112]# CmdTool2 –adpfwflash –f MR59p3.rom –aall
  6. Now perform reboot of the node.
  7. Once the server is back online, log in to verify the RAID Controller update successfully completed

Comments

Powered by Zendesk