Data Domain: DD3300 All Fans are Failed and Temperature Status is Unavailable

           

   Article Number:     531080                                   Article Version: 3     Article Type:    Break Fix 
   

 


Product:

 

Data Domain,DD3300 Appliance

 

Issue:

 

 

The Data Domain Operating System queries iDRAC (integrated Dell Remote Access Console) to determine fan and temperature statuses. When this communication is interrupted, the fan and temperature statuses cannot be reliably determined and will default to FAILED and UNAVAILABLE respectively.  This status may or may not be posted as an alert but will remain in the same state until the condition is resolved.   
   
    Applies to:   
    DD3300 appliance only   
    DDOS version 6.1.x   
   
     
                                                           

 

 

Cause:

 

 

Cause:    
    The issue is caused by a bug in certain versions of the iDRAC module firmware. It fails to fully provide hardware information to DDOS. As a result, DDOS assumes this hardware is either missing or in a failed state.   
   
   
    Symptoms:   
    All fans show up as FAILED and all components under Temperature show as UNAVAILABLE.   

sysadmin@dd330x# enclosure show all    
Enclosure Show All     
      ------------------     
      This command may take up to a minute to complete. Please wait...     
      Enclosure 1     
              Fans     
                      Description   Level   Status     
                      -----------   -----   ------     
                      Fan 3         low     FAILED     
                      Fan 4         low     FAILED     
                      Fan 5         low     FAILED     
                      Fan 6         low     FAILED
   
                    -----------   -----   ------   
            Temperature     
                      Description                 C/F           Status     
                      -------------------------   -----------   ------     
                      CPU1 Temp                   Unavailable   -     
                      System Board Inlet Temp     Unavailable   -     
                      System Board GPU1 Temp      Unavailable   -     
                      System Board GPU2 Temp      Unavailable   -     
                      System Board GPU3 Temp      Unavailable   -     
                      System Board GPU4 Temp      Unavailable   -     
                      System Board GPU5 Temp      Unavailable   -     
                      System Board GPU6 Temp      Unavailable   -     
                      System Board GPU7 Temp      Unavailable   -     
                      System Board GPU8 Temp      Unavailable   -     
                      System Board Exhaust Temp   Unavailable   -     
                      -------------------------   -----------   ------
   
        
    No alerts are posted.   
      sysadmin@dd330x# alerts show current    
   
      No active alerts.This particular issue can be confirmed by observing the following log file:     
   
      /ddr/var/log/debug/messages.engineering    
   
If parsing related error messages from vulcanmon or SMS like the following are found, “Error parsing ________ info” we can reasonably assume that this particular issue is occurring:Examples:     
   
      sms: INFO: Error parsing FAN list infosms: INFO: Error parsing FAN list infovulcanmon: INFO: Error parsing drives infovulcanmon: INFO: Error parsing PSU info    
                                                             

 

 

Resolution:

 

 

Upgrade operating system:   
    This issue is fixed in DDOS 6.2.x. Please consider upgrading the OS to obtain a fix for this issue.   
   
    In the event that you currently experience this issue, please manually initiate a restart of iDRAC   
    by following the below steps. iDRAC will require about 5 minutes to restart, and proper disk   
    monitoring will be restored. The alerts will clear automatically.   
   
   
    Workaround:   
    The issue can typically be mitigated by resetting the system’s iDRAC module:   

         
  1.         Login to the system’s IDRAC GUI.     
  2.      
  3.         In the navigation bar, under the Maintenance drop-down menu, click Diagnostics.     
  4.      
  5.         Click Reset iDRAC.     
  6.    
   
      User-added image   
   
         
  1.         Wait 5 minutes for the iDRAC module to reset.     
  2.      
  3.         Confirm that the alerts are cleared.     
  4.