Dell EMC Unity: How to safely replace a disk in a Dynamic Pool (User Correctable)

           

   Article Number:     532192                                   Article Version: 3     Article Type:    How To 
   

 


Product:

 

Dell EMC Unity Family,Dell EMC Unity All Flash,Dell EMC Unity 300F,Dell EMC Unity 350F,Dell EMC Unity 380F,Dell EMC Unity 400F,Dell EMC Unity 450F,Dell EMC Unity 480F,Dell EMC Unity 500F,Dell EMC Unity 550F,Dell EMC Unity 600F,Dell EMC Unity 650F

 

Instructions:

 

 

In the GUI, when a disk faults, the Dynamic Pool status will indicate (as shown in the figure below) a status of "Degraded" "The pool performance is degraded. Check the storage system for hardware faults. Contact your service provider., A pool is rebuilding because it lost a drive. System performance may be affected during the rebuilding."   
   
    If unbound drives are available, the array will claim an unbound drive to rebuild the faulted drive to.  If there are no unbound drives available, the array will rebuild to the spare Extents within the Dynamic Pool itself.   
   
    NOTE: The GUI can take up to 10 minutes to reflect the Pool in the degraded state due to GUI polling and browser caching.     
     
      User-added image
   
   
   
    With Free Drives in the array to rebuild to:     
   
    During the rebuild process, the initial drive that faulted displays a yellow "warning" error under My Dynamic Pool Properties.   
   
    Once the array has identified a unbound drive of the proper type, the array starts a permanent copy operation to the new drive.   
   
    During this process, the pool disk properties are updated to show the new drive that is replacing the faulted drives.   
   
    NOTE: This does not indicate that the copy has completed, this only indicates that the operation has been started.   
   
    During the rebuild operation, the pool indicates a degraded status. The pool updates its status back to "OK" once the rebuild has completed.   
   
    User-added image   
   
    To verify that the rebuild has completed, establish an SSH session to the array and run the following command grepping for the pool name:     
   
    Command >> 15:57:32 service@spb spb:~/user> cat /EMC/backend/log_shared/EMCSystemLogFile.log | grep -i "<pool name>"   
   
    Review the output from that command to see the logging of the pools rebuild and its subsequent statuses.   
   
    Example of the output:      
                                                  

service@spb spb:~/user> cat /EMC/backend/log_shared/EMCSystemLogFile.log | grep -i "My Dynamic Pool"                 
                                      
                                      
                     "2019-04-10T15:16:12.574Z" "spb@APMx" "Kittyhawk_safe" "25230" "unix/spb/root" "WARN" "1:12d4602" :: "RAID protection for Storage Pool My Dynamic Pool is degraded. Please resolve any hardware problems. Internal information only: Pool OID 0x300000005." :: Category=System Component=mlu TimeZone=UTC                 
                                      
                     "2019-04-10T15:17:13.265Z" "spb@APMx" "Neo_CEM" "23864" "N/A" "WARN" "14:6032d" ::
"Storage pool My Dynamic Pool is degraded." :: Category=User Component=Health TimeZone=UTC                 
                                      
                     "2019-04-10T15:21:21.551Z" "spb@APMx" "Neo_CEM" "23864" "N/A" "WARN" "14:60343" ::
"Storage pool My Dynamic Pool is rebuilding due to the loss of a drive." :: Category=User Component=Health TimeZone=UTC                 
                                      
                     "2019-04-10T15:41:48.007Z" "spb@APMx" "Kittyhawk_safe" "25230" "unix/spb/root" "INFO" "1:12d0508" :: "RAID protection has been upgraded for Storage Pool My Dynamic Pool. Internal information only. Pool OID 0x300000005." :: Category=System Component=mlu TimeZone=UTC                 
                                      
                     "2019-04-10T15:42:16.938Z" "spb@APMx" "Neo_CEM" "23864" "N/A" "INFO" "14:60344" ::
"Storage pool My Dynamic Pool has finished rebuilding." :: Category=User Component=Health TimeZone=UTC                 
                                      
                     "2019-04-10T15:42:16.963Z" "spb@APM" "Neo_CEM" "23864" "N/A" "INFO" "14:60326" ::
"Storage pool My Dynamic Pool is operating normally" :: Category=User Component=Health TimeZone=UTC
           
                
   
    The Pools' RAID protection goes through a degraded status during the initial fault of the drive followed by the pool status being updated for the degraded drive as well.   
   
    Once we see the reference of the pool stating "Finished rebuilding" followed by the subsequent "Operating normally", we start the replacement of the faulted drive.   
   
   
    Without Free Drives in the array to rebuild to:     
   
    Once the Dynamic Sparing process is completed, the GUI references a pool Status of "OK The component is operating normally. No action is required" (shown in the figure below). The pool currently has a reduced amount of spare space. If the pool is currently rebuilding due to a lost drive, the rebuild completes, but there may not be enough space for subsequent failures. Replace the faulted drive or add a drive of the same type and size or larger to the system.   
   
    User-added image   
   
    The Dynamic Pool rebuild can be verified with the same command as before but the output is a different. This time, an entry stating "Storage pool <Pool Name> does not have enough spare space." appears. This indicates there are no free drives available and the array is going to rebuild the faulted drive to free Extents in the Dynamic Pool as designed.   
   
    Example of the output:     
   
    Command >> 15:57:32 service@spb spb:~/user> cat /EMC/backend/log_shared/EMCSystemLogFile.log | grep -i "<pool name>"   
                                                  
15:57:32 service@spb spb:~/user> cat /EMC/backend/log_shared/EMCSystemLogFile.log | grep -i "My Dynamic Pool"           
                          
                          
               "2019-04-13T18:39:06.846Z" "spb@APMx" "Kittyhawk_safe" "25230" "unix/spb/root" "WARN" "1:12d4602" :: "RAID protection for Storage Pool My Dynamic Pool is degraded. Please resolve any hardware problems. Internal information only: Pool OID 0x300000005." :: Category=System Component=mlu TimeZone=UTC                 
                                      
                     "2019-04-13T18:40:02.213Z" "spb@APMx" "Neo_CEM" "23864" "N/A" "WARN" "14:6032d" ::
"Storage pool My Dynamic Pool is degraded." :: Category=User Component=Health TimeZone=UTC                 
                                      
                     "2019-04-13T18:44:10.475Z" "spb@APMx" "Neo_CEM" "23864" "N/A" "WARN" "14:60345" ::
"Storage pool My Dynamic Pool does not have enough spare space." :: Category=User Component=Health TimeZone=UTC                 
                                      
                     "2019-04-13T18:45:12.766Z" "spb@APMx" "Neo_CEM" "23864" "N/A" "WARN" "14:60343" ::
"Storage pool My Dynamic Pool is rebuilding due to the loss of a drive." :: Category=User Component=Health TimeZone=UTC                 
                                      
                     "2019-04-13T19:04:22.047Z" "spb@APMx" "Kittyhawk_safe" "25230" "unix/spb/root" "INFO" "1:12d0508" ::
"RAID protection has been upgraded for Storage Pool My Dynamic Pool. Internal information only. Pool OID 0x300000005." :: Category=System Component=mlu TimeZone=UTC                 
                                      
                     "2019-04-13T19:05:15.701Z" "spb@APMx" "Neo_CEM" "23864" "N/A" "INFO" "14:60344" ::
"Storage pool My Dynamic Pool has finished rebuilding." :: Category=User Component=Health TimeZone=UTC
   
    At this point, the faulted Storage Pool Drive still shows a "warning" error. Once the pool rebuild has been confirmed and the pool references "OK The component is operating normally. No action is required.", the faulted drive can safely be replaced. Once the disk has been replaced, the pool must be allowed to rebuild back to the replaced drive and reference "OK" only.