VPLEX: Extent and Distributed Device component went to critical failure state after creating a Distributed device

           

   Article Number:     534897                                   Article Version: 3     Article Type:    Break Fix 
   

 


Product:

 

VPLEX VS2,VPLEX VS6,VPLEX Series,VPLEX GeoSynchrony,VPLEX Metro,VPLEX for All Flash,VPLEX GeoSynchrony 5.4 Service Pack 1,VPLEX GeoSynchrony 5.4 Service Pack 1 Patch 1,VPLEX GeoSynchrony 5.4 Service Pack 1 Patch 3

 

Issue:

 

 

   

      After creating a distributed device the extent and distributed device component of the attached device went to 'critical-failure' state   
   
    Sample Output:   
    VPlexcli:/> show-use-hierarchy clusters/cluster-*/virtual-volumes/S_OSL_V1V2-PROD_12     
      storage-view: OSL-V1V2-TEST_Y3 (cluster-2-Y3)     
      storage-view: OSL-V1V2-TEST_D1 (cluster-1-D1)     
        consistency-group: cg_D1 (synchronous)     
          virtual-volume: S_OSL_V1V2-PROD_12 (2T, minor-failure, distributed @ cluster-1-D1, running)     
            distributed-device: dd_VNX_0914_206_1_vol_1 (2T, raid-1, minor-failure)     
              distributed-device-component: device_VNX_0914_206_1_vol_12019Jun06_095143 (2T, raid-0, cluster-1-D1)     
                extent: extent_VNX_0914_206_1_vol_1 (2T)     
                  storage-volume: VNX_0914_206_1_vol (2T)     
                    logical-unit: VPD83T3:600601601e203900ec02e3ae3f88e911     
                      storage-array: EMC-CLARiiON-CKM00142500914
   
            distributed-device-component: device_VNX_1278_111_1_vol_1 (2T, raid-0, critical-failure, cluster-2-Y3) <<<<<   
              extent: extent_VNX_1278_111_1_vol_1 (2T, critical-failure) <<<<<   
                storage-volume: VNX_1278_111_1_vol (2T)       
                      logical-unit: VPD83T3:6006016013c03900a73fc8313f88e911       
                        storage-array: EMC-CLARiiON-CKM00143801278
   
                                                                

 

 

Cause:

 

 

After attaching a mirror leg to the an existing device the extent of the attached device went to critical failure. This could be because a rebuild is initialized and the device is rebuilding.    
   
    This can be checked in firmware logs:   
   
    128.221.253.37/cpu0/log:5988:W/"0060165465f564526-2":48947:<6>2019/06/06 09:51:46.55: amf/7 Added mirror to amf "device_VNX_0914_206_1_vol_1": added amf "device_VNX_1278_111_1_vol_1" into slot 1         
          128.221.253.36/cpu0/log:5988:W/"0060165468b17011-2":46412:<6>2019/06/06 09:51:46.56: amf/7 Added mirror to amf "device_VNX_0914_206_1_vol_1": added amf "device_VNX_1278_111_1_vol_1" into slot 1
     
     
      128.221.252.37/cpu0/log:5988:W/"0060165465f564526-2":48961:<5>2019/06/06 09:59:17.53: amf/21 raid 1 rebuild: device_VNX_0914_206_1_vol_1: child node 1 (device_VNX_1278_111_1_vol_1) rebuild started (full rebuild, rebuild line 2475072 blocks)
   
   
    Check the status of the Extent. Extent can be marked out-of-date since rebuilding.   
   
    VPlexcli:/> ll /clusters/cluster-2-Y3/storage-elements/extents/extent_VNX_1278_111_1_vol_1           
            /clusters/cluster-2-Y3/storage-elements/extents/extent_VNX_1278_111_1_vol_1:
       
Name Value       
----------------------------- ------------------------------------------------       
application-consistent false       
block-count 536870912       
block-offset 0       
block-size 4K       
capacity 2T       
description -
   
health-indications [out of date] <<<<<     
      health-state critical-failure <<<<<
   
io-status alive       
itls 0x50001442906ca510/0x500601610860538d/9,       
0x50001442906ca511/0x500601600860538d/9,       
0x50001442906ca510/0x500601680860538d/9,       
0x50001442906ca511/0x500601690860538d/9,       
0x50001442806c8d11/0x500601600860538d/9,       
0x50001442806c8d10/0x500601610860538d/9,       
0x50001442806c8d10/0x500601680860538d/9,       
0x50001442806c8d11/0x500601690860538d/9,       
0x50001442906c8d11/0x500601690860538d/9,       
0x50001442906c8d10/0x500601610860538d/9, ... (16       
total)       
locality -       
operational-status error       
storage-volume VNX_1278_111_1_vol       
storage-volumetype normal       
system-id SLICE:206c8db5c53ed089       
thin-capable false       
underlying-storage-block-size 512       
use used       
used-by [device_VNX_0914_206_1_vol_1]       
vendor-specific-name DGC
   
   
     
                                                           

 

 

Resolution:

 

 

1. Check the status of the rebuild to verify if the rebuild is still running on the device   
        

      VPlexcli:/> rebuild status         
[1] storage_volumes marked for rebuild         
         
Global rebuilds:         
device rebuild type rebuilder director rebuilt/total percent finished throughput ETA         
--------------------------- ------------ ------------------ ------------- ---------------- ---------- ---------         
device_VNX_0914_206_1_vol_1 full s1_6985_spa 1.44T/2T 72.13% 171M/s 57.1min
   
   
   
    2. You will need to wait until rebuild completes to do this run the command in step 1 again after the   
        alloted time shown for the ETA to see if the rebuild has completed.   
        
      VPlexcli:/> rebuild status       
       
        Global rebuilds:       
        No active global rebuilds.       
                
        Local rebuilds:       
        No active local rebuilds.
   
   
    3. Then once you see that the rebuild has completed, run the show-use-hierarchy command again to   
         ensure the critical-failure state has cleared.   
        
      VPlexcli:/> show-use-hierarchy clusters/cluster-*/virtual-volumes/S_OSL_V1V2-PROD_12         
          storage-view: OSL-V1V2-TEST_Y3 (cluster-2-Y3)         
          storage-view: OSL-V1V2-TEST_D1 (cluster-1-D1)         
            consistency-group: cg_D1 (synchronous)         
              virtual-volume: S_OSL_V1V2-PROD_12 (2T, distributed @ cluster-2-Y3, running)         
                distributed-device: dd_VNX_0914_206_1_vol_1 (2T, raid-1)         
                  distributed-device-component: device_VNX_0914_206_1_vol_12019Jun06_095143 (2T, raid-0, cluster-1-D1)         
                    extent: extent_VNX_0914_206_1_vol_1 (2T)         
                      storage-volume: VNX_0914_206_1_vol (2T)         
                        logical-unit: VPD83T3:600601601e203900ec02e3ae3f88e911         
                          storage-array: EMC-CLARiiON-CKM00142500914         
                  distributed-device-component: device_VNX_1278_111_1_vol_1 (2T, raid-0, cluster-2-Y3)
<<<<       
                  extent: extent_VNX_1278_111_1_vol_1 (2T) <<<<
     
                  storage-volume: VNX_1278_111_1_vol (2T)         
                        logical-unit: VPD83T3:6006016013c03900a73fc8313f88e911         
                          storage-array: EMC-CLARiiON-CKM00143801278
   
   

     
      4. If after the 'rebuild status' shows the rebuilds have completed and you run the     
          'show-use-hierarchy' the distributed device component and device still show in a 'critical-failure'     
          state you will need to verify the health of the storage-volume. If the storage-volume is in a 'critical-     
          failure' please refer the Knowledge Base articles,   

   

          499075, "Storage-Volume in critical-failure" and        
          495253, "Storage-Volume in critical-failure state due to scsi check condition."      
     
     
     
     
     
      open a Live Chat with Dell EMC Customer Service for further assistance and mention this        
          article.