Cloud unit does not free space after multiple cleaning cycles

           

   Article Number:     530855                                   Article Version: 2     Article Type:    Break Fix 
   

 


Product:

 

Data Domain,Data Domain Cloud DR

 

Issue:

 

 

Cloud cleaning will not have any utilization drop, if there are many GC pending cycle stuck to process background delete.                                                           

 

 

Cause:

 

 

If the cloud unit timeouts are too frequent, the DD will not be able to process background delete.   
        

      EVT-CLOUD-00001: Unable to access provider for cloud unit edccloudtier2.EVT-OBJ::CloudUnit=edccloudtier2 EVT-INFO::Cause=Timeout was reached   
   
    EVT-CLOUD-00001: Unable to access provider for cloud unit edccloudtier2.EVT-OBJ::CloudUnit=edccloudtier2 EVT-INFO::Cause=Cloud profile state is DOWN                                                           

 

 

Change:

 

 

This can happen in any of the DD with cloud tier enabled                                                           

 

 

Resolution:

 

 

Below are the possible issues, when cloud unit gets disconnected   
        
    A. DDFS goes into not-responding state or gets panic'd.   
    B. Data movement will either stop or will be suspended.   
    C. Cloud cleaning will either be aborted or will not clean expired data from cloud tier.   
    D. Cloud recall will not work or takes much time or will show data not found.   

      The cloud unit will not show utilization drop after cleaning cycle if multiple pending cycles are in hung state. the evidence can be seen in auto-support.   

   

      GC remote delete stats for CP edccloudtier2*:                      Recent       Cumulative       
        Number of delete list containers <cid,offset> processed:              174            1799       
        Number of delete list containers <cid> skipped:                       173            1219       
        Number of regions to delete:                                            0       627732657       
        Number of regions deleted:                                        8236883       215044853       
        Bytes deleted:                                                  623322937273    16683421699021       
        Run time (s):                                                      332316         9324202       
        Deletion rate (region/s):                                              24              23       
        >Pending cycles:                                                       17              17
   

   
    Please follow these steps as per the scenario   
         
  •         Check connectivity with cloud providers and see if                
               
    •             There are any issue with connection on port(443)         
    •          
    •             Check name resolution of cloud provider from DD         
    •          
    •             If everything is proper between DD and cloud tier, go to DD and see if filesystem doesn't respond or has panic.         
    •          
    •             Reboot the DD to release any hung process to get proper cloud connectivity and process pending GC cycles.         
    •        
                  
  •      
  •         The issue will be resolved by making the cloud connectivity proper and run below commands on DD to stop and restart background deletes.     
  •    
   
      Stop the async deletes   
   

      # cloud clean background-delete stop     
     
      Restart the bulk deletes.   

   
      # cloud clean background-delete start   
        
         
  •         Bug #234809 - Upgrade the DD to 6.1.2.x if you have more than one cloud unit and having cloud cleaning issue.     
  •    
                                                             

 

 

Notes:

 

 

The cloud unit will not show utilization drop after cleaning cycle if there are multiple pending cycles are in hung state. the evidence can be seen in auto-support.   
   
    Before resolution   

      GC remote delete stats for CP edccloudtier2*:                      Recent       Cumulative       
        Number of delete list containers <cid,offset> processed:              174            1799       
        Number of delete list containers <cid> skipped:                       173            1219       
        Number of regions to delete:                                            0       627732657       
        Number of regions deleted:                                        8236883       215044853       
        Bytes deleted:                                                  623322937273    16683421699021       
        Run time (s):                                                      332316         9324202       
        Deletion rate (region/s):                                              24              23
     
      >Pending cycles:                                                       17              17       
       
        After resolution
   

   

          

GC remote delete stats for CP edccloudtier2*:                      Recent       Cumulative     
      Number of delete list containers <cid,offset> processed:             1285               0     
      Number of delete list containers <cid> skipped:                       776               0     
      Number of regions to delete:                                            0               0     
      Number of regions deleted:                                      165625861               0     
      Bytes deleted:                                                  11764930559267          0     
      Run time (s):                                                       19897               0     
      Deletion rate (region/s):                                            8313               0
   
    Pending cycles:                                                         1               1   
   
    The evidences can be seen by analyzing multiple auto-supports and looking at the GC pending cycles. Below are some examples.   
   
    autosupport_2018-11-08.out:Pending cycles:                                                      5               5     
      autosupport_2018-11-13.out:Pending cycles:                                                      6               6     
      autosupport_2018-11-24.out:Pending cycles:                                                      7               7     
      autosupport_2018-12-15.out:Pending cycles:                                                      9               9     
      autosupport_2018-12-16.out:Pending cycles:                                                     10              10     
      autosupport_2018-12-28.out:Pending cycles:                                                     11              11     
      autosupport_2018-12-29.out:Pending cycles:                                                     12              12     
      autosupport_2019-01-11.out:Pending cycles:                                                     13              13     
      autosupport_2019-01-18.out:Pending cycles:                                                     14              14     
      autosupport_2019-01-19.out:Pending cycles:                                                     15              15     
      autosupport_2019-02-05.out:Pending cycles:                                                     16              16     
      autosupport_2019-02-07.out:Pending cycles:                                                     17              17     
      autosupport_2019-02-08.out:Pending cycles:                                                      1               1     
      autosupport_2019-02-09.out:Pending cycles:                                                      1               1     
      autosupport_2019-02-10.out:Pending cycles:                                                      1               1
   
        

      The cloud unit timeout issues can be seen in logs or in # alerts show history   

   

      INFO: Event posted: m0-1585 (21000631:553649713): EVT-CLOUD-00001: Unable to access provider for cloud unit edccloudtier2.EVT-OBJ::CloudUnit=edccloudtier2 EVT-INFO::Cause=Cloud profile state is DOWN       
        INFO: Event posted: m0-1586 (21000632:553649714): EVT-CLOUD-00001: Unable to access provider for cloud unit edccloudtier2.EVT-OBJ::CloudUnit=edccloudtier2 EVT-INFO::Cause=Cloud profile state is DOWN       
        INFO: Event posted: m0-1587 (21000633:553649715): EVT-CLOUD-00001: Unable to access provider for cloud unit edccloudtier2.EVT-OBJ::CloudUnit=edccloudtier2 EVT-INFO::Cause=Cloud profile state is DOWN       
        INFO: Event posted: m0-1588 (21000634:553649716): EVT-CLOUD-00001: Unable to access provider for cloud unit edccloudtier2.EVT-OBJ::CloudUnit=edccloudtier2 EVT-INFO::Cause=Cloud profile state is DOWN       
        INFO: Event posted: m0-1589 (21000635:553649717): EVT-CLOUD-00001: Unable to access provider for cloud unit edccloudtier2.EVT-OBJ::CloudUnit=edccloudtier2 EVT-INFO::Cause=Cloud profile state is DOWN       
        INFO: Event posted: m0-1591 (21000637:553649719): EVT-CLOUD-00001: Unable to access provider for cloud unit edccloudtier2.EVT-OBJ::CloudUnit=edccloudtier2 EVT-INFO::Cause=Cloud profile state is DOWN       
        INFO: Event posted: m0-1578 (2100062a:553649706): EVT-CLOUD-00001: Unable to access provider for cloud unit edccloudtier2.EVT-OBJ::CloudUnit=edccloudtier2 EVT-INFO::Cause=Cloud profile state is DOWN       
        INFO: Event posted: m0-1579 (2100062b:553649707): EVT-CLOUD-00001: Unable to access provider for cloud unit edccloudtier2.EVT-OBJ::CloudUnit=edccloudtier2 EVT-INFO::Cause=Cloud profile state is DOWN       
        INFO: Event posted: m0-1580 (2100062c:553649708): EVT-CLOUD-00001: Unable to access provider for cloud unit edccloudtier2.EVT-OBJ::CloudUnit=edccloudtier2 EVT-INFO::Cause=Cloud profile state is DOWN       
        INFO: Event posted: m0-1581 (2100062d:553649709): EVT-CLOUD-00001: Unable to access provider for cloud unit edccloudtier2.EVT-OBJ::CloudUnit=edccloudtier2 EVT-INFO::Cause=Cloud profile state is DOWN       
        INFO: Event posted: m0-1582 (2100062e:553649710): EVT-CLOUD-00001: Unable to access provider for cloud unit edccloudtier2.EVT-OBJ::CloudUnit=edccloudtier2 EVT-INFO::Cause=Cloud profile state is DOWN       
        INFO: Event posted: m0-1583 (2100062f:553649711): EVT-CLOUD-00001: Unable to access provider for cloud unit edccloudtier2.EVT-OBJ::CloudUnit=edccloudtier2 EVT-INFO::Cause=Cloud profile state is DOWN       
        INFO: Event posted: m0-1584 (21000630:553649712): EVT-CLOUD-00001: Unable to access provider for cloud unit edccloudtier2.EVT-OBJ::CloudUnit=edccloudtier2 EVT-INFO::Cause=Cloud profile state is DOWN