RecoverPoint/XtremIO: Possible integrity issues with RecoverPoint replication of XtremIO storage

           

   Article Number:     531684                                   Article Version: 3     Article Type:    Break Fix 
   

 


Product:

 

RecoverPoint,RecoverPoint CL,RecoverPoint EX,RecoverPoint SE,XtremIO X1

 

Issue:

 

 

When using RecoverPoint target copies when the source resides on XtremIO array, the customer may find inconsistencies in target images.   
    This may show up as fie system errors during mount operations, missing or out-dated files or DB/Application seeing errors when accessing the data.   
   
    No direct symptoms are expected in RecoverPoint or XtremIO logs.   
   
    Running Integrity check in RecoverPoint may produce events such as:   
   
      Time:                 Tue Mar 19 14:31:05 2019   
      Topic:                GROUP   
      Scope:                NORMAL   
      Level:                ERROR   
      Event ID:             4108   
      Cluster:              Site1   
      Global links:         None   
      RPA:                  RPA 3   
      Groups:               [CG1_CG, CG1_Source]   
      Links:                [CG1_CG, CG1_Source->CG1_Target]   
      Volumes:              [CG1_CG, CG1_Target, RSet0]   
      Summary:              Replication integrity issue detected   
      Details:              The data in the following blocks differ :(gUDeviceID=0xXXXXXXXXXXXXXXXXX,offset=950565744)(gUDeviceID=0xXXXXXXXXXXXXXXXXX,offset=950565745)(gUDeviceID=0xXXXXXXXXXXXXXXXXX,offset=950565746)(gUDeviceID=0xXXXXXXXXXXXXXXXXX,offset=950565747)(gUDeviceID=0xXXXXXXXXXXXXXXXXX,offset=950565748)(gUDeviceID=0xXXXXXXXXXXXXXXXXX,offset=950565749)(gUDeviceID=0xXXXXXXXXXXXXXXXXX,offset=950565750)(gUDeviceID=0xXXXXXXXXXXXXXXXXX,offset=950565753)(gUDeviceID=0xXXXXXXXXXXXXXXXXX,offset=950565754)(gUDeviceID=0xXXXXXXXXXXXXXXXXX,offset=950565755)   
     More information:     A possible replication integrity issue has been detected.
                                                           

 

 

Cause:

 

 

   

      RecoverPoint replication cycle uses DIFF SCSI command to compare between two Snapsets to identify the data modifications which should be replicated to the DR site.      
      The DIFF SCSI command is sent by RP to the XtremIO cluster, including 2 short IDs (A.K.A SNID) of the Snapsets looking to be compared.      
      XtremIO cluster software uses a metadata table to map the SNIDs to the internal IDs of the snapshot volumes.      
      A software issue was found were a SNID may be mapped to an internal Snapshot ID which is not a member of the Snapset.      
      Therefore when a DIFF command is sent, this incorrect mapping will result in an improper reply to RP and the incorrect data will be replicated.    

                                                             

 

 

Change:

 

 

None                                                           

 

 

Resolution:

 

 

   

      Verification:     
     
      To verify the issue exists, run an Integrity Check on the Consistency Group.     
      Use command "start_integrity_check group={CG Group Name} copy={Target Copy Name}" to start an integrity check.     
      The Integrity check can be then monitored with command "show_integrity_check_status group={CG Group Name} copy={Target Copy Name}".     
      If the Integrity check fails, you will see an event ID 4108 as seen above.     
     
      Workaround:     
     
      Once the issue is identified, the only way to clear it, is to run a Full Sweep.     
      Disable/Enable the Consistency Group to force a full sweep.     
     
      Resolution:     
     
      Dell EMC engineering is currently investigating this issue. A permanent fix is still in progress. Contact the Dell EMC Customer Support Center or your service representative for assistance and reference this solution ID.