RecoverPoint: Consistency Group falls into error state due to [SYM failed to find snapset]


   Article Number:     530043                                   Article Version: 3     Article Type:    Break Fix 




RecoverPoint,RecoverPoint EX,RecoverPoint CL





The link of a CG falls into an error due to [SYM failed to find snapset] forcing the entire group into an enabled with no transfer state.   
    Error: One or more links of group cg_name are set to replicate snaps and an error occurred in the snap-based replication process. The following errors were received from the storage: Link = cg_name->cg_name_copy, error = [SYM failed to find snapset]   
    Symptoms found in logs:   
    ActiveXioArrayHelper_AO_IMPL::xioRefreshConsistentSnapshotFromDevice_i: xioRefreshConsistentSnapshotFromDevice Failed with res.faultString() = SYM failed to find snapset res.arrayRvCode() = e_API_FAILURE     
      printCommand: methodName = sym.SystemRemoveSnapSet format = ((ssi)(ssi)(ssi)i) numArgs = 10 buffer = (( 0065ff5a961b41979c64b1998bf9xxxx xms 1 )( xxxxxb824ee14c94b5f708ced17f3b85 XIO-HO-C01 1 )( 6fbb339729954axxxxxxxxxx 1 ) 19558 )     
      XioConnection::executeCommand: Command execution fail. methodName = sym.SystemRemoveSnapSet m_client = 0x7f502dxxxxx server = 0x7f5xxxxxxxx URL: this = 0x7f5030165xxxxx

CleanEnvAndReturnRV: Operation failed. rv.faultString() = RPC failed at server. snapset_not_found env.fault_code = -500XioArrayHelper: RPC failed at server. snapset_not_found, called from function: xioDeleteConsistentSnapshot:3212    
    2018/10/17 10:12:50.135 - #1 - 5040/4313 - WorkManager: GroupCopy(206327186 SiteUID(0x228e3ecc2dxxxxxx) 0): Action refreshArrayConsistentSnapshot failed! value.arrayRvCode() = e_API_FAILURE value.errorStrings() = [SYM failed to find snapset]     
      2018/10/17 10:12:50.550 - #2 - 5040/4313 - StateChange: lastComputedPipeTargetsMap. copy=GroupCopy(2063271xxx SiteUID(0x228e3ecc2xxxxxxx) 0) to copy= GlobalCopy(SiteUID(0x31ba0c434a00xxxxx) 0) pipe target=PT_CLOSED, reason=No exposed snap to replicate -> PT_CLOSED, reason=Array error






Due to a misalignment between XtremIO and the XMS RecoverPoint receives previous entries from the XMS when it retrieves the current list of snapshots that need to be discarded.   
    Occasionally, an unexpected value is received and instead of the previous snapshot which needs to be discarded being used for deletion is used as the current snapshot. This causes the actual snapshot to be deleted.    
    Due to this the calls from the Array begin to fail for the group associated with the snapshot which was deleted which causes the link of the copy to go into the error state of [SYM failed to find snapset], due to the link of the group being down the entire CG falls into an error state of enabled with no transfer.






    1. Disable the CG, and Re-Enable the CG   
    2. Change the tweak t_xioPeriodicalSnapCleanupGatherInterval from 600000 to 600000000 (x1000) on all Production RPAs. This change will cause the cleanup tool to run once a week instead of every 10 minutes, and reduce the probability to encounter the issue.   
    Dell EMC engineering is currently investigating this issue. A permanent fix is still in progress. Contact the Dell EMC Customer Support Center or your service representative for assistance and reference this solution ID.






Impacted Configurations:   
    RecoverPoint Classic with the XtremIO Array (XMS 6.1.0-99 and above)