VMAX3/VMAX All Flash: XCOPY commands can slow/hang when Reserved Capacity is reached


VMAX3 Series





When the available capacity of a Storage Resource Pool (SRP) in a VMAX3 or VMAX All Flash reaches the Reserved Capacity value, active or new extent copies (XCOPY) can take a long time to complete and may cause VMware tasks to timeout.   








This problem is caused when the Reserved Capacity is reached. The root of the issue is the SRP running out of available capacity. The Reserved Capacity is set at 10% by default so when the SRP exceeds 90% used capacity, the extent copy sessions (along with replication in general) are impacted.   
    Note that the Reserved Capacity setting is a safety measure designed to prevent copy operations like XCOPY, SnapVX, Clone, and RDF/A DSE from using up all of the available capacity of an SRP. Essentially, when the limit is reached, only host writes are able to allocate the remaining free capacity. Without the Reserved Capacity, the SRP could reach its maximum unhindered, and essentially halt the array.   






A code resolution is currently being developed to change the behavior of the XCOPY sessions in this situation, however there are also manual steps that can be taken to resolve.   
    The primary resolution to the issue is to increase the available space in the SRP so that the Reserved Capacity is not breached. In a VMware environment this might include running manual UNMAP against existing datastores to free space on the array.    
    If space cannot be found, the recommendation is to disable XCOPY on the VMware ESXi hosts so that host copy is used instead. XCOPY can be re-enabled after space is made available. Details on how to disable XCOPY dynamically can be found in this white paper:  http://www.emc.com/collateral/hardware/white-papers/h8115-vmware-vstorage-vmax-wp.pdf   

  •         If there are multiple vendor arrays presented to the same host (e.g. XtremIO, VNX, VMAX) and disabling XCOPY is not possible, the only recourse is to free/increase space in the SRP or lower the Reserved Capacity as noted below.     
    Lowering the Reserved Capacity value may alleviate the issue, but needs to be done carefully because as noted it is there to protect the SRP from completely filling up. If the Reserved Capacity is change Dell EMC does not recommend going below 5%. Depending on array activity, it is possible to hit the new limit when the copies start again. If the new limit is hit, do not adjust lower without consulting with Dell EMC Customer Support.