Understanding VMAX & PowerMax Compression

This is a piece I put together based on a request from the community to explain how compression works on the Vmax, in regards to what data gets compressed, what doesn't and how often this occurs.


Compression allows users to compress user data on storage groups and storage resources. The feature is enabled by default and can be turned on and off at storage group and storage resource pool level. If a storage group is cascaded, enabling compression at this level enables compression for each of the child storage groups. The user has the option to disable compression on one or more of the child storage groups if desired.


The VMAX All Flash's Adaptive Compression Engine (ACE) offers a number of benefits such as capacity savings and delivering expected performance improvements. Space savings is commonly the first thought when compression is discussed. However, in all cases there is some cost usually in performance, due to the overhead the actual function of compressing the data. The Dell EMC Adaptive Compression Engine's design using intelligent algorithms, paired with hardware acceleration, minimizes the cost. This combination allows the system to maintain a balanced and optimal configuration. The result is a system that can deliver efficient capacity savings and deliver optimal performance.


This feature was made available in the 5977.945.890 release and this is the minimum version of code you need to be running on the All Flash Array. All data services offered on the VMAX All Flash array with compression enabled are supported. This includes local replication (SnapVX), remote replication (SRDF), D@RE and VVols.

ace of spades.png

The VMAX Adaptive Compression Engine (ACE) comprises of 4 basic tenants: 


1. Hardware Acceleration - Each array has multiple hardware compression modules that handle the actual compressing and decompressing of data. The system requirements state that each system will have a compression module per director which equates to 2 modules per engine. The compression modules being used are tested and proven components and have been in use for years in the VMAX to support SRDF Compression.


2. Optimized Data Placement - This is a function within the VMAX AFA that is always running and is responsible for dynamically changing the compression pools as needed. This function generates minimal overhead similar to how FAST operated on a VMAX 3. Data is stored in back-end compression pools based on its compressibility (8KB pool ranging to a 128K pool).


These compression pools represent actual disk space on multiple solid state drives. Once compressed, data is allocated to these pools. There are multiple possible compression pools that may be created in order to build an optimal back end. The result is a suitable layout of compression pools that accommodate the data sent to the system. All data can be compressed however it does not all compress to the same degree. Some data may compress to one size and other data may compress to another size. In order to maximize compression efficiency, multiple compression ratios need to be available.


The figure above gives us a good visual representation of how multiple pools handle the various compression ratios as the writes come in.


3. Activity Based Compression (ABC) - ABC aims to prevent constant compression and decompression of data that is active or frequently accessed. The ABC function marks the busiest data in the SRP to skip the compression flow regardless of the related storage group compression setting. This function differentiates busy data from idle or less busy data and only accounts for 20% of the allocations in the SRP. Marking up to 20% of the busiest allocations to skip the compression action is a benefit to the whole system as well as the end users.


This ensures optimal response time and reduced overhead that can result from constantly decompressing frequently accessed data. The mechanism used to determine the busiest data does not add additional CPU load on the system. The function is similar to FAST code used for promoting data in previous code releases. ABC leverages the FAST statistics to determining what data sets are the best candidates for compression. It allows them to maintain balance across the resources proving an optimal environment for both the best possible compression savings and the best performance. Effectively this avoids compress and decompress latency for the busiest data. In addition this reduces the system overhead of compression allowing the focus to be on the best candidate data to be compressed.


The figure above gives us a visual of how busy data remains uncompressed while idle data goes forward for compression to the pool.


4. Fine Grain Data Packing - In VMAX AFA using the Active Compression engine each 128K I/O is split into four 32K buffers. Each buffer is compressed individually in parallel maximizing the efficiency of the compression IO module. The total of four buffers result in the final compressed size and determines where the data is allocated. Fine Grain Data Packing offers benefits with performance for both the compression functions as well as the overall performance of the system. Included in the process is a zero reclaim function that prevents the allocation of buffers with all zeroes or no actual data. Pairing the zero non-allocation function with the Fine Grain Data packing allows the compression function to operate in a very efficient manner with minimal cost to performance.


Compressing the 128K I/O in four buffers individually in parallel allows for each section to be handled independent even though they are still part of the initial 128K I/O. In the event that only one or two of the sections need to be updated or read, only that data is uncompressed.


This figure represents a 128K write I/O divided into four cache buffers. Each buffer starts as a 32K and is compressed individually. The sum of the four sections creates a 64K compressed track. The achieved savings with this example is 2:1 as the 128K I/O is compressed and allocated as a 64K track.


Managing Compression


Enabling Compression - Compression is enabled by default when provisioning storage using Unisphere for VMAX, Solutions Enabler or REST API. The Unisphere provisioning wizard includes the compression option as a check box when the storage group is being created. The compression option is available for managed storage groups. Managed storage groups are when a storage resource pool (SRP) is assigned. If there is no SRP assigned the compression option is not available. When compression is enabled, the data sent to the system is sent through the compression path and compressed, when possible, to the best-case track size available in the array. The compressed data is allocated to the appropriate compression pool. When using the modify storage group option to enable compression the data is not immediately compressed. All incoming data is sent through the compression flow and existing data is compressed when accessed and over time. In parallel there is a code function that scans data sets looking for data to be compressed and does so when such data is encountered.

Disabling Compression - Disabling compression does not immediately start a decompression process. Just like enabling compression on an existing storage group the data is decompressed when accessed and over time. In parallel the code function finds data that should not be compressed and decompress it.


When modifying storage groups if the assigned SRP is changed to none this automatically disables compression.


Compression Displays with Unisphere for VMAX


There are two compression reporting levels in regards to the savings achieved from ACE; overall system level and storage group level. Overall system compression can be viewed within the Unisphere capacity report. System achieved compression accounts for all data allocated to the system. Storage group achieved compression can be found in a few different views and accounts for allocations that relate only to it. In addition, there is a compressibility report that provides a possible achieved compression ratio for storage groups where compression is not enabled.


Unisphere Capacity Report


The capacity report presents a system’s efficiency using a few sections. This view shows the system compression ratio as well as the capacity usage.


The figure below represents the array capacity usage in two factors, subscribed capacity and usable capacity. Subscribed capacity represents the total amount of requested front end host and eNAS capacity plus system-configured capacity such as Guest OS and RecoverPoint (RP) devices. The blue portion of the display includes logical allocated capacity based on a track size of 128K. The usable capacity represents the amount of total physical capacity available using the pool track size of all enabled data devices (TDATs). The blue portion represents the allocated physical capacity for all front end hosts, eNAS as well as internal devices such as Guest OS and RP devices.


The figure below shows the system current overall system compression ratio. The COMPRESSION ENABLED STORAGE percentage represents the total amount of data populating the system where compression is enabled. System compression can also be displayed using Solutions Enabler as seen in the SYMMETRIX EFFICIECNY display (symcfg –sid xxx list –efficiency {xxx = last three digits from the system serial number})



Compressibility Report


The compressibility report provides a list of storage groups that do not have compression enabled. The list displays information specific to that storage group such as number of volumes, allocated capacity, used capacity and target compression ratio. The # of volumes is how many devices are in that storage group. The Allocated capacity and Used capacity reflect how much of the capacity has actually been written to the system. Target ratio presents the user with a potential compression ratio that could be achieved if compression was enabled.


These are the reports taken from Unisphere and SE:compression7.PNG.png


Finally I would like to include some Compression I/O flow models which show how the I/O reacts based on different scenarios:










The primary area of focus for any Storage Administrator is Physical Storage Capacity and with the trend being large amounts of Data being produced each year the need for greater efficiencies is critical. The VMAX AFA and ACE enables you to lower your Data consumption and in turn deliver savings through a lower data center footprint, less physical drives and in turn lower power and cooling costs. Finally it’s simple to use and enabling and disabling can be achieved by a single click or command, the system handles all the work.


I hope you found this post helpful and informative, please let me know if you have any questions.