ASUPs and command output for "disk show performance" report "nothing there", instead of the requested disk stats

           

   Article Number:     534212                                   Article Version: 3     Article Type:    Break Fix 
   

 


Product:

 

Data Domain

 

Issue:

 

 

All DDOS versions have a system by which IO performance for all disks installed is collected at regular intervals. This information is persisted on disk and maintained in an internal database, so CLI commands such as "disk show performance" (which is for example ran while generating an ASUP) produce stats like the ones below:   
        

# disk show performance    
   
Disk                      Read                             Write                              Read+Write(enc/disk)   KiB/sec IOPs  Resp(ms) Ops >1s   KiB/sec IOPs  Resp(ms) Ops >1s   MiB/sec  IOPs   Resp(ms)  Random    Busy----------   ------------------------------   ------------------------------   ----------------------------------------1.1          0       0     0.65     0         460     44    6.81     0            0.449 44     6.81      14.23%   3.76%1.2          0       0     1.22     0         460     44    6.76     0            0.449 44     6.76      14.23%   3.74%1.3          0       0     0.98     0         459     44    6.90     0            0.448 44     6.90      14.24%   3.80%...    
   
    This is very useful information to investigate DD performance problems or, for customers, to learn more about the disk IO utilization for storage, for example, to anticipate when the workload on the DD may be too high for the installed hardware and, depending on the circumstances, either add more disk spindles, move to a larger DD or work with DD Support on any possible speedups.   
   
    A daemon named "ssm" is in charge of collecting these stats and returning them when asked for. There is a defect in some releases of DDOS 6.x by which storage events may confuse the "ssm" service, which would stop providing stats at all until the process is manually restarted. However , this process is not user accessible, so only Support may workaround the problem.   
   
    As a result of the defect, "disk show performance" output will be missing, including the "Disk Show Performance" section in ASUPs, similar like this:   
# disk show performancenothing there    
   
    Also , when trying to get the stats through SNMP, the SNMP client asking the DD SNMP agent for the details (diskPerformance SNMP OID) would report:   
        
[SNMP client] # snmpwalk -v2c -c <string> DD .1.3.6.1.4.1.19746.1.6.2SNMPv2-SMI::enterprises.19746.1.6.2 = No Such Object available on this agent at this OID    
                                                             

 

 

Cause:

 

 

DD Engineering has found the "ssm" process is not receiving the right disk events from the DataDomain OS kernel, so this process initializes its internal disk list to zero and never updates it until the process is restarted.     
     
      Wrong events received by the "ssm" process that causes this issue are re-sent by another internal process called "udevd".     
     
      Hence the fix has been to strengthen the "ssm" process to be able to handle these duplicate events coming from "udevd" as well.
   
     
                                                           

 

 

Resolution:

 

 

This defect affects code branches DDOS 6.0 and 6.1 (DDOS 6.2 is not affected), and has been fixed in the following releases:   

         
  •         DDOS 6.0.2.50 and later     
  •      
  •         DDOS 6.1.2.40 and later     
  •    
   
    If a customer can't immediately upgrade to a fixed release to have this problem resolved, there is a temporary workaround to get stats printed again, which needs a brief remote session with no downtime to be assisted by DataDomain Support, as it needs restricted BASH access to be implemented. Note the workaround is temporary, as it consists of restarting the "ssm" process, and hence if it receives any duplicate disk events again later, the process would get into the same situation, and disk stats would fail to be produced again.