VPLEX: SNMP reports incorrect values for CPU idle precentage 'vplexDirectorCpuIdle'

           

   Article Number:     537837                                   Article Version: 2     Article Type:    Break Fix 
   

 


Product:

 

VPLEX VS6,VPLEX Local,VPLEX Metro,VPLEX for All Flash,VPLEX GeoSynchrony 6.0,VPLEX GeoSynchrony 6.0 Patch 1,VPLEX GeoSynchrony 6.0 Patch 2,VPLEX GeoSynchrony 6.0 Service Pack 1,VPLEX GeoSynchrony 6.0 Service Pack 1 Patch 1

 

Issue:

 

 

   

      Impacted Dell EMC VPLEX Hardware:     
      Dell EMC VPLEX Hardware: VS6     
      Dell EMC VPLEX Hardware: VPLEX All Flash     
     
      Impacted Dell EMC VPLEX Software: (see article metadata)     
      Dell EMC Software : GeoSynchrony 6.0.x     
      Dell EMC Software : GeoSynchrony 6.1.x     
     
      Issue:     
      The SNMP value reported for 'vplexDirectorCpuIdle' with OID 1.3.6.1.4.1.1139.21.2.2.3.1.1 which is "CPU idle percentage" is different than the actual correct value seen on the VPLEX GUI for the CPU Utilization Graph under the Monitoring tab, and in the CLI     
     
      The SNMP data will report a higher value for the CPU idle percentage. Thus, it will give the impression of a lower value for the director busy than the actual correct value that can be seen in the VPLEX GUI for the director CPU utilization in the CPU Utilization graph under the Monitoring tab, and the CLI.     
     
      Note: By design the SNMP stat refers to CPU usage as "% idle", whereas the CLI/GUI refer to CPU usage as "% busy".     
       
       
              
     
          
                                                             

 

 

Cause:

 

 

The SNMP statistic for object 'vplexDirectorCpuIdle' with OID 1.3.6.1.4.1.1139.21.2.2.3.1.1 that is reporting the CPU idle percentage is hard coded to report the usage value of CPU#1.   
   
    The issue is seen on the VS6 VPLEX as CPU#1 on the VS6 is mostly an idle CPU, not reflecting the true usage (unlike the CPU's 4 and 10).    
   
    However, the VS2 is not impacted since CPU#1 on VS2 is usually the busiest CPU.   
     
                                                           

 

 

Change:

 

 

CPU#1 on VS2 has a different role than CPU#1 on the VS6.    
     
                                                           

 

 

Resolution:

 

 

   

      Permanent Fix:     
      Engineering are working on a permanent fix to be included in upcoming releases on the VS6 platform.     
     
      Workaround:     
      Until this issue is fixed, end users are advised to not use the SNMP statistic CPU idle in order to monitor the CPU usage since it will report incorrect information as values will under-report how busy the director actually is.     
     
      Note: This hard coded CPU # complexity can be avoided by having the SNMP stats code use the "director.busy" statistic.     
     
      Alternatives would be to visually monitor the VPLEX GUI performance dashboard, the VPLEX restful API "monitor get-stats" command for the director monitor, or use ViPR SRM Suite/ EMC M&R.     
     
      Customer can collect report of all stats, either by directly reading from perpetual monitor of directors or can create their own customized monitor and can poll these monitors for reading the JSON data at regular interval using the below action plan:     
     
      1.Command to get stats from perpetual monitor:     
     
          VPlexcli:/monitoring/directors/director-1-1-A/monitors> monitor get-stats --monitors director-1-1-A_PERPETUAL_vplex_sys_perf_mon_v25/     
     
      2.Steps to create and read from custom monitor example for director 1-1-A:     
     
         2.1 Create a monitor specifying the time frequency, monitor name and stats to read.     
     
               VPlexcli:/monitoring> monitor create --name DirStats --period 10s --director director-1-1-A --stats director.busy,director.fe-ops,director.fe-read,director.fe-write,fe-director.write-avg-lat,fe-director.read-avg-lat,director.be-ops,director.be-read,director.be-write,storage-volume.read-avg-lat,storage-volume.write-avg-lat      
     
         2.2 Create a sink file for the monitor to collect data:     
     
               VPlexcli:/monitoring/directors/director-1-1-A/monitors> monitor add-file-sink --monitor director-1-1-A_DirStats/  -f /var/log/VP lex/cli/director-1-1-A_DirStats.log    -o csv     
     
         2.3 Read the data     
     
              VPlexcli:/monitoring/directors/director-1-1-A/monitors> monitor get-stats --monitors director-1-1-A_DirStats/     
     
               Make rest calls to above CLI commands for collecting data and reading the stats.