PowerPath for AIX 6.4: powerdd: MpxPeriodicCallbackDaemon race condition crash

           

   Article Number:     537201                                   Article Version: 3     Article Type:    Break Fix 
   

 


Product:

 

PowerPath for AIX 6.4

 

Issue:

 

 

Dell EMC SW: PowerPath for AIX 6.4   
   
    CRASH INFORMATION:       
        CPU 16 CSA F00000002FF47600 at time of crash, error code for LEDs: 30000000       
        pvthread+010300 STACK:       
        [F1000000C07A8608]powerdd:PowerEnqueue+000028 (F1000000C04DFEF0, 0000000000000000)       
        [F1000000C07AAC0C]powerdd:PowerGetSemaNew+00020C (F1000000C04DFE90, 0000000100000001)       
        [F1000000C07AAD78]powerdd:PowerGetSema+000018 (F1000000C04DFE90)       
        [F1000000C04D249C]mpxext:MpxGetHostInfo+00009C (F1000000C04DFF60, F1000000C04DC940)       
        [F1000000C049815C]mpxext:MpxDeviceMountStatsRpt+00005C (F1000A03F0900800, 15C06FB653F3BC50)       
        [F1000000C049AA6C]mpxext:MpxPeriodicCallout+001EEC (F1000A03F01C2100)       
        [F1000000C07AD4CC]powerdd:PowerServiceDaemonQ+00010C (F1000A03F06D3E00)       
        [F1000000C07CDCD4]powerdd:PowerServiceDaemonQWrap+000074 (0000000000000000, 0FFFFFFFF3FFFFF0,       
          0000000800000008)       
       
        [00014D70].hkey_legacy_gate+00004C ()       
        [0029B310]procentry+000010 (??, ??, ??, ??)       
        [kdb_real_mem] no real storage @ FFFFFFFFFFF5D50       
        (16)> th       
                       SLOT NAME    STATE   TID PRI  RQ CPUID CL WCHAN       
        pvthread+010300 259*MpxPerio RUN  103006B 03C  16        0       
        NAME................ MpxPeriodicCallbackDaemon       
        FLAGS............... KTHREAD       
        ..       
        (16)> vmlog       
        Most recent VMM errorlog entry       
        Error id              = DSI_PROC       
        Exception DSISR/ISISR = 000000000A000000       
        Exception srval       = 00007FFFFFFFD080       
        Exception virt addr   = 0000000000000008       
        Exception value       = 00000086 EXCEPT_PROT       
        (16)>
   
   
     
                                                           

 

 

Cause:

 

 

PowerPath for AIX 6.4 bug   

      A race condition due to a NULL de-reference in the thread involving multiple function calls from PowerPath 6.4 code.   

                                                             

 

 

Resolution:

 

 

Workaround   
    Disable the "device in use to array report" feature.   
   
    Fix   
    Fixed in PowerPath for AIX 7.0 which is currently available for download at the support site.
                                                           

 

 

Notes:

 

 

The following command can be used to disable the device in use to array report feature:   
    powermt set dev_inuse_to_array_report=off class=symm   
   
    The following command can be used to confirm that the device in use to array report feature is disabled:   
    # powermt display options       
       
                Show CLARiiON LUN names:      true       
       
                Path Latency Monitor: Off       
       
                Performance Monitor: disabled       
       
                Autostandby:  IOs per Failure (iopf): enabled       
                              iopf aging period     : 1 d       
                              iopf limit            : 6000       
       
                Storage       
                System Class  Attributes       
                ------------  ----------       
       
                Symmetrix     periodic autorestore = on       
                              reactive autorestore = on       
                              status = managed       
                              proximity based autostandby = off       
                              auto host registration = enabled       
                              autopath mask counter = enabled       
                              autopath array controlled counter = enabled       
                              device to array performance report = enabled       
                              device in use to array report = disabled
   
   
    This bug is specific to PowerPath for AIX 6.4.   
   
    This is an isolated corner case, the chance of hitting the issue is remote.    
   
    Note: If you are running any microcode version below 5978.221.221 then there won't be any functionality impact.    
    If you running microcode 5978.221.221 above this version then turning this parameter off, a storage admin will    
    not know if the device is in use or not from the SYMM array perspective.