6 Replies Latest reply: Oct 30, 2016 5:43 AM by amrkato RSS

Only one VPLEX director is handling all data transactions.

Nitss

Hi,

 

I've found that only one director is handling data transactions on one of VPLEX cluster. I figured this is not normal as other Clusters in Customer's environment have both directors handling data transactions.

 

I have checked the ports (FA and BE ) all of them are UP and also I can ping Director 1-1-A which we are facing this problem with.

 

Please suggest any workaround to omit this issue. Attached are the snips for both directors.VPLEX director1-1-A.pngVPLEX director1-1-B.png

 

-Nitss

  • 1. Re: Only one VPLEX director is handling all data transactions.
    garyo

    Hi Nitss,

     

    Many of your director-1-1-A performance charts look suspect to me, it's as though they're simply not showing any data, vs. say the director not actually doing I/O.  When many have "06:00PM" and without any data points, it's as though the chart isn't actually receiving any data.

     

    So it just might be the case that the director's receiving I/O, but the chart is failing to update.  There are variety of reasons why a chart might not update, it's possible that the backing CLI monitor is having issues.  Please check the status of the monitors from the CLI in "/monitoring/directors/*/monitors/".  Example (this from a VPLEX 5.4.1 system):

     

    VPlexcli:/> ll /monitoring/directors/*/monitors/

     

    /monitoring/directors/director-1-1-A/monitors:

    Name                                              Ownership  Collecting  Period  Average  Idle  Bucket  Bucket  Bucket  Bucket

    ------------------------------------------------  ---------  Data        ------  Period  For  Min    Max    Width  Count

    ------------------------------------------------  ---------  ----------  ------  -------  ----  ------  ------  ------  ------

    director-1-1-A_PERPETUAL_vplex_sys_perf_mon_v19  true      true        5s      5s      2s    -      -      -      64

    director-1-1-A_VIRTUAL_VOLUMES_PERPETUAL_MONITOR  true      true        1min    1min    12s  -      -      -      64

     

    /monitoring/directors/director-1-1-B/monitors:

    Name                                              Ownership  Collecting  Period  Average  Idle  Bucket  Bucket  Bucket  Bucket

    ------------------------------------------------  ---------  Data        ------  Period  For  Min    Max    Width  Count

    ------------------------------------------------  ---------  ----------  ------  -------  ----  ------  ------  ------  ------

    director-1-1-B_PERPETUAL_vplex_sys_perf_mon_v19  true      true        5s      5s      3s    -      -      -      64

    director-1-1-B_VIRTUAL_VOLUMES_PERPETUAL_MONITOR  true      true        1min    1min    12s  -      -      -      64

     

    The GUI relies upon the "director-1-1-X_PERPETUAL_vplex_sys_perf_mon..." monitors working.  I'd check the monitor itself on 1-1-A, comparing it to 1-1-B:

     

    Example:

    VPlexcli:/> cd /monitoring/directors/director-1-1-A/monitors/director-1-1-A_PERPETUAL_vplex_sys_perf_mon_v19/

     

    VPlexcli:/monitoring/directors/director-1-1-A/monitors/director-1-1-A_PERPETUAL_vplex_sys_perf_mon_v19> ls -f

     

    Attributes:

    Name             Value

    ---------------  --------------------------------------------------------------

    average-period   5s

    bucket-count     64

    bucket-max       -

    bucket-min       -

    bucket-width     -

    collecting-data  true

    firmware-id      1

    idle-for         0s

    ownership        true

    period           5s

    statistics       [be-prt.read, be-prt.write, cache.dirty, cache.miss,

                     cache.rhit, cache.subpg, com-cluster-io.avg-lat,

                     com-cluster-io.max-lat, com-cluster-io.min-lat,

                     director.async-write, director.be-aborts, director.be-busies,

                     director.be-ops, director.be-ops-read, director.be-ops-write,

                     director.be-ops-ws, director.be-qfulls, director.be-read,

                     director.be-resets, director.be-timeouts,

                     director.be-unitattns, director.be-write, director.be-ws,

                     director.busy, director.dr1-rbld-recv, director.dr1-rbld-sent,

                     director.fe-ops, director.fe-ops-act, director.fe-ops-q,

                     director.fe-ops-read, director.fe-ops-write, director.fe-read,

                     director.fe-write, director.heap-used, director.per-cpu-busy,

                     director.udt-conn-drop, director.udt-pckt-retrans,

                     director.udt-recv-bytes, director.udt-recv-drops,

                     director.udt-recv-pckts, director.udt-send-bytes,

                     director.udt-send-drops, director.udt-send-pckts,

                     directory.ch-remote, directory.chk-total, directory.dir-total,

                     directory.dr-remote, directory.ops-local, directory.ops-rem,

                     fe-director.aborts, fe-director.caw-avg-lat,

                     fe-director.caw-ops, fe-director.read-avg-lat,

                     fe-director.write-avg-lat, fe-director.ws16-avg-lat,

                     fe-director.ws16-ops, fe-director.xcopy-avg-lat,

                     fe-director.xcopy-ops, fe-prt.ops, fe-prt.read,

                     fe-prt.read-avg-lat, fe-prt.write, fe-prt.write-avg-lat,

                     ramf.cur-op, ramf.exp-op, ramf.exp-rd, ramf.exp-wr,

                     ramf.imp-op, ramf.imp-rd, ramf.imp-wr, rp-spl-node.write,

                     rp-spl-node.write-active, rp-spl-node.write-avg-lat,

                     rp-spl-node.write-ops, storage-volume.read-avg-lat,

                     storage-volume.write-avg-lat,

                     storage-volume.write-same-avg-lat]

    targets          [A0-FC00, A0-FC01, A0-FC02, A0-FC03, A1-FC00, A1-FC01,

                     A1-FC02, A1-FC03, CPU0, CPU1, CPU2, CPU3, CPU4, CPU5, CPU6,

                     CPU7, cluster-1]

     

    Contexts:

    sinks

     

    Ensure 1-1-A matches 1-1-B.  You just might have to restart the CLI to get the monitor to initialize.  Or you might have to destroy the monitor and then restart the CLI to have it brought up properly. 

     

    You can verify if 1-1-A is actually processing I/O by checking the director fc port stats as well.  Example (which I think is there in your code version - 5.2):

     

    VPlexcli:/> director fc-port-stats --director director-1-1-A

    Results for director 'director-1-1-A' at Thu May 28 20:41:36 UTC 2015:

     

     

    Port:               A0-FC00  A1-FC00  A1-FC01  A1-FC02  A1-FC03  A3-FC00  A3-FC01

    Frames:

    - Discarded:        0        0        0        0        0        0        0

    - Expired:          0        0        0        0        0        2        2

    - Bad CRCs:         0        0        0        0        0        0        0

    - Encoding Errors:  0        0        0        0        0        0        0

    - Out Of Order:     0        0        0        0        0        0        0

    - Lost:             0        0        0        0        0        0        0

    Requests:

    - Accepted:         0        0        0        1417990  1413962  2618098  2314758

    - Rejected:         0        0        0        0        0        0        0

    - Started:          0        0        0        1417990  1413962  2618098  2314758

    - Completed:        0        0        0        1417990  1413962  2618098  2314758

    - Timed-out:        0        0        0        0        0        0        0

    Tasks:

    - Received:         248191   0        0        0        0        2663517  2237415

    - Accepted:         248191   0        0        0        0        2663517  2237415

    - Rejected:         0        0        0        0        0        0        0

    - Started:          248191   0        0        0        0        2663517  2237415

    - Completed:        248191   0        0        0        0        2663517  2237415

    - Dropped:          0        0        0        0        0        0        0

     

    Issue the above command a few times (I only happen to have one front-end port enabled on my system above.)

     

    I see you're running 5.2 code.  There is also newer 5.3 target code as well.

     

    Finally, I would suggest opening up a Service Request with Support.

     

    Gary

  • 2. Re: Only one VPLEX director is handling all data transactions.
    Nitss

    Appreciate for your detailed input Gary! I did check all those counters and seems like 1-1-A is doing data transactions but the GUI Monitor chart is not updating.

  • 3. Re: Re: Only one VPLEX director is handling all data transactions.
    garyo

    Hi,

     

    Okay, then here's what I suggest:

     

    First, if you haven't done this in a while, restart the VPlexCLI process.  From the mgmt station prompt (not the VPlexCLI prompt):

     

         sudo /etc/init.d/VPlexManagementConsole restart

     

    Note:  This will terminate the sessions for any users connected to the CLI or GUI.  After issuing that restart, wait about a minute or so, then re-login into the Unisphere GUI.  Also check the monitors from the CLI if they've changed.

     

    If that doesn't do anything, try issuing this command in the VPlexCLI to restart (kick) the monitors:

     

        emc-internal configuration restart-perpetual-monitors


    For the GUI to pick-up the change, I believe you have to restart the process again.(sudo restart above).


    If even after that you have issues and 1-1-A's not updating, then more drastic steps may be required:


    1. From the VPlexCLI, enable engineering-mode.

    debug engineering-mode on

    2. From the VPlexCLI, destroy the perpetual monitors for the director (1-1-A in your case):

    monitor destroy <monitor name>
    In your case, for example:
    monitor destroy director-1-1-A_PERPETUAL_vplex_sys_perf_mon_v19

    3. Restart the CLI:

    sudo /etc/init.d/VPlexManagementConsole restart


    The monitor will get automatically recreated when the CLI starts up.


    I'd also suggest upgrading to a 5.3 version of VPLEX code.

     

    Hope that works!

    Gary


  • 4. Re: Only one VPLEX director is handling all data transactions.
    Nitss

    Cheers! That worked. Customer has plans to upgrade EMC Data Protection Suite soon, so that would cover VPLEX upgrade as well.

     

    Appreciate again for your valuable input Gary.

     

    -Nitss

  • 5. Re: Only one VPLEX director is handling all data transactions.
    garyo

    Glad to hear! 

     

    You're most welcome,

    Gary

  • 6. Re: Only one VPLEX director is handling all data transactions.
    amrkato

    Hi Guys,

     

    how to enable perpetual logs, I have a cluster but the existing perputual logs are very old, so how can start collecting new logs?