VPLEX: 0x8a263069 / 0x8a4a31f4 Cluster Witness Server failure or lack of connectivity with the Cluster Witness Server

Environment:

EMC VPLEX Series

EMC VPLEX VS1

EMC VPLEX VS2

EMC VPLEX-Local

EMC VPLEX-Metro

EMC VPLEX-Geo

EMC GeoSynchrony

Description:

How to troubleshoot event 0x8a263069 cluster-witness connection issues.
How to troubleshoot event 0x8a4a31f4 management network connection issues.

The Cluster Witness Server connection is lost. The cluster reporting this event has been unable to establish communication with the Cluster Witness Server.
Symptom Code: 0x8a263069

description sample:

[CDATA[The cluster 1 has been unable to establish communication with the Cluster Witness Server for 60 seconds. [Versions:MS{D10.0.0.193.0, D4_MSB_7, D10.0.0.218}, Director{2.1.194.0.0}] RCA: The cluster reporting this event has been unable to establish communication with Cluster Witness Server. This may be due to the failure of the server or loss of network connectivity.

Related Symptom Code: 0x8a4a31f4

Desciption sample:

CDATA[The Management Network for cluster xxx.xx.xxx.xx can not reach its partner cluster. [Versions:MS{D10.0.0.193.0, D4_MSB_7, D10.0.0.218}, Director{unknown}] RCA: Management of remote and distributed resources is no longer possible.

The cluster-witness communication failure, report with event code 0x8a263069, is reported in the diagnostics under the /cluster-witness context in the CLI.

The following example shows a sample output of what might be seen under the cluster-witness CLI context following this issue if it lasts longer than 60 seconds:

VPlexcli:/cluster-witness> ll components   
                                

/cluster-witness/components:
Name        ID  Admin State  Operational State  Mgmt Connectivity
----------  --  -----------  -----------------  -----------------
cluster-1   1   enabled      in-contact         ok
cluster-2   2   enabled      in-contact         ok
server      -   unknown      -                  failed

VPlexcli:/cluster-witness> ll components/*

/cluster-witness/components/cluster-1:
Name                     Value
-----------------------  ------------------------------------------------------
admin-state              enabled
diagnostic               INFO: Current state of cluster-1 is in-contact (last
                         state change: 2 days, 8290 secs ago; message from server: 0 days, 76 secs ago.)
id                       1
management-connectivity  ok
operational-state        in-contact /cluster-witness/components

/cluster-2: Name                     Value
-----------------------  ------------------------------------------------------
admin-state              enabled
diagnostic               INFO: Current state of cluster-2 is in-contact (last
                         state change: 2 days, 8290 secs ago; message from server: 0 days, 76 secs ago.)
id                       2
management-connectivity  ok
operational-state        in-contact 

/cluster-witness/components/server: Name                     Value
-----------------------  ------------------------------------------------------
admin-state              unknown
diagnostic               WARNING: Cannot establish connectivity with Cluster
                         Witness Server to query diagnostic information.
id                       -
management-connectivity  failed
operational-state        -

 

In an IP Network, there will be occasional packet drops that will impact the communication between the VPLEX clusters and between the clusters and the cluster-witness server. A drop of a minute or less is acceptable. Anything more than that can have an impact on the communication between the VPLEX clusters and the Cluster-witness.

 

Resolution:

This issue may be caused by failure of the Cluster Witness Server or loss of management network connectivity between the management server and Cluster Witness Server.

The following is a list of possible causes:

  • Loss of physical connectivity (cabling, port, router/bridge failure, network packet drops).
    Check all physical connections and cabling.
  • Route misconfiguration.
  • VPN misconfiguration or failure.
  • A failure of the Cluster Witness Server VM, its OS, and so forth.
  • A failure or the suspension or powering off of the Cluster Witness Server VM.
  • A hardware failure or shut down of the VMware ESX server hosting the Cluster Witness Server VM.

Procedure:

  1. Type the VPN  status command as shown in the following example to verify if you can reach the Cluster Witness Server:

    VPlexcli:/> vpn status                            
    Verifying the VPN status between the Management Servers...
    IPSEC is UP
    Remote Management Server at IP Address 10.31.24.162 is reachable
    Remote Internal Gateway addresses are reachable

    Verify the VPN status between the management server and the Cluster
    Witness Server...
    IPSEC is not UP
    Cluster Witness Server at IP Address 128.221.254.3 is not reachable
  2. Ping the Cluster Witness Server's private IP address from the public interface on the local management server as follows:

    ping -I eth3 128.221.254.3
  3. Ping the Cluster Witness Server's host IP address from the public interface on the local management server as follows:

    ping -I eth3 <CWS_host_server_public_IP_address>
  4. Resolve all physical network connectivity issues.
  5. Consult with the network administrator and VMware administrator, to resolve any issues in reaching the private or public IP address of the Cluster Witness Server.

IMPORTANT: If you cannot restore connectivity to the Cluster Witness Server and you expect this failure to be prolonged, type the following command to disable Cluster Witness:

VPlexcli:/> cluster-witness disable --force-without-server

WARNING!

This command disables the Cluster Witness on both clusters when there is no access to the Cluster Witness Server (if both clusters are connected with one another). This is extremely important because if there is an additional failure or inter-cluster network partition, all distributed volumes in all consistency groups will suspend I/O due to lack of any guidance from the Cluster Witness Server. By disabling Cluster Witness, I/Os for VPLEX Distributed virtual volumes in consistency groups will follow the pre-configured detach rules for the consistency groups. After connectivity with the Cluster Witness Server is re-established, type the following command to enable Cluster Witness again:

VPlexcli:/> cluster-witness enable

 

NOTE: If the cluster witness is down for a prolonged period of time (normally five minutes or longer), the potential for DU could happen if the WAN-COM ports fail

 

Reference:

EMC Support Solution Number: 84389