EMC VPLEX FOR SAP and ORACLE RAC: UNMATCHED HIGH AVAILABILITY OVER DISTANCE

On May 17th, 2012, EMC announced that Oracle has certified VPLEX Metro in a stretch cluster configuration which provides Oracle Real-Application Clusters (RAC) customers with unprecedented high availability and resiliency when used with 2 data centers located up to 100km apart - this new solution provides the capability of having RPO = 0 and RTO = 0 (RPO is Recovery Point Objective, or the amount of data that could be lost when an Oracle system is recovered, while RTO is Recovery Time Objective, or the amount of time needed to fully recover a failed system).

 

The EMC Solutions Group SAP Engineering team further built upon this ground break solution to implement the separation of the enqueue and message server (the SCS components) from the CI (Central Instance) as described in SAP OSS note 821904 (see attachment below), to provide for maximum high availability for a SAP environment with non-stop availability over a distance of up to 100km, even if the primary data center was to be completely non-operational.

 

The white paper (also available in presentation form) describes this industry-first solution which completely eliminates any single point of failure, and this solution is enabled by EMC VPLEX Metro, EMC Symmetrix VMAX, EMC VNX, VMware vSphere HA, Oracle RAC, Brocade networking, and SUSE Linux Enterprise Sever for SAP Applications.

 

The solution diagram below shows the following key points:

 

  1. VPLEX Metro provides data synchronization between 2 data centers located up to 100km, thus allowing for RPO = 0 and RTO = 0
  2. The VPLEX Witness (not shown in this diagram) provides arbitration between the 2 data centers in the event of failures of components in one data center, to make sure that I/O is automatically resumed at the appropriate (still running site)
  3. Oracle RAC provides continuous & distributed database operations between 2 data centers, in this case using a 4-nodes RAC cluster.  It is worth noting that when Oracle RAC is implemented with VPLEX, it can be managed as a local implementation with 2 data centers located up to 100km apart (see below for more information)
  4. SAP dialog, background, update, spool, and other work processes are actively working in both data centers
  5. The SAP Enqueue Server & Message Server (commonly referred to as the CI) runs in Site A, but can easily "float" to Site B using SuSe High Availability should the need arises, or when manually prompted (as in the case when maintenance is needed in Site A)

 

The resulting benefits of the solution are:

 

  1. Fully automatic failure handling by eliminating single points of failure at all layers in the environment to build a distributed and highly available SAP system
  2. Increased utilization of hardware and software assets: active/active use of both data centers, automatic load balancing between data centers, zero downtime maintenance
  3. Simplified SAP high-availability management & simplified deployment of Oracle RAC on Extended Distance Clusters
  4. Reduced cost by increasing automation and infrastructure utilization

 

SAP Architecture.png

 

When Oracle RAC is implemented with VPLEX it can be managed as a local implementation:

 

  1. No ASM mirroring, no host CPU cycles
  2. No configuration failure groups, no path preferences no quorum disks
  3. Add nodes and storage non- disruptively and simply
  4. All reads done locally from cache
  5. Continuous DR testing
  6. Simplified recovery process
  7. Storage administration done by the storage administrator

 

 

VPLEX Simplifies Deployment and Management of Oracle RAC Over Distance

 

 

With VPLEX, Oracle RAC can be quickly and easily stretched across two sites without using host-based mirroring and experiencing all the complexities mentioned previously with Oracle ASM.  In fact, with VPLEX it’s managed as though it’s a single-site configuration.

 

Not only are the complexities of RAC with ASM eliminated, but VPLEX delivers the data to both sites simultaneously in an active/active configuration where all nodes can access it – at both sites.  If one of the sites were to fail, RAC continues to operate with NO DOWNTIME, NO RESTART and NO INTERRUPTION. 

 

This is a substantial improvement, and offers a number of benefits: no use of host CPU cycles to replicate data (smaller servers possible); DBAs don’t have to manage storage (as they would with the Oracle solution); complex cross-connected SANs don’t have to be built; DR testing is reduced (since both sites are running all the time instead of one site on standby); and most notably, customers continue to process data regardless of a site failure (or for data center relocations, etc).

 

Quite simply, VPLEX enables customer to quickly and easily deploy a highly-available, stretched Oracle RAC environment that offers active/active data access, is easy to manage and provides the high performance necessary for mission critical applications.

 

You can watch a demo of how Oracle stretched RAC combined with VPLEX Metro delivers non-disruptive, continuous business operation, even in the event of an entire site loss, by clicking here.

 

 

VPLEX Witness

 

The VPLEX Witness is a separate server, or server on a virtual machine, which resides in a failure domain that is separate from either of the two VPLEX clusters.  The VPLEX Witness provides VPLEX with the capability for applications to ride through any component or storage failures, including disaster level scenarios affecting entire racks of storage equipment simultaneously.  The VPLEX Witness provides automatic restart in the event of any server failures and all failures are handled automatically without any human intervention or complicated failover processes.  The witness can distinguish between a site or link failure, and enables hosts, servers and applications to fail-over in step with VPLEX, while still keeping applications online.

 

 

 

This VPLEX for SAP & Oracle RAC solution should also work with other OS and High Availability products:

 

  1. IBM AIX and PowerHA (formerly HA-CMP) 
  2. Red Hat Enterprise Linux and Red Hat High Availability add-on 
  3. Microsoft Windows Enterprise Server and Microsoft Cluster Server (MSCS), when used with Oracle RAC 
  4. Solaris and Veritas Cluster Server

 

 

 

For more detailed information regarding this solution download the attached white paper below.