Recently I got involved in another customer discussion around how to replicate data between two datacenters. The suggestion was to use Oracle ASM (with normal redundancy) instead of SRDF (or other SAN/Storage based tooling).

 

Reasons I have heard why customers would choose ASM over EMC tooling:

 

a) The claim that integration with Oracle would be better

b) Performance would be higher (i.e. lower latency because of parallel writes to both mirrors where SRDF would do the remote I/O in sequence)

c) Cost (no SRDF licences, ASM is free)

 

Although these statements might be partly true, I still recommend my customers to stay away from ASM mirroring (unfortunately they do not always follow my advice). OK, I am biased because I work for EMC, but still I would like to put things in the right perspective. So here a list of reasons why ASM might not be the best way to replicate data between datacenters:

 

  • Oracle host has to process every write twice as every write to an Oracle file has to be mirrored. This adds some CPU and I/O overhead and reduces the ability somewhat to process more workload. Expensive Oracle-licenced CPU's are now spending cycles on other stuff than application processing.
  • ASM can perform incremental updates after a link failure. However, this only works if the data that was disconnected has not changed in any way. If it was changed, you risk, best case, a full 100% re-sync of all data (which can take a very long time during which you have a severe performance impact, and, during this time you will have no D/R protection). Worst case, you will risk silent data corruption.
  • A two-datacenter setup cannot resolve split brain scenario's. Unless you deploy a 3rd (arbitration) site with 100% physically separate communication links to both the primary and the D/R location, you risk either split-brain scenario's (which can be a disaster for the business) or you risk downtime in case of a failure (eliminating high availability completely, which was the reason in the first place to mirror the data). Check http://www.oracle.com/technology/products/database/clustering/pdf/thirdvoteonnfs.pdf for more information on this requirement.(Note that with storage replication, because of the sequential write, you don't have this issue although for automated failover you need arbitration as well)
  • The setup is complex because you need to set up ASM failure groups correctly. Failure in correct setup means you mirror two volumes to a local site which can cause severe dataloss in case of a disaster. Failure to correctly setup priority paths can cause subtle performance impact which can be hard to diagnose. Check http://download.oracle.com/docs/cd/B28359_01/server.111/b28282/configbp005.htm and www.oracle.com/technetwork/database/clustering/overview/extendedracversion11-435972.pdf for more insight.
  • Any bug in the Oracle ASM or database code can cause issues. As an example, see the footnote (similar for storage but that tends to be much more robust and easier to monitor).
  • A failure of the storage connectivity can lead to both reads and writes being serviced over ISL (inter-switch links between the two datacenters) again causing severe performance impact (which will get even worse after the storage connectivity has been restored, due to re-silvering)
  • No consistency is possible between application data and database data because ASM only replicates databases. The exception is if you put all flat files on Oracle ACFS - which is a quite recent Oracle feature and hasn't been proven in the field yet. This very problem has been the reason for Oracle themselves to implement EMC Recoverpoint for their internal business applications and to endorse Recoverpoint in a joint EMC/Oracle whitepaper as a viable solution.
  • No consistency is possible between multiple databases. If you have a direct transaction dependency between databases, any failover might result in slight checkpoint timing issues causing transactions being applied to one but not both databases.
  • Only synchronous replication is possible, there is no fallback option to async to mitigate performance impact during peak workloads, upgrades, stresstesting, etc.
  • During a storage failure, transactions being processed might hang until the ASM layer decides that one site has failed and it will continue with one failure group only. Depending on the settings, if the failure is intermittent (such as caused by a bad but not completely broken cable) the transactions will experience good performance, hang for a while, be slow during ASM resilver, perform well again for a while and the cycle repeats. This can be very, very hard to diagnose.
  • Rolling disasters can cause complete unability to do failover. For example, a fire in datacenter A causes the remote links to break but database processing continues on site A. A moment later the link comes back for a while and resilvering remote ASM data to site B starts. During the resilver but before being complete, the fire completely breaks the remote links. After 30 minutes, the fire causes the servers to fail and corrupt or even destroy data at site A so manual failover to site B is required. However, during the aborted re-silvering the data at site B is completely corrupt so a full tape restore is required, taking many hours of downtime and causing severe loss of transactions
  • There is no well established method to test D/R capabilities. Manually forcing link failures will immediately cause performance issues and other risks. In the real world this causes customers to be reluctant to perform DR testing after going live, causing them to be in production for years without ever being able to test if their D/R scenario works.
  • Taking storage-based snapshots will be challenging at best because no cloning tools supports consistent snapshots to be taken from two separate storage boxes at the same time (which is needed because of ASM failure groups). Although technically possible with EMC, this needs to be scripted and requires special multi-session consistency implementation.
  • Every Oracle cluster needs to be carefully configured specifically for ASM mirroring.
  • Every Oracle cluster needs to be monitored for ASM mirroring to be in sync, including the link utilization.
  • Adding a 3rd resp 4th cluster node and so on, on one of the two locations is equally complex.
  • Every storage reconfiguration (i.e. adding or moving storage volumes) needs to be performed with these complexities in mind. Adding a volume without properly setting up the failure groups renders the whole environment unable to failover.
  • Another replication method is required for pre-Oracle 10 environments, for non-Oracle databases, for fileservers, for VMware environments, for Email and content, etc. This can be SAN based but then Oracle would be the single exception for replication. If the preference is for application replication then every application type would require its own method, causing a very complex D/R runbook with multiple dependencies, logical replication instances, versions, etc etc. It is debatable whether it would be possible to sustain a datacenter failure without suffering major downtime and/or dataloss when dealing with such a complex environment. It would be near impossible to perform D/R testing for more than a single application or sub component.
  • Nobody (not even from Oracle, I verified) seems to understand how Oracle deals with concurrent writes, where one makes it to site A, another makes it to site B but both do not complete fully when a failure happens (such as a power outage). The Oracle cluster should be able to recover but might require special understanding from Oracle administrators and the devil is in the details. Not being able to deal with this causes data corruption, possibly without being detected for a longer period.

 

*) Footnote (from Oracle documentation):

 

Known Issues

If the NFS device location is not accessible,

1. Shutting down of Oracle Clusterware from any node using “crsctl stop crs”, will stop the stack on that node, but CSS reconfiguration will take longer. The extra time will be equal to the value of css misscount.

2. Starting Oracle Clusterware again with “crsctl start crs” will hang, because some of the old clusterware processes will hang on I/O to the NFS voting file. These processes will not release their allocated resources such as PORT.

 

These issues are addressed and will be fixed in future versions.

Conclusion: Before stopping or starting the Oracle Clusterware, it should be made sure that the NFS location is accessible using the “df” command for example. If the command does not hang, one may assume that the NFS location is accessible and ready for use.