Thanks to everyone who attended our VMware session. We had a great question from our audience on the initial Monster VM findings: How was the network configured?
In the presentation we showcased how three large Oracle RAC virtual machines were vMotioned using three scenarios:
- vMotion of monster VM on server A to server B. The final state was server A with no virtual machines and server B with two monster virtual machines. Total memory size of the vMotion was 156 GB.
- vMotion of monster VM from server A to server B AND monster VM from server B to server A. The final state was sever A had the VM originally from server B and server B had the monster VM from server A. Total memory size to vMotion 312G.
- vMotion of all three monster VMs: The VM on server C moves to server A and the VM on server A moves to server B and the VM on server B moves to server C. We have nicked named this the "merry-go-round" scenario. Total memory size to vMotion 468 GB.
All the vMotion scenarios were part of an Oracle RAC configuration with each virtualized node heavily loaded using Benchmark Factory. Our goal was to place extreme stress on Oracle RAC nodes and VMware infrastructure to validate that there was absolutely no data loss during the vMotion tests.
Here is how the network topology was configured for the monster VM study. Please keep in mind that this might change with the final release of the study as we are working hard to show the best possible performance.
We are using four IP networks in the test:
1) VMware-private vMotion network
2) Public Oracle-Application network – e.g., SQL queries
3) OS and VMware management network
4) Private Oracle-RAC interconnect
The VMware-private vMotion network:
To optimize the vMotion of the monster VMware Oracle VMs this network was segregated from the others and used two dedicated NICs on each Cisco UCS blade. Additionally, the vMotion network has it own Virtual LAN (VLAN), VMkernel ports (commonly called, "VMK") and distributed vSwitch.
Using two VMkernel ports provides the opportunity to optimize vMotion for improved throughput. The process involves creating the VMkernel ports on a virtual switch (vSwitch) with different IP addresses. Next mark them to be used by vMotion and edit their NIC teaming settings so that they use different vSwitch uplink as the active uplink.
For example, if vmnic0 and vmnic1 are the uplinks to your virtual switch where these port groups are located. In the vSwitch two port groups are created and they are called, “vmotion1” and “vmotion2”. In the vmotion1 group you would configure the use of vmnic0 as active and vmnic1 as standby. In vmotion2 group you would configure vmnic1 as active and vmnic0 as standby.
Table name: NIC Teaming
Here are some references to read about VMkernel ports:
Once completed the private vMotion network will use multiple NICs for vMotion migration. With very busy Oracle databases the change rate in memory can be challenging to execution of vMotion. The use of multiple VMK ports, dedicated NICs and VLAN optimizes vMotion of monster VMs supporting Oracle RAC nodes very quickly and with no data loss.
Watch for these vMotion times to improve but our initial testing has shown:
- One 156 GB memory VM used 10 Gbps of bandwidth and completed 81 seconds
- Two 312 GB memory VMs used 14 Gbps of bandwidth and completed in 121 seconds
- Three 468 GB memory VMs used 11 Gbps of bandwidth and completeed in 173 seconds
The three non-vMotion networks share two NICs on each Cisco UCS blade and share one distributed vSwitch. Traffic was isolated using a VLAN for each of the three networks:
- 1. Public Oracle application network
- 2. OS and VMware management network
- 3. Private Oracle RAC interconnect network
Presentation of NICs:
The Cisco UCS NICs are presented to the VMware hosts (ESXi hypervisor) as 20 GbE devices. The vmxnet3 vNIC from the hypervisor to the virtual machines are presented as 10GbE devices.
The use of Multiple NIC vMotion is a design guideline meant to improve the performance of vMotion for production databases. In a perfect world DBAs and VMware administrators would rarely have to consider moving RAC nodes from one server to another. Architecting for the unplanned means reviewing the multi-NIC approach and incorporating the design in the case you should ever have the need to vMotion one or more RAC nodes. Thanks again for joining our session at VMworld and look for more blogs from Oracle OpenWorld.