There are long standing debates about the use of memory reservations in VMware vSphere clusters. The proponents argue that memory reservations are necessary to ensure the memory of critical VMs does not impacted if the cluster comes under memory pressure. The opponents argue that they will guarantee that the cluster does not come under memory pressure, because they will manage the memory over commitment and insure that it does not happen. (I find this argument hilarious, as we all know, no one monitors this). They also argue that it makes the VMs harder to manage, since they require individually setting each one and it makes the DRS, distributed resource scheduler, work harder and messes up the slot counts, thus reducing the number of VMs that can be put into a cluster.
What both sides of the “debate”, better described as a war, are missing is the benefit of memory reservations to an Oracle database. A well-tuned database will retrieve most of its IO from the database buffer cache. DBAs call this metric the buffer cache hit ratio and the best practice is to try to keep it above 90%. While memory access is much faster than disk access, memory access speeds can very significantly depending on how it is accessed. For instance, with Nehalem chips memory access speeds have dramatically increased over pre-Nehalem chips. The reason for this is that the Nehalem architecture localized memory management to individual sockets. So when accessing memory from the same socket your process is running on, you can achieve access speeds as fast as 10 nano-seconds, but when accessing memory from another socket it could take as long as 44 nano-seconds. 1 nano second is clearly a short amount of time, but when a query does 100 million buffer gets, that difference becomes very noticeable. We have actually measured the difference to a typical OLTP system and the difference is about 15% faster transaction times.
Now let’s take a look at what memory reservations actual do to the memory on the underlying ESX server. When Oracle does a buffer cache lookup, it actually performs a linear scan of the database block buffers that exist in the memory pages of the OS. When you implement hugepages , the buffer cache lookup has to scan fewer pages and results in much faster response times. This same scenario is passed through to the virtual layer and, starting with vSphere 5.0, the ESX server is also utilizing large pages. However, since the memory management of the ESX server MAY have to convert those pages to 4k pages, it creates hashed addresses in the large pages for those 4k pages. So while the memory is actually in contiguous 2MB pages, access to those pages is done though the 4k addresses. For many applications this does not matter as their access is direct. For applications like Oracle or JVMs, where access to memory is achieved by doing a linear scan of the addresses, this adds measurable overhead. In our testing it added 16% overhead to the actual memory lookup and as high as 15% increase in user response times for a well-tuned OLTP database.
So are memory reservations a benefit or bane? I supposed it depends on your point of view, but for an Oracle database, the benefits far outweigh the small additional setup required. For the VMware administrators, we can skip adding reservations for our development and test databases, since they don’t really require production level response times. Also, managing and ESX cluster with reservations are much simpler and more effecting to use HA based on percent and not slots.