Find Communities by: Category | Product

1 2 3 4 Previous Next

Everything Oracle at Dell EMC

50 Posts authored by: Oracle Heretic

Bitly URL:


Tweet this document:

The question of whether a given Oracle application is mission critical is not interesting.


Follow us on Twitter:


As an attorney, I am frequently asked this question: "Is blah, blah, blah illegal?" To which I have a standard response: "The question of whether a given activity is illegal is uninteresting. A more interesting question is: What bad thing happens to you when you do blah, blah, blah?"

The question of whether a given application is mission critical is similarly uninteresting. A more interesting question is: "What bad thing happens to the business if the application fails?" And another interesting question: "How can we protect the business from the bad consequence of the application failing?"

In my experience, this varies dramatically based upon the nature of the application. For example, the failure of a typical Oracle application will result in severe consequences to the business. This is because of the nature of the beast: Oracle is typically used to manage the primary business data of the enterprise. Thus, loss of even a single Oracle transaction (say the trading instructions of a customer of a stock broker) would result in hard, severe legal consequences.

In this context (i.e. a traditional 2nd platform application like Oracle), concepts like backup, clustering, and remote replication all make perfect sense, and EMC has exceptional products to supply those needs.

A 3rd platform application is typically very different. Take MongoDB, an application with which I am fairly familiar. Mongo folks will consistently tell you: "You are going to lose some data. Get over it!" Thus, Mongo is not used for any purpose where transactional consistency is required. Usually, the customer will implement Mongo for an intermediate stage, scratchpad type of function.

Also, Mongo datastores are often astronomically large. (Petabytes are common.) It is simply not possible to back up something that big.

Further, Mongo implements sharding, a geographically dispersed form of redundant replication. For this reason, the loss of a single Mongo server is simply uninteresting. No consequences occur at all from this, other than possibly a minor, temporary performance blip.

For these reasons, clustering, backup and remote replication are not very interesting for a 3rd platform application like Mongo (although there is some variability in that).

And therein lies the challenge for a company like EMC, which has traditionally dominated in the mission-critical 2nd platform types of applications, similar to Oracle. But then again, EMC has a long and storied tradition of reinvention. I have no doubt that EMC will eventually become one of the dominant players in the 3rd platform.

Bitly URL:


Tweet this document:

Virtual Storage Zone (@cincystorage): #Oracle performance on #EMC #XtremIO.


Follow us on Twitter:


Mark May (Virtual Storage Zone) recently posted a very nice piece on XtremIO performance with Oracle. Here is an excerpt:


XtremIO is EMC’s all-flash scale out storage array designed to delivery the full performance of flash. The array is designed for 4k random I/O, low latency, inline data reduction, and even distribution of data blocks.  This even distribution of data blocks leads to maximum performance and minimal flash wear.  You can find all sorts of information on the architecture of the array, but I haven’t seen much talking about archive maximum performance from an Oracle database on XtremIO.


The nature of XtremIO ensures that’s any Oracle workload (OLTP, DSS, or Hybrid) will have high performance and low latency, however we can maximize performance with some configuration options.  Most of what I’ll be talking about is around RAC and ASM on Redhat Linux 6.x in a Fiber Channel Storage Area Network.

Bitly URL:


Tweet this document:

#EMC IT: Best practices for virtualizing your #Oracle database – #VMware datastores.


Follow us on Twitter:



Related posts:


Darryl B. Smith with EMC IT recently published a great new blog post called Best Practices for Virtualizing Your Oracle Database – Datastores, which is part 4 of a series. Here is an excerpt:


First off, my apologies for delaying the last part of this four part blog for so long.  I have been building a fully automated application platform as a service product for EMC IT to allow us to deploy entire infrastructure stacks in minutes – all fully wired, protected and monitored, but that topic is for another blog.


In my last post, Best Practices For Virtualizing Your Oracle Database With VMware, the best practices were all about the virtual machine itself. This post will focus on VMware’s virtual storage layer, called a datastore.  A datastore is storage mapped to the physical ESX servers that a VM’s luns, or disks, are provisioned onto. This is a critical component of any virtual database deployment as it is where the database files reside.  It is also a silent killer of performance because there are no metrics that will tell you that you have a problem, just unexplained high IO latencies.

Anyway, I love Darryl's stuff and highly recommend this blog post. Enjoy.


Bitly URL:


Tweet this document:

#VMworld 2014 public voting is now open. Please vote for your favorite sessions. #EMC #EMCElect


Follow us on Twitter:


VMworld 2014 looms, and the public voting for technical sessions is now open.


The following list includes EMC-sponsored sessions which are of interest to an Oracle audience:



There are a number of other EMC-sponsored sessions relating to Microsoft SQL Server and SAP HANA which are interesting, at least to me:


  • Session 1701 A Flash-Optimized Reference Architecture for Consolidating and Building a High Performance Virtualized Microsoft SQL Server Infrastructure on VMware
  • Session 2309 Reduce Your Business Risks and Lower Deployment Costs by Virtualizing SAP HANA
  • Session 1328 Choosing the Appropriate Storage Solutions for Microsoft Tier-1 Applications


Those interested in the Hadoop / OpenStack stuff might also enjoy these:


  • Session 1397 Benefits of Virtualizing Hadoop
  • Session 1314 Scaling Your Storage Architecture for Big Data - How the Isilon Server Fits Well with Virtual Hadoop Workloads


Public voting closes this Monday, i.e. May 12, at 5 P.M. PDT. Please vote early and vote often. Thanks!


Bitly URL:


Tweet this document:

Soooo, what's a converged infrastructure, exactly? #EMC #EMCElect


Follow us on Twitter:


I have been looking at a lot of stuff relating to the notion of a converged infrastructure. The essential idea is this: The legacy platform, i.e. a single-image, physically booted server, typically attached to a SAN, is hopelessly outdated. This design is essentially 30 years old, and does not take into account many interesting recent technology changes, notably:


  • Virtualization
  • Deduplication
  • Flash storage, including both PCIe-based flash and SSDs


Thus, the converged infrastructure platform looks dramatically different from the legacy platform. Essentially, a converged infrastructure consists of a large cluster which combines storage and compute into a single layer. A clustered I/O subsystem is extended across all of the nodes in this cluster, and all I/O to this shared storage pool is reflected on all nodes. This is either implemented in block or file (NFS), depending on the vendor involved. All of the storage hardware is direct attached. No SAN, in other words. Thus, a converged infrastructure is designed for data center consolidation, similar to the virtualization hypervisor market. Some of the converged infrastructure vendors run natively on a hypervisor, and some do not. More on this later.


The converged infrastructure market is crowded, that's for sure. It would be difficult in a blog like this to cover all of them, so I will focus on three:



And, of course, my focus is on the Oracle space. As usual, I think in terms of running Oracle as efficiently and inexpensively as possible. Also, in terms of hypervisors, I will only address VMware. (I have no technical exposure to Hyper-V or KVM.)


Starting with Nutanix, the architecture is best described in the Nutanix Bible by Steven Poitras. I have slightly reworked one of his graphics to reflect my bias, again Oracle on VMware:


nutanix architecture.png


Essentially, all of Nutanix's IP runs in a VM, referred to as the CVM. The hypervisor and the CVM boot off of a small partition. Then the CVM connects to the other nodes in the Nutanix cluster, and assembles a clustered file system, which is published out as an NFS export. The hypervisor then mounts this export as an NFS datastore. From there, all user space VMs (including Oracle VMs) are booted off of .vmdk files on the NFS datastore. All block-level I/O (other than boot) is handled by the CVM, which by-passes the virtualized controller and uses the native LSI driver with direct path I/O. (thanks to Michael Webster for his correction on this point).


Again, this is from the VMware perspective. The architectures for other hypervisors are different. But I digress.


Moving to SimpliVity, my source for their technical stuff would be the OmniCube Technical Deep Dive. (At 10 pages, this deep dive is not quite so deep as I would prefer, for sure.) The SimpliVity architecture is very similar to Nutanix, in most respects except one: Simplivity adds a piece of hardware which they call the OmniCube Accelerator Card (OAC). Otherwise, the architecture diagram for Simplivity looks just like Nutanix:


simplivity architecture.png


Again, all of SimpliVity's IP runs in the OVC VM, other than the OAC itself, of course. The OAC is a custom-built PCIe card, which among other things acts as the I/O controller. Like Nutanix, the OVC exports an NFS mount, which ESXi mounts as an NFS datastore. From there, all user space I/O, including Oracle, runs through the .vmdk layer within the hypervisor.


Now, looking at ScaleiO, the architecture is dramatically different from either Nutanix or SimpliVity. First of all, Nutanix and SimpliVity are both custom-built hardware platforms. ScaleiO is a piece of software. It is designed to be layered on top of a normal, legacy platform, and provide a slick, easy path to a converged infrastructure. Specifically, ScaleiO does not require the use of a hypervisor, and thus can run in a physically-booted context. This is one of ScaleiO's main advantages over hypervisor-based converged platforms like Nutanix and SimpliVity.


ScaleiO consists of two major components: The ScaleiO Data Client (SDC) and the ScaleiO Data Server (SDS). In a Linux context (again, the only OS I care about deeply), both the SDS and the SDC are implemented as kernel loadable modules, similar to device drivers. The SDS manages the local storage hardware, connects with other nodes to the ScaleiO cluster, performs cache coherency, etc. The SDC then connects to the SDS, which appears to it as a SCSI target. Thus, the SDS publishes storage objects to the SDC, which the local OS sees as normal SCSI LUNs. From there, the local OS simply performs normal I/O.


The stack diagram for a physically-booted ScaleiO cluster node could not be simpler:


scaleio phys architecture.png


ScaleiO can also be run in a virtualized context. In this case, predictably, ScaleiO looks very similar to Nutanix or SimpliVity, in that it has a controller VM as well, called the ScaleiO VM (SVM). This SVM runs both the SDS and the SDC. All I/O is channeled through the SVM. However, everything in ScaleiO is implemented in a block, rather than file, manner. Thus, the ESXi hypervisor sees an iSCSI target which it converts into a VMFS file system, rather than using NFS. (The SVM provides the iSCSI target for this purpose.)


Here is how ScaleiO looks in a virtualized configuration:


scaleio virt architecture.png


The other interesting thing about ScaleiO is that it allows you to run either in a converged or a diverged manner. Since the client (SDC) and server (SDS) are separate components, you can run them on separate hardware, effectively turning the ScaleiO server cluster into a storage array. See the following graphic for an example (thanks to Kevin Closson for this):




Of course, you can also run ScaleiO in a converged manner, in which case the platform looks very much like Nutanix or SimpliVity (with the exceptions noted).


Now, looking at each of these architectures in the context of running Oracle, it appears that ScaleiO has the obvious edge. This is because:


  • Both Nutanix and SimpliVity require you to virtualize in order to run on their platform. ScaleiO does not. Even the most ardent proponent of virtualizing Oracle (and I would certainly qualify on that score) would want to maintain the option to run on bare metal if necessary. Adopting a platform which requires the customer to virtualize all Oracle workloads is not going to work for many Oracle customers.
  • The use of a VMware NFS datastore as the primary container for Oracle storage is problematic, especially in an Oracle context. While I was with the EMC Global Solutions Organization, we tested VMware ESXi-mounted NFS datastores for Oracle datafiles. This configuration had a huge performance impact, relative to either normal NFS (i.e. directly mounted on the guest OS, and using Oracle Direct NFS Client) or block-based VMFS, using conventional SAN storage. There is no reason that this would be any different in a converged context. Think about it. The code path for an I/O on an Oracle ASM diskgroup which is being stored on a .vmdk file which is, in turn, on an NFS datastore is enormously long. Compare that to ScaleiO, especially in a physically-booted context, where the I/O code path is no more lengthy than using a normal device driver!
  • The diverged approach for ScaleiO is arguably custom-built for Oracle. I have made a pretty good living for the last (approximately) 17 years of my life by understanding one thing: Oracle is expensive, and therefore the Oracle-licensed CPU is the single most expensive piece of hardware in the entire configuration. Offloading any workload from the Oracle-licensed CPU onto a non-Oracle licensed CPU is typically a very profitable decision. (Arguably, this is why Oracle database servers adopted storage array technology so widely: By offloading utility operations like snaps, test / dev cloning, data warehouse staging, backup, and the like onto a storage array, the customer preserves the precious Oracle-licensed CPU to do what it does best: Run queries.) Both Nutanix and SimpliVity require the customer to run in a converged manner, thus using the Oracle-licensed CPU (on the ESXi host in this case) to run storage I/O operations. That's wasteful of that precious Oracle-licensed CPU. Thus, it is entirely possible that a fully converged infrastructure may be a poor fit for Oracle, because of simple economics. By enabling a diverged configuration (i.e. looking more like a traditional SAN storage array) ScaleiO neatly optimizes the Oracle CPU, while preserving many of the advantages of a converged platform.


Of course, it remains to be seen how this all pans out. It's early. For now, though ScaleiO is looking good to me.

EMCWorld2014Icon.jpgHow to backup, virtualize, protect, etc., your Oracle Database

Bitly URL:


Tweet this document:

Recorded #Oracle-related technical sessions from #EMCWorld 2014. #EMCElect


Follow us on Twitter:


The following recorded Oracle-related technical sessions are on the live on the virtual EMC World 2014 website:


San fran.pngOOW 2014 San Francisco

Will you be there?

Bitly URL:


Tweet this document:

#Oracle OpenWorld 2014 call for papers is now open. #EMC #EMCElect


Follow us on Twitter:


The Oracle OpenWorld 2014 call for papers is now open. The deadline to submit is April 15, 2014.


As usual, I will submit this year. Here is what I am thinking about submitting, which you are free to use as an example:


Session Title:

Near-Realtime, Cross-Platform Oracle Migration: RISC / UNIX to x86 / Linux



The vast majority of RISC / UNIX Oracle database servers have been migrated to x86 / Linux. Cold migration is simple and easy, but requires an lengthy downtime. All of the remaining RISC / UNIX database servers are enterprise-critical, with hard up time requirements in their SLAs. Migrating these database servers can be one of the most challenging tasks a DBA faces.



This session will explore in detail the technologies and strategies that enable near-realtime migration of RISC / UNIX database servers onto x86 / Linux. This includes:




  • Oracle GoldenGate
  • Dell / Quest SharePlex
  • Informatica Data Replicator
  • RMAN incremental backup with cross platform conversion
  • Data Guard


Hope to see you there.

Tuning-fork.jpgThe Complexity Challenge

The Data Volume Challenge

Bitly URL:


Tweet this document:

Turning the black art of #Oracle performance tuning into a science. #EMC #EMCElect


Follow us on Twitter:


This content was contributed by Brian Lomasky with DBA Solutions, Inc. Enjoy.


Customers frequently deploy a database application on an infrastructure using hardware and software from multiple vendors.


After the application is deployed, it may not perform well enough to meet the business's requirements.  Reports might be taking too long to run.  Service level agreements might be missed.


When this happens, one of two things usually occurs:


  1. Blame a random vendor for their problems (typically with no factual data to back up their complaint). They just need to find someone to blame, and they usually pick on the storage vendor.
  2. Ask one or more of their vendors for help in resolving their problem.


The above usually happens because performance tuning is usually considered a "black art", and the staff usually has no tools, no time, no knowledge, and no experience to be able to locate and then resolve any bottlenecks that are inhibiting optimal performance.


As slow systems cost a company a great deal of money, it is critical that the causes of poor performance are quickly identified and resolved.


Performance Tuning Challenges:

There are 2 critical challenges to performance tuning that need to be solved in  order to result in success:


The Complexity Challenge

Due to the multi-vendor hardware/software infrastructure components, it is almost impossible to obtain technical expertise for a customer's entire database infrastructure.


Using resources that only understand a specific component of the customer's infrastructure typically results in "finger-pointing", in which a different vendor (or company department) is blamed for the sub-optimum performance.


As performance bottlenecks or scalability limitations may exist anywhere throughout a customer's infrastructure, an in-depth knowledge of the internal architecture for all infrastructure components is necessary to be able to provide root-cause analysis for any bottlenecks.


Due to this, the "standard" is to invite each one of your vendors to a meeting, with the "marching orders" of "Can everyone please work together to help solve our performance problem?" Since vendors can't refuse their customer's request to attend such a meeting (otherwise, they won't have a customer for too long...), they usually send whoever is "on the bench" in the local sales office.  This means that there is usually a group of "warm bodies" attending the meeting.  Normally, none of them have the tools, time, knowledge, or experience to be able to diagnose the root cause of the customer's database infrastructure problem.


The Data Volume Challenge

Each infrastructure component (database server, operating system, logical volume manager, filesystem, storage subsystem, database, application, and network) has a very large number of different statistics that can be measured.


As there are no tools and no time to measure everything, common industry practice is to guess-timate where a bottleneck might exist, and then take a specific measurement in order to validate the guess.


If the measurement shows nothing, then the guessing process starts all over again. If the measurement shows a bottleneck, then time and money is committed to resolve that specific bottleneck. But frequently, the "wrong problem" gets fixed, as it's not the root cause of the performance bottleneck, and the customer is still left with a poorly performing system. Unfortunately, this "random guessing" wastes a tremendous amount of time and money, and rarely locates the root cause of the performance problem.  If vendors can't locate a problem in their specific area, the outcome of the meeting is usually "finger-pointing", in which the blame for the poor performance is deflected onto some other person or vendor. They realize that they can't help the customer, so the only thing they can do is to point fingers and blame someone else.


Unfortunately, this is the "standard" in the industry.


Based upon the dismal failure of such a multi-vendor meeting, what many customers do as an alternative is to buy some "random hardware" from a "random vendor", praying that the "new and improved" hardware will fix the problem. However, there are several problems with this approach:


  • This is the most expensive solution for a customer to implement (physical and operating costs as well as software license(s) and maintenance fees!)
  • This method treats the symptoms, but not the root cause of the problem.
  • At best, buying more hardware is a "temporary band aid", as it does not treat the root cause(s).  Therefore, the performance bottleneck may reoccur in a short period of time.
  • "Prayer" should not be the methodology to resolve a performance bottleneck.
  • Even if a company has the budget to buy more hardware, wouldn't it be best for the company to know what SPECIFIC hardware is needed to provide additional bandwidth for a performance problem, rather than just buying some "random" piece of hardware?
  • Frequently, the fix for the root cause(s) of a performance problem involves nothing more than parameter changes and/or application tuning - Both of which cost a company nothing to implement (other than a short amount of time).


So up to this point, I've painted a pretty dismal picture about the state of performance tuning.


Solving the Performance Tuning Challenges:

The first half of the solution is to use a resource that has expertise across the entire database infrastructure: server, operating system, logical volume manager, filesystem, storage subsystem, database, application, and network, as only a true infrastructure-wide analysis eliminates all "finger-pointing"!


The second half of the solution is to measure every configuration and performance statistics across all database infrastructure components!


This unique methodology provides a holistic root cause analysis across the entire database infrastructure.  Nothing is missed, and there is no guessing or finger-pointing. Guaranteed results are provided in no more than 4 days, with management seeing an immediate ROI.

wall street journal.jpgIs Joe Tucci Into Cloud Computing?

Dinosaurs on the Savanna

Bitly URL:


Tweet this document:

Interesting @WSJ article compares #EMC's margins to #Oracle's. #EMCElect


Follow us on Twitter:


An interesting Wall Street Journal article published this month shows the following comparison between EMC's margins and those of other IT companies, including Oracle:



I was shocked at how high the Oracle margins are, actually. But nobody beats VMware, which is, of course, largely owned by EMC.

share toilet.jpgWhich Has Higher ROI?

Which is More Flexible?

Bitly URL:


Tweet this document:

#Oracle 12c multitenancy vs. OS-level virtualization: Which has the higher ROI? #EMC #EMCElect


Follow us on Twitter:


You might also like:

I loved this post from Allan Robertson on the Oracle Community DL today, and had to post it here:


One of the supposed benefits of Oracle Multi-tenancy and PDBs is isolation. In my opinion, Oracle multitenant isolation does not appear to come out well  against a  virtualized solution because:


  • Memory and processes shared and managed at container level -
  • Shared Online and archived redo - ARCHIVELOG is set at CDB level and not at PDB level due to shared redo log file. Lose online redo log – data loss if active or current log will need incomplete recovery.
  • There is one active undo tablespace for a single-instance CDB – Possible performance issues. Corruption or loss could be crash and possible data loss – recovery needed.
  • Shared controlfile, spfile and pwdfile – loss of controlfile outage – various recovery scenarios.
  • Outage on SYSTEM of a PDB requires full DB outage for RMAN recovery.
  • Oracle Database version – must all be common version.
  • Resource Management – Relying on Oracle RM to control compute, effectively only CPU and parallel server processes. No partitioning of SGA/PGA amongst PDBs (can use RAC affinitization – more cost). No control of file i/o  except for Exadata.


Other Issues to consider:


  • PROCESSOR_GROUP_NAME, to bind the database instance to a named subset of the CPUs on the server in Linux and Solaris.
  • License – Needs EE and Multi-tenancy and possibly RAC too for HA and memory management via node affinity.
  • Cloning PDB with a “Snapshot Copy” is support under ACFS  or DNFS, not direct ASM.


I also loved the comment from Kevin Closson:


  • Tenants in an apartment complex share a stairwell, not a single toilet.  PDBs/ share redo.  Mental picture? (See the theme graphic, above.)


A couple of other posts on the subject:



How Do We Get to the Third Platform?

Moving Our Eggs to a New Basket

Bitly URL:


Tweet this document:

#EMC's journey to the 3rd platform - an #Oracle centric view. #EMCElect


Follow us on Twitter:


Josh Kahn recently updated his PPT on the EMC story, laying out EMC's overall strategy at this moment in history. This deck contains the following graphic:




I think this defines the state we are in right now very well: We are collectively transitioning off of the 2nd platform (the traditional computing environment we are all very familiar with, on which I type this right now), and onto the 3rd platform, i.e. mobile devices. Here is the challenge we face right now: Technology will eventually become so cheap and so portable that it will be included in every single device we use. That in turn will result in an enormous proliferation of data, which is worthless in isolation, but quite valuable in toto.


So where does that leave EMC? At the moment, we have many of our eggs very thoroughly in the transactional / 2nd platform basket. Here's another slide from Josh's PPT:




On the left, we have the 2nd platform, of which Oracle is a huge part. (Arguably, Oracle is still the dominant transactional workload, although that is being seriously challenged by the likes of SAP with the HANA and Sybase plays.) A huge percentage of EMC's enterprise-class array sales (for which you can read VMAX) are tied to a transactional workload (especially Oracle) in some way. That's why you see the huge numbers on the left under 2nd platform.


The growth is in the 3rd platform though.


The challenge of EMC is to preserve our dominant position in the persistent data storage on the 2nd platform, while penetrating and dominating the 3rd platform. We need to move our eggs to the newer basket in other words. It's going to be interesting for a while!


Performance for Oracle

Bitly URL:


Tweet this document:

New study by #ESG showcases #Vblock for high performance database servers, including #Oracle. #EMCElect


Follow us on Twitter:



A very recent study by ESG contains very interesting performance results using the new Vblock Specialized System for High Performance Database. This study was performed using SLOB. Here are the highlights:


  • After XtremSF was enabled, the database server was able to sustain 4,000,000+ read IOPS
  • Without XtremSF, read IOPS were 840,000.
  • The system sustained 31.8 GB/sec read throughput during full table scans


All in all, a very respectable result in my view. Here is a summary of the configuration:


Database Server8

UCS C240

2 x 8-core Intel Xeon ES-2690 (2.9 GHz)

768 GB RAM

2 x dual-port 16 Gb FC HBA

2 x 10 GbE

EMC XtremSF: 2 x 700 GB SLC

XtremCache 2.0.1

Application Server8

UCS C220M3

2 x 8-core ES-2690 (2.9 GHz)

256 GB RAM

2 x 10 GbE

Management Server2UCS C220M3
SAN Switch2Cisco MDS 9710
Ethernet Switch2

Cisco Nexus 5548UP

2 x 6248 Fabric interconnect

2 x 2232PP Fabric extender



8-engines with the following per engine:

32 x EFD (100 GB RS 3+1)

48 x FC (300 GB RAID1 10K)

56 x SATA (1 TB RAID6 6+2)



I have been intrigued by the transition from TPC-C (using tools like SwingBench or Benchmark Factory for Databases) to SLOB in EMC's Oracle database performance testing. I must admit that I resisted this at first (I am an old TPC-C guy from back in the day, and have done my share of EMC Proven Solutions using primarily BMF), but I am beginning to warm up to the idea.


I had a brief discussion with Kevin Closson, the author of SLOB, on this issue. One of the reasons I think SLOB is more interesting than TPC-C is the IO-oriented nature of SLOB. As Matt Kaberlein pointed out in one of his recent PPTs, in more modern CPUs (such as the Intel Xeon ES-2690 8-core CPU used in this testing), the bottleneck is now primarily on IOPS, rather than on the front side bus. As a result, a benchmark like SLOB has a better chance of stressing the system in a manner similar to that experienced by the customer in actual use.


NoSQL vs. Oracle

Will the True Database Please Stand Up?


Bitly URL:


Tweet this document:

#NoSQL vs. #Oracle: Will the true database please stand up? #EMCElect


Follow us on Twitter:


As many of my readers know, I am an old-school Oracle DBA, from back in the day. vi, grep, awks, perl, and all that.


So it is perhaps a surprise to many that I have embraced the NoSQL stuff as much as I have. Admittedly, I am woefully behind on my MongoDB DBA training, but I continue to burn through the material. After working with the product for a while, the general idea begins to emerge.


In other words, I am beginning to get it. The light is coming on. Dawn approaches. Choose your cliche.


Here is what I see: NoSQL database-like software packages like MongoDB are optimized to manage huge masses of data, about which the customer cares passionately, but only about the general shape and form of the data. The individual pieces of data can fall on the floor occasionally, no problem. But the majority of the data is good, and the amount of data that can be managed is enormous.


Traditional databases are great for managing relatively modest amounts of data (in comparison to NoSQL, certainly), about which the customer cares passionately. And I mean about each and every individual piece of data.


That's it. The basic difference between the products is one of philosophy: Shall I optimize for absolute integrity of every individual piece of data (at a huge cost in terms of performance and scalability), or shall I optimize for performance and scalability (at, admittedly, a big cost in terms of integrity of individual data)?


It is similar to what I have read about in the VDI world referred to as the Great Persistence Debate. One side argues for desktops that persist and can be individualized. The other side argues for generic desktops that reinitialize. Now ask yourself: Is either side right? Is either side wrong? Clearly, there are usage cases where persistence of VDs makes sense. There are also usage cases where relational database makes sense (although those are becoming scarce these days).


Organizations like MongoDB have staked out an area of persistent data management software which is orthogonal to, and in many ways synergistic with, relational.


Like I keep saying, a usage case always helps. I know a company here in Raleigh that makes smart meters. Each of these meters spawns 25 measurements a day, only one of which absolutely must be saved. The rest are optional, but very helpful in terms of helping the utilities plan their load during the day.


To manage these meters, our company performs an operation called a "read". From the perspective of the meters, the customer's system pulls the measurement data. From the perspective of the customer's data center, the "read" operation is a data insert hit of a few 100s of GB.


Now, clearly there are two kinds of data going on here: One is canonical, business data. That's the one measurement that must be saved from each meter. The remaining measurements are a completely different type of data: Much higher in volume, but only interesting in aggregate. Any individual measurement is not terribly material to the overall problem.


That cries out for two different types of approaches to managing the data. Wouldn't you agree?

Design-in-enterprise-apps.jpgEverything Oracle @ EMC Website Redesign

Why EMC for Oracle?

Bitly URL:


Tweet this blog post:

Under-the-covers redesign of the #EMC Everything #Oracle website. #EMCElect


Follow us on Twitter:


I was MBO'ed this quarter to fundamentally redesign the EO community. A major challenge, considering this comes on the heels of a prior basic navigational design which we kicked off in Q2. No problem.


For the moment, I have kept the basic design of the navigational tool, which looks like this:


overview nav.png


I would readily admit that this structure is not perfect. But we will revisit this area in Q114. The thing I saw when I waded into Omniture was that the top page visited after hitting our site is the so-called "landing pages", e.g. Why EMC? These pages used to look like this:


landing page - old.png

Very boring. This is not the most inviting page I have ever created, that's for sure! My goal in creating a page like this was to avoid work on our part. Because this page is completely data-driven (each of the links is a so-called "saved search", leveraging the use of categories within Jive). While this is a laudable goal, it must take a back seat to the overall attractiveness of the site. Therefore, I decided to dress up these pages. The new Why EMC? page looks like this:

landing page - new.png

Much more attractive, obviously. But more labor intensive. All of our saved searches are still there (see them in the left column), but we have added attractive teasers to relevant documents. This is us effectively being a "smart search" utility. We are betting here that the viewer of this page will click through on one of our teasers. To increase our chances, we need to make the teasers the most attractive, relevant documents possible.


The additional work comes from needing to update these teasers. As we publish additional content, we will need to constantly tweak these pages to make sure they have the most up-to-date content. Think of this as Featured Content, only within a particular category, and all by hand.


I am also trying to establish a consistent "look" for documents on our site. I am faking a two-column layout using tables. I have built both a landing page template and a document template to make this fairly easy. This blog post is a good example. The twitter content and other related items go in the left column, and the main content goes in the right content. The header contains a theme graphic and a title. I use floating graphics heavily to make the look work.


That's it for this quarter. Future work to be done next quarter:


  • Train-the-webmaster training. I want to get the rest of the team productive in maintaining the landing pages, publishing content using the new templates, etc.
  • Re-do the navigational widget on the main Overview page, to conform to the plays showcased in the recently-published playbooks.
  • Continue to refine the look-and-feel of the site, with the input of the team. Remember, all of this is just iterative prototyping at this point. We can tweak this going forward. I am open to any design ideas from any of the other members of the team. Let's work together on this!

My good friend and teammate, Peter Herdman-Grant, and I are taking an online class together in MongoDB Administration. Both Peter and I have come to the same conclusion: It is time for us old-style Oracle DBAs to start paying serious attention to the NoSQL crowd. No kidding. Oracle is facing serious challenges from their NoSQL rivals, as this recent article in DB Engines indicates.


So, both Peter and I figured out that MongoDB is the obvious current favorite in the NoSQL race (although it is certainly too early to call the winner!). When Peter proposed this class, I was certainly game! I burned through a bit of the class tonight, getting up to the installation modules. Which gets to the point of my blog, which is about installation.


I think installation is kind of where the product really lives. If you can install it, you can figure out the rest, most of the time. Also, installing a product really tells you a lot about the culture of the company that created that piece of software.


Earlier this year, I attempted to install Oracle RAC 11g Release 2 on a 64 bit Linux system I was not successful at that task, I will readily admit. There were tremendous time pressures. I could have installed Oracle RAC 10g Release 2 in my sleep two years ago, when that was really what I was doing all the time.


But that was before the nature of my job shifted, and has now become much more social-networking focused. But I digress.


I have recently installed Oracle Database 12c on a Linux system, with fairly good success. I did have to burn through a great deal of documentation to do that though. It's not like you can simply download the DVD image, mount it, and burn through the installation program. Nope, Oracle requires that you go through a great deal of trouble before you get to that point. Otherwise, your success is not only not guaranteed: It is remotely unlikely.


I used to think the opposite of Oracle's installation culture was Microsoft SQL Server. I have installed SQL Server many times, and it is, sure enough, a case of pop the DVD, and click next, next, next: Done!


But MongoDB is seriously cool. They took the installation program idea, and simply threw it away. All you do is download a zip file (or tgz on a Linux box), unpack it, and then put the resulting files and directories anywhere you like. At that point you can run MongoDB from the command line.


I did this tonight. Took about 2 minutes. SQL Server: You've met your match.

Filter Blog

By date:
By tag: