As All Flash Arrays begin to accelerate their expanding footprint and mindshare in corporate data centers, a large percentage of customers are asking about the value of placing Exchange databases onto EMC's Scale-Out All Flash Array -- XtremIO. This is an interesting paradoxical scenario. At first blush, you might believe that Exchange needs "lots of capacity" and at the same time, the latest versions (2010, 2013, and 2016) have relatively "low performance" requirements compared to earlier versions. In fact, when running in Cached Mode (connections made and managed via Client Access Servers), the IO requirements seem to be down-right unimportant. Why would customers be interesting in an all-flash storage option for Exchange? The better question is, how can we help them understand the value of using EMC's Scale-Out All Flash Array to support their Exchange implementation?
In this blog post, I will share:
- The state of the On-Premises Exchange environment -- challenges, issues, considerations
- IDC's analysis of All Flash Arrays in the Messaging & Collaboration marketplace
- How EMC's XtremIO Array solves IT and Business challenges for Exchange implementations
- Next steps and two trusted analysis techniques for predicting potential success with XtremIO for specific environments
- A sample Total Cost of Ownership model based on 10,000 Exchange users
Here's what we know so far:
1. Retention is the IT term for "Keep everything forever". Storage is cheap, unless you need 144TB of it... or 1.4PB... For every e-mail that is retained in the Active database, it is also stored in the database copies. Studies show that mail storage is growing at a rate of more than 30% YOY. In one customer example of storage growth, the migrated capacity from Exchange 2003 was 72TB back in 2010. That same organization now requires 158TB of capacity after only three years.
2. IT administrators have soured to the notion of a) managing racks-upon-racks of Direct Attached Storage shelves, and b) Exchange Continuous Replication (Database Availability Groups/ DAG) consumes between 2x and 6x the capacity required to store the production copy of email/calendars/contacts. In fact, IT Administrators used to enjoy Single Instance Storage (SIS) provided by versions of Exchange prior to 2010. Most IT Admins report that capacity requirements have multiplied by a factor of 6x after migrating from Exchange 2003 to version 2010. The mail that used to consume 12TB of space in an Exchange 2003 Single Copy Cluster, now consumes 72TB in a three-copy DAG. This explosion of capacity has left most IT Admins disillusioned with keeping mail "on prem"
3. Distribution Groups (DLs) have become a de facto method for transmitting messages. Coupled with the fact that nearly 70% of all email messages contain attachments and the fact that every attachment is stored repeatedly in every DL member's inbox, mailboxes and mailbox databases are larger than ever before.
4. Not all customers and not all end-users connect to their mailboxes via Cached Mode connection. In fact more than 10% of a typical Exchange user-base is forced to run in Online Mode for several reasons, including VDI, Security/Governance (cannot use OSTs), Workflow applications (cannot tolerate cached versions of mailbox items, and HIPPA regulations. And yet, in a portion of our customer accounts, 100% of email users are forced to use Online Mode for connecting to their mailboxes. Online Mode increases the IO requirements compared to Cached Mode by 270%.
A study by IDC, published in March of 2015 outlined the issue of retaining primary and "backup" copies of data within corporate environments. They call it the "Copy Data Problem". IDC's advice is simple, "use the most advanced storage array technologies that support real-time thin provisioning, de-duplication, compression, and snapshot capabilities to reduce your overall storage footprint". IDC also completed an analysis of the All Flash Array marketplace in May, 2015. Their findings were compelling to say the least. The bulk of organizations deploying All-Flash Arrays for use in Messaging & Collaboration workloads are EMC's core customers.
EMC has responded to all of these storage concerns with one comprehensive offering. XtremIO has been tested and evaluated by dozens of customers including EMC IT. Initial findings are reporting impressive and compelling efficiency rates ranging from 2.8:1 to more than 16:1 depending on the number of mailboxes and the number of DAG copies. In the vast majority of installations customers deploying two copies of their Exchange databases on XtremIO are observing 7:1 efficiency rates. Again, we are observing an average reduction in actual storage usage for the Active copy of 60%. In real numbers, this means that 12TB of Exchange databases typically consumes only 4.8TB of space on an XtremIO array. Of course, every environment will be different. In the last section of this post, we'll discuss two valid methods of analysis that will allow you to understand how your specific environment will benefit from the power of XtremIO.
Here's how it works:
XtremIO offers a radically innovative storage platform for Exchange Administrators. Let’s take a deep dive into how XtremIO works with Microsoft Exchange databases. XtremIO works like no other storage array ever created. According to Wikibon, XtremIO is a Generation 4 flash array -- it is specifically built to make optimum use of flash drives. It uses a massive ultra-high-speed metadata table to store and track all of the block locations and fingerprints of all of the data that is written to the array.
This metadata table is like a massive File Allocation Table with a list of each and every 8K block on the array along with a numeric representation of the data it contains -- every block has a hash, like a serial number or a fingerprint. Every time a new 8K block is written to the array, a numeric hash is generated. When additional data is ingested, hashes are generated for those blocks too. If the hash of a new block matches the hash of a previously written block, only the metadata is updated, not the underlying flash drives.
This radical new approach to storage reduces writes to the flash, reduces storage usage and dramatically increases write performance compared to any other storage array.
Let’s see how this technology – already producing sub-millisecond response times for more than 1000 enterprise database applications – benefits a Microsoft Exchange environment. We'll start by drawing a logical representation of the host volume, the LUN. Inside this, we'll place your Exchange database.
Inside that, we'll place your data – the messages, the attachments, the contacts, the headers. Keep in mind, XtremIO has no awareness of file, file structures, or databases -- it only knows about 8KB blocks and whether or not it's seeing them for the first time or a subsequent time. If an inbound block is identical to a previously written block, the XtremIO metadata table references the previously written block and reports the second occurrence in metadata. As far as the host is concerned, all of the data that it has written to the volume has been logged into that volume's FAT or GPT and all of the data blocks have been committed to disk -- in this case, the XtremIO array and its metadata table. But here, the orange and smallest blue cylinder inside the data, we'll talk about some of the compression and deduplication technologies the XtremIO array brings – and how those fingerprints play a role in reducing our capacity usage. Based on actual customer data – including an extensive study of EMC’s own Exchange environment, we are finding that an average of 60% LESS storage capacity is used compared to DAS and traditional storage arrays. And the more mailboxes and copies of mailboxes you place on XtremIO, the higher the savings becomes. Here why. Let’s follow the bits as we store Exchange data on our XtremIO array.
First, the LUN is created on the XtremIO and is presented to your Mailbox server. From this point forward, the Exchange server is making use of a block-storage device. Windows formats the volume, a drive letter or mount point is assigned, and Exchange is able to utilize this new storage volume. Then, you either create an empty mailbox database file or, you can copy an existing database file to the XtremIO array. As the database is ingested into the XtremIO, each database page is mapped into the XtremIO’s metadata. For every database page, there are four hashes created. Remember, only blocks with unique hashes are candidates to be written to the flash drives. If a block containing zeros is discovered, only the metadata is updated.
What actually gets copied onto XtremIO’s flash drives is the actual data, not the strings of zeroes within the database – but this is only the first of three steps as the data is written into the array, there’s more data reduction before the data actually lands on the flash drives. Within Exchange databases, a large percentage of the database pages are blank – filled with zeros – either waiting for new message data or erased after deleted items have been removed. Unlike traditional storage technologies, when databases are placed on XtremIO, none of these blank databases pages consume any space on the array. XtremIO saves space by remembering the content of each and every page that it’s already ingested. If it sees a duplicate of that data, it simply references that “already written” block in its metadata table. XtremIO does this operation in the data pathway – it never allows duplicate pages to get onto the flash drives – and it never performs any post-ingest deduplication process.
The last thing that happens before any data is passed onto the flash drives is we remove any repeating characters within the 8K blocks, just like you would typically see in a compression program. It would remove any repeating sevens or eights or repeating sequences, or zeroes. No! Not zeroes! Zeroes never get written to the flash drives. What you're left with is just the unique data, and that's how XtremIO creates an incredibly efficient storage platform for all databases, but especially for Exchange. You might think that all this In-Line data processing would slow-down the IOs, but it has just the opposite effect. Even under extreme load, XtremIO predictably delivers sub-millisecond response times. XtremIO is a match for every Exchange environment, but makes OnLine or Non-Cache Mode users and VDI users especially happy and productive.
Within your database, you have mailboxes. Mailboxes are loaded with zeroes. Through Background Database Maintenance, Exchange strives to keep about 20% of the database filled with black pages at any one point in time. Remember, Exchange pages are 32K in size. [Animate messages being added to each mailbox, various colors] For the XtremIO array, that means it takes four 8K blocks in order to store one page of Exchange data. This means that any partially-filled Exchange page will also benefit from this deprovisioning. And, when anything is deleted [show white replacing already written messages] from Exchange, those pages are blanked or zeroed , so those pages become de-provisioned, as well. Contrary to common belief, attachments stored in an Exchange system are not compressed before they're placed into individual users' mailboxes. Keep in mind that an e-mail that's been routed to dozens of people is stored dozens of times inside the Exchange database. Because those attachments and those messages are placed into each user's mailbox, XtremIO records their placement in each user’s mailbox, yet stores the blocks associated with that attachment onto its flash drives only once. When a given user opens the message, he or she finds it in his or her mailbox as if it were stored dozens of times on the array. Through real-world observations, we have learned how XtremIO actually benefits real Exchange environments; we don’t pretend that JetStress databases represent real-world scenarios.
In the many customers we have studied, we've learned that through thin provisioning techniques, we can save as much as 40% of the necessary capacity to store our Exchange databases. By not writing zeroes, we've learned that we can save an additional 20% to 40%. We've learned that because we don't duplicate data blocks, we can save another 20-30%. We also learned that because we can remove repeating characters and strings of characters, we can remove another 20% from our storage requirements. All in all, this tallies up to about 2.8-to-1 storage efficiency.
Off the top, our volume savings is 40%; this is our LUN tail space. Because we don’t write pages filled with zeros, we save another 30%. Because we de-duplicate repeating blocks within the Exchange mailbox databases, we save another 30%, and then we compress the remaining blocks that haven't already been duplicated by another 20%. This gives us a typical single-copy efficiency of 2.8-to-1. That’s a reduction of 64%!
And remember, if we duplicate the database for use in a Database Availability Group, not only are the databases protected by more than five-nines of availability, but, those copies come for free -- they are block-for-block duplicates. That means for every copy of the databases we make, our efficiency number doubles. And because we use XtremIO snapshots to seed those copies, we can generate DAG copies in seconds instead of hours.
Exchange administrators need predictable, reliable performance. They rely on XtremIO to deliver sub-millisecond performance, ease of administration and, reduced costs 24-hours a day. Unlike other All-flash arrays, XtremIO delivers absolutely consistent performance. In fact, EMC’s tests reveal that a single XtremIO X-Brick handles 45 million messages a day with over 300,000 mailboxes at 150 messages per user, per day.
Analysis -- How will XtremIO fit into my environment?
Through our own analysis process, we have developed two different and complimentary methods for discovering your expected efficiency rates.
Examining your Exchange databases is simple, but can be time-consuming. Create a snapshot of your database(s) and mount that snapshot to an alternate host or VM. You will need to install the Exchange utilities onto your alternate server. In a command window, run the Exchange utility called ESEUTIL /k [database name]. This will scan the database at a rate or 200GB/hr and look for failed checksums in each of the database’s pages. At the end of the operation, ESEUTIL will present a short report. One of the fields in the report shows the number of uninitialized pages in the database. This number, when multiplied by 32K will tell you the amount of space that the XtremIO array will NOT use in storing that database on the array. Take the total database size as it appears on the Windows volume and subtract the uninitialized space from it. This is the space needed to store your database on XtremIO (prior to compression and de-duplication!).
EMC MiTrend Efficiency Analysis
https://app.mitrend.com/emc/#instructions/XtremIO_Reduction contains detailed instructions for how the XtremIO Efficiency Analyzer works and how to get dependable results. If you clicked the MiTrend link and were challenged for credentials you could not provide, please contact your EMC Partner, Rep, or Systems Engineer for assistance. Many Exchange admins are reluctant to have scanners running on their production databases despite that fact that we have successfully scanned dozens of customers' production volumes with the tool. For Exchange admins who "worry", simply restore a representative sampling of databases to an alternate server (non-production VM) and run the MiTrend tool against those volumes. Once you have the MiTrend data collection, you can have your Partner or EMC SE submit your collection to the MiTrend site. A report will be produced in short-order and you can begin to make decisions.
Let's stop to ask what this is going to cost us. We are faced with questions that we have previously NOT needed to answer. XtremIO takes all traditional storage methods, mechanisms, and technical structures and tosses them aside; we are left asking, "how can XtremIO save me money?" or "how can XtremIO cost less than traditional storage?" To assist you in understanding how XtremIO will compare financially to the storage you are using today, we have assembled a "straw man TCO" based on 10,000 seats, 2GB average mailbox size, with 150 messages sent/received per user, per day. The TCO includes all of the various aspects of installing, managing, cooling, supporting, and paying for facilities costs that you would typically find in any other TCO model. The model in this blog post does not attempt to represent ANY cost savings derived from simplified management of the storage device -- all personnel costs are held even across all systems analyzed.
The "Comparative Costs of XtremIO" graph shows three Exchange 2010 implementations based on three different storage devices. All other aspects of the Exchange implementation are held as constant as possible. For example, same number of servers, Ethernet ports, admins, mailboxes, databases and database copies -- everything is the same except the storage and costs associated with the storage such as maintenance, installation, facilities costs, power and cooling. Long-story short, all three storage configurations resulted in average mailbox costs that were within 25-cents of each other. And, based on the "deal" you might get from your hardware vendor, could actually be the same price. The prices used in this TCO are the typical "street" prices seen in the open marketplace. There are no special discounts, no list prices, no one-time offers. These are real prices of real configurations obtained in April of 2015.
EMC XtremIO is a thoroughly tested, production proven, and enterprise-class array. We know you will want to analyze your Exchange environments today. We are delighted to say, "Welcome to the rapidly expanding community of Exchange admins who would not trade their XtremIO arrays for all the DAS in the world."
When you think of Exchange, think of XtremIO. It's an amazing TCO.