Find Communities by: Category | Product

During my 10 year tenure at Microsoft, I met with many corporate and commercial software development teams for architecture review sessions.  The most likely recommendation to result from those sessions was "instrument your code".  Ongoing development, testing and support are much easier if you have a detailed historical record of what has been happening with your product.  Most communication and computer equipment as well as commercial software products that are purchased today produce these detailed records of important activity and events.  This machine generated data is the fastest growing segment of what we call the "big data" market.


If you have never been involved in hardware/software development or IT support, this may all sound a little abstract.  If you have access to a Microsoft Windows computer, go to the search bar and type event.  The first suggestion should be a program called event viewer, double click that icon to start the program.


event viewer.PNG.png

Welcome to the world of machine data!  Two things I want to highlight are:

  1. There is an incredibly large number of activity and event types that are collected, and
  2. It is impossible, even for an expert, to tell if this machine is "healthy" or not from this display.


This is "raw" event data presented in lists.  While the major operating system vendors like Microsoft make it easy for hardware and software developers to write events into a central logging framework using a simple application programming interface (API), the result of all this effort is a giant bucket of bits.  Someone then has to write software to analyze and make sense of the raw event data to derive insights.


And all this raw data you're seeing is coming from just one Windows computer.  Every piece of networking, server, storage and specialty hardware gear in and out of a corporate data center has an activity and event logging capability just as complex, or more so, as the Windows OS event system viewer shown here.  And, there are no standards or even conventions for how to construct or store activity and event data employed across multiple products.  Every vendor and every product will have a unique format for machine data.


Now you can start to get a feel for the formidable complexity that confronts the operations staff of a corporate data center.  If someone asked me how I would architect a software analysis tool that could handle this level of complexity, first, I would suggest that they design a source independent representation of an activity and event that could represent the entire universe of data sources that I was going to encounter.  Then second, I would start writing source specific pre-processors that would translate the raw data from each source into my internal and universal data representation.

splunk-the-big-data-engine.jpgHowever, if you haven't tackled this problem yet, or aren't happy with the solution you have don't break out a compiler and start writing code.  You should really check out our partner, Splunk Software, ranked #1 in Worldwide IT Operations Analytics Software market share.  They have already implement  this approach and much more for handling the complexity of machine data with their Splunk Enterprise product.

Splunk Enterprise can index any kind of streaming, machine, and historical data, such as Windows event logs, web server logs, live application logs, network feeds, system metrics, change monitoring, message queues, archive files, and more.  Splunk Enterprise transforms incoming data into events, which it stores in indexes. The index is the repository for Splunk Enterprise data that facilitates flexible searching and fast data retrieval.  Splunk Enterprise handles everything with flat files using an application native format that doesn't require any third-party database software products. This architecture gives Splunk a great foundation for controlling scale and performance.

Another aspect of Splunk Enterprise architecture that fits with best practices for handling data complexity is the application (apps) and add-ons framework.   Apps and add-ons are both packaged sets of configuration that you install on your Splunk Enterprise instance that make it easier to integrate with, or ingest data from, other technologies or vendors.  Although you don't need apps or add-ons to index data with Splunk Enterprise,  apps and add-ons can enhance and extend the Splunk platform with ready-to-use functions ranging from optimized data collection to monitoring security, IT management and more.

Dell EMC and Slunk work closely to provide a total solution with Splunk Enterprise and Dell EMC hyper-converged platforms tailored to address the complexity of machine data analytics.  Our Ready Systems for Splunk provide non-disruptive scalability and performance, optimized for Splunk workloads  Dell EMC Ready Systems for Splunk are purpose-built for the needs of Splunk, helping consolidate, simplify and protect machine data. These Ready Solutions include the hardware, software, resources and services needed to quickly deploy and manage Splunk in your business.  Check out these resources for more details

rs splunk.png

Ready Systems for Splunk Solution Overview

Using Splunk Enterprise with VxRail Appliances and Isilon for Analysis of Machine Data


Splunk Enterprise on VxRack FLEX for Machine Data Analyics

There are a bunch more features of the Splunk Enterprise platform that I want to write about including the use of multiple index locations for data aging and scale and how the main services are implemented as individually install-able and configurable components but that is going to have to be another article - coming soon.

Thanks for reading,

Phil Hummel



Solution Summary

VCE, the Converged Platform Division (CPD) of EMC just released a paper titled VCE Solutions for Enterprise Mixed Workload on Vblock System 540.  In this solution guide we show how the Vblock Converged infrastructure (CI) platform using all-flash XtremIO storage  provides a revolutionary new platform for modernizing deployment and management of mixed-workload and mixed-application environments. The Converged Platform Division (CPD) together with the Global Solutions Organization brought together a team with expertise in both deploying  Vblock systems and deep Oracle, Microsoft, and SAP workload knowledge.  The goal of the team was to build, test, and document a near-real life mixed application solution using a Vblock 540 system powered by XtremIO all-flash storage.


The business application landscape for the testing environment consisted of:

                • A high frequency online transaction processing (OLTP) Oracle application
                • A simulatied stock trading OLTP application for SQL Server
                • SAP ERP with an Oracle data store simulating a sell-from-stock application
                • An Oracle decision support system (DSS) workload
                • An online analytical processing (OLAP) workload accessing two SQL Server analysis and reporting databases
                • Ten development/test database copies for each of the Oracle and SQL Server OLTP and five development/test copies of the SAP/Oracle system.


The combined test results when Oracle, Microsoft and SAP mixed workloads were run simultaneously produced demand on the XtremIO  array of ~230k predominately 8KB IOPS together with an average throughput of 3.8 GB/s (primary I/O size 64 KB and 128 KB), with an 88 percent read and 12 percent write ratio. Average response times were recorded to be 866 μs, 829 μs for reads and 1152 μs for writes.

mixed workload results.png

IT decision makers who are evaluating new options for data center management to help provide better service with lower TCO should research VCE CI platforms that use all-flash technology. We invite you to read the full report to understand our methodology and results or contact your local EMC representatives to discuss if converged platforms are the right choice for your next data center modernization project.


Thanks for reading,

Phil Hummel  @GotDisk