Everyone is getting bombarded with messaging related to Big Data. It's, well big, and machine data is one of the fastest growing and complex areas driving the interest in doing more with big data. It’s also potentially one of the most valuable since it can be important to doing analytics related to customer behavior, sensor readings, machine behavior, security threats, fraudulent activity and more.
The biggest challenge I hear from people evaluating where to get started with big data is they are stuck in "analysis paralysis". Figuring out what software and hardware platforms to use can seem like an overwhelming obstacle given the choices available. Based on my conversations with attendees at the most recent EMC World, two of the features they are looking for in a big data solutions are:
- Minimizing in-house coding and development
- Finding a platform that can scale as they go.
VCE, the Converged Platform Division (CPD) of EMC, just released a paper titled Providing Enterprise Performance, Capacity, and Data Services for Splunk Enterprise. This paper describes a VCE infrastructure solution that highlights flexible scaling options and tight integration with Splunk software for performing analysis that is targeted at machine data. The solution addresses both of the customer requirements noted above. I will provide some additional background on both Splunk and the VCE solution or you can jump straight to the link in this paragraph to download the full paper now.
Splunk Enterprise indexes any machine data from virtually any source, format, or location in real time. This includes data streaming from applications, app servers, web servers, databases, wire data from networks, virtual machines, telecoms equipment, operating systems, sensors and much more. Splunk indexes contain information about the time of the event, keywords (terms), and any discovered relationships between events. Users can then search, analyze, and visualize machine data using Splunk indexes. In order for Splunk users to more efficiently handle the constant stream of event data from multiple sources, Splunk uses the concept of buckets to store data in classifications of hot, warm, cold, and frozen tiers.
Data is searched in order from hot to cold. Frozen data is not typically queried and is marked for deletion. Data is physically moved between buckets during the aging process and therefore can utilize different classes of storage to increase cost efficacy for the system. Given the breadth of potential data sources that Splunk can process, these environments need to have a flexible supporting compute and storage architectural design.
VCE has a portfolio of products that give customer options for implementing tiered storage infrastructure able to handle high-performance hot and warm data, as well as high capacity cold and frozen data all from a single vendor. In this solution we show how using a combination of Vblock® and VxRack™ systems, organizations can simplify and optimize provisioning, deployment, and management of Splunk search and analytics workloads.
The Vblock System 540’s scalable, linear architecture easily accommodates expansion by scaling-out to >1M IOPS at <1ms
latencies for all of your hot and warm data queries and workloads within your Splunk Enterprise environment.
VCE Technology Extensions for EMC Isilon® complements a Splunk Enterprise scale-out architecture by providing a powerful, cost-effective scale-out storage cluster for the retention of cold data in Splunk. A VCE Technology Extensions for Isilon cluster creates a unified pool of highly efficient storage, with a proven 80 percent storage utilization rate. VCE Technology Extensions for Isilon’s single-volume, single-file system and simplified management typically require less than one full-time employee per petabyte (PB), reducing your overall storage administration costs.
For enterprises interested in making the move to hyper converged infrastructure (HCI), VCE and EMC recommend the VCE VxRack System 1000 Flex for Splunk Enterprise deployments. These self-contained units of servers and networking offer scalability, flexibility, and resilience that make it an ideal platform for Splunk. The storage foundation of the VxRack System 1000 Flex is based on the EMC ScaleIO. ScaleIO converges storage and compute resources to form a single-layer, enterprise-grade HCI implementation.
The VxRack System 1000 Flex, with ScaleIO software, utilizes VCE’s integrated compute nodes’ DAS and aggregates all disks into a global, shared, block storage pool. ScaleIO enables a single-layer compute and storage architecture without requiring additional hardware. Its scale-out server SAN architecture can expand to accommodate thousands of servers.
Get the White Paper for More Details
Machine data is one of the fastest growing and most complex areas of big data collection and analytics. Making use of machine data can be challenging unless you pay attention to the platform and tools you implement. With Splunk Enterprise combined with Vblock and VxRack Systems, and flexible options like VCE technology extensions, organizations can easily, efficiently, and cost effectively incorporate enterprise level data analytics and search for real-time operational intelligence. Get the full white paper using the PDF image below.
Thanks for reading,
Phil Hummel @GotDisk