Hadoop

Why Hadoop For Big Data?


Apache Hadoop is a fundamentally new way of storing and processing data through the distributed parallel processing of huge amounts of data across inexpensive, industry-standard servers that both store (HDFS) and process (MapReduce) the data. Major enterprise distributions of Hadoop include Pivotal HD, Cloudera, and Hortonworks.

 

Although directed attached storage (DAS) is the conventional approach to deploying and managing Hadoop, there are benefits to decoupling compute and storage and scale independently, especially if your Hadoop workload does not linearly scale. EMC offers HDFS enabled storage through EMC Isilon scale-out NAS and HDFS/Object enabled storage through EMC ViPR Software Defined Storage (SDS).  These EMC storage solutions allow you to scale the compute and storage nodes separately.

Product/Solution:

Case Studies:


Pivotal HD: Single Hadoop Platform To Meet All Of Your Big Data Application Requirements

As shown in the architecture diagram below, Pivotal HD provides real-time, interactive, batch processing in a single Hadoop platform.

PivotalHD_ArchitectDiagram.jpg

Read More.png
foo

Are you a data rookie, but not familiar with Hadoop?  Pivotal provides 'Just the Basics for Hadoop'

This session assumes absolutely no knowledge of Apache Hadoop and will provide a complete introduction to all the major aspects of the Hadoop ecosystem of projects and tools. If you are looking to get up to speed on Hadoop, trying to work out what all the Big Data fuss is about, or just interested in brushing up your understanding of MapReduce, then this is the session for you. We will cover all the basics with detailed discussion about HDFS, MapReduce, YARN (MRv2), and a broad overview of the Hadoop ecosystem including Hive, Pig, HBase, ZooKeeper and more.

 

foo
foo

EMC and VCE:  Optimized Infrastructure For Hadoop

Organizations that have built Hadoop and adjacent solutions on commodity hardware are looking to transition to next generation infrastructure due to requirements around mission-critical availability and performance, interoperability with the rest of enterprise architecture, and privacy and security. When deploying Big Data applications on Vblock systems, businesses can enjoy the benefit of standardizing on the optimized infrastructures across data centers while choosing the right set of compute, network and storage for the big data and analytic use cases.

 

vce.jpg

Read More.png