Why Hadoop For Big Data?
Apache Hadoop is a fundamentally new way of storing and processing data through the distributed parallel processing of huge amounts of data across inexpensive, industry-standard servers that both store (HDFS) and process (MapReduce) the data. Major enterprise distributions of Hadoop include Pivotal HD, Cloudera, and Hortonworks.
Although directed attached storage (DAS) is the conventional approach to deploying and managing Hadoop, there are benefits to decoupling compute and storage and scale independently, especially if your Hadoop workload does not linearly scale. EMC offers HDFS enabled storage through EMC Isilon scale-out NAS and HDFS/Object enabled storage through EMC ViPR Software Defined Storage (SDS). These EMC storage solutions allow you to scale the compute and storage nodes separately.