Why Data Lake For Big Data?
The growth of Big Data is driving organizations to move beyond rigid and structured data warehouse environments to more accessible and cost-effective "Data Lakes" - centrally managed repositories using low cost technologies such as Hadoop to land any and all data that might potentially be valuable for analysis.
What sets a Data Lake apart from a chaotic file dumping ground is its robust and intelligent management capabilities. Through an IT enforced security framework and metadata catalog, data sets are easily accessible and discoverable by authorized consumers who are then responsible for data transformation and subsequent analysis. Data is then automoatically cleaned up after analysis is complete to reduce the waste of resources and the risk of data leakage.
The EMC Federation offers a low risk, best of breed approach to evolve from legacy silo'd data architectures into an agile Data Lake. Centered around Pivotal Big Data Suite and HDFS-enabled storage the EVP Data Lake offers technologies from Pivotal, VMware, and EMC II to deploy the most flexible and efficient Data Lake solution based on your data processing and storage requirements.