Pivotal and EMC Come Together To Shore Up The Data Lake

July 7, 2014 Stacey Schneider


Its no secret data growth is accelerating rapidly. IDC predicts data is growing even faster than we thought, predicting that we are on pace to have 40 zettabytes of data by 2020. That’s 14% more than previously thought. Or, to put it in beach terms, if a grain of sand was a byte, that’s 57 times more than all the grains of sand on all the beaches in the world.

With demand building this quickly, and up to 90% of the data still unused and wasting potential, it is not surprising that the software companies focused on this area are moving mountains to deliver solutions to harness that data quickly. In particular, concerns around reliability, availability and scalability have become hot issues for organizations looking to embrace big data fully.

To that end, Pivotal and EMC have combined to shore up the way forward for the Business Data Lake to right size the economics and complexity of big data. Today, EMC and Pivotal are releasing a new turnkey bundle that packages EMC’s leading Isilon technology with Pivotal’s enterprise hardened Apache Hadoop® distribution, Pivotal HD. Together, these technologies solve some of the enterprise’s hardest problems around big data, including failover, backup, replication and scalability. With these challenges addressed, enterprises of all sizes now have the help they need to bridge their big data efforts to a Data Lake architecture quickly and affordably.

Bridge to the Data Lake

Together, EMC and VMware have a big stake in Pivotal technologies, owning 84% of the company and donating many of the technologies in our portfolio during our inception, including Pivotal HD and Greenplum technologies. While our teams may be separately housed in different companies, we have been at work for the past 24 months to test, benchmark and size the kinds of Data Lake solutions that work for the enterprise.

Today, that work is paying off. Available now, the EMC Data Lake Hadoop® Bundle is a combination of “Federation” Hardware, Software, and Services developed to address the multiple business needs of efficiently and securely storing all your data, analyzing anything, and building the right thing.

The EMC Data Lake Hadoop® Bundle is a big data storage and analytics solution from the EMC Federation that combines EMC Isilon scale-out storage with Pivotal HD (Hadoop® Distribution) software and Pivotal HAWQ, a parallel SQL-Standard compliant query engine.

The benefits of this bundled offering are fairly clear:

  • No hassle setup. Pre-tested and pre-configured, the EMC Data Lake Hadoop® Bundle provides companies with a simple-to-deploy, turnkey solution that offers them a big head start on analyzing their data with a best practice deployment available to them at any moment.
  • Serious scalability. By combining the power of EMC’s XtremIO FLASH with in a Vblock and Isilon storage environment you get the ultimate in performance, flexibility, scalability & efficiency on the storage side. On the Pivotal HD side, customers will enjoy a variety of features they normally would not expect in an Apache Hadoop® distribution, including active-active WAN deployments, and scalable performance with petabyte data storage and management. When deployed together, customers can have confidence that their big data solution is not only a highly scalable and efficient infrastructure, but it also lowers costs and will easily keeps pace with growing data storage requirements.
  • Best-in-class performance. Already the best-in-class storage option out there, EMC’s new Isilon release features a 70% increase in performance, setting the bar even higher. On the analytics side, bundled with Pivotal HD, Pivotal HAWQ is the first and only truly SQL query engine for Hadoop®. It delivers high-performance query processing with fastest query optimizer in the market (ORCA)–up to 21x faster than any other option on the market–with multi-petabyte scalability, interactive and true ANSI SQL support, and programmable analytics.
  • Enterprise-grade security. Enterprise class data protection to maximize availability and robust security options to meet business governance requirements
  • Right-sized economics. Plus this enterprise-ready, scale-out storage platform natively integrates with HDFS, and leads the market with over 80% utilization rates, ensuring you get the most out of each node. The new version is also priced aggressively, affording customers, lowering $/MBPS by a whopping 33%. With the new EMC Data Lake Hadoop® Bundle, you also get the first 10 nodes of Pivotal HD free, and regardless of the size will only be charged for data processing, not storage.
  • A Joint Solution By the EMC Federation. Pivotal and EMC have bundled this solution together with a single support model, simplifying how you buy and support your big data solution from two tightly aligned partners in the big data space.

Today is a bright day in the dawn of the golden age of data. Backed by the EMC Federation, the new EMC Data Lake Hadoop® Bundle serves as a major milestone for the consumerization of big data, affording companies the opportunity to capitalize on their data in a scalable, secure, and timely manner.

For more details on the Data Lake Starter Kit, check out the following resources:

  • See the Press Release from EMC
  • Live Streaming: To view the live streaming of the Redefine Possible event, visit:http://www.emc.com/redefinepossible
  • Social: Join the conversation with #RedefinePossible. Stay engaged with Pivotal by following us on Twitter @GoPivotal.
  • Live Chat: Ask Jeremy Burton anything during a live chat today. Click here to register and use #RedefinePossible to join in.

Editor’s Note: Apache, Apache Hadoop, Hadoop, and the yellow elephant logo are either registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries.

About the Author


Linux containers in production
Linux containers in production

Watch live streaming video from pivotallabs at livestream.com Using Linux containers in development is ver...

iOS CI with Jenkins
iOS CI with Jenkins

Co-Authored by Alex Kramer. It often happens that you join a project once it has been going for a few weeks...

SpringOne 2021

Register Now