Q&A with Pivotal VP Todd Paoletti on the EMC Federation, EVP, and How They Are Changing Hadoop for the Enterprise
This week, Pivotal is headed to EMC World as a premier member of the Federation. Together this week, Pivotal and EMC will be sharing with attendees the power of running Hadoop workloads on Pivotal HD powered by EMC Isilon. Before heading to Las Vegas, we did a short Q&A with Pivotal VP Todd Paoletti on Big Data, Hadoop, and the power of the EMC, Isilon, and Pivotal big data partnership.
First, how is the EMC, VMware and Pivotal (EVP) Federation helping businesses today?
Together, we are uniquely positioned to pave the way for Hadoop and HDFS to become mainstream in the enterprise. The opportunity they provide for business advantage has rippled across the data management ecosystem, and enterprises are rapidly trying to mature the offering to so it fits real-world, production, enterprise-grade use cases. While the open source momentum is fast and furious, enterprises need to put a stake in the ground and establish a stable, secure, resilient and scalable, enterprise instantiation of this powerful new technology.
The beauty of the EMC and Pivotal partnership is that we share a vision and collectively have a team of players that bring this hardened infrastructure to our customers’ most demanding environments. EMC has delivered time and time again the world’s most advanced storage infrastructures for the technological challenges of the time. Today is no different. They’ve delivered Isilon as the gold standard for HDFS-compliant storage area network.
Simultaneously, Pivotal has invested and delivered on the world’s most advanced enterprise Hadoop distribution, pouring hundreds of man-years of engineering from our Greenplum, Gemstone and SpringSource engineers to into rounding out Pivotal HD to become the gold standard for how companies will deploy, mine and integrate real-time insights into actual business applications.
With these two gold standards converging, the risk, cost and complexity of these big data deployments reaches acceptable levels to open the floodgates of opportunity for enterprises of all sizes and purposes.
Where does the Business Data Lake fit into the customer journey to this innovation?
This is the Big Data dream of every business. It breeds CIO envy everywhere it touches. It is a powerful idea to have just one data master that is open enough, and cost effective enough, where you can save all data, blend data from external sources easily to discover new insights and promote them into actionable advantages in business applications.
This solves a slew of challenges that have plagued CIOs throughout the history of computing. It allows them to affordably save all data for use in opportunities they may not even realize exist today. It makes it accessible for the entire organization to mine data and discover new insights. And it smoothes the path for developers to operationalize that data into business applications that reap business value. This is the holy grail for data, and its completely possible now with our Business Data Lake solution.
What does the Federation provide when its solutions are used together? Specifically why do Pivotal HD + Isilon together make sense in the enterprise?
EMC Isilon is the most powerful, scalable, and hardware efficient way to achieve scale-out NAS storage for all kinds of data—from transactional to random access data. It allows companies to get a cluster online in under 10 mins, or scale it out in 60 seconds. It uses a single file system, which dramatically simplifies the overhead of management and makes data more easily accessible. It achieves 80% storage utilization, which means your hardware gets the best bang for your buck. In fact, it is such a automated, consolidated, streamlined process that it only requires one FTE per petabyte. Consider this with the fact that it lets data linearly scale from 18TB to over 20PB with no downtime and you understand why this is the gold standard for big data storage.
Through the Federation, Pivotal and EMC have tailored these advantages to Hadoop. This is the first and only scale-out NAS platform to incorporate native support for the HDFS layer–through a powerful Hadoop plug-in.
Pivotal HD complements and takes that further, extending the capabilities of open source Apache Hadoop to become more realtime. Pivotal HD marries traditional Hadoop with two very important technologies. First, it includes support for HAWQ, our massively parallel SQL engine technology derived from Greenplum, that not only allows data analysts to return to their SQL roots, but actually speeds up queries by 100x across a wide range of query types and workloads. In fact, some customers have reported it to increase query speeds by 600x. Couple that with our Gemfire XD technology that promotes data into memory for even faster access for real-time workloads, while still allowing direct write access to HDFS for backend analytics, and you can see how Pivotal HD moves Hadoop into position for the enterprise to more quickly put Big Data to use for business advantage.
But what is the value of this over a fast and cheap Apache Hadoop framework?
While open source Apache Hadoop is cheap, and HDFS is fast, without Isilon and Pivotal HD, growing a typical Hadoop cluster and using the data within it are still hard, and cost enterprises a lot of manpower as well as lost opportunity costs. To use it in business critical systems, companies have to compensate for risks including high-availability, failover, and security. Isilon does that for you. Then you need to consider the development effort, from discovery to incorporating insights into business applications. As typical with most open source projects, this integration into business systems is outside that purview. Pivotal’s mission and product portfolio ranging from the Spring developement framework, to application fabrics and PaaS, have rapidly honed Pivotal HD to close the gap where open source leaves off. The Federation’s work to marry the hardware optimization with the development process means a big win for the bottom line while dramatically speeding up the process of getting big data into realtime use.
This nets down to the classic justification of you have to spend money to make money. Even there, we’ve listened to customers and worked to improve those economics even more. With Pivotal’s Big Data Suite, we include unlimited data storage for Pivotal HD, effectively removing the incremental software cost for storing big data. With our solution you pay for what you analyze or use, and combined with Isilon, that means enterprise-grade Big Data, for less. As Hadoop matures and advances, we will see greater demand to scale processing independently of storage, or vice-versa, to fit the varied workloads we see today.
At Pivotal we aim to help customers store everything, analyze anything, and build the right thing. With Isilon and Pivotal’s integrated approach to data management and development that big data dream we talked about is solidly within reach.
About the Author