Industry Analyst Insight on How Big Data is Bigger Than Data

June 19, 2014 Stacey Schneider

featured-BigData-BigPicture Gartner’s Svetlana Sicular posted her thoughts on 5+ Big Data Companies to Watch on June 17, 2014. In it, Pivotal is listed as one of those Big Data companies to watch. With an important caveat.

We feel this falls in line with our vision that there is a much bigger picture than just big data, and big data’s future depends on ingraining it into the entire development process to make it more easily incorporated, modified and deployed so companies can capitalize on big data opportunities happening right now.

Historically, big data technologies have been offline in silos and took weeks or months to gain insight and comprehensive reports. This is a waste.

Recognizing this, internet giants such as Yahoo! and Facebook developed a new approach for harnessing big data with Apache Hadoop®. MPP databases like Greenplum Database emerged that shared the idea that parallel processing could speed up data munching and crunching, while also making more affordable by running on commodity hardware. In-memory data grids emerged caching high use data for rapid fire access to applications. For many, a complete solution involves all three of these types of technologies, as explained in our post Exploring Big Data Solutions: When To Use Apache Hadoop® vs In-memory vs MPP.

But if you step back farther, delivering big data solutions relies on far more than the data stores. Its the application code. Its the messaging system. Its the array of devices you are collecting data from, whether it be from the web, mobile devices or telemetry from the Internet of Things. Its the machines its running on, whether on premise or in the cloud. Its how you deploy to those machines. Its how you manage them, and the apps deployed on it.

Basically, we are of the opinion that its everything that touches the application lifecycle. And if you bring all those pieces together, its the third platform.

Pivotal’s Mission

Recognizing that the future, and likely the growth of virtualization and storage sales, depended on the apps deploying on both of these platforms, EMC and VMware got together and decided to spin out Pivotal, with the collection of ingredients from both companies that paint a complete picture of how these apps gel together. Knowing their bread and butter depended on the apps that ran on them, they had been steadily investing in a broad range of application technologies that covered application, cloud and data fabrics.

The only real gap between both EMC’s and VMware’s portfolio was the Internet of Things, and mobile—both of which are undoubtedly the largest sources of big data, and where big data has the most opportunities to harness competitive advantage and business optimization. This is why General Electric was essential to our launch to leverage their investments in the Industrial Internet to lead the entire market to the Internet of Things. This is also why Pivotal’s first acquisition was Xtreme Labs, with their technical prowess for developing world-class mobile applications.

Now under one roof, we are all aimed at the 3^rd platform where all of these pieces converge and just work for developers and business users alike. In short, we are working hard to eliminate complexity, maximize automation, reduce overhead and costs and make it really simple for companies to focus on building really great software.

To that end, we’ve made some major strides in the past year:

Cloud Foundry. Underpinning the entire third platform, Cloud Foundry is critical to eliminating the overhead and risks of building and deploying apps. After last week’s Cloud Foundry Summit, there is no denying this vision is happening. Last year we delivered the 1.0 enterprise-ready version of Cloud Foundry, Pivotal CF, and started to recruit an ecosystem of ISVs to do the same. Earlier this year, we formalized the ecosystem with the Cloud Foundry Foundation which will ensure open governance for the project. The momentum this project has taken since the spin-out is real and massive, and was extremely evident at the Cloud Foundry Summit last week with a newly minted ecosystem worth over $3 trillion in market cap present, as I described in my field report.
Spring. Java developers still rule the biggest apps on the planet. And Spring is at the forefront of this movement helping to reduce complexity and lines of code needed to deliver apps. This past year, we’ve also worked hard to make it easier for Spring developers to work with big data by releasing Spring XD. We’ve made it simpler to deploy apps with Spring Boot. And right now we are working on taking the dozens of projects in the Spring ecosystem, and applying the principles of the consumerization of IT to them. Basically, we are releasing a new version of Spring soon that will eliminate the version dependencies between all the Spring projects, allowing developers to confidently use any Spring component in the same release, and upgrade them smoothly.
Big Data. Just before the spinout, EMC released Pivotal HD, our distribution based on Apache Hadoop® that combines HDFS with the Greenplum technology we have been developing for over 10 years, called HAWQ which is a SQL query engine that can speed up Apache Hadoop® queries by 100x. One retail company shared that they enjoyed even bigger results, showing it sped up queries by 318x. Later that year, we revved Pivotal HD 2.0 to better support the concept of the Business Data Lake, something we believe will be a common architecture soon along with partners like Cap Gemini who are also making a big investment in our vision. We’ve also added GemFire XD to better synchronize data from Apache Hadoop® to our in-memory data grid, GemFire. Then, we made it easier for customers to buy enterprise grade solutions for Apache Hadoop®, MPP and IMDG with the Big Data Suite. This new suite allows customers to buy a subscription that allows them to use licenses interchangeably, and right sizes the economics of big data by paying only on how much you use the data, not how much you collect—essentially opening up storage on Apache Hadoop® to be free and encouraging companies to use more data.
Application Fabric. Aside from feature improvements and continued work to harden each of these application components to work in high-demand environments, Pivotal has been working to make the group of products that make up the Application Fabric more friendly with big data, Cloud Foundry and customer purchasing. We have added support for Redis, added new Cloud Foundry services such as the RabbitMQ service that can be stood up easily in the cloud, and will be launching a new consumption model similar to the Big Data Suite that allows customers to easily swap licenses as projects and scale demand.

Building the Vision

It is not only Gartner’s Sicular that we feel is picking up on our progress and recognizing the value of the bigger picture. Its also one of the main reasons people want to work for Pivotal. People like Roman Shaposhnik, a former member of the Yahoo! team that originally worked on Apache Hadoop® and later went to Cloudera, who spent years building Apache Hadoop® into the leading Apache project on big data. Now, he wants to work on things bigger than just Apache Hadoop®. He wants to help reinvent the entire development lifecycle with big data solutions baked in. Or, in his words:

In my mind, Pivotal is an ideal sponsor for an effort to bring us closer to the fully integrated, easy to use Apache Hadoop® platform. First of all, Pivotal is an extremely open source oriented company. Projects like Spring framework, Groovy, and Grails, just to name a few, are the building blocks of Pivotal One. Pivotal has the open source gene baked into its DNA from day one. Even more importantly, the vision of Pivotal One is that of an application development platform offering developers a variety of application and data services available as part of a single comprehensive offering. At the end of the day, Pivotal has a much more ambitious goal compared to all the Apache Hadoop® vendors out there. The company puts a huge emphasis on platform integration and views Apache Hadoop® APIs in a much wider context of application and data services. The Pivotal One vision is way bigger than any of its parts (even if that part is as big as an entire Apache Hadoop® ecosystem).

Pivotal has a big vision and a bright future. Enterprises of all sizes will continue to see us hack away at making the development process smoother, and all of our products working together like clockwork. So, stay tuned. There’s a lot more where that came from coming soon!

Note on Gartner:

Gartner does not endorse any vendor, product or service depicted in its research publications, and does not advise technology users to select only those vendors with the highest ratings. Gartner research publications consist of the opinions of Gartner’s research organization and should not be construed as statements of fact. Gartner disclaims all warranties, expressed or implied, with respect to this research, including any warranties of merchantability or fitness for a particular purpose.

Editor’s Note: Apache, Apache Hadoop, Hadoop, and the yellow elephant logo are either registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries.

About the Author

Biography

When your client wants crazy features

Sometimes our stakeholders ask for features that are counter to generally good ux and design practice. In ...

Graph Analytics for Identity Resolution—Transforming Billions of Customer Records in One Minute

Two Pivotal Data Scientists share details on how they took billions of customer records from multiple syste...