All About The New Open Source Greenplum Database

November 11, 2015 Simon Elisha

sfeatured-podcast It is an exciting time for customers using data-related software—with more data being processed an analysed than ever. To do this effectively you need some heavy duty technology. And one of the best for a long time has been the Pivotal Greenplum Database. Now, this database is available as Open Source—giving customers even more flexibility, and enabling a flourishing community of developers and contributors. In this episode, we explore more about what this means and how you can be a part of it.

PLAY EPISODE

SHOW NOTES

Subscribe to the feed
Feedback: podcast@pivotal.io
Links referred to in the show:
- Greenplum Website
- Pivotal Query Optimizer

TRANSCRIPT

Announcer:
Welcome to the Pivotal Perspectives Podcast. The podcast of the intersection of Agile, Cloud, and Big Data. Stay tuned for regular updates, technical deep dives, architecture discussions, and interviews. Now let’s join Pivotal’s Australia and New Zealand’s CTO, Simon Elisha, for the Pivotal Perspectives Podcast
.
Simon Elisha:
Hello, everyone, and welcome back to the podcast. Thanks for making the time to have a listen.

Coming to you from Auckland, New Zealand. Not my normal base of operations, but out here visiting some customers and some partners, and having a good old time seeing what’s what here in the north island of New Zealand, which is a very beautiful place if you ever get a chance to come on down.

Today, I wanted to talk to you a little bit about Greenplum, the Greenplum Database, and what is now the world’s first open source MPP data warehouse. What is taking place? Well, we have taken the Greenplum Database technology, which is a massively parallel processing data warehouse, and committed it to open source.

Let’s step back a little bit and just define what it is we’re talking about here. The Greenplum Database is a very well-known database technology, In fact, it was built over the last ten years. It is deployed with many, many customers with some pretty impressive workloads and does a huge amount of work. What it is, is a massively parallel processing, or MPP, database.

This is different from a traditional database in that it spreads the workload across many nodes, or segments, which basically means it slices and dices the work, but provides the end to use all application the same signal interface that they are always familiar with.

What does this mean? This means that we can process far greater amounts of data, IE terabytes to petabytes, with very quick response times, because what we do with this very efficient and complicated technology, to be honest, is split up the work between different segment nodes that handle components independently. They have their own RAM, their own data and highly connected on a highly high bandwidth network connection, and they basically divide and conquer the problem space delivering results that are many times faster than pretty much any technology that you could use.

What this means is you can have a really powerful data warehouse capability and data analytics capability in the one place, because Greenplum also supports things like MADlib and other tools to run queries and run data analytics very close to the data itself in a parallel way. Again, the seeker of speed in most cases is parallization. If you can run many small tasks at the same time, you get the outcome you want.

Greenplum, the Greenplum Database, is now being released under the Apache Software 2.0 licence and is now available for you to enjoy, play with, use, experiment, explore as you like.

This represents over ten years of development and some two million lines of code, which is pretty exciting. It also includes the next generation query optimization technology, as well as, she’s never been available commercially outside of Pivotal. That allows you to get unbelievably great performance from your queries. In fact, some big data queries opt to one thousand times more powerful. This is the Pivotal query optimizer. A huge amount of work has gone into this.

What this means now is that you have access to this technology that you can explore, contribute to, and use, and you can also get commercial support, of course, through Pivotal.

Now, what we’ve done is decided to take a stewardship role of this project, because we believe this a strategic component of our big data efforts as an overall company. The project is maintained for reuse and collaboration with a broader community. We think that particularly the PostSQL community will be really active in this because this is where the Greenplum technology came from. We will continue to sponsor development, maintenance, and innovation for the Greenplum technology itself. This is what we’ve also done with technologies like Pivotal Cloud Foundry^®, Spring, RabbitMQ, and Geode.

It’s really exciting. It’s a really exciting milestone because what we’re doing is moving away from the days of closed source vendor lock in. We’re not accepting legacy models anymore. We’re not expecting customers to be locked into particular technology decisions. We’re giving them complete choice and complete flexibility.

In fact, at Pivotal we’ve open sourced all of our cloud and data products inside of ten months. That’s a pretty amazing step when you think about it. Some ten million lines of code that have moved from the commercial propriety’s feed into a thriving open source ecosystem.

We’re pretty excited about that because it means that customers can better things. You can be involved. You can drive the project in the direction you want, and you can contribute changes and improvements as you like.

Where do you go to do all this? Well, have a look at Greenplum.org. That’s g-r-e-e-n-p-l-u-m dot org where you can download the source code, you can join the mailing lists, and you can contribute to that, as well. Also, you can find the Twitter handle: @greenplum, on Twitter, obviously, and participate that way, as well.

Something to have a look at if you’re in the market for a big data solution with a lot of processing power, that you want to use SQL, that you want to use in-memory analytics for, that you want to deploy as a software component in the cloud, on your local premises, in a virtualized environment. You can make the choice and be part of what is promising to be a very exciting and thriving new community.

That’s a bit of a snapshot of the new Greenplum Database in it’s open sourced form. Go visit the website and enjoy.

As ever if there are suggestions or things you’d like to hear on the podcast, you can make suggestions at podcasts@pivotal.io, and until next time from beautiful New Zealand, keep on building.

About the Author

Simon Elisha is CTO & Senior Manager of Field Engineering for Australia & New Zealand at Pivotal. With over 24 years industry experience in everything from Mainframes to the latest Cloud architectures - Simon brings a refreshing and insightful view of the business value of IT. Passionate about technology, he is a pragmatist who looks for the best solution to the task at hand. He has held roles at EDS, PricewaterhouseCoopers, VERITAS Software, Hitachi Data Systems, Cisco Systems and Amazon Web Services.

All Things Pivotal Podcast Episode #20–Spring Session

One of the key design patterns needed to deploy a new, or migrate an existing, application to the cloud is ...

CODE: Debugging the Gender Gap @ The Napa Valley Film Festival

Pivotal has partnered with the Napa Valley Film Festival this year to sponsor the screening of CODE: Debugg...

All About The New Open Source Greenplum Database

PLAY EPISODE

SHOW NOTES

TRANSCRIPT

About the Author

Previous

Next

All About The New Open Source Greenplum Database

PLAY EPISODE

SHOW NOTES

TRANSCRIPT

About the Author

Previous

Next

Related content in this Stream

VMware Tanzu announces the General Availability of AWS Commitment Discount Recommendations, which provides recommendations for all reservable services in AWS through VMware Tanzu CloudHealth.

Introducing VMWare Tanzu Data Hub, a self-managed Database as a Service (DBaaS) Platform, providing enterprises a way to host their internal DBaaS offering for internal business users.

In the cloud-native landscape, MCAs drive seamless compliance integration. Their expertise ensures proactive security measures align with regulatory standards for sustained innovation & collaboration.

Tanzu Application Platform brings innovation faster with more frequent feature updates. With 1.9, take advantage of enhanced DORA metrics visibility and improved compliance options for companies.

We’re excited to share some great news! Spring Academy Pro content is now free. It will be available to everyone who registers a work, vocational, or educational email address.

March 28, 2024, marks the official minor release date of Spring Cloud Gateway for K8s version 2.2, and it's set to optimize how developers protect access to their GraphQL services.

We are excited to announce that VMware Tanzu Application Service 6.0 is now generally available!

Get a clear picture of your OSS supply chain, and the risks you face from your open source software dependencies, using the all-new Tanzu OSS Health Assessment.

Trivy can now utilize CSAF VEX data to filter out false positives in CVE reports, maximizing the value of VEX documents in VMware Tanzu Application Catalog.

Bitnami-packaged open source software container images available in DockerHub are now signed by Notation, an implementation of the Notary Project specifications and a CNCF-incubating project.

There’s never been a better time to be a Java and Spring developer! Let me show you why with a sneak peak into JD Conference 2024.

If you're into FinOps, you've probably heard of FOCUS. Introducing our FOCUS FlexReports template for AWS, Azure, and GCP. Turn your cloud bills into FOCUS-compliant reports in minutes!

The latest Spring Boot simplifies infrastructure setup with Docker Compose. Now, supporting Bitnami images, it opens new possibilities for developers. Exciting times ahead!

Shape the future of Spring! Participate in the State of Spring Survey 2024. Share insights, collaborate with the community, and drive innovation.

Extend Apache Tomcat support with Tanzu Spring Runtime. Seamless transition, enhanced security, and uninterrupted workflow for Java applications.

Welcome to another edition of What’s new with Tanzu Application Catalog. This is a quarterly round up of all things related to Tanzu Application Catalog.

As we stand at the threshold of a new era in data management, Greenplum continues to lead the industry with its commitment to innovation.

Experience enhanced security with Tanzu Application Platform. Elevate your organization's defenses from code to build with SLSA Level 3, image scanning scheduling & automatic upgrades for new patches.