Parallel Postgres for enterprise analytics at scale

With improved transaction processing capability and support for streaming ingest, VMware Greenplum can address workloads across a spectrum of analytic and operational contexts, from traditional business intelligence to deep learning.

Talk to an expert Get started

Greenplum is designed to run anywhere—on-premises or in public and private clouds—for easier installation, operation, and upgrades.


Analytics from BI to AI

Consolidate more workloads in a single environment

Greenplum reduces data silos by providing you with a single, scale-out environment for converging analytic and operational workloads, like streaming ingestion. Execute point queries, fast data ingestion, data science exploration, and long-running reporting queries with greater scale and concurrency.

Deploy anywhere

Run analytics on public and private clouds or on-premises

Greenplum provides your enterprise with flexibility and choice because it can be deployed on all major public and private cloud platforms, on-premises, and in sovereign clouds.

Open source innovation

Pre-integrated components for easier consumption

VMware Greenplum is based on PostgreSQL and the Greenplum Database project. It offers optional use-case specific extensions like PostGIS for geospatial analysis, and GPText (based on Apache Tika and Apache Solr) for document extraction, search, and natural language processing. These are pre-integrated to ensure a consistent experience, not a “wild-west,” DIY open source approach. Instead of depending on expensive proprietary databases, users can benefit from the contributions of a vibrant community of developers.

Enterprise data science

Streamline data science operations and simplify workflows

Tackle data science from experimentation to massive deployment with Apache MADlib, the open source library of in-cluster machine learning functions for the Postgres family of databases. MADlib with Greenplum provides multi-node, multi-GPU and deep learning capabilities. It also offers automation-friendly features such as model versioning, and the capability to push models from training to production via a REST API. Users avoid the pain of porting and re-coding analytical models.

“Whatever use case we can dream up and whatever ways we can think of to better understand the user, Greenplum allows us to do it.”

John Conley, Vice President of Data Warehousing, Conversant

Architecture


Greenplum architecture diagram

Features


Supporting icon

Cloud-agnostic for flexible deployment

Greenplum is available on leading public cloud marketplaces—Amazon Web Services (AWS), Microsoft Azure, and Google Cloud Platform (GCP)—with “bring your own license” (BYOL) and hourly consumption models. It’s also available for VMware vSphere and OpenStack private clouds. Best of all, it’s the same Greenplum version and the same tools across all clouds for a consistent experience.

Supporting icon

Value and performance in an appliance-like experience

Dell Greenplum Reference Architecture is the most performant way to run VMware Greenplum in an on-premises deployment. It’s a VMware-certified and supported blueprint for Dell hardware configurations that replace proprietary appliances. Users can also deploy Greenplum on HP- and Cisco-certified configurations, as well as their own commodity hardware.

Supporting icon

Analytics from business intelligence to artificial intelligence

Machine learning, deep learning, graph, text, and statistical methods are all provided in one scale-out MPP database. Use geospatial analytics based on open source PostGIS, and text analytics based on Apache Solr with Greenplum’s GPText. Extensive support for R and Python analytical libraries, as well as Keras and Tensorflow.

Supporting icon

Handle streaming data and cloud data with ease

Greenplum includes integration to the Kafka ecosystem, certified by Confluent. Together with improved low-latency writes, Greenplum provides fast event processing for streaming use cases. The ability to query Amazon S3 objects in place leads to better integration of cloud data.

Supporting icon

Maximize uptime and protect data integrity

Greenplum has features for high availability, intelligent fault detection, and fast online differential recovery, as well as full and incremental backup and disaster recovery. Security and authentication features address enterprise policy and regulatory requirements.

Supporting icon

Industry-leading performance

With its unique, cost-based query optimizer designed for large-scale data workloads, Greenplum scales interactive and batch-mode analytics to large datasets in the petabytes without degrading query performance and throughput.

Supporting icon

Based on open source projects

Avoid proprietary vendor lock-in. The Greenplum Database open source project is 100% in alignment with the PostgreSQL community. All major VMware Greenplum contributions are part of the Greenplum Database project and share the same database core, including the MPP architecture, analytical interfaces, and security capabilities.

Supporting icon

Massively parallel, highly concurrent architecture

Greenplum features a shared-nothing architecture that automates parallel processing of data and queries and petabyte-scale data ingestion. It’s open source, cost-based query optimizer (GPORCA) was developed specifically to address advanced analytics, creating query plans that execute complex joins at breakthrough performance on large data volumes.

Use Cases

Enterprise analytics and AI

With support for advanced algorithms such as multi-layer perceptron and convolutional neural networks in Apache MADlib, users can begin to tackle cutting edge use cases in speech recognition, image recognition, machine translation, and computer vision. With optional support for REST APIs, you can train, test, and deploy in a single language (SQL), reducing the occurrence of errors when putting models into production at scale.

Flexible deployment on-premises or in the cloud

Move your analytics workloads to the platform of your choice under the terms and in the timeframe you choose. Deploy on private, sovereign, or public clouds (like AWS, Microsoft Azure, or GCP) or on-premises with GBB. Have the freedom to select the best platform for each project and workload based on ease of use, performance, and total cost of ownership (TCO).

Enterprise data warehouse modernization and replatforming

Replatform legacy enterprise data warehouses (EDWs) to replace expensive, proprietary databases. Modernize with the only open source-based, multi-cloud platform for analytics offering the full range of data warehouse functionality that your enterprise demands. Gain the power of an MPP system in conjunction with proven technology to reduce the cost and complexity of application migration.

Down arrow

Let's talk.

Contact us about VMware Greenplum.

Thank you for your interest!

We will get back to you shortly.