Introducing VMware Greenplum 7: Transform All Your Data to Actionable Insights, at Any Scale

September 27, 2023 Arnab Chakraborty

In today’s fast-paced business landscape, enterprises are constantly seeking ways to enhance their operations, streamline decision-making processes, and gain a competitive edge. The key to achieving these goals lies in harnessing the wealth of data at their disposal. However, this endeavor is not without its challenges. The volume, diversity, and sources of data are continually expanding, and the techniques for extracting value from this data are ever-evolving. 

This is where VMware Greenplum steps in as a game-changer. Greenplum is a unified analytics and artificial intelligence (AI) platform designed to empower enterprises to make the most of their data resources. Whether it’s structured data, semi-structured data, or unstructured data, Greenplum provides a single platform that serves as the undisputed "single source of truth," now integrating with the latest LLM approaches, through the support of parallel vector processing.  

The power of unification

At the core of VMware Greenplum is an open source PostgreSQL foundation, which forms the basis for seamless integration of business intelligence (BI) and AI capabilities. This unique amalgamation of diverse tools and technologies, within a unified platform, equips enterprises with the ability to tackle complex challenges swiftly and efficiently—all from the familiar interface of a SQL database. 

Imagine a scenario where you need to intelligently search through vast volumes of customer feedback documents, merge this information with detailed customer online transaction processing (OLTP) transaction histories, and then distill these insights into actionable business recommendations. This multifaceted task, previously involving various data silos and disparate tools, can now be seamlessly executed within the Greenplum platform. The result? Improved operational efficiency and heightened responsiveness to customer needs. 

A seamless journey from BI to AI

One of the remarkable attributes of Greenplum is its capacity to unify data analytics and AI requirements, facilitating a smooth transition from BI to AI applications. This transition can occur at any scale, whether you’re dealing with small datasets or vast data ecosystems at the petabyte scale. 

Greenplum’s versatility is further enhanced by its ability to adapt to the ever-changing data landscape. As the volume and variety of data continue to grow, and as new analytical techniques emerge, VMware Greenplum evolves in tandem. This helps enable enterprises to remain at the forefront of data-driven decision making, continuously unlocking new insights and opportunities. 

Introducing VMware Greenplum 7  

VMware recently announced VMware Greenplum 7 at VMware Explore. Today, we are making this next generation of VMware Greenplum 7 available.

VMware Greenplum 7 epitomizes our commitment to creating and evolving an intrinsically secure, mature, and flexible SQL-based online analytical processing (OLAP) platform. This innovative platform introduces a slew of enhancements and additions, with an emphasis on cutting-edge resource management and sophisticated analytics capabilities for various data types, whether structured, semi-structured, or unstructured. 

VMware Greenplum 7 ushers in many important advancements around seamless data scalability, multi-workload handling, and deployment flexibility. 

What’s new in VMware Greenplum 7 

Check out the powerful new features being introduced in VMware Greenplum 7.

Open source and derivation from PostgreSQL 12: Built on an open source foundation, VMware Greenplum 7 harnesses the features, reliability, and flexibility of a modern PostgreSQL version. Compared to its predecessor, version 7 incorporates five years’ worth of PostgreSQL releases and is rooted in PostgreSQL version 12.

Multiple index types: VMware Greenplum 7 supports a broad spectrum of index types, including B-tree, Hash, Bitmap, Block Range Index, text indices, geospatial indices, and AI vector indices. This feature optimizes data retrieval and query performance. The Greenplum Query Optimizer, refined since 2009 and with a proven track record in version 6, is extended into version 7 with full index selection support.

Enhanced data federation with PXF: The Platform Extension Framework (PXF) in VMware Greenplum 7 has undergone improvements, enabling superior data federation. Businesses can now query datasets in Amazon Simple Storage Service (S3) object stores, Hadoop Distributed File System (HDFS), and other relational databases via JDBC. It leverages the Foreign Data Wrapper API from PostgreSQL to access remote data sources in parallel, offering an abstracted data model for managing security and statistics about the remote data for query optimizations.

Improved text search: VMware Greenplum 7 expands its text search capabilities, supporting both lexical and AI-powered semantic searches to deliver more accurate search results. Lexical search provides traditional keyword-based term search, while semantic search, powered by AI and vector embeddings, finds matching content based on semantic meaning.

Upgraded geospatial analytics: VMware Greenplum 7 has upgraded geospatial analytics capabilities with the integration of PostGIS version 3. This improvement significantly enhances the speed and feature-richness of geospatial queries.

Row-level permissions for security: This feature supplements the role-based security model and the table- and column-level permissions already present in VMware Greenplum.

Generated columns for enhanced data modeling: The introduction of generated columns in VMware Greenplum 7 allows for improved data abstraction and modeling, solving use cases such as feature-preserving data masking for security. 

Improved DBA query features: Greenplum 7 brings a host of enhancements to DBA query features, including UPSERT support, user-defined functions with transactions, and improvements to alter tables to reduce data rewrites. 

Enhanced semi-structured and unstructured data analysis: With Greenplum 7, semi-structured data handling, such as enhanced JSON and array data manipulation functions, is now included in addition to XML document support. Full text search and text-based lexical search indices allow for efficient storage, indexing, and searching of text. Furthermore, vector embeddings enable condensed and efficient representation of unstructured data, allowing for similarity search for matching documents, images, and videos across multiple languages, including multilingual search.

PostgreSQL extension ecosystem: PostgreSQL extensions—such as advanced password check, fuzzy string matching, Hyperloglog, Ip4r for network data, Isn for media data, nanosecond timestamps, sparse vector, Tablefunc for pivoting, UUID for unique identifiers, and pg_vector for AI vector embeddings—are all supported.

Advanced resource management: Greenplum 7 introduces a set of superior resource management features. These features ensure robust performance under heavy loads.

VMware vSphere deployment model: Greenplum 7 can be deployed on bare metal or public cloud environments based on reference architectures. With version 7, Greenplum offers an automated deployment model that integrates seamlessly into the vSphere private cloud environment.

Multi-data center disaster recovery solution: As a part of the multi-data center disaster recovery solution, data is replicated via transaction log archiving, enabling more efficient and lower recovery point objective (RPO) and recovery time objective (RTO) disaster recovery solutions than previous versions of Greenplum.

Benefits of VMware Greenplum  

The many benefits VMware Greenplum brings to the enterprise can be broken into four key areas: flexibility, speed and scale, productivity, and resilience.

Flexibility

Infrastructure versatility: VMware Greenplum offers remarkable flexibility in deployment, making it compatible with various infrastructure types. It is optimized for bare metal, public cloud, and vSphere-based private cloud environments. This means organizations can choose the infrastructure that best suits their needs without sacrificing performance or efficiency. 

Dedicated optimizations: Greenplum provides dedicated reference architecture, ensuring that it seamlessly integrates into different infrastructure setups, reducing deployment complexities. 

Speed and Scale 

In-database analytics: Greenplum’s in-database analytics capabilities significantly accelerate the time-to-insight. This feature means data analysts and scientists can perform complex analytics directly within the database, eliminating the need for time-consuming data transfers. 

Petabyte-scale data handling: Greenplum is built to handle massive volumes of data, even at the petabyte level. This ensures that organizations can efficiently analyze and manage vast datasets, unlocking insights from their largest data repositories. 

Productivity

Data variety: Greenplum excels in managing diverse data types on a single platform. It seamlessly handles structured, semi-structured, and unstructured data, including text, images, videos, vectors, geospatial information, graphs, and voice data. This versatility enables organizations to consolidate their data sources, making it easier to analyze data from wherever it’s stored. 

Data accessibility: Greenplum’s capability to process and analyze data in various formats and from different sources increases productivity by reducing the time and effort required to pre-process and integrate data from multiple origins. 

Resilience

Proven foundation: Greenplum is built on the foundation of open source PostgreSQL, a time-tested and proven database platform. This improves reliability and stability for mission-critical applications and data workloads. 

Enhanced security: Greenplum incorporates enhanced security features, helping organizations safeguard their data. This includes authentication mechanisms, encryption options, and access control. 

Enterprise support: Greenplum offers robust enterprise-level support, giving organizations access to the assistance they need to manage and optimize their data platform. 

Disaster recovery: With features like remote disaster recovery, Greenplum provides mechanisms for data backup and recovery, minimizing downtime and data loss in the event of a disaster. 

With the introduction of this new version, VMware Greenplum is not just a platform; it’s a catalyst for transformation. It empowers enterprises to leverage their data assets to their full potential, driving operational efficiency, accelerating decision-making processes, and, ultimately, achieving excellence in customer responsiveness. As data continues to shape the future of business, Greenplum stands as a beacon of innovation, guiding enterprises on their journey from BI to AI and beyond. Embrace the power of unified data analytics and AI with Greenplum, and propel your enterprise into a future where data is the ultimate competitive advantage. 

Ready to give it a try? Get started today with VMware Greenplum 7. 

See how Greenplum 7 on Samsung’s Gen-5 NVMe drives establishes a new reference architecture that can have far-reaching implications for the future of big data, analytics, and data warehousing. 

Previous
Picking a Framework for Mobile App Development with Tanzu Labs
Picking a Framework for Mobile App Development with Tanzu Labs

In this blog post, we’ll explore the factors that go into the selection of development framework for mobile...

Next
To Succeed at Digital Transformation, Do Less
To Succeed at Digital Transformation, Do Less

When building the business case for a change initiative, it’s easy to overpromise. But doing so only sets y...

SpringOne 2024

Learn More