Greenplum For Kubernetes: Demonstration of Managing Greenplum Database on Kubernetes
Massive Parallel Processing Relational Database in the Cloud
Greenplum Database (GPDB) has many features designed to enable data scientists. Before a data scientist can use GPDB, a database administrator (DBA) must provision a cluster and install any required data science packages. Provisioning a GPDB cluster on bare metal requires a lengthy setup process. Scaling, recovering, and securing the cluster post-deployment are also complex. Greenplum for Kubernetes (GP4K) abstracts away these complexities, simplifying and automating the process for users.
In this white paper, we introduce GP4K with an opinionated deployment and a declarative manifest. We provide a brief overview of GP4K’s architecture and discuss its implementation. We also demonstrate a full life cycle of managing a cluster from birth to retirement, including scale-up and self-healing all with minimal DBA inputs.