A Look Back at Apache Geode Summit 2019: Expanded Caching Adoption Fuels Record Attendance

October 29, 2019 Jagdish Mirani

Why does Apache GeodeTM Summit continue to attract enterprise practitioners in record numbers?

According to the nearly 500 attendees, it’s because the Summit broadens their horizons on the many uses of Apache Geode.  This is consistent with the project’s monthly downloads, which have doubled over the last year!

New to the Geode scene? Engineers around the world use Apache Geode to provide a database-like consistency model, reliable transaction processing and a shared-nothing architecture. It’s a popular in-memory technology to use when you need to maintain very low latency performance with high concurrency processing.

At this year’s Summit, experts from the Geode community shared their use cases, best practices, and lessons learned. We’ve compiled this handy recap with videos, highlights and embedded links to help you get caught up.

Breaking Open Apache Geode: How It Works and Why

Dan Smith - Principal Software Engineer - Pivotal, @drossmith

New to Geode? Want a deeper understanding of the project’s design choices? Dan explains how the needs of a highly available, low-latency, distributed system maps to the capabilities of Geode.

Watch the video, and learn how Geode:

  • stores data and maintains consistency

  • partitions data for low-latency access

  • maintains data redundancy for handling node failures

  • rebalances data across a cluster when nodes are added or removed


Introducing the Geode Native Client

Blake Bender - Staff Software Engineer, Pivotal, @ekalbredneb

Charlie Black - Product Manager, Pivotal, @charliemblack

Did you know Geode supports languages other than Java? It’s true. Blake and Charlie cover how Geode’s Native Client helps C++ and C# apps gain access to Geode servers. Watch this video, and learn all about the major improvements to the Native Client, as it switched from C++98 to the more modern C++11. You’ll also enjoy the hands-on demonstration of how to set up and talk to Geode via the Native Client libraries.

Software developers always want to focus on their code. The Geode Native Client can help Java, C++, and C# developers do just that! So take a look at the Native Client docs. Then, download the latest Geode release (v1.10); it includes the Native Client. 

Next up for the Native Client contributors: Node.js support!


Performance in Geode: How Fast is it, How is it Measured, and How Can it be Improved?

Helena Bales - Software Engineer, Pivotal, @hb7825483

This session was highly engaging, with a lots of questions from the audience. This stands to reason, since enterprises choose Geode when lightning-fast performance matters.

Helena describes how the community improved our collective approach to performance benchmarking. The lynchpin of the effort: running tests in the public cloud against every commit to Geode, as well as running tests on demand. 

Thanks to the community’s new benchmarking tool, we could clearly see where performance was lagging. This yielded a to-do list of fixes that promised to take Geode’s performance to the next level. Among other things, the community worked on complicated refactors to remove scaling bottlenecks, and simple code changes to shave milliseconds off popular API calls.

We used the Geode benchmark to compare the performance of Geode 1.9.0 and Geode 1.10.0. This graph tells the story: 

On average the performance more than doubled! Need a reason to upgrade? This is it.

Want to learn more about the benchmark? Check out the Github repository.


Using Apache Geode: Lessons Learned at Southwest Airlines

Brian Dunlap - Solutions Architect, Southwest Airlines, @brianwdunlap

Brian is a consummate architect. Over the years, he has amassed deep knowledge about Geode. And he’s always willing to share his insights! In fact, Brian has presented at numerous conferences over the years, including the inaugural Geode Summit in 2016.


What did Brian have in store for this year? His list of essential 21 tips. These highly accessible learnings run the gamut, from the best use of Geode technology, how to effectively plan your project, the best way to organize your teams, and strategies for empowering your people. He touches on cloud economics, infrastructure considerations, and a whole lot more! Not bad for a 30 min presentation. Brian, you really should write a book on Geode. No pressure :-)


A Fireside Chat with Apache Geode Committers

Anthony Baker - Moderator and Engineering Director, Pivotal, @metatype

Panel time! The audience asked several questions to project committers, a few of the good folks who write and push code to the Geode project.

Our favorite highlights:

  • There are several options for monitoring and managing Geode. The recent integration of Geode with Micrometer opens up access to a large number of monitoring tools (see next entry). Also, there is a new API in the works that simplifies configuration.

  • The amount of data that can be stored in a Geode cluster is dependent on several factors. To name a few: available memory and other system resources, quantity of data persisted to disk, data partitioning across the nodes, spare capacity needed for overhead, and much more. Rest assured, there are implementations that use thousands of nodes with very high volumes of data.

  • It’s easy to start contributing to Geode. Start with contributions to the documentation; your efforts will improve the community and get you familiar with the technology. From there, you will feel more comfortable contributing code, examples, and performance benchmarks. You should also subscribe to the Geode mailing lists

  • The Geode community can help you make important technology choices.  Simply email a proposal to the Geode dev list, and feedback will flow back to you. Technology choices often come from Apache projects or from Spring, but are not limited to these sources.


Visualize Your Geode Metrics

Michael Oleske - Software Engineer, Pivotal, LinkedIn

Dale Emery - Principal Software Engineer, Pivotal, @dhemery

We’ve come a long way with how we capture Geode metrics. In the past, internal metrics were written to a local file in a proprietary format. Even worse, they were only viewable with a custom tool. The community told us “do better please!” Well, we listened!

Enter Micrometer. Operators can now route Geode metrics to a variety of external monitoring systems (such as Datadog, Dynatrace, New Relic, Wavefront, and Prometheus). As an abstraction for instrumentation, Micrometer allows you to instrument your code with dimensional metrics. An added bonus: you get a vendor-neutral interface, so you can decide on the monitoring system as a last step. Metrics can be viewed in real-time, and used to improve system performance. 

Michael and Dale explain how this works. It goes something like this: Geode measurements are captured in an internal meter registry. You can then choose a Micrometer meter registry tool that publishes to your favorite monitoring tool. Geode tracks health metrics such as memory usage, operation latency, and communication queue size.

Then, this duo demonstrates how to connect a Geode system to an external monitoring system. Take this guidance to heart; you can use this data to make better decisions about the health of your Geode deployment!

A couple of resources you’ll find useful are the wiki page on how to publish Geode metrics to external monitoring systems, and example code for adding a Micrometer registry to Geode.

Hungry for more background on Micrometer? Check out these two Micrometer sessions from this year’s SpringOne Platform:


Reactive Event Processing with Apache Geode

Bill Burcham - Software Engineer, Pivotal, @billburcham

Bill discusses Geode’s role in building reactive, non-blocking, event-driven applications. Importantly, he covers two scenarios: the role of Geode as a reactive consumer (i.e. a subscriber to events), and Geode as a reactive producer (a publisher of events).

He then demos a sample application that uses Project Reactor, a library for building non-blocking applications. A key topic Bill addresses: how Goede generates backpressure when the input stream exceeds Geode’s capacity. On the publishing side, Bill describes how Goede’s Continuous query feature can be integrated with Project Reactor to publish only the data that has specific attributes defined by the consumer.

Want to know how reactive APIs fit with Geode? Bill has you covered, complete with code samples and a demo.


Data Serialization and CI/CD Techniques for Apache Geode

Jeff Cherng - Advisory Solution Architect, Pivotal, LinkedIn

Jeff explains the challenges of data serialization. He reviews how different approaches to this problem can have a direct impact on the performance of an Apache Geode cluster and the Day 2 operations of the cluster. When you make the wrong choice here, things get thorny. You’ll stare down far-reaching implications to performance and day-to-day cluster management.

Jeff says there are three popular techniques offered by Geode:

  • Java Serialization

  • Geode PDX Serialization

  • Geode Data Serialization

He goes on to detail their respective performance, ease of use, and compatibility.

Your choice of serialization technique will impact the style of server-side components deployed to a cluster. Further, the selected data serialization technique impacts a CI/CD tool’s ability to make changes to every node. 

From there, Jeff shared very specific advice on the best way to get started in his “serialization magic” and “cloud ready recipes” discussions.

Finally, Jeff showcased Pivotal Cloud Cache, Pivotal’s commercial caching product based on Apache Geode, as an example of a specific configuration that improves developer experience, cluster performance, and Day 2 operations. 


High-Performance Data Processing with Spring Cloud Data Flow and Geode

Cahlen Humphreys - Cofounder, Enfuse.io, @cahlenhu

Tiffany Chang - Engineering Anchor, Enfuse.io, LinkedIn

Enfuse.io helps companies adopt streaming data techniques over traditional ETL processes. To best serve their clients, they needed a data grid layer. In this talk, Cahlen and Tiffany explain why they chose Geode. (Spoiler alert: It was because of Geode’s scalability, low-latency response, and support for event driven applications.)

The presenters also described three useful things:

  • How Geode supports event processing, while minimizing additional integration infrastructure toil 

  • How to embed batch applications into a streaming data pipeline 

  • How you can solve data latency issues related to a legacy API


Scaling Beyond a Billion Transactions Per Day with Sub-Second Responses

Andrey Zolotov - Lead Software Engineer, Mastercard, @zdre

Gideon Low - Principal Data Transformation Architect, Pivotal, LinkedIn

Andrey and Gideon go through the journey of Mastercard’s transition to distributed data processing using Pivotal GemFire (our commercial product based on Apache Geode) and the challenges faced along the way. This transition was critical for their Decision Management Platform. And the stakes couldn’t be higher: The platform helps the company combat fraud and validate cardholder identity. The patented, Java-based platform has grown over the years to process over a billion transactions a day (you read that right - one billion!), while meeting strict SLAs. To make accurate decisions, the platform keeps historical and real-time aggregates for billions of parameters. 

In addition to describing the role of Geode in the Decision Management Platform, Andrey and Gideon cover hot technical topics like the use of server-side functions to enable scalable real-time compute, the pros and cons of delta propagation, and how to avoid update slowdowns due to replication latency. This is a must watch for enterprise practitioners! 


Scalable, Cloud-Native Data Applications by Example

John Blum - Principal Software Engineer, Pivotal, @john_blum

Luke Shannon - President and Founder - Phlyt, @lukewshannon

John and Luke describe how Geode can be used across various cloud-native data access patterns, such as caching (look-aside, in-line, multi-site), distributed compute, event stream processing, search, and system of record.

The presenters then review a common path: moving from open source Geode to a commercial product, Pivotal Cloud Cache (a managed service on Pivotal Platform, based on Apache Geode) with little to no code or configuration changes. The move back to Geode, for local testing locally, is just as easy.

Developers, you’re in luck - this is a live coding session! John and Luke showcase a simple Spring Boot app and show how it can be enhanced with Geode. They also covered a new feature of start.spring.io, that allows you to easily and programmatically access Geode’s key features from your Spring Boot app.


Simple Data Movement Patterns: Legacy Application to Cloud-Native Environment and Apache Geode

James Bedenbaugh - Advisory Data Solutions Architect, Pivotal, LinkedIn

Zachary Hansen - Data Transformation Solutions Architect, Pivotal, @zhansen15

You’ve reached a limit with your monolithic applications. Growing the scalability, availability, and performance is increasingly difficult. Now what do you do? 

Jim and Zach discuss a common remedy: Use Geode to move your application to a cloud-native model. Rewriting, refactoring, or migrating these legacy apps can be a daunting task. Two factors must be overcome: the complexities of the applications and data dependencies, as well as the challenges associated with time to market and budget.

Jim and Zach have found success with a focus on bounded domain contexts. This approach simplifies and streamlines the overall process. In addition, they also illustrate a few simple data movement patterns that could be utilized to assist the effort.

See you in Seattle!

We’re already planning the next SpringOne Platform in Seattle, Sept 21-24, 2020. Stay tuned for details for Geode Summit 2020! 

For posterity, it seems like a good idea to share a link to all the talks from our four Geode Summits. Here’s a treasure trove of videos covering a vast array of topics!

About the Author

Jagdish Mirani

Jagdish Mirani is an enterprise software executive with extensive experience in Product Management and Product Marketing. Currently he is in charge of Product Marketing for Pivotal's data services (Cloud Cache, MySQL, Redis, PostgreSQL). Prior to Pivotal, Jagdish was at Oracle for 10 years in their Data Warehousing and Business Intelligence groups. More recently, Jag was at AgilOne, a startup in the predictive marketing cloud space. Prior to AgilOne, Jag held various Business Intelligence roles at Business Objects (now part of SAP), Actuate (now part OpenText), and NetSuite (now part of Oracle). Jagdish holds a B.S. in Electrical Engineering and Computer Science from Santa Clara University and an MBA from the U.C. Berkeley Haas School of Business.

More Content by Jagdish Mirani
The Reality of Managing Microservice Deployments at Scale: You Need a Spinnaker
The Reality of Managing Microservice Deployments at Scale: You Need a Spinnaker

In his talk at SpringOne Platform 2019, Richard Francois, VP at JPMorgan Chase, discussed why JPMC adopted ...

Need to Manage Thousands of Backing Services? Get to Know Pivotal Service Instance Manager.
Need to Manage Thousands of Backing Services? Get to Know Pivotal Service Instance Manager.

Pivotal Service Instance Manager, now a beta, aims to make it easier for platform engineers to keep tabs on...