How Mastercard fights fraud with Apache Geode

With almost 200 million cardholders, Mastercard cares a lot about credit card fraud.

Mastercard’s customer base is growing, and cardholders are using payment cards for more and more purchases. In fact, a recent survey found that 80% of respondents chose credit or debit cards as their primary method of payment. The same survey reveals that more people are opting to make credit cards their sole method of payment.

What’s at stake, given all this? Billions of dollars. Last year, the industry lost $24 billion due to payment card fraud worldwide. Fraud prevention is a top priority.

So how does Mastercard go about minimizing the risk from fraud? Andrey Zolotov, Lead Software Engineer at Mastercard, and Gideon Low, Principal Data Transformation Architect at Pivotal, spoke on this very topic at our recent SpringOne Platform 2019 event.

When you can’t compromise on performance and volume, use Apache Geode

Decision Management Platform (DMP) is at the center of Mastercard’s efforts to combat fraud and validate cardholder identity. Every swipe or tap with a Mastercard goes through DMP. The platform either approves the transaction or blocks it with an alert. DMP’s multipurpose transaction-processing engine is part of a broader framework that’s used by over 20 Mastercard products.

However, scaling a mission-critical system like this can be a challenge. Mastercard required both large data volumes and fast response times. And when you can’t trade-off volume for performance (and vice versa), there are bound to be growing pains.

The company was finally able to have it both ways with Apache Geode’s in-memory technology. Further, Mastercard engineers used design patterns to scale the data volume and transaction throughput of data-hungry applications without increasing latency.

Scaling beyond a billion transactions a day with strict SLAs. Just a small app, nothing to see here. #SpringOne #Mastercard pic.twitter.com/tqhOpFjonU

— Stu Charlton (@svrc) October 9, 2019

Designing Geode for terabyte-scale

The breadth of a decisioning platform like DMP requires terabytes of shareable data. The situation gets even more complex because the system must handle hundreds of time-aware, aggregated, or computed variables. This additional data is used for running risk models and hundreds of decision rules. Mastercard estimates that they have over 30 billion time-aware aggregates!

For handling this scale of data, Zolotov and Low offer a number of considerations and best practices.

Vertically vs. horizontal scaling nodes

Mastercard has experimented with both approaches, and found them both to be useful. Vertical scaling, with a large heap size per node, works well with a pauseless JVM for garbage collection. Horizontal scaling has the benefit of performance optimizations related to parallelism.

Data access scalability through co-location

Mastercard co-locates related data whenever possible. For example, all the data related to a customer’s account lives on the same node. As such, a request for a customer’s data only hits one node. This simple design choice increased read performance to 8 million reads per second. Another benefit: Mastercard could scale the network horizontally without increasing the number of network calls.

Balanced data distribution

The key (as in key-value store) used for data partitioning should result in an even distribution. Why? Simple: Because an unbalanced workload impedes scalability. Andrey allocated sufficient time for diligently determining the right approach to data distribution via balanced partitioning in a way that distributes the workload evenly over the nodes in the cluster.

Use byte array storage

Using byte array storage reduces the complexity of the Java object graph. It also happens to be Geode’s default storage model. As it turns out, Geode is an amazingly efficient byte array manager! Byte arrays can be scaled more easily because they are easier for the Java heap to handle. Simple byte arrays also speed up the garbage collection workload.

With these concepts in mind, Mastercard was able to grow its Geode cluster to 40 terabytes with sustained performance. How did it sidestep any hits to latency? Read on.

Optimizing Geode performance at scale

Mastercard aims to keep the average transaction latency to 50 milliseconds. This benchmark gives them some headroom in case any unexpected slowdowns occur. They wanted to meet this latency target at a throughput of 60,000 transactions per second. Achieving this level of performance requires sub-millisecond reads at several million reads per second.

Mastercard took full advantage of Geode’s built-in performance optimizations to achieve this. With the increase in data available processed by DMP, the number of requests for data increased commensurately. Client requests for data would often involve large volumes of data being delivered over the network—an unwelcome performance bottleneck. Also, deserializing many entries at high transactions per second (TPS) can take up significant CPU cycles.

So what did the company do specifically? Once again, Andrey and Gideon give us four areas of interest:

Aggregates

DMP uses tens of billions of aggregates that get updated and consumed in real time for each transaction. Yes, it adds to the data volume. But it pays off in performance big-time!

Move the code to the data

Geode’s functions offload logic to the cluster itself. It parallelizes execution, reduces the load on the clients, and minimizes network utilization. Geode’s functions are data aware and only invoke functions on partitions that have data needed for the request.

“[I]mplementing context-aware arguments resulted in 95% reduction in network traffic between clients and server nodes, along with 50% latency reduction and 40% CPU reduction on the Geode servers. Delta propagation resulted in an additional 50% reduction in peer-to-peer network traffic . . .”

– Andrey Zolotov, Software Engineer, Mastercard

In-memory contention management

The rate at which Mastercard updates data in DMP raises questions about contention and consistency. Contention occurs when multiple transactions try to update the same data at the same time. In this case, the transactions are serialized and completed atomically. Consistency has to do with updating all copies or related data simultaneously, so that all apps and users see the latest value of the data.

Mastercard’s challenge was to manage contention and maintain data consistency while processing fast, atomic, delta updates. Sure, Geode can store complex objects, but updates to object attributes can also be processed as delta updates. This way, you don’t impact (i.e., lock) access to other attributes that are not being updated.

Additionally, each transaction can involve hundreds of reads and writes. Efficient contention management allows Geode to atomically apply small changes to large entries hundreds of thousands of times per second.

When performance and scalability matter, enterprises choose Geode

Of course, there are lots of nuances to the approaches above. These topics (and more) are covered in more detail by Andrey and Gideon in their SpringOne Platform session embedded above.

Want to learn more? Watch this Geode Summit session on Geode performance benchmarking. In fact, the full set of presentations from all previous Geode Summits, and a whole lot more, can be found here!

You’ll get to know why Mastercard and so many others choose Geode for performance and scalability.

When you can’t compromise on performance and volume, use Apache Geode

Designing Geode for terabyte-scale

Vertically vs. horizontal scaling nodes

Data access scalability through co-location

Balanced data distribution

Use byte array storage

Optimizing Geode performance at scale

Aggregates

Move the code to the data

In-memory contention management

When performance and scalability matter, enterprises choose Geode

Related Articles

Don't Miss These Tanzu Sessions at VMware Explore 2025

Don't Miss the SpringOne Spotlight: Your Gateway to an Incredible Conference Experience

Introducing VMware Tanzu Greenplum 7.5: Enhanced Performance for Data-Intensive Workloads

Principles and Best Practices: From Data to Impact

Beyond the Model: Monitoring and Feedback for Data-Driven Success