Cache Rules Everything Around Me: When to Use Redis and When to Use Pivotal Cloud Cache

June 21, 2018 Jagdish Mirani

One truth of product-market fit: usability and functionality are design tradeoffs. This is why maturing markets often evolve towards two centers of gravity: one focused on ease of use, the other on breadth and depth of functionality. Want a few examples? Camera phones are excellent for daily use. But most professional photographers use a DSLR. Quickbooks is an excellent lightweight accounting tool for smaller businesses. But a FORTUNE 100 company will prefer a much more functional product.

This is true for the caching market. Redis is favored as the “easy to use” choice. This is why it’s the third most popular NoSQL engine, and the #1 in-memory key-value store. On the other hand, Redis was conceived as a simple cache and lacks horizontal scalability and automated failover.

To be fair, some Redis products have started to address enterprise scenarios.  But, when compared to a full-featured enterprise product like Pivotal Cloud Cache (PCC), the gaps are still evident.

Of course, when you have thousands of apps in an enterprise, you’re going to use both products frequently. To this end, Pivotal’s offerings meet the requirements of both centers of gravity: Redis for ease-of-use, and PCC for covering high-end enterprise features.

Let’s take a deep-dive into the use cases for Redis and those for PCC.

When to Use Redis

Many Redis use cases are supported by Redis pre-packaged data structures. In fact, Redis is often referred to as a data structures store. Each Redis data structure has a set of commands that are unique to that data structure, which makes the application of these data structures to business use cases very direct and easy. Redis provides API access to expose these commands for simplifying the code to store, access, and use data generated by applications. The table below shows several examples by mapping specific use cases to the relevant Redis commands shown in parenthesis.

Use Case

Examples and Redis Data Structure Commands

Analytics

Map boolean information for a huge domain into a compact representation and use this for analytic queries. Ex: finding all members of a population with a specific attribute - all university students who are currently taking a course. (BITMAPS)

Leaderboards

Set sorted by score, used in games and are used to increase the level of competition amongst players by ranking them in a variety of ways with the aim of generating more gameplay.
(LIST and SORTED SETS, ZADD, ZREVRANGE, ZRANK, LPUSH, LTRIM)

Latest items list

Social media applications (and increasingly, various types of web applications) show a list of latest items in your home page. (LISTS with LPUSH, LTRIM)

Maintaining counters

Popular for maintaining visitor counts, up-votes, limiting certain activities (login attempts in a 5 min window) - such counters are used extensively in applications. Redis updates counters atomically. (INCR, INCRBY, GETSET)

Unique N items

Counting unique visitors to a website. (SETS/SADD, HYPERLOGLOG)

Messaging and task queues

Any asynchronous workflows based on queues (LIST and SET)

Pub/Sub

Clients can publish or subscribe to messages by channel  (SUBSCRIBE, UNSUBSCRIBE, PUBLISH)

Redis is also an appropriate choice for pure caching use cases, with relatively few writes and where the unavailability of cached data does not negatively impact the availability of the overall application (i.e. the data can be read from another system). Other Redis use cases include distributed locks, user session store, cookie storage, search engines, ad targeting, forums, geo searches, and configuration management.

Redis has extensibility to allow new data structures (Redis Modules) and extend to new use cases.

An added bonus: Redis has a wonderful community that’s a real asset for enterprise developers.

When to Use Pivotal Cloud Cache

PCC supports pure caching use cases. But it’s also a true data store. PCC has a large feature set including strong consistency within a cluster, support for multiple data centers, reliable event delivery, and a strong security model. If you think about it, you can consider PCC a framework for building high performance, stateful, scale-out systems.

Many PCC use cases are driven by the need for strong consistency. When does strong consistency matter? When your systems are “update oriented” rather than “insert oriented.”  Without strong consistency in the face of updates, conflicting updates could be lost forever.

For these use cases, it is important that all updates are replicated synchronously to all backup members for the updated entry. That way even in the event of a network segmentation you will always get the most current and correct value for that updated entry. The only time you will be unable to access the most current and correct value is if there is a network segmentation that causes all servers hosting that entry to become unreachable.

PCC inherits a number of nifty features from Pivotal Cloud Foundry.

PCC runs atop PCF, so it benefits from four levels of high availability. This combination is a big deal if your use case requires:

  • Extremely high throughput

  • Low-latency

  • Transaction processing

  • Event processing

  • Compute grids

Based on scores of customer calls, we’ve developed this useful table that shows vertical-specific use cases for PCC:

Vertical

Scenario

Accounting / Finance / Banking

Balances need to be consistent after each transaction

Online e-commerce

Personalized customer interactions, (e.g. offers that require access to recent and accurate customer information)

Manufacturing

IoT sensor management use cases require consistent, up-to-date information for keeping track of the status of things and responding to anomalous behavior

Trading

Analyzing and executing trades in milliseconds requires access to correct pricing and risk information quickly

Regulatory reporting

Accuracy of reporting and speed are both critical

Every vertical has additional areas that require strong consistency: billing, logistics, inventory, and risk management.

Where else might PCC be a fit? Consider these workloads:

  • High volume transaction systems with a lot of update contention. These require extremely high throughput, low latency transaction processing. Examples of this kind of use-case are reservation systems (e.g. India / China Railways, Southwest Airlines), and telco activation systems.

  • Event processing systems that ingest very large streams of data, then perform rapid calculations. Examples of these include systems that examine each incoming transaction for fraud detection or each sensor reading for detecting anomalies in IoT systems.

  • Compute Grid: when processing must be done close to the data because it is infeasible to move the data out of the store into a processing node.  This includes things like portfolio valuation, bond pricing, and risk management. For these compute-intensive use cases PCC executes distributed computations in parallel across multiple nodes in a cluster.

So PCC is full-featured. How does it compare with Redis on the usability front? The gap is closing! The PCC product team has simplified the initial setup of the product. We streamlined this workflow by making a few common assumptions about various configuration parameters.

Navigating the Trade-Off Between Availability and Consistency

The availability and consistency characteristics of Redis and PCC are dependent on whether the system is configured as a singleton, a clustered system, or multiple clusters in different locations. Pivotal Redis for PCF is a singleton. Helpfully, the  Pivotal Services Marketplace offers Redis partner products that can be clustered, or configured as interconnected clusters via a WAN link. These partner products include Redis Labs and a9s Redis. PCC is a clustered system; a singleton version is in the works and will soon be available.

PCC was conceived as a clustered system. Redis, on the other hand, was conceived as a singleton first. Clustering was added later. As a bolt-on feature, clustering in Redis is not as functional as it is in PCC where it was integrated from the start.

This table summarizes the most salient availability and consistency characteristics of each product in each configuration.

 

 

Availability

Consistency and Data Loss

Single Node (Singleton)

Redis: limited by single node - no failover.

PCC: singleton soon to be available - no failover.

Redis: singletons are consistent by definition

PCC: singleton soon to be available. Consistent by definition.

Cluster of Nodes

Redis: asynchronous leader-follower for backup. Active-passive

PCC: Synchronous replication for strong consistency and failover.

Redis:

1) Replication of data to followers is asynchronous - can result in data loss.

2) Network partitions (split brain) can lead to multi-master errors, with data merge and consistency challenges.

PCC:

1) Strong consistency - replication of data is synchronous;

2) Better split brain handling ensures that only the quorum side accepts writes

Multi-Site, Multiple WAN Connected Clusters

Redis and PCC: Multi-site replication for disaster recovery and/or geographic locality (active-active, or active-passive)

Redis: WAN outage can lead to multi-master for writes. Redis provides support for conflict-free data types (CRDTs) for conflict resolution. CRDTs are specialized data structures with unique characteristics for handling data update conflicts.

PCC: Built-in, timestamp-based, conflict resolver, plus custom resolvers. Also can store CRDTs. In addition, general guidance is to adopt one of many published collision avoidance patterns.

 

You’re Going to Use Redis and PCC, So Know When to Use Which

Redis offers the fastest path to a simple cache. Use it when enterprise-grade features are not a requirement. PCC, on the other hand, limits the trade-offs imposed by the CAP theorem by providing partition tolerance, strong consistency, and a good measure of availability (via  PCF). PCC’s unique ability to cover these bases was explored by my colleague, Mike Stolz, in a recent blog post.

As always, your choice should be driven by the unique requirements of each use case. Now, you’re equipped to ask the right questions and asses each product on its merits.

About the Author

Jagdish Mirani

Jagdish Mirani is an enterprise software executive with extensive experience in Product Management and Product Marketing. Currently he is in charge of Product Marketing for Pivotal's data services (Cloud Cache, MySQL, Redis, PostgreSQL). Prior to Pivotal, Jagdish was at Oracle for 10 years in their Data Warehousing and Business Intelligence groups. More recently, Jag was at AgilOne, a startup in the predictive marketing cloud space. Prior to AgilOne, Jag held various Business Intelligence roles at Business Objects (now part of SAP), Actuate (now part OpenText), and NetSuite (now part of Oracle). Jagdish holds a B.S. in Electrical Engineering and Computer Science from Santa Clara University and an MBA from the U.C. Berkeley Haas School of Business.

More Content by Jagdish Mirani
Previous
What’s the Best Way to Pair?
What’s the Best Way to Pair?

Pomodoro, Ping-Pong or Pair-mate?This post was written by Maya Rosecrance, Software Engineer at Pivotal Lon...

Next
MySQL for Pivotal Platform v2.3 Protects Your Data with TLS, Adds Synchronous Replication for Leader-Follower
MySQL for Pivotal Platform v2.3 Protects Your Data with TLS, Adds Synchronous Replication for Leader-Follower

MySQL 2.3 for PCF aims to ease the pain associated with microservices database management. Let’s take a spi...