One truth of product-market fit: usability and functionality are design tradeoffs. This is why maturing markets often evolve towards two centers of gravity: one focused on ease of use, the other on breadth and depth of functionality. Want a few examples? Camera phones are excellent for daily use. But most professional photographers use a DSLR. Quickbooks is an excellent lightweight accounting tool for smaller businesses. But a FORTUNE 100 company will prefer a much more functional product.
This is true for the caching market. Redis is favored as the “easy to use” choice. This is why it’s the third most popular NoSQL engine, and the #1 in-memory key-value store. On the other hand, Redis was conceived as a simple cache and lacks horizontal scalability and automated failover.
To be fair, some Redis products have started to address enterprise scenarios. But, when compared to a full-featured enterprise product like Pivotal Cloud Cache (PCC), the gaps are still evident.
Of course, when you have thousands of apps in an enterprise, you’re going to use both products frequently. To this end, Pivotal’s offerings meet the requirements of both centers of gravity: Redis for ease-of-use, and PCC for covering high-end enterprise features.
Let’s take a deep-dive into the use cases for Redis and those for PCC.
When to Use Redis
Many Redis use cases are supported by Redis pre-packaged data structures. In fact, Redis is often referred to as a data structures store. Each Redis data structure has a set of commands that are unique to that data structure, which makes the application of these data structures to business use cases very direct and easy. Redis provides API access to expose these commands for simplifying the code to store, access, and use data generated by applications. The table below shows several examples by mapping specific use cases to the relevant Redis commands shown in parenthesis.
Use Case |
Examples and Redis Data Structure Commands |
Analytics |
Map boolean information for a huge domain into a compact representation and use this for analytic queries. Ex: finding all members of a population with a specific attribute - all university students who are currently taking a course. (BITMAPS) |
Leaderboards |
Set sorted by score, used in games and are used to increase the level of competition amongst players by ranking them in a variety of ways with the aim of generating more gameplay. |
Latest items list |
Social media applications (and increasingly, various types of web applications) show a list of latest items in your home page. (LISTS with LPUSH, LTRIM) |
Maintaining counters |
Popular for maintaining visitor counts, up-votes, limiting certain activities (login attempts in a 5 min window) - such counters are used extensively in applications. Redis updates counters atomically. (INCR, INCRBY, GETSET) |
Unique N items |
Counting unique visitors to a website. (SETS/SADD, HYPERLOGLOG) |
Messaging and task queues |
Any asynchronous workflows based on queues (LIST and SET) |
Pub/Sub |
Clients can publish or subscribe to messages by channel (SUBSCRIBE, UNSUBSCRIBE, PUBLISH) |
Redis is also an appropriate choice for pure caching use cases, with relatively few writes and where the unavailability of cached data does not negatively impact the availability of the overall application (i.e. the data can be read from another system). Other Redis use cases include distributed locks, user session store, cookie storage, search engines, ad targeting, forums, geo searches, and configuration management.
Redis has extensibility to allow new data structures (Redis Modules) and extend to new use cases.
An added bonus: Redis has a wonderful community that’s a real asset for enterprise developers.
When to Use Pivotal Cloud Cache
PCC supports pure caching use cases. But it’s also a true data store. PCC has a large feature set including strong consistency within a cluster, support for multiple data centers, reliable event delivery, and a strong security model. If you think about it, you can consider PCC a framework for building high performance, stateful, scale-out systems.
Many PCC use cases are driven by the need for strong consistency. When does strong consistency matter? When your systems are “update oriented” rather than “insert oriented.” Without strong consistency in the face of updates, conflicting updates could be lost forever.
For these use cases, it is important that all updates are replicated synchronously to all backup members for the updated entry. That way even in the event of a network segmentation you will always get the most current and correct value for that updated entry. The only time you will be unable to access the most current and correct value is if there is a network segmentation that causes all servers hosting that entry to become unreachable.
PCC inherits a number of nifty features from Pivotal Cloud Foundry.
PCC runs atop PCF, so it benefits from four levels of high availability. This combination is a big deal if your use case requires:
-
Extremely high throughput
-
Low-latency
-
Transaction processing
-
Event processing
-
Compute grids
Based on scores of customer calls, we’ve developed this useful table that shows vertical-specific use cases for PCC:
|
Every vertical has additional areas that require strong consistency: billing, logistics, inventory, and risk management.
Where else might PCC be a fit? Consider these workloads:
-
High volume transaction systems with a lot of update contention. These require extremely high throughput, low latency transaction processing. Examples of this kind of use-case are reservation systems (e.g. India / China Railways, Southwest Airlines), and telco activation systems.
-
Event processing systems that ingest very large streams of data, then perform rapid calculations. Examples of these include systems that examine each incoming transaction for fraud detection or each sensor reading for detecting anomalies in IoT systems.
-
Compute Grid: when processing must be done close to the data because it is infeasible to move the data out of the store into a processing node. This includes things like portfolio valuation, bond pricing, and risk management. For these compute-intensive use cases PCC executes distributed computations in parallel across multiple nodes in a cluster.
So PCC is full-featured. How does it compare with Redis on the usability front? The gap is closing! The PCC product team has simplified the initial setup of the product. We streamlined this workflow by making a few common assumptions about various configuration parameters.
Navigating the Trade-Off Between Availability and Consistency
The availability and consistency characteristics of Redis and PCC are dependent on whether the system is configured as a singleton, a clustered system, or multiple clusters in different locations. Pivotal Redis for PCF is a singleton. Helpfully, the Pivotal Services Marketplace offers Redis partner products that can be clustered, or configured as interconnected clusters via a WAN link. These partner products include Redis Labs and a9s Redis. PCC is a clustered system; a singleton version is in the works and will soon be available.
PCC was conceived as a clustered system. Redis, on the other hand, was conceived as a singleton first. Clustering was added later. As a bolt-on feature, clustering in Redis is not as functional as it is in PCC where it was integrated from the start.
This table summarizes the most salient availability and consistency characteristics of each product in each configuration.
Availability |
Consistency and Data Loss |
|
Single Node (Singleton) |
Redis: limited by single node - no failover. PCC: singleton soon to be available - no failover. |
Redis: singletons are consistent by definition PCC: singleton soon to be available. Consistent by definition. |
Cluster of Nodes |
Redis: asynchronous leader-follower for backup. Active-passive PCC: Synchronous replication for strong consistency and failover. |
Redis: 1) Replication of data to followers is asynchronous - can result in data loss. 2) Network partitions (split brain) can lead to multi-master errors, with data merge and consistency challenges. PCC: 1) Strong consistency - replication of data is synchronous; 2) Better split brain handling ensures that only the quorum side accepts writes |
Multi-Site, Multiple WAN Connected Clusters |
Redis and PCC: Multi-site replication for disaster recovery and/or geographic locality (active-active, or active-passive) |
Redis: WAN outage can lead to multi-master for writes. Redis provides support for conflict-free data types (CRDTs) for conflict resolution. CRDTs are specialized data structures with unique characteristics for handling data update conflicts. PCC: Built-in, timestamp-based, conflict resolver, plus custom resolvers. Also can store CRDTs. In addition, general guidance is to adopt one of many published collision avoidance patterns. |
You’re Going to Use Redis and PCC, So Know When to Use Which
Redis offers the fastest path to a simple cache. Use it when enterprise-grade features are not a requirement. PCC, on the other hand, limits the trade-offs imposed by the CAP theorem by providing partition tolerance, strong consistency, and a good measure of availability (via PCF). PCC’s unique ability to cover these bases was explored by my colleague, Mike Stolz, in a recent blog post.
As always, your choice should be driven by the unique requirements of each use case. Now, you’re equipped to ask the right questions and asses each product on its merits.
About the Author
More Content by Jagdish Mirani