Case Study: Performance Comparison for Scaling Session Cache at Southwest Airlines

November 7, 2013 Adam Bloom

header-graphic-case-study-southwest-gemfireIn 2010, Southwest.com’s e-commerce engine was responsible for over $8.5 billion dollars in revenue, and Pivotal GemFire was chosen to support the website’s session caching with several thousand reads and writes per second.

To provide some background, Southwest Airlines is decidedly a successful company at massive scale. With 40 consecutive years of profitability, the company coordinates over 3200 flights and 37,000 employees to serve over 100 million passengers per year. In 2010 (around the time of their GemFire selection), Southwest Airlines had over $11 billion in passenger revenue, and 84% came from the website. Since revenue and profit is largely derived from their website, Southwest needed to rearchitect it to better handle dynamic scale and add additional social media support to their web services.

Building the Requirements for Improved Session Cache

In planning for growth, Southwest’s IT department began looking for a solution to scale Southwest.com sessions and maintain their legendary customer service standards globally. In addition, they wanted to use cost-effective deployment models and continue their path towards a virtualized infrastructure. Previously, they had built a custom session solution in-house, but the high availability component was not suitable for virtualization—they wanted a proven solution provided by a vendor. Given the maturity of cloud architectures within the IT landscape, the decision-making group also thought it was important to move to an architecture that was cloud-ready. In the future, IT also knew session functionality needed to support other applications besides the website and had a goal to eventually move to a distributed data center model with active-active state to achieve maximum uptime, even during upgrades. These requirements led Southwest to look at Pivotal GemFire.

From a development perspective, one of the main influences was the fact that Pivotal GemFire was on the road to greater Spring-enablement with capabilities like the newer Spring Data GemFire project, a way to potentially speed traditional RDBMS by 60x. At the time of choosing GemFire, Southwest’s team had adopted the Spring Framework and used it for over five years. In addition, they had spent two years moving from a historical EJB container to a Tomcat-based container. This also led Southwest down the path of using Pivotal tc Server.

Within the session caching environment, there was a significant volume—the kind that enters the realm of big, fast data. In the world of e-commerce, travel websites are more sophisticated than a simple shopping cart that sums product prices and calculates shipping. Travel websites can have multiple marketing integrations, significant promotion functionality, and loyalty capabilities. Then, the site has to deal with more than product and prices. Complexity is added to the website with flight and seat availability, regular price changes to optimize utilization, taxes, baggage, partner offers, and more. Session plays a key role in the user experience, and Southwest.com was seeing over 4000 reads and writes to session per second for session.

The GemFire Difference in Performance and Cost

In the case of Southwest, their legacy application architecture forced every read to include a write, contributing to the challenge of scaling. Their scale tests included very high concurrency models—over 1000 application threads would each read and write several times per second. These queries were funneled into a 4-node GemFire cluster (16 cores total). The Pivotal engineering, support, and sales teams worked hand-in-hand with Southwest’s IT team to oversee the approach, tune region requests, and optimize garbage collection.

From a competitive perspective, GemFire’s architecture was much more cost-effective in terms of capital and operating expenses. Against multiple, major competitors, our ability to maximize 64-bit JVMs and 34GB heap sizes per JVM made the comparison an obvious one—some even referred to the competitive approach as an “operational nightmare.” See the table below for the comparison.

Item Pivotal GemFire Competition
Number of Servers per Data Center 4 4
Number of JVMs per Server 1 70
Heap Size per JVM 34 GB 2 GB
Available Heap Memory per JVM Approx 34 GB 1.6 GB
Available Storage per JVM(50% ratio for churn on GemFire vs 60%) 17 GB 0.96 GB
Total Storage Needed per Data Center 34 GB 66 GB

Note: For GemFire, it is recommend that if you have a lot of churn (i.e. you do a lot of writes, which the website does) that your cache size be 50% of the size of the heap.

As you can see from the chart above, GemFire’s footprint in the data center was nearly half of that tested by the competition for storage and the demands on the JVM. There were a much lower number of JVMs to operate, and GemFire allowed the cluster to be a much more manageable size.

Additional Benefits

In addition to providing a virtualized, cost-effective, cloud-ready data grid, GemFire provided extremely low latency on reads with an ability to scale. This meant users experienced a much more responsive site, and Southwest had a path to grow. Though it is a NoSQL, distributed solution, GemFire also supported high consistency of data, enabling the IT team to run the product across connected data centers. Lastly, GemFire provided a number of analysis capabilities that allows teams to understand the forensics of demand.

With these capabilities and more, GemFire has helped Southwest stay at the top of it’s industry.

To learn more about GemFire:

About the Author

Biography

Previous
How to Mix in Mobile for Restaurant Success
How to Mix in Mobile for Restaurant Success

Over the last couple years, we’ve seen a ton of mobile innovation and disruption across a variety of sector...

Next
Case Study: Scaling Reservations for the World’s Largest Train System, China Railways Corporation
Case Study: Scaling Reservations for the World’s Largest Train System, China Railways Corporation

The largest single migration of humans occurs around the Chinese New Year when the world’s largest populati...