EMC Q&A Part 1: Reduced Costs, 10-200x Performance Improvements with Pivotal GemFire & SQLFire

March 24, 2014 Adam Bloom

Jim Nuzzo, Enterprise Architect, EMCIncreasing workloads by 10x on a smaller server footprint is a solid achievement. So is increasing query response times by 200x. This article explains how EMC achieved both.

With over 25 years in IT, Jim Nuzzo is an enterprise architect and cloud platform manager at our parent corporation, EMC. Mr. Nuzzo was kind enough to have a two-part Q&A session with us about EMC’s use of Pivotal GemFire, Pivotal SQLFire, and Spring technologies. EMC has been relying on these technologies for over six years, well before Pivotal came into being.

Below is part one. In it, we hear about how GemFire was deployed as a website cache, reducing costs and increasing performance by 10x. Then, Nuzzo explains how SQLFire improved query responses from 10 minutes to 3 seconds and allowed EMC to join data across Greenplum, Oracle, and Microsoft databases.

Could you tell us more about how Pivotal GemFire was deployed on EMC.com?

Yes. We were using another cache mechanism that was part of the web content management system on EMC.com and replaced it. Instead, we deployed GemFire as a caching solution. With it, we saw a tremendous performance improvement.

What drove the change to get on Pivotal GemFire?

With the old solution, we were being forced into an approach that caused the existing cache to be ejected every ten minutes based on a time-to-live algorithm. So, we ran into a thundering herd problem. This is where, all of the sudden, you have huge spikes and load on the servers. We would have these every ten minutes. We really needed a time-to-idle approach so that we didn’t refresh the item in the cache unless it was no longer being used. GemFire allowed us to do this. As well, GemFire is an extremely high-performance system based on an in-memory, distributed architecture. It’s fast, really fast. As well, it scales almost linearly by adding nodes.

What were the results with Pivotal GemFire?

The original system was only able to get about 48 concurrent users per server. With GemFire, we were able to achieve 500 concurrent users per server. Not only is this a +10x improvement, it is more than double the expected SLA per server that we expected. As a result of GemFire’s deployment, we were able to actually remove servers and still achieve our SLAs. We reduced both operating expenses and capital expenditures while improving performance.

Before Pivotal SQLFire was deployed on your support website, what problems did you face?

On our support website, customers can come in and look at all the information EMC has about their products—they can look at everything—the serial numbers for every piece of EMC equipment in their data centers worldwide. Besides acting as a register, this allows customers to perform self-service activities like finding service providers, reviewing firmware fixes, showing support history, seeing contract status, and more. Unfortunately, this data is not in one, unified backend system. It is spread across systems, and we were trying to aggregate data from across these systems. When we did joins across the three sources, it would cause timeouts. This meant we couldn’t present data to customers at all or it would take 10 minutes to process the results. The query time was unacceptable.

How did you use Pivotal SQLFire solve the problem?

We are using Pivotal SQLFire to accelerate data out of three separate systems—Greenplum, Oracle, and Microsoft SQL Server. SQLFire provides an intermediate data model and aggregates the data from these three systems. We populate it with Spring Batch on a nightly basis and selectively allow dynamic queries in some cases.

What have been the results with Pivotal SQLFire?

The system’s performance has shown a drastic improvement. Before, queries could take about 10 minutes and often time out. Now, they take no more than three seconds—that’s a 200x performance improvement. We have this tremendous ability to do something we couldn’t do before. Now a customer can see support ticket history, software versions available, engineering releases, the status of contracts and more. This shows up for all their equipment, even if they have hundreds or thousands of data centers with tens of thousands of serial numbers for EMC products.

Learn more about the Pivotal products mentioned in this article:

About the Author


Working with the Cloud Controller API in Ruby: Beyond CFoundry.
Working with the Cloud Controller API in Ruby: Beyond CFoundry.

Cloud Controller is the primary API through which third parties interact with Runtime; it encapsulates the ...

Troubleshooting IPv6 Firewall Rulesets Using tcpdump and pflog
Troubleshooting IPv6 Firewall Rulesets Using tcpdump and pflog

This blog post discusses the procedure we followed when troubleshooting a connectivity issue with a firewal...