After spending about 18 months working hard at modernizing the data services Discover uses to access big, monolithic, systems of record, I have some advice for anybody just getting started on that journey: Expect to be surprised, both by the simplicity of things that might seem hard, as well as the difficulty of things that might seem like no-brainers. And expect things to change along the way.
But do get started right away. Any early legwork is well worth it in the end. Here’s my experience in working with Pivotal to improve our application stack and data service architecture at Discover.
This Time, We Mean Business
So there I was, in June 2017, sitting in the annual technology strategy meeting listening to my CIO and vice president of architecture talk about our application modernization journey. I figured it would be like previous conversations we have had. As a company, we have had great success building scalable, flexible, consumer-facing applications around the edges and then connecting them up to the plumbing of all our legacy stack. This year’s message was different, though.
They were talking about going full stack and including the legacy platforms and applications in their journey to agile, CI/CD-based development. They were talking about building and deploying changes to production in 24 hours or less—not just for the skinny edge / web apps, but for core capabilities. It sounded great, if not a bit optimistic.
I knew we were serious, and I knew it was the right and necessary thing to do. However, as a domain architect for a suite of monolithic, back-office systems of record with an average age of 10-plus years and production release schedules that ranged from semi-annually to monthly, I did start to freak out just a little bit. Mostly, I just had difficulty imagining how we might achieve it.
After giving it some thought and consulting with my peers, we decided to focus on the integration layer in between the dynamic, flexible systems of engagement and the more monolithic systems of record. Specifically, we wanted to stand up a series of functionally aligned, cloud-native data services in that middle layer. By choosing the right abstraction and creating smaller, more flexible, cloud-native bits only loosely coupled to the monoliths, we could enable speed and agility in that layer while insulating the back-end system from the demands and strains of high-velocity change. But we could still adapt to enhancements and improvements in the systems of record themselves, exposing them to other business and developer stakeholders quickly and easily.
Get more insights from Discover's digital transformation here:
Our choice for this new layer: Java/Spring on Pivotal Cloud Foundry for our API logic, and Pivotal GemFire/Apache Geode for our data abstraction layer. Building the new bits turned out to be surprisingly easy. Most of the work focused on reverse-engineering the existing logic and data, and decomposing things in a logical but useful way.
We also ended up changing how we thought about application development, in general, and transforming the processes and procedures we used to create and deploy working software. Previously, we would add onto or enhance the existing thing, and force every change through every control gate, process step, and manual governance checkpoint. We had “monolith” baked into our approach, not just into our technology.
Fast forward to our current state. It’s amazing how liberating it was when we switched to building small bits of new things that largely stood alone. And that could be changed and deployed very quickly because they were only loosely connected to the big, old monolith.
Successful Modernization Means Putting In The Effort
As I look back on our efforts over the last year or so—and on an entire process that kicked off in earnest in June 2017—I realize that much of what I learned revolves around the mentality, process and approach rather than around the technology stack and tools. In that context, here are five lessons that stand out as particularly valuable for transforming how your developers work, not just your applications.
1. Spend The Time Up Front So You Can Start On The Right Foot
I was initially surprised by how long it seemed to take to set everything up; give everyone access; and, as teams, reach consensus on objectives, standards and processes. I wanted to jump right in. But, luckily, we had some Pivotal folks embedded in our group who insisted we follow the agile methodologies and started us out with a couple of onboarding sprints.
As I look back, that investment of time laid the groundwork for an amazing combination of speed and quality once we got going. And it helped break folks out of their old habits and ways of looking at things as long, slow, water-fall type processes. We had trouble seeing the way things could be, until we stripped away the blinders we’d always been wearing.
2. Learn By Doing
There’s something truly liberating about wading into unfamiliar territory. You’re forced to be open-minded, curious, flexible and collaborative, always asking yourself, “Is there a better way to do this that maybe I don’t’ know about?” And as we continuously learned new things, we immediately shared them with the team.
Waiting for someone else to pre-solve all our problems—and then give us mature solutions and patterns with clear and unambiguous instructions—was precisely what had been slowing us down and reducing innovation.
Because we were using tools and platforms (PCF and GemFire) that were new to our operations groups, as well, this really became an interesting exercise. We had to align work and priorities with them so that we could both be more successful. I knew it was working when I was invited to their agile ceremonies almost as de facto product owner. It meant that instead of them deciding what I needed and waiting until they were ready to deliver it to me, they were listening to what I needed and setting their priorities and schedules around helping add value more quickly.
3. Deliver Something, and Then Rework It As Needed
This approach is definitely easier and better than trying to make everything “perfect” on day one. This flows from the “definition of done” that the teams agree to in the boarding phases. I quickly realized that most of the resistance to deploying something less than fully perfect came from the team’s “legacy” understanding of implementing changes, which was sometimes viewed as time-consuming, uncomfortable, and carrying the potential to trip over long-buried technical debt.
However, using a platform like PCF along with our CI/CD pipelines made the refactoring so easy that we were actually able to embrace the concept of continuous delivery. Here’s a great example: As we iterated with the platform team, the processes for deploying our APIs to production evolved. What started out as an hours-long process with multiple manual steps and hand-offs between groups ended up as an automated pipeline with minimal intervention, that only took minutes.
When a bit of old code that we hadn’t touched for a while broke, we realized our deployment pipeline for that bit didn’t work anymore. Our initial reaction was to ask the platform team for an exception so that we could get in our production fix quickly—because in our traditional approach, changing how we’d deploy an application to production would take days or weeks. These days, we are able to spend just an hour or two refactoring our pipeline job and get the fix installed the right way that same day.
4. Keep Your Head Up and Adapt
Once we got into the process of working through our feature backlog, we realized that the ones we thought would be the most valuable sometimes were not. As we adapted to other work in flight and the demands of the consumers of our data services, we dropped some things, started on other things instead, and didn’t let that bother us. The result is consumers banging down the door asking to consume our new APIs.
5. Backward-Compatibility Comes At A Cost
After trying to make our first new API plug invisibly into one of the biggest of our legacy, monolithic data services behind the scenes, we realized it didn’t go so well. We ended up just tripping over tons of unsurfaced technical debt in the legacy service and eliminating newly exposed bugs for months after the install. From that point on, we committed to building out the new, more real-time capabilities only in the new bits and then requiring consumers to switch to the new bits if they wanted the new capabilities.
Full Speed Ahead In 2019
It’s amazing how far we’ve come in just 18 months, and how easy it was for our monolithic application developers to learn new things. Not only are they comfortable working with our PCF-based stack, but they have also become both experts and advocates for that transformation.
I can’t wait to see what will happen in 2019, as we move further away from batch processes and into real-time processing, and as we successfully serve an ever-growing list of consumers for our modern data services.
About the Author