Developing During a Pandemic: The Lessons We Learned

July 30, 2020 Joe Baguley

In January 2020, the UK recorded its first case of COVID-19. The world was facing the unknown, and the only apparent way to contain the virus was for every nation across the planet to shut down. As governments across the globe planned for the unknown, organizations rapidly began seeking ways to support efforts to save lives and restart the economies of these countries. One such endeavor was the vital role technology could play in contact tracing. Over the past few months, VMware has been pioneering the development of a contact-tracing app, which has provided valuable learnings and knowledge.

Testing and contact tracing have long been recognized as one of the most effective ways to build a picture of how a virus spreads between humans and was quickly implemented by nations around the world. Unlike in previous pandemics, however, today’s technology presents a unique opportunity to support what has traditionally been a manual task. Connected devices such as phones and tablets can provide scientists and healthcare workers with granular insights into the spread of the disease, insights that can help in healthcare response and allow scientific communities to learn more about viruses’ unknown behaviors.

As the UK went into lockdown, research by epidemiologists, mathematical modellers, and ethicists at Oxford University’s Nuffield Departments of Medicine and Population Health concluded that a contact tracing app was urgently needed to support health services in containing spiraling transmissions, targeting interventions, and keeping people safe as part of a test and trace system.

While other countries around the world, such as Singapore, were working on their own apps, few in the world had in-depth experience of building a nationwide contact tracing app. VMware’s unique combination of software and services, which empower developers to deliver better software, faster, while enabling the highest level of security and operations, put us in a unique position to work with governments to develop such an app. We operate under a lean governance structure with an agile program management system, allowing us to steer the product direction and constantly focus teams on the most important deliverables. This approach allows us to adapt quickly to any of our customers’ technological requirements and securely build solutions regardless of the app development architecture.

At the beginning of March, VMware Pivotal Labs partnered with NHSX, the technology innovation arm of the UK’s National Health Service (NHS), to share insights on how software could help track the spread of the COVID-19 virus. In less than 24 hours, we had presented a proof of concept using low-energy Bluetooth technology on smartphones to detect users’ proximity. After further research, this led to NHSX spearheading a contact tracing app to deliver on the objective of identifying and notifying people who had been in contact with someone with the virus, while also capturing anonymized data to improve the scientific community’s understanding of how the virus spreads.

Together with NHSX development partners, we agreed that the NHS app would need to:

  • Work on as many of today’s smartphones as possible

  • Provide granular data that uncovered the distance and time it takes for one person to infect another and how this differs between symptomatic and asymptomatic people in order to improve scientists’ understanding of how the virus is transmitted

  • Be highly secure and enable complete privacy for users

The road to delivering an MVP

Our team set to work developing a minimum viable product (MVP) from the proof of concept. To achieve the best application design possible, we brought as many experts and technology advisors to the table as we could. We further supported our efforts by conducting extensive research and testing while also liaising with teams from around the world and learning from their experience of trying to create their own contact tracing apps. Our team worked directly with NHSX on the application design, continuously amending the code in response to the rapidly changing environment and requirements.

User privacy and security were of paramount importance throughout our process. We needed to offer a high level of privacy to users, but also to deliver critical data that supported scientists’ understanding of how the virus is transmitted. From the outset the plan was that any work would be open source to both harness and enable innovation by the tech community while at the same time ensuring maximum transparency. The NHSX wanted to pursue a centralized model that was supported by a robust, highly scalable back end that could handle millions of records in a more secure and anonymous way. We could see from the demand from other test-and-trace efforts that citizens would sign up in large numbers upon release, so it needed to be able to scale on Day 1.

All data in the application would be pseudonymized to protect the identities of the phone owners—an approach endorsed by the National Cyber Security Centre (NCSC). The only information it would ask users to share upon installation was the first half of their postcode, which would help ensure public health resources were diverted to areas showing high rates of infection. The NCSC consulted on the management of citizen data, while an ethics board was appointed by NHSX, informed by political bodies and equalities groups.

We established a socially distanced war room in our closed-off offices that allowed the VMware Pivotal Labs team to work on a centralized version of the app in conjunction with members of the wider app development team, including other government partners, which supported our team with assurance and testing and would ultimately run and manage the app from July onwards.

Testing roadblocks encountered along the way

In early March, the team encountered a major testing hurdle. Singapore, Germany, and France had all reported issues with Bluetooth on iPhones, but a known limitation in Apple’s iOS was causing engineering challenges for contact tracing app developers to overcome. Unlike with Android devices, when two Apple devices had the application running in the background, they would not detect each other. We tried a number of ways to get around this, but would ultimately get “false negative” results when this scenario arose. Only later would Apple address this limitation, and only in the context of its Google Apple Exposure Notification (GAEN)-specific application infrastructure. We’ll come back to that later.

In the meantime, the team was constantly testing and iterating to ensure the NHS app would work effectively upon release to the public. As more official virus symptoms needed to be added, the app had to be modified to take them into account—which was no small feat. When you’re deploying something for national use, it’s not just a case of plug and play or cut and paste, and each time we modified the app, we needed to run more simulations to test viability at a massive scale.

In early April, we delivered an MVP that was based on self-certified symptom reporting. The opportunity to move away from computer simulations and run a real-life test at the Royal Air Force base in Leeming, North Yorkshire was a crucial moment for us all. The test was created to simulate users that tested positive or reported symptoms, and phones that crossed paths were anonymously tracked. The initial trial was a success, and so the team moved to the next phase of real-world testing by beginning preparations for a large, public, first-phase launch on the Isle of Wight. This was another big step forward, as we were able to conduct far more extensive research as part of that trial, leveraging everything from focus groups to scenario testing in a variety of working environments, right down to the language used.

Just as we were preparing to launch the first phase on the Isle of Wight, the GAEN API was officially announced, potentially providing the answer to the iPhone challenge that we had been working to overcome. However, several unknowns still existed. The code wasn’t set to be available until mid-May and, at this point no one really knew what the final launched functionality would look like. In the meantime, however, its decentralized approach immediately triggered conversation and debate globally.

Several months into the project and as GAEN became available, it was important to explore and test its capabilities as quickly as possible in order to gain insights into what would be the best possible approach to deliver an app that could track the virus and help restart society. It became clear in time, however, that while GAEN incorporated a fix to the background detection issues referred to earlier, it would supply far less granular reporting. For example, the NHS app could differentiate between a low-risk situation in which a user passed another individual on the street for a few seconds, while the GAEN approach at that time couldn’t detect between that low- risk situation and if the user hugged someone.

Development concepts are rarely what is finally delivered, and changing course and being as flexible as possible is crucial to successful delivery. As the public health service learned more about the virus, the team made daily, and in some cases hourly changes to ensure the best iteration of the app was put forward. For example, while many tests achieved the desired results across both Apple and Android devices, NHSX decided that until two Apple devices were able to communicate with each other, there were simply too many false negatives to release nationally.

Another significant shift for the team came when the focus of the NHS app switched from notifying people that had come into contact with those who’d reported officially recognized symptoms to only those that had received positive tests. Following this change in direction, along with continued issues with accessing the internal API on Apple devices, with VMware’s help it was decided that the NHS app should be based on GAEN as a more viable option.

The team continued to work to support NHSX following the decision. Ultimately, while the application release was put on hold, we were able to help the NHS and UK government build a more secure mobile and backend application that would scale nationally on launch day using Bluetooth Low Energy without significantly impacting a device’s battery life. At the end of June, we handed the development work on the app to the team at Zühlke Engineering to operate, as planned at the start of the project.

Our learnings

This year, we encountered an event no one in the world could have predicted.

The VMware team—which was composed of employees from the UK, Germany, France, Ireland, Spain, Singapore, Australia, Japan, the U.S., and Canada—worked around the clock to fight this virus. We came together to be a part of something much bigger than ourselves, and bigger than this project. We continue to collaborate with other companies, third-party experts, and global government bodies to help design the most viable way to track and manage this virus—and we continue to consult and advise different groups around the world based on the learnings and successes of our open source solution. A number of the learnings we share include:

  • Ensure balanced skill sets: We always advocate the importance of a balanced development team, but the significance of this approach was never more evident than when we shipped our MVP in just six weeks. With designers, product managers, developers, platform engineers, and site reliability engineers on board from day one, the need to have the right cross section of skills to deliver quickly, without dependencies, was critical. This also included an extended balanced team and partners for security, epidemiological research, and ethics being in place to ensure all parties could input into the product with guidance at the governance boards for prioritization from the stakeholder.

  • Right trumps fast: Throughout the project, we strove to maintain a constant balance between doing things right and doing them fast. Striking such a balance can lead to some challenging decision-making, but a key learning for the team was to default to doing the right thing, which avoided rework and wasted time down the line.

  • Speed up start and scale as needed: This project required an accelerated start and scale, meaning that some of our usual processes needed to be adapted for working large and fast rather than small and incremental. Implementing governance/growth boards provided a forum to ensure we could share progress with stakeholders, and seek clarification or renewed direction in as controlled a manner as possible.

  • Security has to be baked in: Security requirements must always be top of mind with an application of this nature, and in this case, security needed to be bullet-proof from Day 1. By partnering with NCSC and providing our own security expertise, we were able to design the architecture and security practices so they were intrinsic to the application.

  • Plan for scalability early: When undertaking a project in such unusual circumstances, plan for wide-scale adoption and use from the moment of release. To ensure that any challenges could be mitigated and planned for, we created a performance and scale team early on to make certain the product and platform would effectively work at the scale and demand anticipated.

Sharing these learnings and—in the process—amplifying the open source nature of this project is important to us as technologists. As a powerful example of this, VMware is a founder of the recently announced Linux Foundation Public Health (LFPH) initiative. Using open source technologies to help public health authorities across the world combat COVID-19 and future epidemics, the LFPH has launched with two hosted exposure notifications projects that are currently being deployed in Canada, Ireland, and several U.S. states. While it will initially focus on exposure notification applications using the implementation of the GAEN system, the initiative will be able to expand to support all aspects of public health authorities’ testing, tracing, and isolation activities.

We’re immensely proud of what we have achieved, and plan to continue to achieve, through initiatives such as this. The experience of engineering in a pandemic, when everything is changing on a daily basis, illustrates once again the potential for technology to act as a force for good and the importance of collaborative innovation to lead the way—today and in the future.

 

This article may contain hyperlinks to non-VMware websites that are created and maintained by third parties who are solely responsible for the content on such websites.

Image courtesy of Mimi Thian via Unsplash.

Previous
4 Secrets to Remote Agile Success
4 Secrets to Remote Agile Success

How one agile team responded to remote work without missing a beat, and without losing its sense of synergy.

Next
Go Faster: Write Tests First
Go Faster: Write Tests First

Here are three reasons why you’ll build software faster when you write tests first.