In this episode, we’re joined by special guest Sanjna Verma, Senior Product Manager at MuleSoft. In addition to her responsibilities delivering some of MuleSoft’s core products, Sanjna teamed up with colleagues from Salesforce, MuleSoft, and Tableau to deliver the COVID-19 Data Platform quickly after the pandemic hit. She relates that experience, the insights it brought to her, as well as interesting thoughts on other API-related topics.
You can listen to the episode here:
When the COVID-19 pandemic went global in early 2020, Salesforce leadership made a call to action to the company to see how its technology could be applied to help deal with the pandemic’s impacts. In response to this, a team of individuals from Salesforce, MuleSoft, and Tableau banded together to find out how they could build a data platform to service a wide range of possible consumers. Sanjna Verma joined APIs Unplugged to recount the journey from its abrupt inception to its current state of handling hundreds of thousands of requests per day. The main insights are summarized below in three thematic journey stages:
Stage #1: Finding the way
Sanjna’s entry into the world of APIs came in college, when she wanted to build a multi-destination routing system in Python. She was amazed at the rich set of “digital libraries” — APIs — available to help her shortcut this work. As she puts it, “Every experience I’ve had professionally, formally or informally, has always revolved around how to make the dissemination of technology and information easier. APIs have been the vehicle through which I get to share that narrative all the time.” Fast forward to 2020, and that same desire for digital exploration kicked in when San Francisco society locked down on March 6. Describing her mindset at that time, Sanjna said, “We can’t wander outside, so we may as well wander with our minds.” She joined her peers from Salesforce, MuleSoft, and Tableau with a common mission of using data to understand what was going on with COVID-19.
For API design, the best practice is to start with consumer needs and work back to the supporting data and functionality. However, in the unanticipated circumstances of the pandemic, urgency, and volatility negated this approach. Instead, the team worked to collect as much data from as many sources as possible, and then make it consumable for the as yet unknown consumers. That meant ingesting APIs, CSV’s and PDF’s from WHO, the New York Times, the COVID-19 Tracking Project, ECDC, and more. And the data elements ranged from infection rates and hospitalizations to PPE and government policies. The team built a pipeline to acquire, cleanse, curate, and distribute this data through multiple channels — APIs, Tableau visualizations, the AWS data marketplace, Salesforce’s Work.com — in a way that was usable by consumers ranging from college students to corporations to government agencies. The work paid off, as the API has seen hundreds of thousands of requests per second, and even survived a denial of service attack.
Stage #2: Building the team
Sanjna stressed the inspirational impact of the team, and especially the importance of having several dimensions of diversity. First of all, each of the constituent companies brought their respective expertise: MuleSoft with data normalization and curation, Tableau with consumers’ data requirements, and Salesforce with healthcare domain expertise. In addition to these core team members, external subject matter experts from healthcare and government were consulted to ensure that data was interpreted correctly, and unconscious biases were overcome. Sanjna described an overriding principle of the effort as, “understanding that we might have some kind of bias in the interpretation of the numbers and ask for everyone’s help in figuring out what that meant.” Every effort was taken to ensure the diversity of the team matched the diversity of the problem space.
The team was geographically and linguistically distributed as well. Regional understandings were needed to integrate the many data points into a canonical model that was accurate and understandable. The external domain experts played a crucial role in vetting this model, revising it repeatedly. To help consumers understand the API interactions, the team published the model. As Sanjna explains, “When you go to the library, you have to know the Dewey decimal system the first time to effectively find your book. We had to ensure that people could actually use these APIs and these visualizations to effectively find the data that they needed, but without giving them the Dewey decimal system.” At some points, the team even consulted with family and friends to further the universality of what they were developing.
Stage #3: Dealing with change
From the outset, the COVID-19 Data Platform team needed to deal with drastic and frequent change. Shifting priorities on data types, changes to data source schemas, along with the evolving needs of its diverse consumer base and distribution channels. The first way of handling this was to provide a stable base. As Sanjna says, “We wanted to build a foundation that other people could step on top of and do the things that they needed.” Being resilient to change required an architecture with clearly defined capabilities, and points of demarcation. MuleSoft was used to ingest, cleanse, and normalize the data before normalizing it and propagating it to a unified data warehouse. The cleansed and normalized data was then made available to channels. MuleSoft was used as the API channel, Tableau the visualization channel, with other access provided to Work.com and select partners. This API-led, decoupled architecture minimized the impact of changes such as back-end format revisions, canonical warehouse model updates, or the addition of new channels.
Sanjna noted that constant communication was a key to handling the constant revisions. Continual questioning led to more accurate data and more consumable representations of that data. She stated that Murphy’s Law always comes true, and anticipating possible failures helped to mitigate their impact. At the same time, she cautioned against holding back from releasing things out of fear. “Sometimes you have to just let things go.” The evolvable approach taken by the team allowed them to navigate through periods of uncertainty, and provide a flexible yet usable data set available through multiple channels in multiple contexts.
There are many other insights to be gained from the episode. Give it a listen! Follow the podcast on SoundCloud or subscribe to our newsletter above to get summaries of the episodes.