In a blog post entitled “The 4 P’s of API Governance,” MuleSoft’s Matt McLarty shared that many people think of governance as a four-letter word. When discussing fundamental data principles to support, no one seems to bat an eye when suggesting they adopt the “Data is an Asset” principle. It’s so rudimentary that The Open Architecture Framework Group (TOGAF) identifies it in an extensive example set of architecture principles and aligns it with two other data principles: Sharing and Accessibility.
But get the stakeholders of your program to stick around for the second half of a governance meeting and, together, you will likely have several four-letter words to choose from. The reality is, the governance landscape can become increasingly complicated as teams take on new initiatives to modernize key lines of business systems with APIs, cloud migration, and data migration.
For data governance to be successful it needs to leverage the right combination of people, processes, and technology. Aligning to a “Data is an Asset” architecture principle means implementing a data governance strategy that protects, identifies, reuses, documents, and unifies its most valuable data. In this blog, I will share nine important steps to build a solid data governance strategy with an API-led connectivity approach.
Step 1: Protect your data
Protecting an asset seems like an obvious first step. That said, in a recent Netflix docuseries called Heist we discover that millions of dollars in Kentucky bourbon barrels were left unsecure and the results weren’t pretty. If our data is an asset, then we need to protect it. Otherwise, like in Heist, things will go wrong.
APIs can help a data governance program build security into data management. API policies, authentication, authorization, and encryption can do much of the heavy lifting here. If you follow the MuleSoft recommended API-led connectivity approach, these can be applied to every experience layer API, using API Manager to apply policies, authorize consumers, and enforce connectivity via two-way Transport Layer Security (TLS).
Step 2: Identify your most valuable data
Identify the single source of truth across your key data entities and lock it in. Spend some time determining which data entities provide the most value first. If you are a trading organization, you look to your trades, your counterparties, and your financial instruments. If you supply electricity to homes, your customer meters, meter data, and outage information may be the most valuable data. And if you are a purveyor of bourbon, then your bourbon inventory and supply chain is key.
Often companies will find, through this process, their key data is being updated and sometimes created in multiple systems by different people. This results in different versions of the truth. Lack of access control is often cited as the reason behind multiple versions of data. API’s can enable access control to ensure a single source of truth for your data entities.
Step 3: Monitor data reuse
Monitor how often your most valuable data is reused. An API can help share this data, making it available to multiple consumers. Sharing was another one of our desired principles to be supported by data governance. But just how often is this data being used to come up with some new product, analyze profitability, or locate supply chain issues?
Using API Management reporting capabilities such as MuleSoft Visualizer can give us insights into how many other components depend on this System API. You should expect to see the usage and dependency on these key data System APIs to grow over time as more and more lines of business applications look to access this key data in the standard, secure, validated way as approved by the data governance policies.
Step 4: Develop documentation
Develop your data dictionary with complimentary API documentation. There are many ways to enable this communication and developing and maintaining a data dictionary should be near the top of the list. The data dictionary provides overall documentation about the structure of the data which can inform stakeholders including users and developers on usage.
API developers and system integrators can help improve the vocabulary by using Anypoint Exchange. Exchange provides the ability to share assets, resources, and documentation easily across your organization. Pro tip: Be sure to establish a good protocol, standard, and expectation about the documentation included in Exchange.
Step 5: Create an API-enabled application network
Begin to design and build out an API-enabled application network. The value of the application network comes with the reuse of data across applications while maintaining the integrity and quality of the data. Creating connectivity where it did not exist before begins to eliminate the need for swivel-chair integration but has the potential to reduce replication and duplication of data.
Step 6: Enable your citizen integrators
Enable certain roles to build citizen integrations with the right tools. The IT demand curve laid against the IT capacity curve almost always shows an environment in which IT struggles just to keep up with operations and support. Necessity is the mother of invention and many business users have found ways to compensate for the resource discrepancy. An early mentor of mine once said, “All businesses really want is Excel.” The reality is that Excel and shadow IT are endemic to most businesses. I’ve seen spreadsheets driving payroll, maintaining reference data, and calculating business investment deals. This begs the questions: what is the quality and timeliness of the data? How repeatable is that process and does it scale? But neither IT nor data governance programs have been able to completely quell these rogue spreadsheet uprisings.
Over the past few years, as the term “citizen integration” began to take shape, many in IT (myself included) turned up their noses and said, “That certainly won’t work here; what would happen if we allowed them to build an integration flow, never mind the fact it’s already going on?”
Citizens integrators have long integrated data by their own means to achieve data accessibility, but API-enabled application networks and new tools can now provide better guidance and structure to this access. With the introduction of tools such as MuleSoft Composer, the right questions perhaps should be, “Will this tool allow better data access, speed up data integration, and do all this in a well-managed secure environment?” The answer with MuleSoft Composer is, “Yes.”
Step 7: Have a unification plan
Moving to cloud-first strategies and architectures can have a big impact on existing data governance programs and the transition to the cloud can leave some data in limbo. Where once, the system of record was sitting in an enterprise application in the data center, a new initiative has moved some of the data to a SaaS application on a cloud platform, now splitting the system of record. The API-led connectivity approach can bring together these two different data sets and easily present them as a single source. Work with the data stewards to document the paths and any special business rules for handling the historical data.
Step 8: Prepare for the impact of big data
As organizations add data sources from the cloud, from social media feeds, IoT devices and endpoints, and many more, this influx of data could put a strain on the stewards and trustees. The System layer in the API-led connectivity approach is set up to solve that strain and easily unlock the data for the Process layer. To develop System layer interfaces in front of this data, we will need to consider performance, pagination, aggregation, and availability. Taking advantage of API streaming capabilities like MuleSoft’s AWS Kinesis connector allows you to manage this data with the same policies and oversight described in the first four steps.
Step 9: Enable accessibility of your data
Utilize Anypoint DataGraph to enhance the accessibility of the data. As we connect the dots in our application network with data and services from inside and out, doing so with clear guidelines from the data dictionary and thorough specifications along with great API documentation in Exchange, we can define a unified schema. This could become a valuable tool for data stewards to explore the data across the application network, potentially providing a mechanism to identify data quality issues, master data issues, and even accessibility issues.
Get started on your data governance journey
In 2006, Sir Tim Berners-Lee, the inventor of the World Wide Web said, “Data is a precious thing and will last longer than the systems themselves.” When he said this, the broader context was that customers should be given control of their data. Recognizing the value of your data, knowing data will outlive us all, and providing more control of data to the data owners can all be catalysts for ensuring data management technology supports a data governance strategy that you L-O-V-E. Now there’s a four-letter word you didn’t think could come to mind with data governance.