A key element of any healthy API program is the ability to upgrade and migrate existing services in your ecosystem without causing fatal service disruptions. It takes a concerted effort to safely and successfully complete API migrations and, in my experience, organizations who can consistently upgrade their running systems share a common set of skills and employ similar techniques.
API migration principles
When migrating existing services and APIs to new technologies or architectures there are a handful of common skills and principles your organization needs to follow. The four I’ve seen consistently across all types of companies are: 1) one brick at a time, 2) refactor your landscape, 3) interfaces are forever, and 4) instant reversibility.
One brick at a time
Only change a single aspect of your ecosystem at a time. For example, if you need to refactor some of the code in an existing service, add new features to that service, and convert its output from XML to JSON, that is at least *three* sprint-release cycles. Changing one thing at a time makes it easier to complete a sprint, improves the likelihood that your release will succeed, and increases the chances you’ll be able to quickly solve problems that may crop up along the way.
Refactor your landscape
Essentially, you need to take the same techniques used to upgrade *code* (facades, the strangler pattern, and refactoring) and apply them to your application network. Refactoring allows you to change the “inside” of a component or subsystem without affecting the “outside.” The key to successful refactoring is making sure no one “outside” the system knows about it.
Interfaces are forever
Once you publish an API, you can’t easily “take it away.” That interface should be treated as something that will last forever. If you need to improve an interface, you can release a new one (running side-by-side with the existing one) and direct API consumers to use the new interface when they are ready to make the change. By adopting this “forever” principle, you’re assured of never breaking an existing API consumer when you make changes within your ecosystem.
Support instant reversibility
Successful migration projects have the ability to instantly reverse a production release. There are going to be times when you put an update into the system and something unexpected happens. In those moments, you need to be able to quickly and safely reverse the update and return the system to its previous health. That means not just functionality, but also data storage needs to be restored. Hint: the smaller the release, the easier it is to reverse.
Once you’re confident your team has mastered these skills, you are ready to start your API migration in earnest by applying what is called the S*T*A*R migration pattern.
The S*T*A*R migration pattern
A good way to deal with IT migrations is to have a repeatable process or pattern that you can teach to migration teams. With a well-known process, you have the ability to provide consistent guidance across the entire organization and create tracking methods to improve the observability of various migration efforts within your company. You also have the power to measure the effectiveness of your migration work and, whenever needed, modify the pattern based on the collective experience that builds up from teams working the process.
To that end, here’s a straightforward pattern that you can use as a starter for creating your own teaching, trackable, and measurable API migration process. It’s called the S*T*A*R pattern; an acronym that stands for Stabilize, Transform, Add, and Repeat.
Stabilize
The first step in a safe and successful API migration is to stabilize the existing traffic in your ecosystem. That means standing up “pass-through” proxies inside your ecosystem in front of whatever services you plan to update. Just like code facades that act as entry-points to functionality in a component, these stabilizer proxies become key entry-points into your API landscape.
Ideally, you should be able to add these proxies without even disturbing any API consumer or provider in your system. However, you might need to update API consumers to provide them with new URLs that point to the stabilizer proxy. But this change should only happen once — when you first install the pass-through proxy.
And it is important that you implement a simple pass-through proxy. Don’t try to make any changes to the request or response information at this time (see “One Brick at a Time” above). That way, the only thing that is changing is the URL, nothing else.
Finally, you can use this stabilizer pattern on any portion of your API ecosystem. Place it in front of just one server, in front of a collection of servers, etc. And, if you wish, you can create stabilizer clusters to make sure you maintain a robust and resilient API platform.
Once you have your stabilizer facade in place, you’re ready for the next step: transforming the services behind the proxy.
Transform
Once you have placed a stabilizer proxy between all API consumers and the service you wish to update, you can start to safely transform that service without breaking any API consumers. This transformation can be releasing a new version of an existing service or breaking an existing monolithic component into smaller, more scalable microservices.
For example, if you want to refactor an existing service (e.g. replace an old component written in C++ with a new one written in Java) you can:
a) Code the new version.
b) Deploy that new version as a side-by-side component (e.g. both old and new versions running in the ecosystem at the same time).
c) After you complete your tests for the new component, update the routing (at the stabilizer proxy) to point all API consumer traffic to the new version of the component.
If you want to break up a monolithic component, you just need to release small portions of the existing monolith (using the strangler pattern mentioned earlier) and, as you release each microservice, update the proxy routes to point to the new component. You can do this step-by-step until all the features of the monolith are released as microservices. Again, this should never cause any API consumers any problems.
Of course, if any of this transformation work goes badly, you can use your instant reversibility skills to quickly switch the proxy routing form the newly-released component to the one already running in production. And that’s a reminder that, even when you release new services, keep the old ones around for a while in case you need them.
Now that your team has experience transforming existing services, you’re ready for the next step: safely adding new functionality to your API landscape.
Add
After you have stable proxies in your ecosystem and you’ve had a chance to learn how to transform existing services behind those proxies without disrupting existing API consumers, you can finally start to add new features and functionality to your API platform. That includes adding new features to existing components, releasing entirely new components into your system, and even creating new service aggregation and orchestration components to make it easier for API consumers to get the functionality they need.
The recommended process for adding new functionality to your API landscape starts with building the new component. Second, releasing that component into your system without routing any production traffic so you can run a final set of production-level tests. And, third, adding new routes to your proxy facade that makes the new functionality available in production for API consumers.
It is a good idea to slowly roll out new features in your API platform instead of doing it all in one big shot. For example, once a new component is up and running, make it available to a select set of teams — a small group that can help you test edge cases and work out any remaining bugs. Then, as you gain confidence in the resilience of your new component, you can open it up to more and more traffic until it is available for the entire enterprise.
And, while you are working out the final bugs in your new component, you can always use instant reversibility to undo recent changes and regain a stable ecosystem.
Repeat
The last step in the S*T*A*R pattern is to simply repeat all the above work at some new location in your API platform. It is a good idea to start with a small, rather trivial portion of your API ecosystem with one team and learn to work through the S*T*A*R steps there before tackling more mission-critical portions of your network.
Also, as you gain skills you can add more teams and expand the footprint of your migration plan accordingly. This process of building up your experience piece-by-piece is a great way to approach IT updates in general.
Making it observable
You can track your teams’ S*T*A*R progress using dashboards and metrics display screens. How many teams are engaged in some form of S*T*A*R work? How long does each phase take? Why are some teams faster than others? Are there some teams that seem to be lagging behind? How far along are your teams in completing their S*T*A*R journey?
By tracking and publishing the progress of S*T*A*R teams within your company you can celebrate success, head off challenges, and generally spread knowledge and understanding up and down the entire organization.
Aiming for the S*T*A*Rs
Once you know what you want to change in your system, you can plan out the right stabilize, transform, add, and repeat strategy for the change you wish to accomplish. So, you not only get to repeat your API migration process, but you get to establish and repeat any infrastructure changes, including things like moving from on-premise to the cloud, from VMs to containers, from compiled components to serverless functions, and so on.
Successfully implementing the S*T*A*R pattern at your company gives you a solid set of skills to handle all sorts of change in the future. To learn more about API best practices, check out MuleSoft’s API Strategy resources.