In the first blog post of this two-part series, we reviewed how our data access layer was built and how multi-tenancy data was passed around using domains. We also hinted at how difficult this was to actually get off the ground.
We had to execute some fairly deep code dives to get domains to work for our purposes, since we quickly discovered that requests’ domains were getting lost somewhere in our code paths. We started opening the hood on Node.js and the libraries that we use and after a lot of debugging found a pair of critical issues that will affect just about any real-world system.
Open Source Collaboration
First, the knex library, which we use to talk to our back-end databases, was not domain compatible. Because it uses a connection pool, connections aren’t created in scope of a domain and so must be explicitly attached to domain when acquired from pool and detached when returned to the pool. We resolved this issue by adding domain support to the underlying generic pool library and submitting that patch back to the community.
Second, the bluebird library, which we use extensively to support asynchronous programming patterns, holds a reference to Node.js’ setTimeout implementation. A deeply tricky issue, however, is that when you use domains, setTimeout is replaced with a new function. So suddenly one finds onesself in a world where bluebird uses a different version of setTimeout than your own code!
We discussed this with the Node.js core team, who graciously fixed this issue: They no longer override the actual setTimeout reference, but an internal reference instead.
A lot of work went into making domains work in our system – the above two changes required a lot of investigation and a lot of work with the open source community. And the approach did, in fact, finally work. But, in the end, we decided to swallow our pride and remove domains from our codebase. Instead, we manually pass around a context object which holds the information that, before, we stored in our domain. This is a bulletproof approach and one that carries no risk of us having missed some crucial detail within one library or another, and no risk of a new library dependency in the system causing issues due to lack of proper integration with domains.
This switch over to manually passing around contexts has turned out to be less cumbersome than we expected, although certainly not as beautiful as the domain-based approach. Furthermore, the decision was further validated when the Node.js team began discussing deprecating domains.
The lesson we will take away from this is to avoid the use of overly exotic constructs and to work with Node.js’ single-threaded grain, rather than against it. We would still love to see a context-like construct be made available, but only one that did not require active support from third-party libraries to properly work.