REST – the REpresentational State Transfer as defined in Roy Fielding’s thesis – is not a protocol, a standard, an API, a technology or a product. You cannot buy it, you can’t download and install it and you don’t need to poke another hole in your firewall for it. Instead, REST lives at a level completely decoupled form any specific technology, protocol or product, because REST is merely an architectural style: A set of constraints and principles, which should influence your system architecture decisions. When building a distributed system, you will end up with a RESTful architecture if you don’t just do whatever is technically possible, but instead voluntarily restrict yourself only to what is allowed in the narrow confines of those constraints.
While working on a RESTful web-services platform, I came across a lot of discussions devoted to these REST constraints, sometimes arguing the point merely by quoting from Mr. Fielding’s thesis or more ominously warning about ‘bad things’, which could happen if not following the constraints to the letter.
But engineering is all about making the right compromises. Therefore, it is important to know the potential benefits of following a REST constraint and the issues you might have to deal with should you chose not to. Whenever we are told to restrict ourselves, it is only fair to ask “why?” That is the only way we as engineers, designers and architects can make an informed decision. So, in this small series of articles then I will explore those constraints with a firm focus on the actual, real-world benefits of the individual constraints.
Prolog: REST is not HTTP
There is one very important point to be made before we dive into the benefits of REST constraints: REST is not HTTP and HTTP is not REST: Using HTTP does not mean that your communication patterns are RESTful and neither does a RESTful architecture mandate or even imply the use of HTTP. It just so happens to be the case that HTTP is particularly well-suited to implement RESTful systems. That should not surprise us, since Roy Fielding was also a key contributor to the HTTP 1.0 and HTTP 1.1 RFCs. As a consequence, almost all real-world RESTful systems are based on HTTP and even here in this article, we will use HTTP as the illustrative example.
Constraint: Standard methods and use of resources
Consequences if not followed:
- Inability to utilize standard Internet infrastructure elements properly (caches, proxies, load balancers, etc.)
- More costly, complex, fragile systems due to need to re-invent the wheel (for example, custom caching or security layers)
- Possibly lower adoption rate of APIs due to complex use, need for custom client software, possibly not suitable for easy sharing
This particular constraint in my opinion is “the big gun” of REST: It is the most visible constraint, and one of the most unusual ones for those practitioners with an SOA or RPC background to wrap their heads around.
The constraint here is that we are restricting ourselves to use only standard methods (called ‘verbs’ in REST parlance) and to use those standard methods on particular, identified resources (which REST calls ‘nouns’). This is in sharp contrast to a traditional RPC or WS-* SOA approach where arbitrary, custom actions are routinely defined and accessed.
To clarify, let’s see how this maps to HTTP: In HTTP we have the HTTP methods, such as ‘GET’, ‘POST’, ‘PUT’, ‘DELETE’ (and a few others), which take on the role of the REST verbs. And the URL path component usually can be equated with the REST noun, the resource we are operating on.
Why does this constraint exist? What are the benefits of using only standard methods? In a RESTful system the entire infrastructure – including intermediaries such as load-balancers, caches, proxies, etc. – become active participants in the communication. If you don’t use standard methods then those intermediaries – over which you often do not have any control – cannot actively participate in the communication. Why is this an issue?
Consider SOAP as an example of how to do it in a non-RESTful manner. In SOAP you send a detailed request description in XML to the server. The description may look like this (taken from the SOAP Wikipedia page and slightly shortened):
The HTTP POST method is used for every request, while a custom method is encoded within the XML body (‘GetStockPrice’ in this case). The URL path (‘/InStock’) is identifying a service endpoint, but is not used to identify the resource (the stock). There are now several issues with this approach:
Caching proxies generally do not cache the responses to POST requests. POST requests by convention are used to change state on the server. Therefore, caches cannot simply short-cut that client/server communication and always need to pass those requests through to the server. This particular request is ideally suited for the HTTP GET method (which is cacheable), but in SOAP that method is not used. To make matters worse, caches normally only look at the HTTP method and URL, not at the request body when looking for a cache hit. As a consequence, ‘the Internet’ and all its standard infrastructure elements would like to help us scale and perform better (by caching responses for example), but are prevented from doing so. In effect, we are working against the Internet and not with the Internet, thus forgoing all the possible scalability benefits of standard Internet elements. For example, for SOAP, you need to implement your own, custom caching layers. In a RESTful system, however, standard off-the-shelf proxies, caches, load-balancers, etc. can be dropped into place wherever you see fit, generally without the need for any customization: If you restrict yourself to standard HTTP and its proper use, all those standard infrastructure elements which understand only standard HTTP instantly become useful for your application.
An additional problem is that with SOAP you cannot bookmark the URL, share it or click on it and thus use it in a simple manner. Instead, you need to have detailed knowledge about the implemented service and the information that needs to be sent to the server. This is a direct consequence of not using the URL path to identify the resource in question and of not restricting yourself to standard HTTP methods. If you restrict yourself to standard methods + resource identifier for your HTTP request then a simple URL becomes an extremely effective encapsulation of all you need to know in order to access a resource. Thus, a bookmark (or a link emailed to you) is actually something usable.
Don’t assume that moving the method out of the request body and using GET instead of POST fixes the problem. The Internet is teeming with ill-conceived APIs of that sort, which encode all the request details as parameters in a URL. Assume you design an HTTP-based API to manage your customer list. You have requests like this:
At first, this looks like an improvement: The URL can certainly be bookmarked. You can just click on it and get back something useful. Caching proxies can even be of help here, since the URL contains all they need to match up request and cached content. So what’s not to like about this?
In the end, we are still just using HTTP to tunnel our RPC requests! The standard HTTP methods are dismissed, since we are now always using GET. Instead, we opt to encode the custom method in the URL. Now assume the first request (‘GetCustomer’) is seen by a caching proxy. The response to that request is cached. But now the second request comes along. The caching proxy speaks HTTP, not English and definitely does not know what all your different method names mean. Therefore, the cache has no idea what ‘DeleteCustomer’ actually does. As a result, if the cache then sees a ‘GetCustomer’ request again, it does not know that the customer does not exist anymore and will return stale results to the client. You can try to work your way around that by carefully managing the caching related HTTP headers, but why make it difficult for yourself?
In a RESTful system these requests would look like this:
http://example.com/customer/123 (request uses HTTP GET method)<br> http://example.com/customer/123 (request uses HTTP DELETE method)
The URL stays the same in both cases. Since DELETE is a standard HTTP method, a caching proxy knows that its copy of the result for this URL should be invalidated. The cache behaves correctly by default. The same goes for updates to the customer entity: While our non-RESTful API would have used something like an ‘UpdateCustomer’ method encoded in the URL, in a RESTful system you merely use the HTTP PUT method. PUT is also understood and dealt with correctly by caching proxies.
Summary for the ‘standard methods and use of resources’ constraint
It is important to realize that many of the Internet’s infrastructure elements are active. They are either already deployed in ISPs or data centers or they are readily available for deployment in your own enterprise network in the form of excellent open source and commercial projects. These infrastructure elements know how to deal with HTTP correctly. Allowing this infrastructure to do its job means taking advantage of the inherent capabilities of the Internet, resulting in simpler, more scalable and easier to adopt applications and APIs. Restricting yourself to standard methods and resources allows these standard tools to work out of the box, without customization. That in turn allows your organization to react more quickly to changing requirements and modify server setups and system architectures, should the need arise.
If you chose to design your system in a non-RESTful manner, you are deliberately not allowing the Internet to work for you to its fullest potential. In effect, you either degrade the otherwise powerful, active infrastructure elements to mere routers, or you might even be forced to add hacks and workarounds to prevent them from doing what they do best, while adding layers of custom code to perform what the Internet already provides as inherent capabilities.
While the constraint for standard methods and use of resources is just one of the REST constraints, it is one of the most important ones to understand. It can take a while to change your mindset from RPC (encapsulated data and custom methods everywhere) to the way REST sees the world (small number of standard methods, data open in the form of resources). However, if you want to achieve true scalability, easier uptake of your API and organizational agility, it is well worth understanding where this constraint comes from.
Since this article is already quite long, we will discuss the remaining REST constraints – such as statelessness, HATEOAS and media types – in the next installment of this article series.