Christian Giacomi

An attempt at a small developer blog

REST Design principles

Design principles when developing REST API's

Posted — Nov 2, 2019   Category — architecture 

Most of the developers I talk to eventually end up talking about REST, and they all want to build the best REST API possible. This of course leads to the usual discussions about what is a good REST API, or what should you do to ensure that your API is RESTful. Sometimes tensions run high and so I decided I wanted to write a post to address this.

This post is meant as a way to gather what I have come to see as the primary design principles behind REST. I will of course update the document to reflect any lacking on my part or any new ideas that could help anyone out there who is in doubt.

What is REST

REST or ‘Representational State Transfer’ is an architectural style that defines how a Web service should be designed and behave. REST defines a set of constraints which must be followed in order to provide greater interoperability between systems on the internet. Web services that follow these principles are said to be RESTful.

The term ‘Representational State Transfer’ was introduced by Roy Fielding in 2000 in chapter 5 of his doctoral dissertation.

One of the fundamental pillars behind REST is the formalization of the constraint of the client-server architectural style. Where clients and server follow the separation of concern between “user interface concern and data storage concern” which allows for client and server systems to evolve independently of each other.

I will try to summarize each constraint and then show how this applies to HTTP and the way we can apply it to develop web services.

Client-server

The client-server constraint is based on the separation of concern between the user interface concerns on one side and the data storage concerns on the other. This separation allow us to improve the portability of the user interface across multiple platforms and improve the scalability of the server components as a result of its simplification.

This separation, furthermore, allows for the different concerns to evolve independently of each other, which is vital to support internet scale requirements of multiple organizational domains.

Stateless

The stateless constraint postulates that the communication between client and server be stateless and each request from the client must contain all the required information for the server to perform the requested operation. Each operation on the server must be completely independent of any other request that may have preceded it.

No client context has to be stored on the server between requests, thus the client is solely responsible for storing and handling all application state.

By removing client state from the server we are able to simplify server logic and scale better. Furthermore requests on the server can occur in parallel without having to worry about interaction semantics.

Cache

The cache constraint is meant as a way to improve network efficiency by explicitly or implicitly labeling response data as cacheable or non-cacheable. Should a response be labeled as cacheable then the client is allowed to reuse the response data for equivalent requests.

By adding caching we aim to partially or completely eliminate interactions between client and server. The reduced number of interactions and reduced latency will greatly increase user perceived performance as well as efficiency and scalability of the different concerns.

One drawback of this is the decreased reliability of the cached data, as it might differ greatly from the server stored data. A second drawback is the added complexity of invalidating the cache when data changes.

Uniform interface

The uniform interface constraint is the fundamental pillar of a REST system. The constraints aims to simplify and decouple the architecture and allow each part to evolve independently. This constraint contains 4 principles:

1) Identification of resources

The key abstraction in REST is a Resource. A resource is anything that can be named (using Nouns) and maps a set of entities. Each resource is unique and independent so that can it be identified and referenced using a resource identifier. The representation of a resource in the data stores is conceptually separate from the representations returned to the client.

2) Manipulation of resources through representations

Representations consist of data and metadata describing a resource, and for purposes of message integrity a representation can also contain metadata to describe the metadata.

Representations allow a client to hold enough information to display, modify or delete the resource.

3) Self- descriptive messages

Each message contains enough information on how to process the message. A client receiving a response is fully capable to parse the message and apply any business logic to the contents of the message.

4) Hypermedia as the engine of application state

Through the use of an initial URI the client should be able to use the server provided links to dynamically discover all the available actions and resources. The actions and resource available to the client may change dynamically as the client processes the returned responses, and a client should have no need for hard-coded information regarding structure, actions and dynamics of the returned resource.

Layered system

Layered systems allow the for the introduction of proxies, gateways and firewalls at various points in the communication between client and server without changing the interfaces between the different components. This layered architecture allow for improved performance and scalability. REST's stateless constraint and self descriptive messages enable the addition of intermediate processing.

Code-On-Demand

The ability for a client to receive code from the server which enables the client to correctly process data without knowing the sematics of the data.

REST and HTTP

Having seen the six constraints which REST postulates we can now look at how they are applied when dealing with HTTP. So lets start looking at the resources and move on from there.

Resources

Everything is a Resource in REST as previously stated. As such when starting to write a new API one should start by identifying the resources that your API will be dealing with. Anything that can be given a name is a good candidate for a resource. Remember that resources should be Nouns not Verbs. Here is an example:

Article
User
Author
Image
Bird

Uniform URI

Most developers make the mistake of designing their routes to manipulate their resources using verbs, which is typical of RPC. When designing a REST service one should focus on Nouns instead. For example, if one of the resource was an ‘Article’ then to get all articles you should have a route that looks like this

/articles

and NOT

/getAllArticles

The HTTP methods, mentioned further down, will help to distinguish the action you are performing on the ‘article’ resource. Also the convention is to use the plural form ‘articles’. This will make more sense once we look at the HTTP methods.

When referencing the entire collection of ‘Aritcles’ the URI would look like this

/articles

While to reference a specific article in the articles collection, you would have a route that looks like

/articles/123456789

So when talking about uniform URI's one should realize that the URI to the resources in the API become uniform because they follow the same semantics. If we take the articles resource a little further and add ‘Author’ we can see that the resulting routes in our API follow the same pattern.

/articles
/articles/123456  

/authors
/authors/123456

This makes it easier for clients since they do not need to weed through tons of documentation to understand how to use the API, all they need to know is which resources are exposed.

Filtering, Sorting, Paging…

When looking at uniform URI's in RESTful service one thing you might be tempted to do when sending a request is to add certain control information, like filtering, sorting and/or paging in the HTTP request headers, or worse as part of the request payload. Unfortunately this is not the best way to do so.

The most correct way, which also plays nicely with the hypermedia constraint is the practice of including this control information as part of the route, in a query string. Like so:

/articles?author=chris

And although this is perfectly fine, it is not very flexible. The filter above would only work for exact matches. The problem is that URL parameters only have a key and a value but filters can be composed of three components:

LHS Brackets

One solution to the problem is the use of LHS brackets which are composed in this way

author[eq]=chris
status[ne]=past
count[gte]=200

With LHS brackets we can have as many operators as we need [lte], [gte], [eq], etc. The result would look something like this.

/articles?author[eq]=chris

LHS brackets can be a little more complicated to implement on the server but provide greater flexibility.

HTTP methods

When creating a REST API we already mentioned that you need to use nouns not verbs. So how do we express the intention of the route? How do we express the action to be carried out on the resource? Well that is where HTTP methods come into practice. The important thing is to use the correct method, for the correct action.

GET

The GET method is used ONLY to retrieve the resource and NOT to modify it or create it. GET endpoints are said to be safe endpoints as they do not change the state of the resource. Furthermore GET endpoints have to be idempotent. This means that making multiple identical requests must produce the same result each and every time until a change to the resource is carried out on the server.

GET requests should not contain a body, but should send any parameters needed for the request in a query string. An example of this could be filter or paging parameters.

If the resource is found in the API then the endpoint must return an HTTP response code 200 (OK) together with the body containing the resource in the format the client asked for.

If the resource is not found in the API then the endpoint must return an HTTP response code 404 (NOT FOUND).

If API determines that the GET request is not properly formed then it must return an HTTP response code of 400 (BAD REQUEST).

POST

The POST method is used ONLY to create a new resource. POST endpoints are not considered safe because they result in a change in the collection of resources in the API. POST are not idempotent because invoking two identical POST endpoints will result in the creation of two different resources containing the same data, but different ids.

If the resource has been created in the API then the endpoint must return an HTTP response code of 201 (CREATED) and contain the newly created resource in the response message and have a Location header.

If the POST endpoint returns a response before the resource, which can be identified by a URI, is created then the endpoint must return either an HTTP response code of 200 (OK) or 202 (ACCEPTED).

Responses to POST are not cacheable, unless they contain the appropriate Cache-Control or Expires header fields.

If API determines that the POST is not properly formed then it must return an HTTP response code of 400 (BAD REQUEST).

PUT

The PUT method is used primarily to replace an existing resource or if the resource does not exist the API may choose to create a new resource. PUT endpoints are not considered safe, but they are considered idempotent.

If the resource is found in the API and successfully modified then the endpoint must return an HTTP response code of 200 (OK) or 204 (NO CONTENT).

If the resource is not found in the API then the endpoint must return an HTTP response code 404 (NOT FOUND).

If the resource is not found in the API but is created by a PUT endpoint then the endpoint must return an HTTP response code of 201 (CREATED) and contain the newly created resource in the response message.

The difference between POST and PUT are the target of the URI. POST URI's target the resource collection while PUT URI's target the individual resource. And of course POST endpoints are not meant to modify and existing resource.

PATCH

The PATCH method is used primarily to partially update an existing resource or if the resource does not exist the API may choose or not to create a new resource. PATCH endpoints are not considered safe nor idempotent.

If the resource is found in the API and successfully modified then the endpoint must return an HTTP response code of 200 (OK) or 204 (NO CONTENT).

If the resource is found in the API but cannot be modified due to conflicting state then the endpoint must return an HTTP reponse code of 409 (CONFLICT).

If the resource is not found in the API then the endpoint must return an HTTP response code 404 (NOT FOUND).

DELETE

The DELETE method is used to delete an existing resource identified by the URI. DELETE endpoints are idempotent since calling delete on the same resource results in the same outcome.

If the resource is found in the API and successfully deleted then the endpoint must return an HTTP response code of 200 (OK) if the response includes an entity describing the status, or 202 (ACCEPTED) if the action has been queued or 204 (NO CONTENT) if the action has been performed but the response does not include an entity.

If the resource is not found in the API then the endpoint must return an HTTP response code 404 (NOT FOUND).

Different representations

Resources can have multiple representations, also different client might expect the resource to be represented differently according to their needs. A client can therefore send a request for a resource to be represented in a specific form, this is what is called Content Negotiation.

Content negotiation can be done using headers

Content-Type: application/json

or via URL patters

https://api.domain.com/v1/articles/123456.xml
https://api.domain.com/v1/articles/123456.json

References

Architectural Styles and the Design of Network-based Software Architectures.

The following documents together define the HTTP/1.1 protocol:

If this post was helpful tweet it or share it.