GraphQL: Publish the Schema of Your Company, Not Your Api

GraphQL is an interesting technology that promises better developer and user experience, but its usefulness can be misinterpreted as an overall better technology than what came before it.

To best utilise it, a specific organisation approach must be used. GraphQL works best when the whole organisation shares the same domain conventions. Domain Driven Design shines in a GraphQL environment, but it requires a lot of effort to change a company’s mentality and embrace it.

In this article, I’ll describe how GraphQL differs from a more common REST API approach and how to ensure a successful coexistence or migration.

HTTP REST API

The default way to build a service is over the internet. All languages and libraries understand it well, and its division into different verbs provides solutions for safe, idempotent and cacheable requests.

Let’s take a library API as an example. To add a new book or to publish a new review of a book, a REST API would look like this:

Method	Endpoint	Description
`POST`	“/books”	Create a new book
`POST`	“/books/:id/reviews”	Create a new review

This is fine in a typical CRUD style API, but a GraphQL one can instead leverage the business’ domain naming:

type Mutation {
  publishBook({book info})
  publishReview({review info})
}

Differences between REST and GraphQL

Your schema is your API

This is a significant shift from the way your endpoint will expose data. To ensure a well-structured schema, start from the business problems and produce a cohesive shared understanding of the business. Domain Driven Design is what allows an organisation to create one and use it to define what your schema should look like.

Start by defining your domain events. What events happen in your business that are definable as a step forward? Create as many as you need. Once done, organise them in a timeline and group them by context. Each context will enable consistent use of the same words across all events, even when the same word means something different in a different context.

Some events will require business rules, and those rules will require data from internal or external sources. Define those rules and where the data is coming from. This will result in an action that will generate the new domain event.

This system will help define what structure your schema will have:

Internal resources are GraphQL Queries
External resources live outside of the schema and must be merged, together with the business rules, in the business layer logic
Actions are GraphQL Mutations, ready to be exposed on the schema

Remember that your schema is an ever-evolving system. Embrace it, and don’t be afraid to change it when they improve your customers’ experience.

Common mistake when migrating from REST

Don’t simply move your CRUD verbs. GraphQL’s queries and mutations can use names that make sense to the business, not an API convention.

Consider a typical REST API:

Method	Endpoint	Description
`GET`	“books/”	Get all books
`GET`	“books/:id”	Get a book
`POST`	“books/“	Create a new book
`POST`	“books/:id/review”	Create a new review
`PATCH`	“books/:id/review”	Update a review

A common first step is to use the same style that REST adopts:

type Query {
  getAllBooks()
  getBook({id})
}

type Mutation {
  createBook({book data})
  createReview({bookID}, {review data})
  updateReview({bookID}, {review data})
}

There is no need to stick to REST verbs if it doesn’t make sense from a business perspective. It is better to use more straightforward and more effective names.
Updates are handled differently in GraphQL. Use one big input object instead of dividing it in id and body.

type Query {
  books()
  book({id})
}

type Mutation {
  // include everything in one object, {bookID} should be included inside
  createReview({review data})
  updateReview({review data})
}

Versioning

In REST, new versions are required to clarify in what shape your data will be returned, but it is an anti-pattern in GraphQL. Multiple versions mean multiple schemas, which will get out of hand quickly. GraphQL’s way of dealing with changes is to add new fields when necessary and tag the old ones as deprecated. There is no drawback in having a lot of fields, as every client will always require only what they need.

Authorisation

There are no particular differences here, but if the plan is to keep both REST and GraphQL endpoint, it is better to move the authorisation logic down into the business layer to consolidate the checks in a single place and allow each endpoint to only care about passing down the correct data.

Query Complexity

The freedom allowed by GraphQL is a double-edged sword. Clients can optimise their queries for their use case, but what they need and ask at once might be onerous on your infrastructure. Load-testing your client’s use cases to verify how many records and how much nesting to allow must be considered when exposing your GraphQL endpoint to the public.

Rate limiting

Rate limiting an API can also be challenging. Consider how each part of your schema respond to an increase of queries and implement a business logic to prevent abuse.

Caching

Contrary to REST, caching in GraphQL doesn’t have a well-defined set of rules, with each client managing its cache. If one or more services rely heavily on caching to be functional, think twice before moving them to GraphQL, as it is a delicate matter even when using best practices.

Client Side Caching

Client side cache, with clients like Apollo, relies on cache deduplication and normalisation to optimise the fetching and re-fetching of data. This system works similarly to an SQL database, leveraging the knowledge of the graph architecture and the information about the query to know what to load and what to skip.

Let’s assume this query:

query GetAllBooks {
  books {
    id
    title
  }
}

That returns this data:

{
  books: [
    {<book1>},
    {<book2>},
    {<book3>},
    {<...>},
  ]
}

By default, the normalisation process will split the data to allow it to be individually cached:

{
  id: 1,
  __typename: "Book",
  title: "Book1",
}

Apollo Client will use the __typename and id to create a unique pair that will use to cache it on the client, in a quick-access data structure, like this:

{
  "Book:1": { id: 1, title: "Book1" },
  "Book:<n>": { id: <n>, title: <title> },
}

Notes on client side caching

It’s important to have a unique identifier; without it, Apollo Client won’t be able to keep track of changes to the object
The library creates an array that references the normalised data to keep the correct ordering in place.

Server Side Caching

Apollo Server supports server side caching. It can be applied to a type or a specific field by proving a maxAge value, which describes the number of seconds the cache will be considered valid.

These are the allowed properties to control the cache:

Name	Description
`maxAge`	Number of seconds the resource will be cached for. Default to 0.
`scope`	Can be PUBLIC or PRIVATE. Use PRIVATE to make a response as specific to a single user Default is PUBLIC.
`inheritMaxAge`	If set to true, it will inherit the maxAge value of its parent. Make sure not to provide maxAge if you plan to use inheritMaxAge.

It is used by using a directive called @cacheControl . The previous query for books could be cached for one day by having a type Book with this value:

type Book @cacheControl(maxAge 86400) {
  id: String!
  title: String!
}

The caveat is that the query as a whole will be valid based on the field with the lowest maxAge value.

type Review {
  id: String!
  book: Book!
}

A few caching rules

Having maxAge set to 0 by default means nothing is cached out of the box. It is easier to start without cache and add it where needed, avoiding unexpected behaviours by caching too early.
If any field inside a query has a smaller maxAge value, the whole result will have that cache value associated with it.
If any field inside a query is set to PRIVATE, the whole response will be marked as PRIVATE.

Here, I’ve discussed statically set cache. Note that there is a way to set the cache dynamically. This is only a small introduction to a deep topic.

Conclusion

This article barely scratched what GraphQL can do for an organisation. The simplicity given by being able to organise and find data in the same way it is talked about from the business is a benefit that enables a more cohesive discussion between engineering and product, and it should not be considered an extra but instead seen as the core of it.