Schema management - how and why?

Peter Nycander
Peter Nycander

Schema management - how and why?

When coming to GraphQL from other services, and attempt your first deprecation you quickly realize that it can be pretty difficult to track this in a decent way. One of the main selling points of GraphQL is that you no longer has to keep a versioned API, so how can we actually achieve this if we can't track deprecations? Enter: schema management tooling.

How schema management tools work

Schema management includes a broad set of features related to maintaining your GraphQL schema. I will go through different great features here and how they work.

Integration to your server

In order to use any feature of a schema management tool, you need to connect it to your GraphQL server. That is usually done through some plugin system. Either an official plugin from the schema management provider, or a third-party plugin. The plugins work by hooking into lifecycle methods of a request. For example, it might do the following and store it temporarily during the request:

  • Register how long the entire request took
  • Register all relevant headers (like client name, client version)
  • Register the GraphQL operation string
  • Hook into every resolver function and measure the execution time
  • Register any runtime errors

After the response has been sent to the user, it sends off a request to the schema management provider, which stores it for you to use later on in your day-to-day work.

It might also send off the current schema on server startup (or during CI) to the schema management provider.

Tracking deprecations

This is the feature I mentioned before. You have deprecated something and now you want to remove it.

Because we have integrated our server with a schema management tool, the schema management tool now knows of every request that comes in to the server. The tool also knows about the current schema, and that schema information includes information about what fields are deprecated. So now, it is "only" to glue these pieces of information together.

When logging into a schema management tool, you will typically find some place to see the current list of deprecated fields, and the number of times those field have been requested. The tool has glued the pieces of information together for you, and if the number of requests is 0, you can feel safe to remove it. This is often useful even if you control all consuming clients, like if you have an SPA who just won't refresh, or an app that someone won't upgrade.

Continuous integration

During a continuous integration workflow, a schema management tool might be able to tell you if a pull request is about to break production. It typically works like this:

  • A GraphQL schema is generated from the code in your branch
  • The schema is sent off to the schema management tool
  • The schema management tool calculates a diff between production and your branch, and checks for breaking changes according to production traffic
  • The report with the diff and any breaking changes is sent back to the CI pipeline

Track field performance

A schema management tool can often show how fast all the different fields resolve, and also show the spread of different execution times. This requires the plugin to gather data on every single resolving function, and that can add quite a bit of overhead, but it works like this:

  • The plugin records the execution time of every field resolver and sends that as the part of the integration
  • The schema management tool stores the execution time for that field, and all relevant metadata connected to that request
  • The schema management tool calculates relevant metrics based on the raw data of aggregated reports.

Apollo Studio

Apollo Studio is the big actor when it comes to schema management, and it can be a great tool for some companies. Apollo Studio is the product of the Apollo startup, who raised a $130M Series D values at over $1.5B in August 2021. Suffice it to say, Apollo is a big boy.

In my experience, if Apollo Studio it is not a good fit, it has either to do with their pricing, or that they lack good support for the GraphQL runtime you want to use. I have tried using it with graphql-ruby, and it has not been a great experience. We have had to modify a broken third-party plugin, maintain the gRPC contract, and it is only half-working.

Apollo has their own runtime in Apollo Server, and while that is actually my personal preference when it comes to runtimes, it is not a great fit for all companies. However, if you use Apollo Server, then integrating with Apollo studio is trivial (you just include a few environment variables).

Hubburu

Hubburu is a new product developed by an experienced GraphQL developer with experience of high-load systems. It is ready for enterprise-level scale. Sign up here to join the alpha.