GraphQL at Braintree

GraphQL has had our attention at Braintree for a while now. With Facebook continuing to evolve the specification and reference implementation, interest within the developer community has been rapidly growing. GraphQL offers compelling features for API providers and API consumers, including (but not limited to):

  • Client-driven responses
  • Well-typed API schemas
  • Consistency in API design

To see if GraphQL would be a good fit for Braintree, we did some internal prototyping to test these features. Here’s what we found:

Client-driven responses

GraphQL intrigued us initially because of client-driven responses. Our merchants have a wide variety of integrations and, therefore, have many different uses for the data we send back. Some may need nothing more than the status of an action and the identifier of a resource, while others may need very detailed information about the response.

This is much more challenging with traditional REST APIs -- much of the time we’re forced to return every attribute that any integrator may ever need. If we want to allow field selection, the solutions are ad hoc and challenging with nested data. This is obviously suboptimal.

GraphQL can help immensely here because each integrator decides the exact data they want back. If you need a piece of data from a GraphQL query or mutation, you must ask for it specifically. This means we can avoid returning computationally expensive fields to users who are throwing the data away but keep them available for integrators that rely on the data.

Well-typed API schemas

We also had a sense that the well-typed specification for GraphQL schemas would be a strong point. In retrospect, we undervalued this benefit. GraphQL’s type system is fantastic. It is simple and powerful and leaves other solutions like JSON Schema and Open API in the dust.

The difference is most striking when dealing with polymorphic types. GraphQL provides two simple and familiar primitives: interfaces and union types. These ideas are easy to comprehend and very flexible. Open API and JSON Schema attempt to use tools like ‘one-of’ and discriminator values to mimic the behavior of these primitives but fall short. GraphQL also allows adding new scalars and building composite types very easily.

Additionally, a GraphQL schema lives directly with the code that executes it and all queries are validated by the schema. All resolver code can assume well-formed inputs -- and this makes a huge difference. Open API and JSON Schema can be used to validate incoming requests but do not do so by default. In practice, this means the tooling available for doing so is incomplete and often broken.

Finally, GraphQL allows programmatic introspection of a schema. This leads to fantastic documentation that stays up to date with schema changes and great tools for API and documentation exploration, such as GraphiQL.

Consistency in API design

An extremely important yet subtle benefit of GraphQL is the elimination of a category of decisions about API design. You already know how your queries and mutations will be structured. You already know how to pass arguments to your fields. You already know how to return a collection. In each of these cases, the framework dictates a solution and you’re not forced to make an arbitrary choice.

It’s still challenging to pick optimal modeling for your types and the fields that you expose in your schema, but you’ll have more time to focus on these real modeling problems because you won’t have to bikeshed about whether a particular resource is nested or top-level.

The consistency with which GraphQL resolvers access data is also very important. Downstream services that expose data for consumption by GraphQL can do so in a very consistent manner. This makes it easier to define reasonable interfaces between GraphQL and the services that provide it with data.

Orchestration layer

The previously mentioned benefits would be enough to make a reasonable case for adopting GraphQL on their own. Our biggest surprise, however, was how perfectly suited GraphQL is for powering an orchestration layer.

GraphQL forces you to cleanly separate logic for retrieving data from logic that composes this data and presents it to the user. You write code for the former and your GraphQL framework handles the latter.

This separation of concerns is also ideal for an orchestration layer. We can write code to fetch data from a variety of sources (other downstream services, local data stores, in-memory data, etc.) and our GraphQL framework handles the composition of this data into the structure requested by the user. At Braintree we’re already orchestrating between several traditional REST services and gRPC services. The authors of these services don’t need to think about the GraphQL representation of their data and can instead focus on providing capabilities necessary to perform their specific task.

GraphQL also allows very fine-grained resolution of fields. You can write a resolver that fetches a huge collection of fields or a resolver that fetches the data required for just a single field. This means that we can iteratively introduce new services in the future and delegate resolution for individual fields (or sets of fields) to these services as they become available. Doing so greatly aids gradual decomposition of large applications in a manner that is transparent to our API consumers.

Opportunities for improvement

Errors. Seriously, errors are hard. Errors are always challenging in API design and GraphQL hasn’t done a lot to alleviate this concern. The spec is very light on recommendations when it comes to modeling errors, and simply says that ‘message’ is the only required field and that ‘locations’ should be provided if applicable. It is our opinion that this is not nearly enough data for integrators to handle errors programmatically.

There also seems to be contention within the GraphQL ecosystem as to whether multiple errors should be allowed for a single field. The spec is ambiguous and the reference implementation does not allow it. Some libraries do allow multiple errors per field, yet others do not. Our chosen framework does not allow multiple errors for the same field, but we knew this was a requirement for our users.

To support this requirement, we chose to add an ‘extensions’ field to our GraphQL errors. It contains a list of error detail objects. These detail objects include things like error codes, specific messages and input paths (so that errors can be correlated to the specific piece of input that caused them).

This strategy is working fairly well for us but finding a reasonable error structure was easily the most challenging part of implementing GraphQL. We believe the spec should be more explicit about recommendations for errors. Furthermore, we think multiple errors per field should be allowed in the spec.

We considered adding “user errors” to the schema itself, but this forces our users to handle errors in two places depending on the type of error. We do not believe it should be the responsibility of the user to differentiate between errors produced by the GraphQL framework and errors produced by business logic during query resolution. This approach was a non-starter for us.

Conclusion

In the end, the pros outweighed the cons and we decided that GraphQL was a tool we wanted to use for API development at Braintree. Over the past three months, we’ve built our first public API using GraphQL and we’re slowly migrating production traffic to the new service. You may already be using it if you tokenize credit cards using our JavaScript SDKs.

We expect to continue expanding the surface area of our GraphQL API in the future. As we add more capabilities, we’ll be releasing public documentation for merchants to integrate with the GraphQL API directly. Until that time, expect to see more of our SDKs’ capabilities powered by GraphQL APIs.

Considerations for adopting GraphQL

Read everything from Apollo and Relay before picking a structure for your schema. They’ve distilled a bunch of hard lessons into framework choices. Understanding their choices will lead to a better schema for your users.

Be wary of non-nullable types, especially in responses. GraphQL’s concept of “partial success” is very helpful for API consumers, but it means you’ll often complete requests with only a subset of the requested fields. If any of the failed fields has a non-nullable type, you’re out of luck. You’ll have to fail up a level in the response object, potentially returning no useful information to your user. Be very sure you want a non-nullable type before introducing it into your schema.

Translate unexpected errors into a generic GraphQL “something went wrong” error. You don’t want to accidentally leak information about your system or the specific errors it has encountered. If your GraphQL library allows providing a custom error handler, use it to whitelist acceptable errors and return something generic for any unexpected errors produced during execution.

Do not ship production services without depth and/or complexity restrictions. You need to prevent users from intentionally or accidentally sending overly complex queries to your GraphQL framework. Most frameworks allow rejection of queries that are overly-complex or too-deeply-nested.

A quick thanks

We wanted to take a brief moment to thank the graphql-java team. They’ve produced a fantastic GraphQL framework and have been extremely responsive both on GitHub and Gitter. Their help allowed us to get our implementation right the first time and quickly deploy our API to production. We recommend that anyone considering a GraphQL implementation in Java check out the library.

The information in this blog post has been prepared by PayPal and is for informational and marketing purposes only. You should not act or refrain from acting on the basis of any content included in this blog post without seeking the appropriate professional advice. This blog post contains general information and may not reflect current developments or address your specific situation.

PayPal disclaims all liability for actions you take or fail to take based on any content in this blog. Although the information in this article has been gathered from sources believed to be reliable, no representation is made as to its accuracy. All product names, trademarks, logos, and brands are property of their respective owners. Use of these names, trademarks, logos, and brands does not imply endorsement by PayPal. This blog post is not an endorsement or recommendation of any third-party products or services of any kind.

***
Drew Olson Drew is a Principal Engineer at Braintree. More posts by this author

You Might Also Like