@Laurent Senta
IPFS is a significant piece of technology in the web3 ecosystem. The first implementation, Kubo, was started in 2015 as of early 2023 had a very comprehensive but hard-to-maintain test suite. Recently, the team started solidifying the protocol stack by creating an official specification that would become the foundation for every new implementation.
The Kubo IPFS team reached out to co-design and build a testing framework that would help them test these new specifications and address the tech debt accumulated over the years.
In a few months, we built a new, user-friendly test framework. We ported over a thousand test cases and implemented a few dozen essential features like subdomain testing, fixtures support, and many more.
The Protocol Labs IPFS Engineering Team
Roughly, IPFS is a peer-to-peer data-sharing network, a new protocol for computers and users to share files online. You might draw an analogy with HTTP fit for a decentralised world.
The team is responsible for maintaining the original IPFS implementation, previously known as go-ipfs
, and renamed Kubo in 2022.
The team is also responsible for pushing the IPFS specifications efforts and encouraging the creation of new implementations.
The team needed an interoperability test suite for the IPFS Gateway Specifications
Some browsers, like Brave, have added native IPFS support, but for most users, the entry point to the IPFS network is an HTTP Gateway. These servers bridge the “regular internet” (http) to the decentralised, content-addressed network (ipfs).
Given its significance, the IPFS HTTP Gateway specifications are core to the team's interests. Most implementations should interoperate and agree on the correct behaviours. The most reliable way to ensure implementations are following the specifications is to provide a test suite that can be used to compare and validate different versions.
The team also had a treasure trove of conformance tests in Kubo’s integration test suite. Among these, we identified numerous test files containing hundreds of test cases—amounting to thousands of lines of code. These tests were written using shell scripts, which had become error-prone and challenging to maintain over time.
We co-designed the testing framework with the team, freeing their most critical engineers for other work
The team knew they needed an accessible conformance test suite to support the specification popularisation process but had no capacity to work on this. The skills required were different from the ones required to work on Kubo and on the specification itself. This is when they reached out to us for help with that effort.
We were able to co-design the test framework with them, gather stakeholder requirements, and build the desired solution while the core team could keep its focus on the core product (Kubo) and one of its key milestones for 2023 (fleshing out the IPFS specs).
Core problems
After a few back-and-forths with the team to evaluate their needs, we shared a proof of concept to validate the critical pieces of the solution. This exercise resulted in the following goals:
- Core Goal: Build a testing framework and a conformance test suite for the IPFS HTTP Gateway specification that can be maintained by “any” engineer working on the stack. This meant making sure that the test suite was approachable to engineers of various levels and familiar with different programming languages.
- Requirement: It should be easy for any new project to use this test suite in their own CI, with minimal cost and effort. This was critical to ensure the project would be used by “neighbouring” projects in the network and encourage new users to follow the specs.
- Requirement: While our initial PoC was written using JS, the team had been burnt by the maintenance cost of a NodeJS project before; being more familiar with Go, they requested tooling that would be natural for them to maintain.
- Acceptance Criteria: Kubo, the original implementation, already had more than a thousand tests written using shell scripts. Porting these tests would lower the team’s maintenance costs and prove our framework could cover a large number of cases.
We built a data-driven test suite using native Go test tooling, custom tooling, and a whole lot of reusable workflows
Workflow and Timeline
February 2023 | First JavaScript proof of concept - Data-Driven Approach Validated with the team |
March 2023 | Second Go proof of concept - Stack & Syntax Validated with the team |
April 2023 | Subdomain Testing, JUnit Report generation
Onboarding of three HTTP gateway implementations: Boxo, Kubo, and Bifrost-Gateway |
May 2023 | v0.1 - DNSLink Testing, Dedicated String Templating |
June 2023 | v0.2 - Custom Fixtures Support (CAR files), CORS Testing |
July 2023 | v0.3 - Kubo’s Test suite completely ported to Gateway Conformance |
We designed a Data-Driven Test Suite
After our initial research, we decided to move on with a declarative / data-driven approach. Our domain expertise suggested that starting from a data-driven test suite would make it very easy to refactor, transform, translate, and rethink the test suite in the likely event of scope changes.
We experienced the benefits of this approach early on. We transitioned from the JavaScript proof of concept to a Golang prototype in one day. We "simply" translated our test definitions from one data format to another.
This also supported our requirement of being “language agnostic”. If we could expose data structure to test maintainers, then no internal knowledge of any language (Go, JavaScript, etc.) would be necessary. The constraint of not using any complex feature would be built into the framework.
After a few iterations, this is what the test suite looked like
{
Name: "GET Responses from Gateway should include CORS headers allowing JS from other origins to read the data cross-origin.",
Request: Request().
Path("/ipfs/{{id}}/", someFileID),
Response: Expect().
Headers(
Header("Access-Control-Allow-Origin").Equals("*"),
Header("Access-Control-Allow-Methods").Has("GET", "HEAD", "OPTIONS"),
Header("Access-Control-Allow-Headers").Has("Content-Type", "Range", "User-Agent", "X-Requested-With"),
Header("Access-Control-Expose-Headers").Has(
"Content-Range",
"Content-Length",
),
),
},
Our approach enables us to offer functions that can transform and generate additional tests as needed. Since the tests are just another form of data, they are easy to manipulate.
One notable use case involved Subdomain Gateways, a subset of the specification outlining how to serve data behind a subdomain. There are various ways to query a subdomain gateway, including different headers, proxies, and HTTP version combinations. Writing a test for each of these scenarios can be laborious and requires specialized knowledge.
However, our data-driven approach allows a maintainer to generate all the necessary test cases with a single function:
tests = helpers.UnwrapSubdomainTests(t, tests)
This function takes a list of “simple” test cases (like the one above) and generates all the combinations required to fully test a subdomain gateway. This reduces the workload for maintainers and significantly lowers the risk of mistakes.
We decided to rely on native Go testing and reuse our ecosystem of tooling
The test suite is accessible to any maintainer, whether they are a Go expert or not. Ultimately, there is a single command used to run the test suite and generate a report. However, internally, we don’t “just” create a binary, we rely on the go test
tool built into the language.
This lets us reuse tools that we have already built for other Go projects. For example, we maintain a detailed reporting tool developers are familiar with and enjoy. And any new tools that we were going to build could be reused throughout our ecosystem.
As an additional benefit, this meant Go maintainers would be able to rely on workflows and IDE integrations they were familiar with to speed up running and testing the test suite.
Finally, we provided Reusable Workflows & other IPDX sweetness
When we approach a project, no matter whether we create it or update it, we have a set of tools to improve its maintainability. We stamp our IPDX Seal of Quality on it if you will.
We installed our Changelog Driven Release workflow to make sure the effort required by maintainers to coordinate new version releases is minimal.
We wrote a custom-tailored workflow to build and distribute the Docker images containing the test suite runner. Because we wanted to let the team experiment with any version, fork, or branch of the test suite, we created a unique feature that allows them to run any version of our test suite in CI, building the code on the fly if needed or reusing a released version if it existed. This means ultimate flexibility for core maintainers and, at the same time, top-notch performance when using pre-built images.
We also took it upon ourselves to explore code generation with Large Language Models (LLM) techniques to breeze through porting hundreds of shell scripts to the new framework. We experimented with a few approaches and learned how to use them most effectively: our workflow (using Github Copilot) would propose new tests, and a human would accept or correct them. We found this was more reliable than porting all the tests manually or letting the LLM port all the tests in one go and reviewing thousands of lines worth of changesets at once.
Finally, the project is distributed as a docker image. However, it is also designed to be used as a reusable workflow. This is a GitHub feature that lets a project use another project’s workflow in its own CI. This means we can push out improvements to the test suite without requiring our end-users to touch their code—zero effort maintenance in practice.
The Test Suite is now a part of IPFS Community’s toolbox
The choices we describe above, and many more, got us to a place where the team is now rolling with the test suite and building heavily on top of it.
Closing the first release, we:
- Added the test suite to three core projects
- Removed more than X thousands lines of shell-based tests, simplifying the team’s day-to-day
- Produced reports with more than X thousands of checks, which caught some significant issues with how Kubo (and other gateway implementations) implemented the specification.
The first version of the Gateway Conformance project for IPFS is done! The tool is part of the team’s toolbox; they are adding new tests with each change to the specification and running it in CI.
However, that’s not the end! We continue helping the team, providing maintenance and support whenever they need new features for the framework or any other form of assistance.
A follow-up project is in the works to build a specification dashboard to help maintainers follow along the specification process.
Steve Loeppky, Engineering Manager PL IPFS Engineering Team
🔍 We help team focus on their product
Just as we turned thousands of shell scripts into a scalable, data-driven test suite for the IPFS team, we’d love to help you.