@Laurent Senta
Here's a story about productivity, tech debt, and putting developer’s experience first.
The context: together with Piotr, we help teams focus on their products. We take care of their CI.
In theory, it’s quite simple: there are quality standards in the industry; we install the tools to support them in our client's environments.
While the industry agrees on a bunch of standards, the solutions we implement often need to be tailored to the context of each team.
This story includes advice and examples on how we implemented tools to redeem 2 years’ worth of technical debt while supporting our client’s release schedule.
The Client
A team of five people. Making custom hardware and using techs such as AI, Rust, and mobile. Not a trivial stack for such a small team.
Some components have relatively mature developer tooling setups. Code and quality standards are high (meaning they have up-to-date development and code quality tooling). But other parts of the system have a less sophisticated setup, prone to gathering tech debt.
The team is suffering from "death by a thousand paper-cuts”.
The Perfect Project for IPDX
This is why we founded IPDX: we relieve teams from CI and tooling work. We listen to their pains and identify their issues. We help them focus on their product.
It is even better when we work with great and motivated engineers who are “just” a little too busy to keep up with the CI improvements and upgrades.
The Mobile Application had some Code QA issues
The mobile application is maintained by a single person; it's a React Native app targeting iOS and Android. The team wants to:
- Speed up their CI workflows.
- Improve code quality by raising the requirements before merging code.
- Automate the application's releases.
(That's very standard)
We analyze their workflows, CI, and the general codebase. There's little automation, and the CI has been red for several months.
Here's what I found when I ran some tests on the React Native repository:
In an ideal world, the team would stop work and redeem this technical debt before any new development.
But that’s not possible; the team is releasing a new version soon. They have clients and feature requests to address. Our goal is efficiency, not blocking them.
IPDX Does Not Take Over Your Codebase
IPDX won’t change this code; the team calls us to spend a week with them to improve their workflows. During our Bootstrap Package, our focus is on DevExp improvement only.
Fixing each error would only cure the symptoms, not the root cause:
We aim to set up workflows to help the team meet the standards they want to achieve. Not to fix occasional bugs here and there (even if we do sometimes).
“Fix a developer’s bug and you get their CI green for a day. Give them the right workflows and teach them how to use them, you improve their experience for a lifetime”. — Someone from the IPDX Team
What would you do?
How We Approach the Problem
We have a few options:
- We ignore the problem and move on — ewww
- We take over the code base and pay off the technical debt — short-term
- We add CI workflows that incentivize the team to merge "clean" code — yes
It would not be feasible to delay new work until we have fixed all the lint, builds, and tests. There are over 300 issues that need to be manually fixed. That would be a lengthy and tedious process. Moreover, rushing through this could potentially introduce new errors.
We're operating in a startup environment, remember?
Golden Rule: the CI is Green When We Leave
The first thing to understand, which is very important, is that your CI should be green. Continuous Integration for a team is the source of truth. No "It works on my machine" or "I can swear the tests were green this morning.”
- 🟢 CI is green: the code is OK
- ❌ CI is red: the code is not acceptable
- 🐛 Else, there is a problem with the CI
“CI is red, but the code is acceptable” is unsustainable. It implies that your code repository cannot be trusted and collaborated on.
Someone makes a mistake, forgets a check, introduces a real bug, and then broken code is released and pushed to production. That code starts impacting users and businesses.
You can always miss bugs with a green CI, but you have the means to improve. There is nothing much you can do if a Red CI is part of your development process.
Our primary goal is to set up CI that acts as the source of truth for the team. "CI is green" should be synonymous with "this is the minimum to accept code".
But how do you keep CI green if there are 500+ bugs? How do you improve without taking a HUGE step that no one has time for?
This is where our job is ultimately more human than technology. We consider each client’s context to build the right solution for them. It’s about the team more than code and standards. It’s also about the company, the product, and more.
We improve developer experience, not constrain developers
The CI is a tool; it's a step to ensure that the code will be good once it is in the application. We try not to impose additional constraints that developers would have to fight against.
We modified the workflows we usually set up to provide this developer with the tools they needed most.
1️⃣ We post comments in the Pull Request with a summary of the linting, tests, and build errors
We use marocchino/sticky-pull-request-comment
to post a comment on the PR with the number of errors found. The action provides a recreate: true
option that updates the comment after each push instead of spamming with messages.
2️⃣ We (temporarily) accept quality errors
As long as the jobs run and post their comments successfully, they’re considered a success.
With GitHub Actions, this means marking some workflow steps with continue-on-error: true
and using set +e
in the script to allow errors.
3️⃣ We post detailed error messages in the job’s summary.
Here, we use echo "message" >> $GITHUB_STEP_SUMMARY
, which outputs nice summaries in the Github Actions summary view.
Github Actions Workflow Example
This is what the final workflow looks like (edited to highlight the useful parts you may wish to copy and paste).
name: Code Quality Checks
jobs:
# ...
lint:
runs-on: ubuntu-latest
# ...
steps:
- uses: actions/checkout@v4
# ... setup ...
- name: Stylish Lint
continue-on-error: true # allow errors
# output as a summary
run: |
set +e # do not exit on failure; failures are expected here
yarn lint --format stylish | tee lint.txt
echo "## Lint Summary" > $GITHUB_STEP_SUMMARY
echo '```' >> $GITHUB_STEP_SUMMARY
cat lint.txt >> $GITHUB_STEP_SUMMARY
echo '```' >> $GITHUB_STEP_SUMMARY
- name: Prepare Lint Comment
if: github.event.pull_request
continue-on-error: true
# generate linting details in a file
run: |
set +e # do not exit on failure, failures are expected here
yarn --silent lint --format unix --output-file output.txt
echo -n "**Lint Result**: " > comment.md
tail -n 1 output.txt >> comment.md
- uses: marocchino/sticky-pull-request-comment@331f8f5b4215f0445d3c07b4967662a32a2d3e31 # v2.9.0
# send the comment
with:
recreate: true
header: lint
path: comment.md
# ...
The Right Solution is often not the Perfect Solution
The team is still merging code with failed tests, lint, and builds. But we laid down the foundations and went as far as possible in a single iteration.
- Our solo maintainer gets into the habit of merging green PRs. And it's rewarding for them to make improvements to reduce the error numbers.
- Other maintainers can quickly check the number of errors and review the details.
We have clear follow-up steps:
- Enforce “CI must be green before merging”.
- Maybe add a delta calculation in our workflow so that a PR can maintain or reduce the number of errors. But refuse PR that increases errors.
- Fix the 600+ lint, build, and test errors, then move to a standard build & test workflow.
It will be up to the team to choose whether they can invest time in fixing the code now or would rather do it over time.
Having DevEx expertise helps focus on the Human side of things
Every team functions differently. And while “off-the-shelf” tooling gets the job done, sometimes it’s worth having the expertise to build more efficient and more adapted solutions.
If you are trying to improve your DevEx, here are a few points we always keep in mind:
- The golden rule is never to work with a “red CI” workflow.
- Think long-term; ensure your workflow supports good habits but does not constrain developers.
- Adapt to your end-user (the developer) and their context.
Just as we helped this team pay off their tech debt, we’d love to help you.