sourcegraph/dev/buildchecker
Bolaji Olajide 20b858f6c3
fix(build-tracker): Failed back-compat doesn't count towards branch-locking quota (#63911)
Closes
[DINF-51](https://linear.app/sourcegraph/issue/DINF-51/failed-back-compat-doesnt-count-towards-branch-locking-quota)

## Context

If a back-compat step on main fails, the build is marked as having
failed. However, we don't treat that as a failure in build-tracker,
resulting in no #buildkite-main post and not counting towards failed
build quota for locking main.

The reason why this was happening is that the Backcompat build wasn't
linked to the main Sourcegraph build in anyway. However, when a
backcompat fails the main build reflects the status of this failure, but
we do not use this field when determining the status of a build, so it
doesn't work for our use case.

![CleanShot 2024-07-18 at 15 04
15@2x](https://github.com/user-attachments/assets/9553330a-ad98-45cc-b4ce-03a22ca1b99d)

We [instead do a walkthrough of all the jobs associated with a build to
figure
out](https://sourcegraph.sourcegraph.com/github.com/sourcegraph/sourcegraph/-/blob/dev/build-tracker/main.go?L349-372)
if the build has failed, fixed or is passing.

With this logic, it means we have to link the steps from child builds
that a particular build triggers to it's parent.

## Test plan

* Create a build that'll have backcompat failing
* The build tracker event associated with the main build will be
reported with a state of failed to buildkite.

![CleanShot 2024-07-18 at 15 10
45@2x](https://github.com/user-attachments/assets/1bf503ab-0020-47bf-9512-b3a9ee5d4e36)


## Changelog

<!-- OPTIONAL; info at
https://www.notion.so/sourcegraph/Writing-a-changelog-entry-dd997f411d524caabf0d8d38a24a878c
-->
2024-07-25 06:45:09 -05:00
..
testdata/TestRepoBranchLocker Update marketing website link in monorepo (#58449) 2023-12-01 22:22:34 +00:00
branch_test.go buildchecker: update lock/unlock api for buildchecker (#57581) 2023-10-13 11:01:33 +02:00
branch.go buildchecker: update lock/unlock api for buildchecker (#57581) 2023-10-13 11:01:33 +02:00
BUILD.bazel chore(ci): remove buildchecker sunday summary posts (#63289) 2024-06-17 13:05:39 +00:00
checker_test.go fix(buildchecker): do not consider cron builds (#38152) 2022-07-04 17:01:15 +02:00
checker.go chore(ci): remove buildchecker sunday summary posts (#63289) 2024-06-17 13:05:39 +00:00
failures_test.go chore(ci): remove buildchecker sunday summary posts (#63289) 2024-06-17 13:05:39 +00:00
failures.go chore(ci): remove buildchecker sunday summary posts (#63289) 2024-06-17 13:05:39 +00:00
main.go fix(build-tracker): Failed back-compat doesn't count towards branch-locking quota (#63911) 2024-07-25 06:45:09 -05:00
OWNERS chore: links/ownership devx->dev-infra (#58999) 2023-12-14 15:07:20 +00:00
README.md dev: remove deadlink from buildchecker (#59118) 2023-12-20 09:50:07 +00:00
run-check.sh buildchecker: require explicit 'check', add discussion channel (#29934) 2022-01-19 11:22:08 -08:00
slack_test.go chore(ci): remove buildchecker sunday summary posts (#63289) 2024-06-17 13:05:39 +00:00
slack.go chore(ci): remove buildchecker sunday summary posts (#63289) 2024-06-17 13:05:39 +00:00

buildchecker buildchecker

buildchecker is designed to respond to periods of consecutive build failures on a Buildkite pipeline. Owned by the DevInfra team.

More documentation for Sourcegraph teammates is available in theCI incidents playbook.

Buildchecker is deployed as a GitHub Action.

Usage

Available commands:

Check

Checks for a series of build failures that exceed the configured threshold, locks the target branch, and posts various updates to Slack.

go run ./dev/buildchecker/ check # directly
./dev/buildchecker/run-check.sh  # using wrapper script

Also see the buildchecker GitHub Action workflow where buildchecker check is run on an automated basis.

History

Writes aggregated historical data, including the builds it finds, to a few files.

go run ./dev/buildchecker \
  -buildkite.token=$BUILDKITE_TOKEN \
  -builds.write-to=".tmp/builds.json" \
  -csv.write-to=".tmp/" \
  -failures.timeout=999 \
  -created.from="2021-08-01" \
  history

To load builds from a file instead of fetching from Buildkite, use -builds.load-from="$FILE".

You can also send metrics to Honeycomb with -honeycomb.dataset and -honeycomb.token:

go run ./dev/buildchecker \
  -builds.load-from=".tmp/builds.json" \
  -failures.timeout=999 \
  -created.from="2021-08-01" \
  -honeycomb.dataset="buildkite-history" \
  -honeycomb.token=$HONEYCOMB_TOKEN \
  history

Tokens

Buildkite API token

Required for all buildchecker commands, except for buildchecker history -load-from.

  1. Go over your personal settings
  2. Create a new token with the following permissions:
  • check sourcegraph organization
  • read_builds
  • read_pipelines

Development

  • branch_test.go contains integration tests against the GitHub API. Normally runs against recordings in testdata - to update testdata, run the tests with the -update flag.
  • All other tests are strictly unit tests.