2nd attempt of #63111, a follow up
https://github.com/sourcegraph/sourcegraph/pull/63085
rules_oci 2.0 brings a lot of performance improvement around oci_image
and oci_pull, which will benefit Sourcegraph. It will also make RBE
faster and have less load on remote cache.
However, 2.0 makes some breaking changes like
- oci_tarball's default output is no longer a tarball
- oci_image no longer compresses layers that are uncompressed, somebody
has to make sure all `pkg_tar` targets have a `compression` attribute
set to compress it beforehand.
- there is no curl fallback, but this is fine for sourcegraph as it
already uses bazel 7.1.
I checked all targets that use oci_tarball as much as i could to make
sure nothing depends on the default tarball output of oci_tarball. there
was one target which used the default output which i put a TODO for
somebody else (somebody who is more on top of the repo) to tackle
**later**.
## Test plan
CI. Also run delivery on this PR (don't land those changes)
---------
Co-authored-by: Noah Santschi-Cooney <noah@santschi-cooney.ch>
https://linear.app/sourcegraph/issue/DINF-111/rework-how-we-inject-version-in-our-artifacts
Pros:
- saves having to rebuild `bazel query 'kind("go_library", rdeps(//...,
//internal/version))' | wc -l` == 523 Go packages when stamp variables
cause a rebuild
- Cutting out GoLink action time when stamp changes but code is cached
Cons:
- Binaries themselves are no longer stamped, only knowing their version
info within the context of the docker image
- A tad extra complexity in internal/version/version.go to handle this
new divergence
---
Before:
```
$ bazel aquery --output=summary --include_commandline=false --include_artifacts=false --include_aspects=false --stamp 'inputs(".*volatile-status\.txt", //...)'
Action: 1
Genrule: 2
Rustc: 3
ConvertStatusToJson: 88
GoLink: 383
```
After:
```
$ bazel aquery --output=summary --include_commandline=false --include_artifacts=false --include_aspects=false --stamp 'inputs(".*volatile-status\.txt", //...)'
Mnemonics:
Genrule: 2
Action: 3
Rustc: 3
ConvertStatusToJson: 86
```
## Test plan
Lots of building & rebuilding with stamp flags, comparing execution logs
& times
## Changelog
<!-- OPTIONAL; info at
https://www.notion.so/sourcegraph/Writing-a-changelog-entry-dd997f411d524caabf0d8d38a24a878c
-->
Begin introducing an `upgrades` package in the appliance project. Later
to power buttons in target upgrades via site admin interface as well as
maintenance UI
This is just to initialize the package, will implement a better logger
etc later
## Test plan
Unit Tests
## Changelog
<!-- OPTIONAL; info at
https://www.notion.so/sourcegraph/Writing-a-changelog-entry-dd997f411d524caabf0d8d38a24a878c
-->
We have a number of docs links in the product that point to the old doc site.
Method:
- Search the repo for `docs.sourcegraph.com`
- Exclude the `doc/` dir, all test fixtures, and `CHANGELOG.md`
- For each, replace `docs.sourcegraph.com` with `sourcegraph.com/docs`
- Navigate to the resulting URL ensuring it's not a dead link, updating the URL if necessary
Many of the URLs updated are just comments, but since I'm doing a manual audit of each URL anyways, I felt it was worth it to update these while I was at it.
* chore: drop unused go pkg
* go mod tidy + bazel run :gazelle-update-repos
---------
Co-authored-by: William Bezuidenhout <william.bezuidenhout@sourcegraph.com>
Cody no longer needs it and it is obsolete now!
Since App added a non-insignificant amount of new concepts and alternative code paths, I decided to take some time and remove it from out codebase.
This PR removes ~21k lines of code. If we ever want parts of single binary (app), the redis kv alternatives, or the release pipeline for a native mac app back, we can look back at this PR and revert parts of it, but maintaining 21k lines of code and many code paths for which I had to delete a surprisingly small amount of tests justifies this move for me very well.
Technically, to some extent SG App and Cody App both still existed in the codebase, but we don't distribute either of them anymore, so IMO we shouldn't keep this weight in our code.
So.. here we go.
This should not affect any of the existing deployments, we only remove functionality that was special-cased for app.
This updates variable names, property names, env var names, etc., to call it "Cody App".
The entire diff was created by running the following commands:
```
fastmod -e go SourcegraphAppMode CodyAppMode
fastmod -e go,ts,tsx sourcegraphAppMode codyAppMode
fastmod -e ts,tsx isSourcegraphApp isCodyApp
fastmod -e ts,tsx,go,yaml,sh,js SOURCEGRAPH_APP CODY_APP
fastmod -e ts,tsx,go,json,mod,graphql,md,js 'Sourcegraph App\b' 'Cody App'
fastmod -e ts,tsx,go,json,mod,graphql,md,js 'Sourcegraph app\b' 'Cody app' # with a few changes skipped
```
If a frontend crashes due to OOM or any other reason, in a way that it
fails to set the `success` column in `upgrade_logs` to `false`, then all
frontends will block trying to claim the autoupgrade lock until the
status is manually set in postgres (or the table is truncated). With a
heartbeat mechanism, we can allow claiming the lock if the last lock
hasnt received a heartbeat in a certain amount of time (here we're
giving 6 missed heartbeats space).
## Test plan
Added unit tests for the claim query, tested in docker-compose

In certain concurrent situations, it's possible for some frontends to
block indefinitely if theyre trying to autoupgrade when it has already
occurred and dependent services are already up and connected
## Test plan
Tested locally by first bringing up everything in docker-compose besides
sourcegraph-frontend-0, and then bringing it up once everything is reasy
Implements (basically) fully automated multi-version upgrades as an
optional startup step in the `frontend` service. It was put there for a
few reasons:
1) We can display a UI with whatever we want (not possible in init
containers)
2) We can support rolling upgrades, by booting up ready/health + conf
servers to shutting down of old services and blocking new services from
connecting until autoupgrade is complete
[brain
expansion](https://i.kym-cdn.com/entries/icons/original/000/014/401/tumblr_inline_mqrh26jIU11qz4rgp.jpg)
## Test plan
Expansive testing on a local docker-compose setup (from 3.37.0 to 5.0.5)
and some lighter testing on k3s (from 4.3.1 to 5.0.5)
---------
Co-authored-by: Eric Fritz <eric@sourcegraph.com>
The previous approach to enable race detection was too radical and
accidently led to build our binaries with the race flage enabled, which
caused issues when building images down the line.
This happened because putting a `test --something` in bazelrc also sets
it on `build` which is absolutely not what we wanted. Usually folks get
this one working by having a `--stamp` config setting that fixes this
when releasing binaries, which we don't at this stage, as we're still
learning Bazel.
Luckily, this was caught swiftly. The current approach insteads takes a
more granular approach, which makes the `go_test` rule uses our own
variant, which injects the `race = "on"` attribute, but only on
`go_test`.
## Test plan
<!-- All pull requests REQUIRE a test plan:
https://docs.sourcegraph.com/dev/background-information/testing_principles
-->
CI, being a main-dry-run, this will cover the container building jobs,
which were the ones failing.
---------
Co-authored-by: Alex Ostrikov <alex.ostrikov@sourcegraph.com>
We weren't returning the values when performing the fallback query,
instead returning an error regardless
## Test plan
Tested in a full run of migrator
This column was first added in 5.0.0, so trying to run a migrator from a
version containing https://github.com/sourcegraph/sourcegraph/pull/48787
(no tagged release yet) or later on an instance pre-5.0.0 would result
in `checking auto upgrade: failed to get frontend version and
auto_upgrade state: ERROR: column \"auto_upgrade\" does not exist
(SQLSTATE 42703)`
## Test plan
Tested locally on a full upgrade from 3.33.0 to 5.0.1
This PR adds graphQL resolver methods to allow handling of the
`auto_upgrade` db column via a toggle switch. It also creates logic to
handle the state of the `auto_upgrade` flag in the `migrator upgrade`
command.
Currently the flag will be set at default as `false` and cannot be
changed via the frontend or any other method besides manually altering
the table via SQL queries. Meaning that `upgrade` will never hit the
`auto_upgrade` path here.
Toggle switch UI elements and API console methods have been separated
into another branch:
https://github.com/sourcegraph/sourcegraph/pull/50206
Finally this PR adds some visibility improvements to the UI and expands
drift by default.
_local dev env for testing_
```
export CODEINTEL_PG_ALLOW_SINGLE_DB=1
export PGUSER=sourcegraph
export PGPASSWORD=sourcegraph
export PGDATABASE=sourcegraph
```
## Test plan
<!-- All pull requests REQUIRE a test plan:
https://docs.sourcegraph.com/dev/background-information/testing_principles
-->
`sg start` was used to check the UI site-admin updates page
`go install cmd/migrator && migrator upgrade` was used to check logic
for handling the `auto_upgrade` column state
---
Part of https://github.com/sourcegraph/sourcegraph/issues/48048
---------
Co-authored-by: DaedalusG <warrenbruceg@gmail.com>
Co-authored-by: Warren Gifford <warren@sourcegraph.com>
Co-authored-by: Thorsten Ball <mrnugget@gmail.com>
We embed a version identifier in our binaries. This adds it to both
tracing and metrics. For tracing we add the tracer tag
"service.version". For metrics we add the gauge "src_version" which has
a label version. For metrics this can then be joined against[1].
This is useful for observing the impact of a version change on
metrics. Additionally when receiving debug information from customers
this helps us understand exactly which versions are being run.
[1]: https://www.robustperception.io/exposing-the-software-version-to-prometheus
This adds alerts when sourcegraph is out of date:
- Months 1 & 2 outdated only admin is notified. Focus on upgrading being for features/bug fixes.
- At 3 months, inform admin they may be missing important security bug fixes, warn them users will be notified at 4+ months.
- 4 months, continue to inform admins of security risks but only inform users of bug fixes.
- 6+ months, we additionally inform users of security risks, but we keep it just as. a warning level alert.
- At +1 year, warnings turn to errors.
See site_alerts_test.go for completes spec.
It also removes animated alerts from Sourcegraph.
Co-authored-by: Stephen Gutekanst <stephen.gutekanst@gmail.com>
* frontend: Enforce upgrade policy
This commit changes the frontend to enforce our upgrade policy on
startup. A `versions` table is introduced, where the latest seen version
of a service in the Sourcegraph architecture is stored.
For the frontend, this is done on startup, before database migrations
are run. If an admin violates our upgrade policy, the frontend will
shutdown with a descriptive error, pointing to our documentation.
In a future PR, we'll ensure our upgrade policy is respected by
other services. This will be done by sending the service's version
along to the frontend on calls to `api.InternalClient.WaitForFrontend`
during a service's startup procedure, and having the frontend respond
with an error if the upgrade is invalid, which the service should handle
by logging and shutting itself down.
Fixes#7702
Follow-up in #8382
* fixup! Fix test
* fixup! Add integration tests
* fixup! Add comment
* fixup! Remove redundant import
* fixup! support all rollbacks and downgrades
* fixup! Fix typo
* fixup! Prevent panic on first run
* fixup! Support non semantic versions
* fixup! Add CHANGELOG entry
* fixup! ./dev/generate.sh
* fixup! Fix auto conflict resolution screw up
minversion is for ensuring developers are on a version of go which doesn't
introduce problems locally. It doesn't need to match the version we use in
CI. Currently minversion is higher than what is available from brew. So this
PR just lowers it to the actual minversion required for development purposes.