Closes CORE-99, closes CORE-176
This PR is based off (and was also served as PoC of) [RFC 962: MSP IAM
framework](https://docs.google.com/document/d/1ItJlQnpR5AHbrfAholZqjH8-8dPF1iQcKh99gE6SSjs/edit).
It comes with two main parts:
1. The initial version of the MSP IAM SDK:
`lib/managedservicesplatform/iam`
- Embeds the [OpenFGA server
implementation](https://github.com/openfga/openfga/tree/main/pkg/server)
and exposes the a `ClientV1` for interacting with it.
- Automagically manages the both MSP IAM's and OpenFGA's database
migrations upon initializing the `ClientV1`.

- Ensures the specified OpenFGA's store and automatization model DSL
exists.
- Utility types and helpers to avoid easy mistakes (i.e. make the
relation tuples a bit more strongly-typed).
- Decided to put all types and pre-defined values together to simulate a
"central registry" and acting as a forcing function for services to form
some sort of convention. Then when we migrate the OpenFGA server to a
separate standalone service, it will be less headache about
consolidating similar meaning types/relations but different string
literals.
1. The first use case of the MSP IAM:
`cmd/enterprise-portal/internal/subscriptionsservice`
- Added/updated RPCs:
- Listing enterprise subscriptions via permissions
- Update enterprise subscriptions to assign instance domains
- Update enterprise subscriptions membership to assign roles (and
permissions)
- A database table for enterprise subscriptions, only storing the extra
instance domains as Enterprise Portal is not the
writeable-source-of-truth.
## Other minor changes
- Moved `internal/redislock` to `lib/redislock` to be used in MSP IAM
SDK.
- Call `createdb ...` as part of `enterprise-portal` install script in
`sg.config.yaml` (`msp_iam` database is a hard requirement of MSP IAM
framework).
## Test plan
Tested with gRPC UI:
- `UpdateEnterpriseSubscription` to assign an instance domain
- `UpdateEnterpriseSubscriptionMembership` to assign roles
- `ListEnterpriseSubscriptions`:
- List by subscription ID
- List by instance domain
- List by view cody analytics permissions
---------
Co-authored-by: Robert Lin <robert@bobheadxi.dev>
Upgrades rules_oci from `1.4.3` to `1.7.6`, the latest 1.x release of
rules_oci before upgrading to rules_oci 2.x. Upgrading directly from
`1.4.3` to `2.0.0` is big a jump, because a lot has changed in between.
Signed-off-by: thesayyn <thesayyn@gmail.com>
## Test plan
I don't expect any breaking changes. Also, I am assuming the repo
already has a test coverage for containers built with rules_oci.
## Changelog
Sandbox escapes be-gone
## Test plan
Tested in CI and locally with `bazel build //client/...` as well as a
lot of blood, sweat n tears tearing through failed sandboxes
## Changelog
CI started failing with a bunch of the following for every apko target
```
...
(10:27:41) ERROR: /mnt/ephemeral/workdir/sourcegraph/sourcegraph/cmd/batcheshelper/BUILD.bazel:78:11: Action cmd/batcheshelper/wolfi_base_apko failed: missing input file '@@batcheshelper_apko_lock//:lockfile_copy'
(10:27:41) ERROR: /mnt/ephemeral/workdir/sourcegraph/sourcegraph/cmd/batcheshelper/BUILD.bazel:78:11: Action cmd/batcheshelper/wolfi_base_apko failed: 1 input file(s) do not exist
...
```
This line seemed suspect, so lets replace it with an actual full copy instead of 🤨 symlink https://sourcegraph.com/github.com/chainguard-dev/rules_apko@1d78765293a0baf3f92ca49efa51d6c02b9c828e/-/blob/apko/translate_lock.bzl?L69
## Test plan
CI goes green again 😎
* wip
* gitserver (mostly) wolfi 4 bazel
* the big heck of all things
* Add rules_apko lock translation rules to WORKSPACE
* Call apko_repositories() more
* fix rules_apko to handle our shorter repo urls
* fix workspace from rebase, and missing locks
* visibility on wolfi_base_image
* hand-fix a lock coz apko lock is 🅱️roken
* remove chainguard repo+keyring from base
* update locks
* add chainguard repo+keychain to single server manifest
* unrelated fixes, server+grafana still h*cked
* fix postgres-exporter
* the big fix
* aws lib got bumped?
* downgrade sso-oidc? idk
* ignore wolfi locks from prettier
* dynamically do the locks with a reporule
* document and make nice :nails:
* bazel run @rules_apko//apko patch
* Fix .typo.typo
* Update tooling for end-to-end Bazel images (#61106)
* Update sg wolfi image to build using Bazel
* bazel run @rules_apko//apko patch
* Fix .typo.typo
* Add update-images and implement apko YAML change monitoring
* Use bazel apko and add support for additional repos
* Refactor sg wolfi
* Rework wolfi base image auto-update pipeline
* sg bazel configure
* [rough] Add --check flag to sg wolfi lock
* Refactor sg wolfi lock --check
* Simplify check and update apko lock hash operations
* Fix resolveImagePath when running in bazel
* Fixup logic error in CheckApkoLockHashes
* Tweak DoBaseImageBuild output
* Remove debug output
* Fix sg wolfi lock --check behaviour for all images
* Replace base image build step with apko lock --check
* Remove debug line
* Minor fixups for CI step
* Wrap with AnnotatedCmd
* Fixup annotation
* Update apko lockfiles
* Allow additional repos to be passed
* Update build-base-image.sh with bazel + add back to pipeline
* Ensure that modified base images are rebuilt
* Solve bazelception
* Remove timestamp for bit-level reproducibility
* Skip local keygen when running on buildkite
* Add workaround for lack of local repo support in rules_apko
* Run apkoOps first as it's quick and might fail
* Remove blocking allBaseImagesBuilt step
* Remove unused promethus-gcp image
* Add special cases to resolveImagePath
* Cleanly handle case where no bazel build path exists
This could happen in cases where a base image is only used outside of sourcegraph/sourcegraph,
or if you've added a new base image config but haven't added the associated Bazel scaffolding
* Add debugging around failing docker builds
* More debugging
* Normalise apko_lockfile to match repo.bzl
* Fixup apko docker call
* Try passing imageconfigdir differently to docker
* Run ls in different container
* Soft-fail when using legacy build in Buildkite
* Add missing include
* Workaround for building sourcegraph and sourcegraph-dev
* Add postgresql-client package to server
This contains createdb, which was recently moved from postgresql
* Inflate postgres-12-codeinsights image to avoid rules_apko errors
* Remove update line from yaml files
* Fix issue caused by moving base sourcegraph image
* Remove apk-tools from server
* Update lockfiles
* Address review feedback
* Remove debug lines
* fix unbound var
---------
Co-authored-by: Noah Santschi-Cooney <noah@santschi-cooney.ch>
* go mod tidy + gazelle-update-repos after merging main
* Use aspect bazel cache
* Use Aspect bazel caching when calling bazel in bash and sg
* Append annotation
* Run apko lock on aspect agent
* Remove base image builds
Discussion in https://sourcegraph.slack.com/archives/C05EVRLQEUR/p1712307465660509
* Remove unused functionality
* Update BaseImageConfig comments
* Rewrite wolfi-images/README.md
* Add .apko/range.sh to .gitattributes
* Remove "wolfi" from :base_image and :base_tarball targets
* remove allowlist extras from debugging
* Tweak user instructions around package testing
* Add agent healthcheck to buildkite scripts
* prettier
* sg bazel configure
* bazel run //:gazelle-update-repos
---------
Co-authored-by: Noah Santschi-Cooney <noah@santschi-cooney.ch>
Co-authored-by: Noah S-C <noah@sourcegraph.com>
As we were including the `go_binary`'s directly in the images, we couldn't set the x_defs attr in order to stamp. Given this, we patch each of the `go_binary` rules to include the x_defs.
## Test plan
```sh
$ VERSION=6.9.50 bazel build @com_github_sourcegraph_zoekt//cmd/zoekt-webserver --stamp --workspace_status_command=./dev/bazel_stamp_vars.sh
...
$ strings bazel-bin/external/com_github_sourcegraph_zoekt/cmd/zoekt-webserver/zoekt-webserver_/zoekt-webserver | grep 8cf8
v0.0.0-20240327102325-8cf8887a903a
```
* wip
* gitserver (mostly) wolfi 4 bazel
* the big heck of all things
* Add rules_apko lock translation rules to WORKSPACE
* Call apko_repositories() more
* fix rules_apko to handle our shorter repo urls
* fix workspace from rebase, and missing locks
* visibility on wolfi_base_image
* hand-fix a lock coz apko lock is 🅱️roken
* remove chainguard repo+keyring from base
* update locks
* add chainguard repo+keychain to single server manifest
* unrelated fixes, server+grafana still h*cked
* fix postgres-exporter
* the big fix
* aws lib got bumped?
* downgrade sso-oidc? idk
* ignore wolfi locks from prettier
* dynamically do the locks with a reporule
* document and make nice :nails:
* bazel run @rules_apko//apko patch
* Fix .typo.typo
* Update tooling for end-to-end Bazel images (#61106)
* Update sg wolfi image to build using Bazel
* bazel run @rules_apko//apko patch
* Fix .typo.typo
* Add update-images and implement apko YAML change monitoring
* Use bazel apko and add support for additional repos
* Refactor sg wolfi
* Rework wolfi base image auto-update pipeline
* sg bazel configure
* [rough] Add --check flag to sg wolfi lock
* Refactor sg wolfi lock --check
* Simplify check and update apko lock hash operations
* Fix resolveImagePath when running in bazel
* Fixup logic error in CheckApkoLockHashes
* Tweak DoBaseImageBuild output
* Remove debug output
* Fix sg wolfi lock --check behaviour for all images
* Replace base image build step with apko lock --check
* Remove debug line
* Minor fixups for CI step
* Wrap with AnnotatedCmd
* Fixup annotation
* Update apko lockfiles
* Allow additional repos to be passed
* Update build-base-image.sh with bazel + add back to pipeline
* Ensure that modified base images are rebuilt
* Solve bazelception
* Remove timestamp for bit-level reproducibility
* Skip local keygen when running on buildkite
* Add workaround for lack of local repo support in rules_apko
* Run apkoOps first as it's quick and might fail
* Remove blocking allBaseImagesBuilt step
* Remove unused promethus-gcp image
* Add special cases to resolveImagePath
* Cleanly handle case where no bazel build path exists
This could happen in cases where a base image is only used outside of sourcegraph/sourcegraph,
or if you've added a new base image config but haven't added the associated Bazel scaffolding
* Add debugging around failing docker builds
* More debugging
* Normalise apko_lockfile to match repo.bzl
* Fixup apko docker call
* Try passing imageconfigdir differently to docker
* Run ls in different container
* Soft-fail when using legacy build in Buildkite
* Add missing include
* Workaround for building sourcegraph and sourcegraph-dev
* Add postgresql-client package to server
This contains createdb, which was recently moved from postgresql
* Inflate postgres-12-codeinsights image to avoid rules_apko errors
* Remove update line from yaml files
* Fix issue caused by moving base sourcegraph image
* Remove apk-tools from server
* Update lockfiles
* Address review feedback
* Remove debug lines
* fix unbound var
---------
Co-authored-by: Noah Santschi-Cooney <noah@santschi-cooney.ch>
* go mod tidy + gazelle-update-repos after merging main
* Use aspect bazel cache
* Use Aspect bazel caching when calling bazel in bash and sg
* Append annotation
* Run apko lock on aspect agent
* Remove base image builds
Discussion in https://sourcegraph.slack.com/archives/C05EVRLQEUR/p1712307465660509
* Remove unused functionality
* Update BaseImageConfig comments
* Rewrite wolfi-images/README.md
* Add .apko/range.sh to .gitattributes
* Remove "wolfi" from :base_image and :base_tarball targets
* remove allowlist extras from debugging
* Tweak user instructions around package testing
* Add agent healthcheck to buildkite scripts
* prettier
---------
Co-authored-by: Noah Santschi-Cooney <noah@santschi-cooney.ch>
Co-authored-by: Noah S-C <noah@sourcegraph.com>
Workaround for https://github.com/sourcegraph/devx-support/issues/622. This is portable between both bsdtar and gnutar. On gnutar, this is the default, so this changes nothing for CI builds. This only changes behaviour in macOS with bsdtar.
It is unclear to me where a final solution will exist:
- An issue was opened upstream in docker/moby, but the latest opinion is that this is an issue with rules_oci _technically_ emitting docker-compatible formats that are incompatible with docker (Im not 100% sure yet that docker itself cant create a tarball that would fail to `docker load`, but I dont want to subject Christoph to more experiments lol) https://github.com/moby/moby/issues/47517
- A PR exists in rules_oci to use a hermetic BSD tar instead of system tar (doesnt work on nixos though coz dynamic libraries :sadge:). It uses `mtree` format to add files, I don't know yet if that works around xattr issue without also passing `--no-xattr` (my current belief is that it does not) https://github.com/bazel-contrib/rules_oci/pull/385
## Test plan
Had Christoph run `bazel run //cmd/batcheshelper:image_tarball`, which succeeded with this patch
Adds a new:
- gazelle generator
- rule + rule targets + catchall target
for generating go-mockgen mocks & testing for their being up-to-date.
Each go_mockgen macro invocation adds targets for generating mocks, copying to the source tree, as well as testing whether the current source tree mocks are up-to-date.
How to use this: `bazel run //dev:go_mockgen` for the catch-all, or `bazel run //some/target:generate_mocks` for an individual package, and `bazel test //some/target:generate_mocks_tests` to test for up-to-date-ness. There is no catch-all for testing
This currently uses a fork of go-mockgen, with an open PR for upstream here: https://github.com/derision-test/go-mockgen/pull/50.
Closes https://github.com/sourcegraph/sourcegraph/issues/60099
## Test plan
Extensive testing during development, including the following cases:
- Deleting a generated file and its entry in a go_library/go_test `srcs` attribute list and then re-running `sg bazel configure`
- Adding a non-existent output directory to mockgen.test.yaml and running the bash one-liner emitted to prepare the workspace for rerunning `sg bazel configure`
The existing config tests a lot of existing paths anyway (creating mocks for a 3rd party library's interface, entries for a given output file in >1 config file etc)
Another step towards https://github.com/sourcegraph/sourcegraph/issues/59155, previously `bazel test //...` would error at analysis time on `//client/web/src/end-to-end:e2e` due to it attempting to perform variable substitution for env vars e.g. `"HEADLESS": "$(E2E_HEADLESS)"`, for values not defined via `--define` (we only set these explicitly in .aspect/bazelrc/ci.sourcegraph.bazelrc and some `sg` targets).
By leveraging https://bazel.build/rules/lib/builtins/actions#run.use_default_shell_env, we can allow the test to read values from `--action_env` while _also_ having explicit values set via `env` macro parameter. Previously, setting `env` macro parameter would completely shadow any `--action_env` values.
Unfortunately, we cant use `--test_env` for this, as `js_run_binary` is an action not a test (or something like that?).
We also cant do env renaming anymore, meaning we have to drop the `E2E_` prefix for some env vars. At least one script needed some reworking to accommodate that `e2e_test.sh`

## Test plan
👁️ CI once again 👁️
Changes:
- Bumps hermetic_cc_toolchain to [v2.1.3](https://github.com/uber/hermetic_cc_toolchain/releases/tag/v2.1.3) for macOS specific fixes
- Removed now-unused `incompat-zig-linux-amd64` bazel config
- Removed now-true-by-default in bazel7 `--incompatible_enable_cc_toolchain_resolution`
- Added `--sandbox_add_mount_pair=/tmp` to `darwin-docker` bazel config as recommended for hermetic_cc_toolchain
- Bumps glibc version used in `darwin-docker` config to be closer to the version in our wolfi images
- zig 0.11.0 (and by extension, hermetic_cc_toolchain) only supports up to 2.34, while wolfi images use 2.37. This is close enough to be fine, an improvement over cross-compiling for 2.31
- Bumps zig to 0.12.0-dev build to fix occasional flakiness
- Brings rust-toolchains.toml in-line again with WORKSPACE
## Test plan
- `bazel build //docker-images/syntax-highlighter:syntect_server --config=darwin-docker` with various combinations of `bazel clean`, `bazel clean --expunge` and `rm -rf /tmp/zig-cache`
- `bazel test //docker-images/syntax-highlighter:image_test --config=darwin-docker`
Closes https://github.com/sourcegraph/sourcegraph/issues/54836
## Test plan
Brought dbconn into the dependency graph of //cmd/executor in a few ways:
1. directly
2. indirectly through a direct dependency
3. indirectly through a transitive dependency
4. inserting/removing from the graph in a few ways to try catch any Fact caching issues
* bump rules
* update rules_go, rules_buf, gazelle
* move proto deps around and update googleapis genproto
* gazelle update repos
* remove gazelle:resolve directives for @go_googleapis
* define rules_proto
* gazelle update repos again
* udpate github.com/grpc-ecosystem/grpc-gateway to 2.16.1
* 2.16.1 fixed to go_googleapis change so we don't have to patch it
anymore
* remove use of `@go_googleapis` and replaced it with @org_golang_google_genproto_googleapis_api
* remove patching of grpc_gateway_v2 as it is not needed
* bazel configure
* go mod tidy
Updates our reference to the zoekt repo which no longer has bazel. Adds
back in the patches we were previously using to ensure we're building
statically, as it was observed this affected indexing behaviour in
previous testing.
## Test plan
Tested with images built from this branch on a compose deployment, which
no observable issues.
Main motivation is to see the effect on performance for attribution
search.
Note that this included a bump in the otel version used since zoekt is
using a new version. This had a bit of a cascading effect on our third
party deps since they removed a package. So this bumps most 3rd party
packages that directly interacted with otel.
The changed commits are
7643f3b313...45f608ff95
- a176bde1a3 go get -u -t ./...
- e2e8aede00 Fix template documentation comments
- 25c1ea5177 all: observe missing Stats RegexpsConsidered and FlushReason
- 9abbb8b0d3 zoekt-indexserver: Prevent invalid config from causing an NPE
- 008a775ba8 zoekt-indexserver: use value format directive for bad conf warning
- f9d3a0e2e4 zoekt: add fgprof for full profiling
- 3d0bdd5c9c remove ngram offset code
- 45f608ff95 sort ngrams before looking them up
Test Plan: tested in the zoekt repo. Our CI will handle the dep updates.
I eyeballed them and they all look low risk.
Co-authored-by: Jean-Hadrien Chabran <jh@chabran.fr>
I tried this out because I was getting some test failures in go test
that looked like a dependency issue and was wondering if CI wasn't
picking it up. Lets see if CI is green.
Additionally I updated outdated references to how we used to run
gazelle.
Test Plan: CI
Reintroduces the same changes as
https://github.com/sourcegraph/sourcegraph/pull/51104 minus
syntax-highlighter which we're unable to compile with the right
toolchain at the moment.
Tested as a full main-dry-run, as well as running the stack with compose
and checking indexing and syntax-highlighting.
Executors are also built correctly.
## Test plan
CI + manual test via compose.
---------
Co-authored-by: Jean-Hadrien Chabran <jh@chabran.fr>