Some analysis phase performance improvements in rules_js 2.0.0-rc9 that
are worth picking up. rules_js is very close to 2.0.0 final now. Waiting
on one last improvement requiring and API change for bzlmod.
## Test plan
CI
## Changelog
2nd attempt of #63111, a follow up
https://github.com/sourcegraph/sourcegraph/pull/63085
rules_oci 2.0 brings a lot of performance improvement around oci_image
and oci_pull, which will benefit Sourcegraph. It will also make RBE
faster and have less load on remote cache.
However, 2.0 makes some breaking changes like
- oci_tarball's default output is no longer a tarball
- oci_image no longer compresses layers that are uncompressed, somebody
has to make sure all `pkg_tar` targets have a `compression` attribute
set to compress it beforehand.
- there is no curl fallback, but this is fine for sourcegraph as it
already uses bazel 7.1.
I checked all targets that use oci_tarball as much as i could to make
sure nothing depends on the default tarball output of oci_tarball. there
was one target which used the default output which i put a TODO for
somebody else (somebody who is more on top of the repo) to tackle
**later**.
## Test plan
CI. Also run delivery on this PR (don't land those changes)
---------
Co-authored-by: Noah Santschi-Cooney <noah@santschi-cooney.ch>
Previously we were having soft-warnings around permissions in badly made
npm packages
```
(21:31:30) WARNING: Remote Cache: /mnt/ephemeral/output/__main__/execroot/__main__/bazel-out/k8-fastbuild/bin/node_modules/.aspect_rules_js/its-fine@1.1.1_react_18.1.0/node_modules/its-fine/src/index.tsx (Permission denied)
```
When trying to enable compact execution log, this becomes a hard fail
```
(14:44:58) ERROR: /mnt/ephemeral/workdir/sourcegraph/sourcegraph/BUILD.bazel:33:22: Extracting npm package its-fine@1.1.1_react_18.1.0 failed: IOException while logging spawn: /mnt/ephemeral/output/__main__/execroot/__main__/bazel-out/k8-fastbuild/bin/node_modules/.aspect_rules_js/its-fine@1.1.1_react_18.1.0/node_modules/its-fine/dist/index.cjs (Permission denied)
```
This bump should fix that
## Test plan
CI still builds successfully
## Changelog
[Linear
Issue](https://linear.app/sourcegraph/project/claude-3-on-gcp-8c014e1a3506/overview)
This PR adds support for anthropic models in the google provider through
google vertex.
NOTE: The current code only supported Google Gemini API and had boiler
plate code for Google vertex(only for the gemini model) this PR adds
Google Vertex for anthropic models properly so this way the google
provider can be run in 3 different configurations
1. Google Gemini API(this works but only for chat and not for
completions which is the intended behaviour for now)
2. Google Vertex API Anthropic Model(This works perfectly and is added
in this PR and tested for both chat and completions and it works great)
3. Google Vertex API Gemini Model (this doesn't work yet and can
eventually be added to this codebase but we gotta add a new decoder for
the streaming responses of the gemini model through this API we can take
care of this later)
Sense of Urgency: This is a P0 because of enterprise requirements so I
would appreciate a fast approval and merging.
<!-- 💡 To write a useful PR description, make sure that your description
covers:
- WHAT this PR is changing:
- How was it PREVIOUSLY.
- How it will be from NOW on.
- WHY this PR is needed.
- CONTEXT, i.e. to which initiative, project or RFC it belongs.
The structure of the description doesn't matter as much as covering
these points, so use
your best judgement based on your context.
Learn how to write good pull request description:
https://www.notion.so/sourcegraph/Write-a-good-pull-request-description-610a7fd3e613496eb76f450db5a49b6e?pvs=4
-->
## Test plan
- Run this branch for Cody instance ->
https://github.com/sourcegraph/cody/pull/4606
- Ask @arafatkatze to dm you the siteadmin config to make things work
- Check the logs and play with completions and chat
<!-- All pull requests REQUIRE a test plan:
https://docs-legacy.sourcegraph.com/dev/background-information/testing_principles
-->
## Changelog
<!--
1. Ensure your pull request title is formatted as: $type($domain): $what
3. Add bullet list items for each additional detail you want to cover
(see example below)
4. You can edit this after the pull request was merged, as long as
release shipping it hasn't been promoted to the public.
5. For more information, please see this how-to
https://www.notion.so/sourcegraph/Writing-a-changelog-entry-dd997f411d524caabf0d8d38a24a878c?
Audience: TS/CSE > Customers > Teammates (in that order).
Cheat sheet: $type = chore|fix|feat $domain:
source|search|ci|release|plg|cody|local|...
-->
<!--
Example:
Title: fix(search): parse quotes with the appropriate context
Changelog section:
## Changelog
- When a quote is used with regexp pattern type, then ...
- Refactored underlying code.
-->
---------
Signed-off-by: Stephen Gutekanst <stephen@sourcegraph.com>
Co-authored-by: Beatrix <beatrix@sourcegraph.com>
Co-authored-by: Stephen Gutekanst <stephen@sourcegraph.com>
Follow up https://github.com/sourcegraph/sourcegraph/pull/63085
rules_oci 2.0 brings a lot of performance improvement around oci_image
and oci_pull, which will benefit sourcegraph. It will also make RBE
faster and have less load on remote cache.
However, 2.0 makes some breaking changes like
- oci_tarball's default output is no longer a tarball
- oci_image no longer compresses layers that are uncompressed, somebody
has to make sure all `pkg_tar` targets have a `compression` attribute
set to compress it beforehand.
- there is no curl fallback, but this is fine for sourcegraph as it
already uses bazel 7.1.
I checked all targets that use oci_tarball as much as i could to make
sure nothing depends on the default tarball output of oci_tarball. there
was one target which used the default output which i put a TODO for
somebody else (somebody who is more on top of the repo) to tackle later.
## Test plan
I am assuming that the repo has enough tests to catch potential problems
on CI. Also somebody who knows the repo better should double check my
changes.
---------
Co-authored-by: Noah Santschi-Cooney <noah@santschi-cooney.ch>
Upgrades rules_oci from `1.4.3` to `1.7.6`, the latest 1.x release of
rules_oci before upgrading to rules_oci 2.x. Upgrading directly from
`1.4.3` to `2.0.0` is big a jump, because a lot has changed in between.
Signed-off-by: thesayyn <thesayyn@gmail.com>
## Test plan
I don't expect any breaking changes. Also, I am assuming the repo
already has a test coverage for containers built with rules_oci.
## Changelog
Sandbox escapes be-gone
## Test plan
Tested in CI and locally with `bazel build //client/...` as well as a
lot of blood, sweat n tears tearing through failed sandboxes
## Changelog
`exclude_declarations_from_npm_packages` is not an option anymore as
from the output
```
ERROR: @aspect_rules_js//npm:exclude_declarations_from_npm_packages :: Unrecognized option: @aspect_rules_js//npm:exclude_declarations_from_npm_packages
```
Also upgraded to `rc3` while diagnosing this.
Closes https://github.com/sourcegraph/devx-support/issues/1005
## Test plan
Tested locally + CI
```
sg bazel test //internal/appliance/reconciler:reconciler_test
INFO: Invocation ID: 70da4295-36f2-43a8-a71e-9b11ae489657
WARNING: Build option --modify_execution_info has changed, discarding analysis cache (this can be expensive, see https://bazel.build/advanced/performance/iteration-speed).
INFO: Analyzed target //internal/appliance/reconciler:reconciler_test (0 packages loaded, 17313 targets configured).
INFO: Found 1 test target...
Target //internal/appliance/reconciler:reconciler_test up-to-date:
bazel-bin/internal/appliance/reconciler/reconciler_test_/reconciler_test
Aspect @@rules_rust//rust/private:rustfmt.bzl%rustfmt_aspect of //internal/appliance/reconciler:reconciler_test up-to-date (nothing to build)
Aspect @@rules_rust//rust/private:clippy.bzl%rust_clippy_aspect of //internal/appliance/reconciler:reconciler_test up-to-date (nothing to build)
INFO: Elapsed time: 1.210s, Critical Path: 0.11s
INFO: 1 process: 1 internal.
INFO: Build completed successfully, 1 total action
//internal/appliance/reconciler:reconciler_test (cached) PASSED in 19.2s
```
## Changelog
- remove deprecated option `exclude_declarations_from_npm_packages` from
local.bazelrc
- update to rc3 of rules_js
Bumps to rules_js (and friends) to 2.0 RCs.
This brings in performance improvements for analysis phase since npm package depsets and now much smaller. It also adds support for pnpm v9 and allows for linking js_library targets as 1p deps instead of npm_package targets. See https://github.com/aspect-build/rules_js/issues/1671 for more details.
## Test plan
CI
## Changelog
See [RFC 885 Sourcegraph Enterprise Portal (go/enterprise-portal)](https://docs.google.com/document/d/1tiaW1IVKm_YSSYhH-z7Q8sv4HSO_YJ_Uu6eYDjX7uU4/edit#heading=h.tdaxc5h34u7q) - closes CORE-6. The only files requiring in-depth review are the `.proto` files, as everything else is generated:
- `lib/enterpriseportal/subscriptions/v1/subscriptions.proto`
- `lib/enterpriseportal/codyaccess/v1/codyaccess.proto`
This PR only introduces API definitions - implementation will come as subsequent PRs, tracked in the ["Launch Enterprise Portal" Linear project](https://linear.app/sourcegraph/project/launch-sourcegraph-enterprise-portal-ee5d9ea105c2).
Before reviewing the diffs, **please review this PR description in depth**.
### Design goals
This initial schema aims to help achieve CORE-97 by adding our initial "get subscription Cody access", as well our general Stage 1 goal of providing read-only access to our existing enterprise subscription mechanisms. In doing so, we can start to reshape the API in a way that accommodates future growth and addresses some debt we have accumulated over time, before the Stage 2 goal of having the new Enterprise Portal be the source-of-truth for all things subscriptions.
I am also aiming for a conservative approach with the Cody Gateway access related RPCs, to ease migration risks and allow for Cody teams to follow up quickly with more drastic changes in a V2 of the service after a Core-Services-driven migration to use the new service: https://github.com/sourcegraph/sourcegraph/pull/62263#issuecomment-2101874114
### Design overview
- **Multiple services**: Enterprise Portal aims to be the home of most Enterprise-related subscription and access management, but each component should be defined as a separate service to maintain clear boundaries between "core" capabilities and future extensions. One problem we see in the `dotcom { productSubscriptions }` is the embedding of additional concepts like Cody Gateway access makes the API surface unwieldy and brittle, and encourages an internal design that bundles everything together (the `product_subscriptions` table has 10 `cody_gateway_*` columns today). More concretely, this PR designs 2 services that Enterprise Portal will implement:
- `EnterprisePortalSubscriptionsService` (`subscriptions.proto`): subscriptions and licenses CRUD
- `EnterprisePortalCodyGatewayService` (`codygateway.proto`): Enterprise Cody Gateway access
- **Multiple protocols**: We use [ConnectRPC](https://connectrpc.com/) to generate traditional gRPC handlers for service-to-service use, but also a plain HTTP/1 "REST"-ish protocol (the ["Connect Protocol"](https://connectrpc.com/docs/protocol)) that works for web clients and simple integrations. Go bindings for the Connect protocol are generated into the `v1connect` subpackages.
- **Future licensing model/mechanism changes**: The _Subscription_ model is designed to remain static, but _Licenses_ are designed to accommodate future changes -`EnterpriseSubscriptionLicenseType` and `EnterpriseSubscriptionLicense` in this PR describe only the current type of license, referred to as "classic licenses", but we can extend this in the future for e.g. new models (refreshable licenses?) or new products (Cody-only? PLG enterprise?), or existing problems (test instance licenses?)
- **Granular history**: Instead of a `createdAt`, `isArchived`, `revokedAt` and and so on, the new API defines Kubernetes-style `conditions` for licenses and subscriptions to describe creation, archival, and revocation events respectively, and can be more flexibly extended for future events and a lightweight audit log of major changes to a subscription or license. In particular, `revokedAt` already has a `revokedReason` - this allows us to extend these important events with additional metadata in a flexible manner.
- **Pagination**: I couldn't find a shared internal or off-the-shelf representation of pagination attributes, but each `List*` RPC describes `page_size`, `page_token`, and `next_page_token`
- **Querying/filtering**: I couldn't find a strong standard for this either, but in general:
- `Get*` accepts `query` that is a `oneof`, with the goal of providing exact matches only.
- `List*` accepts `repeated filter`, where each `filter` is a `oneof` a set of strategies relevant to a particular `List*` RPC. Multiple filters are treated as `AND`-concatenated.
Some major changes from the existing model:
- **Downgrade the concept of "subscription access token"**: this was built for Cody Gateway but I am not sure it has aged well, as the mechanism is still tied to individual licenses, did not find new non-Cody-Gateway use cases (except for license checks, though those do not require an "access token" model either), and today are still not "true" access tokens as they cannot be expired/managed properly. This PR relegates the concept to remain Cody-specific as it effectively is today so that we might be able to introduce a better subscription-wide model if the use case arises. Over time, we may want to make this even more opaque, relying entirely on zero-config instead (generating from license keys).
- **Subscriptions are no longer attached to a single dotcom user**: Most of these users today are not real users anyway, as our license creation process asks that you create a fake user account (["User account: [...] We create a company-level account for this."](https://handbook.sourcegraph.com/departments/technical-success/ce/process/license_keys/#license-key-mechanics)). The new API removes the concept entirely, in favour of a true user access management system in CORE-102.
- **Database/GraphQL IDs** are no longer exposed - we use external, prefixed UUIDs for representing entities over APIs in a universal manner.
- **Per-subscription Cody Gateway access no longer exposes `allowed models`**: I suggested this to @rafax in light of recent problems with propagating new models to Enterprise customers. He agreed that the general product direction is "model options as a selling point" - it no longer makes sense to configure these at a per-subscription level. Instead, the Cody Gateway service should configure globally allowed models directly, and each Sourcegraph instance can determine what models they trust. If we really need this back we can add it later, but for now I think this removal is the right direction.
### Direct translations
`cmd/cody-gateway/internal/dotcom/operations.graphql` defines our key dependencies for achieving CORE-97. The concepts referred in `operations.graphql` translate to this new API as follows:
- `dotcom { productSubscriptionByAccessToken(accessToken) }`: `codygateway.v1.GetCodyGatewayAccess({ access_token })`
- `dotcom { productSubscriptions }`: `codygateway.v1.ListCodyGatewayAccess()`
- `fragment ProductSubscriptionState`:
- `id`: **n/a**
- `uuid`: `subscriptions.v1.EnterpriseSubscription.id`
- `account { username }`: `subscriptions.v1.EnterpriseSubscription.display_name`
- `isArchived`: `subscriptions.v1.EnterpriseSubscription.conditions`
- `codyGatewayAccess { ... }`: **separate RPC to `codygateway.v1.GetCodyGatewayAccess`**
- `activeLicense { ... }`: **separate RPC to `subscriptions.v1.ListEnterpriseSubscriptionLicenses`**
### Why `lib/enterpriseportal`?
We recently had to move another Telemetry Gateway to `lib`: #62061. Inevitably, there will be services that live outside the monorepo that want to integrate with Enterprise Portal (one is on our roadmap: Cody Analytics in https://github.com/sourcegraph/cody-analytics). This allows us to share generated bindings and some useful helpers, while keeping things in the monorepo.
### Implications for Cody Clients
For now (and in the future), nothing is likely to change. Here's how I imagine things playing out:
```mermaid
graph TD
cc["Cody Clients"] -- unified API --> cg[services like Cody Gateway]
cg -- PLG users --> ssc[Self-Serve Cody]
cg -- Enterprise users --> ep[Enterprise Portal]
```
## Test plan
CI passes, the schemas can be generated by hand:
```
sg gen buf \
lib/enterpriseportal/subscriptions/v1/buf.gen.yaml \
lib/enterpriseportal/codyaccess/v1/buf.gen.yaml
```
---------
Co-authored-by: Joe Chen <joe@sourcegraph.com>
Co-authored-by: Chris Smith <chrsmith@users.noreply.github.com>
chore(rel): bump minor for stitch graph + add support invalidating migrations repo rule (#62490)
* chore(bzl): allow to invalidate migrations repo rule
* chore(bzl): gen stitch graph for 5.4
Our rust binaries (e.g. scip-ctags/syntect_server) were being built in some mix of opt & fastbuild mode[1]. Unlike with Go where there is no release/debug mode flag, Rust requires opting into optimized release builds. We can do that automagically when building any oci_image target with the power of ✨ transitions ✨
This has the side-effect of our Go binaries no longer being stripped & containing debug symbols, see https://github.com/bazelbuild/rules_go/issues/3917
Also to note, [in Cargo.toml we opt into debug symbols in release mode](https://sourcegraph.com/github.com/sourcegraph/sourcegraph@nsc/bazel-release-mode-rust/-/blob/docker-images/syntax-highlighter/Cargo.toml?L67%3A11-70%3A9). Is this preserved by this PR for bazel[2]?
[1] `strings` on the binaries showed the 3rd-party crates having `k8-opt` filepath names e.g.
```
$ / # strings syntect_server | grep k8-
/tmp/bazel-working-directory/__main__/bazel-out/k8-opt-exec-ST-13d3ddad9198/bin/external/crate_index__onig_sys-69.8.1/onig_sys_build_script_.runfiles/crate_index__onig_sys-69.8.1
```
but the final binaries (and the 1st-party crates) themselves were being built in fastbuild mode. See https://github.com/sourcegraph/devx-support/issues/790 for original point of contact
[2] It seems like it may be preserved, but I dont know how reliable this is for Rust binaries
```
$ file bazel-bin/docker-images/syntax-highlighter/scip-ctags
bazel-bin/docker-images/syntax-highlighter/scip-ctags: ELF 64-bit LSB pie executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, for GNU/Linux 2.0.0, with debug_info, not stripped
```
## Test plan
Tested for sourcegraph/scip-ctags image:
```
/ $ strings scip-ctags | grep "Could not parse file"
/ $ echo $?
1
```
CI started failing with a bunch of the following for every apko target
```
...
(10:27:41) ERROR: /mnt/ephemeral/workdir/sourcegraph/sourcegraph/cmd/batcheshelper/BUILD.bazel:78:11: Action cmd/batcheshelper/wolfi_base_apko failed: missing input file '@@batcheshelper_apko_lock//:lockfile_copy'
(10:27:41) ERROR: /mnt/ephemeral/workdir/sourcegraph/sourcegraph/cmd/batcheshelper/BUILD.bazel:78:11: Action cmd/batcheshelper/wolfi_base_apko failed: 1 input file(s) do not exist
...
```
This line seemed suspect, so lets replace it with an actual full copy instead of 🤨 symlink https://sourcegraph.com/github.com/chainguard-dev/rules_apko@1d78765293a0baf3f92ca49efa51d6c02b9c828e/-/blob/apko/translate_lock.bzl?L69
## Test plan
CI goes green again 😎
* wip
* gitserver (mostly) wolfi 4 bazel
* the big heck of all things
* Add rules_apko lock translation rules to WORKSPACE
* Call apko_repositories() more
* fix rules_apko to handle our shorter repo urls
* fix workspace from rebase, and missing locks
* visibility on wolfi_base_image
* hand-fix a lock coz apko lock is 🅱️roken
* remove chainguard repo+keyring from base
* update locks
* add chainguard repo+keychain to single server manifest
* unrelated fixes, server+grafana still h*cked
* fix postgres-exporter
* the big fix
* aws lib got bumped?
* downgrade sso-oidc? idk
* ignore wolfi locks from prettier
* dynamically do the locks with a reporule
* document and make nice :nails:
* bazel run @rules_apko//apko patch
* Fix .typo.typo
* Update tooling for end-to-end Bazel images (#61106)
* Update sg wolfi image to build using Bazel
* bazel run @rules_apko//apko patch
* Fix .typo.typo
* Add update-images and implement apko YAML change monitoring
* Use bazel apko and add support for additional repos
* Refactor sg wolfi
* Rework wolfi base image auto-update pipeline
* sg bazel configure
* [rough] Add --check flag to sg wolfi lock
* Refactor sg wolfi lock --check
* Simplify check and update apko lock hash operations
* Fix resolveImagePath when running in bazel
* Fixup logic error in CheckApkoLockHashes
* Tweak DoBaseImageBuild output
* Remove debug output
* Fix sg wolfi lock --check behaviour for all images
* Replace base image build step with apko lock --check
* Remove debug line
* Minor fixups for CI step
* Wrap with AnnotatedCmd
* Fixup annotation
* Update apko lockfiles
* Allow additional repos to be passed
* Update build-base-image.sh with bazel + add back to pipeline
* Ensure that modified base images are rebuilt
* Solve bazelception
* Remove timestamp for bit-level reproducibility
* Skip local keygen when running on buildkite
* Add workaround for lack of local repo support in rules_apko
* Run apkoOps first as it's quick and might fail
* Remove blocking allBaseImagesBuilt step
* Remove unused promethus-gcp image
* Add special cases to resolveImagePath
* Cleanly handle case where no bazel build path exists
This could happen in cases where a base image is only used outside of sourcegraph/sourcegraph,
or if you've added a new base image config but haven't added the associated Bazel scaffolding
* Add debugging around failing docker builds
* More debugging
* Normalise apko_lockfile to match repo.bzl
* Fixup apko docker call
* Try passing imageconfigdir differently to docker
* Run ls in different container
* Soft-fail when using legacy build in Buildkite
* Add missing include
* Workaround for building sourcegraph and sourcegraph-dev
* Add postgresql-client package to server
This contains createdb, which was recently moved from postgresql
* Inflate postgres-12-codeinsights image to avoid rules_apko errors
* Remove update line from yaml files
* Fix issue caused by moving base sourcegraph image
* Remove apk-tools from server
* Update lockfiles
* Address review feedback
* Remove debug lines
* fix unbound var
---------
Co-authored-by: Noah Santschi-Cooney <noah@santschi-cooney.ch>
* go mod tidy + gazelle-update-repos after merging main
* Use aspect bazel cache
* Use Aspect bazel caching when calling bazel in bash and sg
* Append annotation
* Run apko lock on aspect agent
* Remove base image builds
Discussion in https://sourcegraph.slack.com/archives/C05EVRLQEUR/p1712307465660509
* Remove unused functionality
* Update BaseImageConfig comments
* Rewrite wolfi-images/README.md
* Add .apko/range.sh to .gitattributes
* Remove "wolfi" from :base_image and :base_tarball targets
* remove allowlist extras from debugging
* Tweak user instructions around package testing
* Add agent healthcheck to buildkite scripts
* prettier
* sg bazel configure
* bazel run //:gazelle-update-repos
---------
Co-authored-by: Noah Santschi-Cooney <noah@santschi-cooney.ch>
Co-authored-by: Noah S-C <noah@sourcegraph.com>
Revert "bazel: migrate dind dockerfile to rules_oci (#61788)"
This reverts commit e4de7a46c1.
Google Container Registry doesn't like this image for some reason, so we're gonna hold out until we're moved to Artifact Registry
* wip
* gitserver (mostly) wolfi 4 bazel
* the big heck of all things
* Add rules_apko lock translation rules to WORKSPACE
* Call apko_repositories() more
* fix rules_apko to handle our shorter repo urls
* fix workspace from rebase, and missing locks
* visibility on wolfi_base_image
* hand-fix a lock coz apko lock is 🅱️roken
* remove chainguard repo+keyring from base
* update locks
* add chainguard repo+keychain to single server manifest
* unrelated fixes, server+grafana still h*cked
* fix postgres-exporter
* the big fix
* aws lib got bumped?
* downgrade sso-oidc? idk
* ignore wolfi locks from prettier
* dynamically do the locks with a reporule
* document and make nice :nails:
* bazel run @rules_apko//apko patch
* Fix .typo.typo
* Update tooling for end-to-end Bazel images (#61106)
* Update sg wolfi image to build using Bazel
* bazel run @rules_apko//apko patch
* Fix .typo.typo
* Add update-images and implement apko YAML change monitoring
* Use bazel apko and add support for additional repos
* Refactor sg wolfi
* Rework wolfi base image auto-update pipeline
* sg bazel configure
* [rough] Add --check flag to sg wolfi lock
* Refactor sg wolfi lock --check
* Simplify check and update apko lock hash operations
* Fix resolveImagePath when running in bazel
* Fixup logic error in CheckApkoLockHashes
* Tweak DoBaseImageBuild output
* Remove debug output
* Fix sg wolfi lock --check behaviour for all images
* Replace base image build step with apko lock --check
* Remove debug line
* Minor fixups for CI step
* Wrap with AnnotatedCmd
* Fixup annotation
* Update apko lockfiles
* Allow additional repos to be passed
* Update build-base-image.sh with bazel + add back to pipeline
* Ensure that modified base images are rebuilt
* Solve bazelception
* Remove timestamp for bit-level reproducibility
* Skip local keygen when running on buildkite
* Add workaround for lack of local repo support in rules_apko
* Run apkoOps first as it's quick and might fail
* Remove blocking allBaseImagesBuilt step
* Remove unused promethus-gcp image
* Add special cases to resolveImagePath
* Cleanly handle case where no bazel build path exists
This could happen in cases where a base image is only used outside of sourcegraph/sourcegraph,
or if you've added a new base image config but haven't added the associated Bazel scaffolding
* Add debugging around failing docker builds
* More debugging
* Normalise apko_lockfile to match repo.bzl
* Fixup apko docker call
* Try passing imageconfigdir differently to docker
* Run ls in different container
* Soft-fail when using legacy build in Buildkite
* Add missing include
* Workaround for building sourcegraph and sourcegraph-dev
* Add postgresql-client package to server
This contains createdb, which was recently moved from postgresql
* Inflate postgres-12-codeinsights image to avoid rules_apko errors
* Remove update line from yaml files
* Fix issue caused by moving base sourcegraph image
* Remove apk-tools from server
* Update lockfiles
* Address review feedback
* Remove debug lines
* fix unbound var
---------
Co-authored-by: Noah Santschi-Cooney <noah@santschi-cooney.ch>
* go mod tidy + gazelle-update-repos after merging main
* Use aspect bazel cache
* Use Aspect bazel caching when calling bazel in bash and sg
* Append annotation
* Run apko lock on aspect agent
* Remove base image builds
Discussion in https://sourcegraph.slack.com/archives/C05EVRLQEUR/p1712307465660509
* Remove unused functionality
* Update BaseImageConfig comments
* Rewrite wolfi-images/README.md
* Add .apko/range.sh to .gitattributes
* Remove "wolfi" from :base_image and :base_tarball targets
* remove allowlist extras from debugging
* Tweak user instructions around package testing
* Add agent healthcheck to buildkite scripts
* prettier
---------
Co-authored-by: Noah Santschi-Cooney <noah@santschi-cooney.ch>
Co-authored-by: Noah S-C <noah@sourcegraph.com>
`bazel build` on percy mocha targets (such as //client/web/src/integration:integration-tests) no longer result in actually running the test!
Originally, we used `js_run_binary` with `build_test` as `js_test` doesnt support stamping, and we need to be able to read volatile variables for percy.
Then, we worked around https://github.com/bazelbuild/bazel/issues/16231 in https://github.com/sourcegraph/sourcegraph/pull/58505 by not explicitly depending on the stamp variables, but exploiting a bit of a hack to read them anyways (will this work with RBE?)
Now, given that we're not explicitly stamping and still using the hack, we can use `js_test` instead, to avoid having the tests run as part of `bazel build`, instead only when we run `bazel test` (as is good 😌)
It is apparently possible to work around https://github.com/bazelbuild/bazel/issues/16231 when using disk/remote caches, but only for local builds (so no remote builds) according to [the following comment](https://github.com/bazelbuild/bazel/issues/16231#issuecomment-1772835555), but we would still need:
1. `js_test` to support stamping and
2. this workaround to also apply to remote execution (as we're considering that once its supported in Aspect Workflows)
todo: update doc/dev/background-information/bazel/web_overview.md in new docs repo
## Test plan
CI 🎉
We use the GCS JSON API directly instead of gsutil or gcloud because:
- gsutil may spend up to a ~1m20s trying to contact metadata.google.internal
without a discovered way to disable that
- gcloud disallows unauthed access to an even public bucket
TODO: in future iterations we can explore how to properly invalidate this. For now we can force a refresh with `bazel sync`
## Test plan
`bazel run //internal/database/migration/shared:write_stitched_migration_graph` runs successfully and without a change to the file
Man I dont even know. Why does golang.google.org/protobuf even depend on github.com/golang/protobuf if the latter is deprecated and explicitly states to use the former...? https://github.com/golang/protobuf/issues/1596
## Test plan
go build & CI
Workaround for https://github.com/sourcegraph/devx-support/issues/622. This is portable between both bsdtar and gnutar. On gnutar, this is the default, so this changes nothing for CI builds. This only changes behaviour in macOS with bsdtar.
It is unclear to me where a final solution will exist:
- An issue was opened upstream in docker/moby, but the latest opinion is that this is an issue with rules_oci _technically_ emitting docker-compatible formats that are incompatible with docker (Im not 100% sure yet that docker itself cant create a tarball that would fail to `docker load`, but I dont want to subject Christoph to more experiments lol) https://github.com/moby/moby/issues/47517
- A PR exists in rules_oci to use a hermetic BSD tar instead of system tar (doesnt work on nixos though coz dynamic libraries :sadge:). It uses `mtree` format to add files, I don't know yet if that works around xattr issue without also passing `--no-xattr` (my current belief is that it does not) https://github.com/bazel-contrib/rules_oci/pull/385
## Test plan
Had Christoph run `bazel run //cmd/batcheshelper:image_tarball`, which succeeded with this patch
`//dev:go_mockgen` had `suggested_update_target` set to `//dev:write_all_generated` without it actually being part of that list. Theres two options here: updated `suggested_update_target` to `//dev:go_mockgen` for go_mockgen targets, or add the former to the latter. We're going with the latter here (we could also do both, but want to direct people towards write_all_generated even though it does more work in the worst case)
## Test plan
CI and `bazel run //dev:write_all_generated`
with_cfg.bzl is a lot of complicated starlark which we can avoid having to try understand when debugging needs arise by doing what it does by-hand and specific to what we need (aka super cut down and simplified).
Extracted from https://github.com/sourcegraph/sourcegraph/compare/jcjh/msp-bazel-delivery#diff-1a8a445b4ce2a72080eca8ae2a3ae24bc904175e9d9aa2dd1938a29746ae86a3 while we were debugging why AW delivery was being problematic (this wasnt the reason, but I did this change in case it _was_ causing issues and itd be more understandable to read)
## Test plan
`bazel run //cmd/batcheshelper:image_tarball && docker run batcheshelper:candidate --help`
Adds a new:
- gazelle generator
- rule + rule targets + catchall target
for generating go-mockgen mocks & testing for their being up-to-date.
Each go_mockgen macro invocation adds targets for generating mocks, copying to the source tree, as well as testing whether the current source tree mocks are up-to-date.
How to use this: `bazel run //dev:go_mockgen` for the catch-all, or `bazel run //some/target:generate_mocks` for an individual package, and `bazel test //some/target:generate_mocks_tests` to test for up-to-date-ness. There is no catch-all for testing
This currently uses a fork of go-mockgen, with an open PR for upstream here: https://github.com/derision-test/go-mockgen/pull/50.
Closes https://github.com/sourcegraph/sourcegraph/issues/60099
## Test plan
Extensive testing during development, including the following cases:
- Deleting a generated file and its entry in a go_library/go_test `srcs` attribute list and then re-running `sg bazel configure`
- Adding a non-existent output directory to mockgen.test.yaml and running the bash one-liner emitted to prepare the workspace for rerunning `sg bazel configure`
The existing config tests a lot of existing paths anyway (creating mocks for a 3rd party library's interface, entries for a given output file in >1 config file etc)
Does what it says on the tin
Caveat:
As this doesn't use the built-in downloaders, this probably cant make use of the repository cache. While it won't refetch it every single time (there is _some_ degree of caching), I'm not sure what will cause it to not use the cached one and refresh it. Its a very fast operation though.
See https://github.com/bazelbuild/bazel/issues/19267
## Test plan
`bazel build //internal/database/migration/shared:generate_stitched_migration_graph`