Commit Graph

157 Commits

Author SHA1 Message Date
Greg Magolan
c60ae9cd67
build(bazel): upgrade to rules_oci 2.0.0-beta2 (#64364)
beta2 release came out yesterday. A few minor fixes on top of beta1.

https://github.com/bazel-contrib/rules_oci/releases/tag/v2.0.0-beta2

## Test plan

CI

## Changelog
2024-08-09 13:32:12 +01:00
Greg Magolan
119f17578c
build(bazel): upgrade to latest aspect bazel-lib and rules_js rules (#64365)
Some analysis phase performance improvements in rules_js 2.0.0-rc9 that
are worth picking up. rules_js is very close to 2.0.0 final now. Waiting
on one last improvement requiring and API change for bzlmod.

## Test plan

CI

## Changelog
2024-08-08 16:57:54 -07:00
Noah S-C
b9c4e2aae9
Revert "Revert "refactor: upgrade to rules_oci 2.0 (2nd attempt)"" (#64354)
Reverts sourcegraph/sourcegraph#64351

## Test plan

Need to test on main due to main-only CI steps (even with main dry-run)
2024-08-08 09:00:08 +00:00
Noah S-C
addba96f47
Revert "refactor: upgrade to rules_oci 2.0 (2nd attempt)" (#64351)
Reverts sourcegraph/sourcegraph#63829

Not working with Aspect Delivery

## Test plan

CI
2024-08-07 22:15:21 +00:00
Greg Magolan
be015c58c2
refactor: upgrade to rules_oci 2.0 (2nd attempt) (#63829)
2nd attempt of #63111, a follow up
https://github.com/sourcegraph/sourcegraph/pull/63085

rules_oci 2.0 brings a lot of performance improvement around oci_image
and oci_pull, which will benefit Sourcegraph. It will also make RBE
faster and have less load on remote cache.

However, 2.0 makes some breaking changes like

- oci_tarball's default output is no longer a tarball
- oci_image no longer compresses layers that are uncompressed, somebody
has to make sure all `pkg_tar` targets have a `compression` attribute
set to compress it beforehand.
- there is no curl fallback, but this is fine for sourcegraph as it
already uses bazel 7.1.

I checked all targets that use oci_tarball as much as i could to make
sure nothing depends on the default tarball output of oci_tarball. there
was one target which used the default output which i put a TODO for
somebody else (somebody who is more on top of the repo) to tackle
**later**.

## Test plan

CI. Also run delivery on this PR (don't land those changes)

---------

Co-authored-by: Noah Santschi-Cooney <noah@santschi-cooney.ch>
2024-08-07 22:21:49 +01:00
Jean-Hadrien Chabran
0fd6235fa4
chore(rel): prepare stitch graph for 5.6 (#64343)
The only manual action for bumping a minor but we'll get there :) 

## Test plan

CI
2024-08-07 19:25:15 +02:00
Warren Gifford
cb19d6f0a9
release/bug: generate a new stitched migration graph (#63764)
This will correct6 upgrade path for mvu plan creation

## Test plan

CI test

<!-- REQUIRED; info at
https://docs-legacy.sourcegraph.com/dev/background-information/testing_principles
-->

## Changelog

<!-- OPTIONAL; info at
https://www.notion.so/sourcegraph/Writing-a-changelog-entry-dd997f411d524caabf0d8d38a24a878c
-->

---------

Co-authored-by: Release Bot <107104610+sourcegraph-release-bot@users.noreply.github.com>
Co-authored-by: Jean-Hadrien Chabran <jean-hadrien.chabran@sourcegraph.com>
Co-authored-by: Anish Lakhwara <anish+git@lakhwara.com>
Co-authored-by: Jean-Hadrien Chabran <jh@chabran.fr>
Co-authored-by: Anish Lakhwara <anish+github@lakhwara.com>
2024-07-10 14:49:18 -07:00
Jean-Hadrien Chabran
087ad83995
chore(release): bump stitch graph generation (#63767)
Missing bit for the minor release version bump

## Test plan

CI

<!-- REQUIRED; info at
https://docs-legacy.sourcegraph.com/dev/background-information/testing_principles
-->
2024-07-10 13:36:02 -07:00
Jean-Hadrien Chabran
2645a9b04d
chore(migrator): bump migration archive (#63752)
Routine update, as this is still a manual process.

## Test plan

CI
2024-07-10 15:10:32 +02:00
Noah S-C
4021ec0aec
chore(bazel): bump rules_js to address permissions denied warning (#63419)
Previously we were having soft-warnings around permissions in badly made
npm packages
```
(21:31:30) WARNING: Remote Cache: /mnt/ephemeral/output/__main__/execroot/__main__/bazel-out/k8-fastbuild/bin/node_modules/.aspect_rules_js/its-fine@1.1.1_react_18.1.0/node_modules/its-fine/src/index.tsx (Permission denied)
```
When trying to enable compact execution log, this becomes a hard fail
```
(14:44:58) ERROR: /mnt/ephemeral/workdir/sourcegraph/sourcegraph/BUILD.bazel:33:22: Extracting npm package its-fine@1.1.1_react_18.1.0 failed: IOException while logging spawn: /mnt/ephemeral/output/__main__/execroot/__main__/bazel-out/k8-fastbuild/bin/node_modules/.aspect_rules_js/its-fine@1.1.1_react_18.1.0/node_modules/its-fine/dist/index.cjs (Permission denied)
```
This bump should fix that

## Test plan

CI still builds successfully


## Changelog
2024-06-21 19:31:53 +02:00
Ara
1a6a7f78bf
Adding Anthropic messages API support to the Google provider through Google vertex (#63282)
[Linear
Issue](https://linear.app/sourcegraph/project/claude-3-on-gcp-8c014e1a3506/overview)

This PR adds support for anthropic models in the google provider through
google vertex.
NOTE: The current code only supported Google Gemini API and had boiler
plate code for Google vertex(only for the gemini model) this PR adds
Google Vertex for anthropic models properly so this way the google
provider can be run in 3 different configurations
1. Google Gemini API(this works but only for chat and not for
completions which is the intended behaviour for now)
2. Google Vertex API Anthropic Model(This works perfectly and is added
in this PR and tested for both chat and completions and it works great)
3. Google Vertex API Gemini Model (this doesn't work yet and can
eventually be added to this codebase but we gotta add a new decoder for
the streaming responses of the gemini model through this API we can take
care of this later)

Sense of Urgency: This is a P0 because of enterprise requirements so I
would appreciate a fast approval and merging.

<!-- 💡 To write a useful PR description, make sure that your description
covers:
- WHAT this PR is changing:
    - How was it PREVIOUSLY.
    - How it will be from NOW on.
- WHY this PR is needed.
- CONTEXT, i.e. to which initiative, project or RFC it belongs.

The structure of the description doesn't matter as much as covering
these points, so use
your best judgement based on your context.
Learn how to write good pull request description:
https://www.notion.so/sourcegraph/Write-a-good-pull-request-description-610a7fd3e613496eb76f450db5a49b6e?pvs=4
-->


## Test plan
- Run this branch for Cody instance ->
https://github.com/sourcegraph/cody/pull/4606
- Ask @arafatkatze to dm you the siteadmin config to make things work
- Check the logs and play with completions and chat

<!-- All pull requests REQUIRE a test plan:
https://docs-legacy.sourcegraph.com/dev/background-information/testing_principles
-->


## Changelog

<!--
1. Ensure your pull request title is formatted as: $type($domain): $what
3. Add bullet list items for each additional detail you want to cover
(see example below)
4. You can edit this after the pull request was merged, as long as
release shipping it hasn't been promoted to the public.
5. For more information, please see this how-to
https://www.notion.so/sourcegraph/Writing-a-changelog-entry-dd997f411d524caabf0d8d38a24a878c?

Audience: TS/CSE > Customers > Teammates (in that order).

Cheat sheet: $type = chore|fix|feat $domain:
source|search|ci|release|plg|cody|local|...
-->

<!--
Example:

Title: fix(search): parse quotes with the appropriate context
Changelog section:

## Changelog

- When a quote is used with regexp pattern type, then ...
- Refactored underlying code.
-->

---------

Signed-off-by: Stephen Gutekanst <stephen@sourcegraph.com>
Co-authored-by: Beatrix <beatrix@sourcegraph.com>
Co-authored-by: Stephen Gutekanst <stephen@sourcegraph.com>
2024-06-20 10:50:15 -07:00
Noah S-C
c8b583f8e6
Revert "refactor: upgrade to rules_oci 2.0" (#63200)
Reverts sourcegraph/sourcegraph#63111
Issue with jobs only run on main

### Test plan

:wat:
2024-06-11 14:23:53 +02:00
Sahin Yort
c12fd6db87
chore(bazel): upgrade to rules_oci 2.0 (#63111)
Follow up https://github.com/sourcegraph/sourcegraph/pull/63085

rules_oci 2.0 brings a lot of performance improvement around oci_image
and oci_pull, which will benefit sourcegraph. It will also make RBE
faster and have less load on remote cache.

However, 2.0 makes some breaking changes like 

- oci_tarball's default output is no longer a tarball
- oci_image no longer compresses layers that are uncompressed, somebody
has to make sure all `pkg_tar` targets have a `compression` attribute
set to compress it beforehand.
- there is no curl fallback, but this is fine for sourcegraph as it
already uses bazel 7.1.

I checked all targets that use oci_tarball as much as i could to make
sure nothing depends on the default tarball output of oci_tarball. there
was one target which used the default output which i put a TODO for
somebody else (somebody who is more on top of the repo) to tackle later.

## Test plan

I am assuming that the repo has enough tests to catch potential problems
on CI. Also somebody who knows the repo better should double check my
changes.

---------

Co-authored-by: Noah Santschi-Cooney <noah@santschi-cooney.ch>
2024-06-11 11:48:58 +00:00
Noah S-C
bb178ba729
chore(tooling): bump Go version to 1.22.4 (#63124)
Bump for @evict 

## Test plan

CI passes with no complaints

## Changelog

- Bumped version of Go used to build to 1.22.4
2024-06-06 15:19:03 +00:00
Sahin Yort
1425097105
chore(bazel): upgrade to latest rules_oci 1.x (#63085)
Upgrades rules_oci from `1.4.3` to `1.7.6`, the latest 1.x release of
rules_oci before upgrading to rules_oci 2.x. Upgrading directly from
`1.4.3` to `2.0.0` is big a jump, because a lot has changed in between.

Signed-off-by: thesayyn <thesayyn@gmail.com>

## Test plan

I don't expect any breaking changes. Also, I am assuming the repo
already has a test coverage for containers built with rules_oci.


## Changelog
2024-06-05 14:35:37 +00:00
Noah S-C
4a93f29755
chore(bazel): enable rules_esbuild sandbox with object-inspect workaround (#61969)
Sandbox escapes be-gone

## Test plan

Tested in CI and locally with `bazel build //client/...` as well as a
lot of blood, sweat n tears tearing through failed sandboxes

## Changelog
2024-06-05 15:34:29 +01:00
William Bezuidenhout
224b3e5830
bazel: rules_js rc3 and remove deprecated "exclude_declarations" option (#63095)
`exclude_declarations_from_npm_packages` is not an option anymore as
from the output
```
ERROR: @aspect_rules_js//npm:exclude_declarations_from_npm_packages :: Unrecognized option: @aspect_rules_js//npm:exclude_declarations_from_npm_packages
```

Also upgraded to `rc3` while diagnosing this.

Closes https://github.com/sourcegraph/devx-support/issues/1005
## Test plan
Tested locally + CI
```
sg bazel test //internal/appliance/reconciler:reconciler_test
INFO: Invocation ID: 70da4295-36f2-43a8-a71e-9b11ae489657
WARNING: Build option --modify_execution_info has changed, discarding analysis cache (this can be expensive, see https://bazel.build/advanced/performance/iteration-speed).
INFO: Analyzed target //internal/appliance/reconciler:reconciler_test (0 packages loaded, 17313 targets configured).
INFO: Found 1 test target...
Target //internal/appliance/reconciler:reconciler_test up-to-date:
  bazel-bin/internal/appliance/reconciler/reconciler_test_/reconciler_test
Aspect @@rules_rust//rust/private:rustfmt.bzl%rustfmt_aspect of //internal/appliance/reconciler:reconciler_test up-to-date (nothing to build)
Aspect @@rules_rust//rust/private:clippy.bzl%rust_clippy_aspect of //internal/appliance/reconciler:reconciler_test up-to-date (nothing to build)
INFO: Elapsed time: 1.210s, Critical Path: 0.11s
INFO: 1 process: 1 internal.
INFO: Build completed successfully, 1 total action
//internal/appliance/reconciler:reconciler_test                 (cached) PASSED in 19.2s
```

## Changelog
- remove deprecated option `exclude_declarations_from_npm_packages` from
local.bazelrc
- update to rc3 of rules_js
2024-06-05 10:33:37 +02:00
Greg Magolan
2d3d918ffa
chore(bazel): upgrade to rules_js 2.0 RC (#63022)
Bumps to rules_js (and friends) to 2.0 RCs.

This brings in performance improvements for analysis phase since npm package depsets and now much smaller. It also adds support for pnpm v9 and allows for linking js_library targets as 1p deps instead of npm_package targets. See https://github.com/aspect-build/rules_js/issues/1671 for more details.

## Test plan

CI

## Changelog
2024-06-04 11:26:42 +00:00
Greg Magolan
a3afa08161
chore(bazel): bump to aspect_bazel_lib 2.7.7 (#63012) 2024-05-31 23:08:52 +01:00
Greg Magolan
bbae7a4954
build(bazel): bump to rules_esbuild 0.16.0 (#63005)
* build(bazel): pin bazel fetched esbuild version to 0.19.2

* build(bazel): bump to rules_esbuild 0.16.0

* Update WORKSPACE

Co-authored-by: Noah S-C <noah@sourcegraph.com>

---------

Co-authored-by: Noah S-C <noah@sourcegraph.com>
2024-05-31 11:20:23 -07:00
Varun Gandhi
b95e8fdc71
chore: Bump Rust version 1.73.0 -> 1.78.0 (#62921) 2024-05-28 14:53:52 +00:00
Greg Magolan
f7a2d5e380
bazel: bump to latest aspect_rules_js, rules_nodejs, aspect_bazel_lib, aspect_rules_ts, aspect_rules_swc (#62874) 2024-05-23 10:14:30 +02:00
William Bezuidenhout
9b990df62e
chore(ci): rules_buf 1.31 (#62718) 2024-05-16 08:15:20 +00:00
Robert Lin
d05d4d218f
lib/enterpriseportal: initial service API for RFC 885 (#62263)
See [RFC 885 Sourcegraph Enterprise Portal (go/enterprise-portal)](https://docs.google.com/document/d/1tiaW1IVKm_YSSYhH-z7Q8sv4HSO_YJ_Uu6eYDjX7uU4/edit#heading=h.tdaxc5h34u7q) - closes CORE-6. The only files requiring in-depth review are the `.proto` files, as everything else is generated:

- `lib/enterpriseportal/subscriptions/v1/subscriptions.proto`
- `lib/enterpriseportal/codyaccess/v1/codyaccess.proto`

This PR only introduces API definitions - implementation will come as subsequent PRs, tracked in the ["Launch Enterprise Portal" Linear project](https://linear.app/sourcegraph/project/launch-sourcegraph-enterprise-portal-ee5d9ea105c2).

Before reviewing the diffs, **please review this PR description in depth**.

### Design goals

This initial schema aims to help achieve CORE-97 by adding our initial "get subscription Cody access", as well our general Stage 1 goal of providing read-only access to our existing enterprise subscription mechanisms. In doing so, we can start to reshape the API in a way that accommodates future growth and addresses some debt we have accumulated over time, before the Stage 2 goal of having the new Enterprise Portal be the source-of-truth for all things subscriptions.

I am also aiming for a conservative approach with the Cody Gateway access related RPCs, to ease migration risks and allow for Cody teams to follow up quickly with more drastic changes in a V2 of the service after a Core-Services-driven migration to use the new service: https://github.com/sourcegraph/sourcegraph/pull/62263#issuecomment-2101874114

### Design overview

- **Multiple services**: Enterprise Portal aims to be the home of most Enterprise-related subscription and access management, but each component should be defined as a separate service to maintain clear boundaries between "core" capabilities and future extensions. One problem we see in the `dotcom { productSubscriptions }` is the embedding of additional concepts like Cody Gateway access makes the API surface unwieldy and brittle, and encourages an internal design that bundles everything together (the `product_subscriptions` table has 10 `cody_gateway_*` columns today). More concretely, this PR designs 2 services that Enterprise Portal will implement:
  - `EnterprisePortalSubscriptionsService` (`subscriptions.proto`): subscriptions and licenses CRUD
  - `EnterprisePortalCodyGatewayService` (`codygateway.proto`): Enterprise Cody Gateway access
- **Multiple protocols**: We use [ConnectRPC](https://connectrpc.com/) to generate traditional gRPC handlers for service-to-service use, but also a plain HTTP/1 "REST"-ish protocol (the ["Connect Protocol"](https://connectrpc.com/docs/protocol)) that works for web clients and simple integrations. Go bindings for the Connect protocol are generated into the `v1connect` subpackages.
- **Future licensing model/mechanism changes**: The _Subscription_ model is designed to remain static, but _Licenses_ are designed to accommodate future changes -`EnterpriseSubscriptionLicenseType` and `EnterpriseSubscriptionLicense` in this PR describe only the current type of license, referred to as "classic licenses", but we can extend this in the future for e.g. new models (refreshable licenses?) or new products (Cody-only? PLG enterprise?), or existing problems (test instance licenses?)
- **Granular history**: Instead of a `createdAt`, `isArchived`, `revokedAt` and  and so on, the new API defines Kubernetes-style `conditions` for licenses and subscriptions to describe creation, archival, and revocation events respectively, and can be more flexibly extended for future events and a lightweight audit log of major changes to a subscription or license. In particular, `revokedAt` already has a `revokedReason` - this allows us to extend these important events with additional metadata in a flexible manner.
- **Pagination**: I couldn't find a shared internal or off-the-shelf representation of pagination attributes, but each `List*` RPC describes `page_size`, `page_token`, and `next_page_token`
- **Querying/filtering**: I couldn't find a strong standard for this either, but in general:
  - `Get*` accepts `query` that is a `oneof`, with the goal of providing exact matches only.
  - `List*` accepts `repeated filter`, where each `filter` is a `oneof` a set of strategies relevant to a particular `List*` RPC. Multiple filters are treated as `AND`-concatenated.

Some major changes from the existing model:

- **Downgrade the concept of "subscription access token"**: this was built for Cody Gateway but I am not sure it has aged well, as the mechanism is still tied to individual licenses, did not find new non-Cody-Gateway use cases (except for license checks, though those do not require an "access token" model either), and today are still not "true" access tokens as they cannot be expired/managed properly. This PR relegates the concept to remain Cody-specific as it effectively is today so that we might be able to introduce a better subscription-wide model if the use case arises. Over time, we may want to make this even more opaque, relying entirely on zero-config instead (generating from license keys).
- **Subscriptions are no longer attached to a single dotcom user**: Most of these users today are not real users anyway, as our license creation process asks that you create a fake user account (["User account: [...] We create a company-level account for this."](https://handbook.sourcegraph.com/departments/technical-success/ce/process/license_keys/#license-key-mechanics)). The new API removes the concept entirely, in favour of a true user access management system in CORE-102.
- **Database/GraphQL IDs** are no longer exposed - we use external, prefixed UUIDs for representing entities over APIs in a universal manner.
- **Per-subscription Cody Gateway access no longer exposes `allowed models`**: I suggested this to  @rafax in light of recent problems with propagating new models to Enterprise customers. He agreed that the general product direction is "model options as a selling point" - it no longer makes sense to configure these at a per-subscription level. Instead, the Cody Gateway service should configure globally allowed models directly, and each Sourcegraph instance can determine what models they trust. If we really need this back we can add it later, but for now I think this removal is the right direction.

### Direct translations

`cmd/cody-gateway/internal/dotcom/operations.graphql` defines our key dependencies for achieving CORE-97. The concepts referred in `operations.graphql` translate to this new API as follows: 

- `dotcom { productSubscriptionByAccessToken(accessToken) }`: `codygateway.v1.GetCodyGatewayAccess({ access_token })`
- `dotcom { productSubscriptions }`: `codygateway.v1.ListCodyGatewayAccess()`
- `fragment ProductSubscriptionState`:
  - `id`: **n/a**
  - `uuid`: `subscriptions.v1.EnterpriseSubscription.id`
  - `account { username }`: `subscriptions.v1.EnterpriseSubscription.display_name`
  - `isArchived`: `subscriptions.v1.EnterpriseSubscription.conditions`
  - `codyGatewayAccess { ... }`: **separate RPC to `codygateway.v1.GetCodyGatewayAccess`**
  - `activeLicense { ... }`: **separate RPC to `subscriptions.v1.ListEnterpriseSubscriptionLicenses`**

### Why `lib/enterpriseportal`?

We recently had to move another Telemetry Gateway to `lib`: #62061. Inevitably, there will be services that live outside the monorepo that want to integrate with Enterprise Portal (one is on our roadmap: Cody Analytics in https://github.com/sourcegraph/cody-analytics). This allows us to share generated bindings and some useful helpers, while keeping things in the monorepo.

### Implications for Cody Clients

For now (and in the future), nothing is likely to change. Here's how I imagine things playing out:

```mermaid
graph TD
  cc["Cody Clients"] -- unified API --> cg[services like Cody Gateway]
  cg -- PLG users --> ssc[Self-Serve Cody]
  cg -- Enterprise users --> ep[Enterprise Portal]
```

## Test plan

CI passes, the schemas can be generated by hand:

```
sg gen buf \
  lib/enterpriseportal/subscriptions/v1/buf.gen.yaml \
  lib/enterpriseportal/codyaccess/v1/buf.gen.yaml
```

---------

Co-authored-by: Joe Chen <joe@sourcegraph.com>
Co-authored-by: Chris Smith <chrsmith@users.noreply.github.com>
2024-05-15 12:58:55 -07:00
Jean-Hadrien Chabran
d488517383
chore(rel): bump minor for stitch graph + add support invalidating migrations repo rule (#62511)
chore(rel): bump minor for stitch graph + add support invalidating migrations repo rule (#62490)

* chore(bzl): allow to invalidate migrations repo rule

* chore(bzl): gen stitch graph for 5.4
2024-05-07 22:04:59 +00:00
Noah S-C
7896492d36
bazel: bump rules_go to 0.47.0 (#62147) 2024-04-25 17:08:10 +02:00
Noah S-C
ce6a366647
bazel: transition oci_image to opt/release mode for Rust (#61740)
Our rust binaries (e.g. scip-ctags/syntect_server) were being built in some mix of opt & fastbuild mode[1]. Unlike with Go where there is no release/debug mode flag, Rust requires opting into optimized release builds. We can do that automagically when building any oci_image target with the power of  transitions  

This has the side-effect of our Go binaries no longer being stripped & containing debug symbols, see https://github.com/bazelbuild/rules_go/issues/3917

Also to note, [in Cargo.toml we opt into debug symbols in release mode](https://sourcegraph.com/github.com/sourcegraph/sourcegraph@nsc/bazel-release-mode-rust/-/blob/docker-images/syntax-highlighter/Cargo.toml?L67%3A11-70%3A9). Is this preserved by this PR for bazel[2]? 

[1] `strings` on the binaries showed the 3rd-party crates having `k8-opt` filepath names e.g.
```
$ / # strings syntect_server | grep k8-
/tmp/bazel-working-directory/__main__/bazel-out/k8-opt-exec-ST-13d3ddad9198/bin/external/crate_index__onig_sys-69.8.1/onig_sys_build_script_.runfiles/crate_index__onig_sys-69.8.1
```
but the final binaries (and the 1st-party crates) themselves were being built in fastbuild mode. See https://github.com/sourcegraph/devx-support/issues/790 for original point of contact

[2] It seems like it may be preserved, but I dont know how reliable this is for Rust binaries
```
$ file bazel-bin/docker-images/syntax-highlighter/scip-ctags
bazel-bin/docker-images/syntax-highlighter/scip-ctags: ELF 64-bit LSB pie executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, for GNU/Linux 2.0.0, with debug_info, not stripped
```

## Test plan

Tested for sourcegraph/scip-ctags image:
```
/ $ strings scip-ctags | grep "Could not parse file" 
/ $ echo $?
1
```
2024-04-15 17:19:17 +00:00
Noah S-C
88998e2c35
bazel: patch rules_apko to copy, not symlink, lockfile (#61877)
CI started failing with a bunch of the following for every apko target
```
...
(10:27:41) ERROR: /mnt/ephemeral/workdir/sourcegraph/sourcegraph/cmd/batcheshelper/BUILD.bazel:78:11: Action cmd/batcheshelper/wolfi_base_apko failed: missing input file '@@batcheshelper_apko_lock//:lockfile_copy'
(10:27:41) ERROR: /mnt/ephemeral/workdir/sourcegraph/sourcegraph/cmd/batcheshelper/BUILD.bazel:78:11: Action cmd/batcheshelper/wolfi_base_apko failed: 1 input file(s) do not exist
...
```

This line seemed suspect, so lets replace it with an actual full copy instead of 🤨 symlink https://sourcegraph.com/github.com/chainguard-dev/rules_apko@1d78765293a0baf3f92ca49efa51d6c02b9c828e/-/blob/apko/translate_lock.bzl?L69

## Test plan

CI goes green again 😎
2024-04-15 11:07:16 +00:00
Noah S-C
c87793ac9b
Reapply "bazel: migrate dind dockerfile to rules_oci" (#61790) (#61824)
This reverts commit 52ac934abe.

Once we're off GCR and onto GAR 😎 

## Test plan

CI
2024-04-12 15:52:44 +00:00
Will Dollman
d56fa926dd
Build images end-to-end using Bazel v2 (#61845)
* wip

* gitserver (mostly) wolfi 4 bazel

* the big heck of all things

* Add rules_apko lock translation rules to WORKSPACE

* Call apko_repositories() more

* fix rules_apko to handle our shorter repo urls

* fix workspace from rebase, and missing locks

* visibility on wolfi_base_image

* hand-fix a lock coz apko lock is 🅱️roken

* remove chainguard repo+keyring from base

* update locks

* add chainguard repo+keychain to single server manifest

* unrelated fixes, server+grafana still h*cked

* fix postgres-exporter

* the big fix

* aws lib got bumped?

* downgrade sso-oidc? idk

* ignore wolfi locks from prettier

* dynamically do the locks with a reporule

* document and make nice :nails:

* bazel run @rules_apko//apko patch

* Fix .typo.typo

* Update tooling for end-to-end Bazel images (#61106)

* Update sg wolfi image to build using Bazel

* bazel run @rules_apko//apko patch

* Fix .typo.typo

* Add update-images and implement apko YAML change monitoring

* Use bazel apko and add support for additional repos

* Refactor sg wolfi

* Rework wolfi base image auto-update pipeline

* sg bazel configure

* [rough] Add --check flag to sg wolfi lock

* Refactor sg wolfi lock --check

* Simplify check and update apko lock hash operations

* Fix resolveImagePath when running in bazel

* Fixup logic error in CheckApkoLockHashes

* Tweak DoBaseImageBuild output

* Remove debug output

* Fix sg wolfi lock --check behaviour for all images

* Replace base image build step with apko lock --check

* Remove debug line

* Minor fixups for CI step

* Wrap with AnnotatedCmd

* Fixup annotation

* Update apko lockfiles

* Allow additional repos to be passed

* Update build-base-image.sh with bazel + add back to pipeline

* Ensure that modified base images are rebuilt

* Solve bazelception

* Remove timestamp for bit-level reproducibility

* Skip local keygen when running on buildkite

* Add workaround for lack of local repo support in rules_apko

* Run apkoOps first as it's quick and might fail

* Remove blocking allBaseImagesBuilt step

* Remove unused promethus-gcp image

* Add special cases to resolveImagePath

* Cleanly handle case where no bazel build path exists

This could happen in cases where a base image is only used outside of sourcegraph/sourcegraph,
or if you've added a new base image config but haven't added the associated Bazel scaffolding

* Add debugging around failing docker builds

* More debugging

* Normalise apko_lockfile to match repo.bzl

* Fixup apko docker call

* Try passing imageconfigdir differently to docker

* Run ls in different container

* Soft-fail when using legacy build in Buildkite

* Add missing include

* Workaround for building sourcegraph and sourcegraph-dev

* Add postgresql-client package to server

This contains createdb, which was recently moved from postgresql

* Inflate postgres-12-codeinsights image to avoid rules_apko errors

* Remove update line from yaml files

* Fix issue caused by moving base sourcegraph image

* Remove apk-tools from server

* Update lockfiles

* Address review feedback

* Remove debug lines

* fix unbound var

---------

Co-authored-by: Noah Santschi-Cooney <noah@santschi-cooney.ch>

* go mod tidy + gazelle-update-repos after merging main

* Use aspect bazel cache

* Use Aspect bazel caching when calling bazel in bash and sg

* Append annotation

* Run apko lock on aspect agent

* Remove base image builds

Discussion in https://sourcegraph.slack.com/archives/C05EVRLQEUR/p1712307465660509

* Remove unused functionality

* Update BaseImageConfig comments

* Rewrite wolfi-images/README.md

* Add .apko/range.sh to .gitattributes

* Remove "wolfi" from :base_image and :base_tarball targets

* remove allowlist extras from debugging

* Tweak user instructions around package testing

* Add agent healthcheck to buildkite scripts

* prettier

* sg bazel configure

* bazel run //:gazelle-update-repos

---------

Co-authored-by: Noah Santschi-Cooney <noah@santschi-cooney.ch>
Co-authored-by: Noah S-C <noah@sourcegraph.com>
2024-04-12 16:18:43 +01:00
Noah S-C
52ac934abe
Revert "bazel: migrate dind dockerfile to rules_oci" (#61790)
Revert "bazel: migrate dind dockerfile to rules_oci (#61788)"

This reverts commit e4de7a46c1.

Google Container Registry doesn't like this image for some reason, so we're gonna hold out until we're moved to Artifact Registry
2024-04-11 18:21:48 +02:00
Noah S-C
e4de7a46c1
bazel: migrate dind dockerfile to rules_oci (#61788)
From @willdollman's ask:
> when updating the underlying dind image, we update the [upstream image hash here](https://sourcegraph.com/github.com/sourcegraph/sourcegraph@5352138e5ed0e6ab96c95417d90a3d65f28aa769/-/blob/docker-images/dind/Dockerfile?L1). But I don’t see how to get the output of our Dockerfile onto Docker Hub - [oci_deps.bzl](https://sourcegraph.com/github.com/sourcegraph/sourcegraph@5352138e5ed0e6ab96c95417d90a3d65f28aa769/-/blob/dev/oci_deps.bzl?L221-225) and the [BUILD.bazel](https://sourcegraph.com/github.com/sourcegraph/sourcegraph@5352138e5ed0e6ab96c95417d90a3d65f28aa769/-/blob/docker-images/dind/BUILD.bazel?L6) reference the image that’s already on Dockerhub and push to the same image name, which seems to be a circular dependency.
> tl;dr there doesn’t seem to be a way to use CI to push an updated dind image

To address this, Im migrating from the Dockerfile + build.sh approach to Bazel-native rules_oci approach (integrating with our existing ./dev/ci/push_all.sh script).

To remove the docker-{buildx,compose} files, given that we cant run `rm <file>` with rules_oci, I take the approach of how this is actually implemented in the OCI spec, using [whiteout files](https://github.com/opencontainers/image-spec/blob/main/layer.md#whiteouts).

## Test plan

`BUILDKITE_COMMIT=deadbeef VERSION=6.9.0 bazel run --stamp --workspace_status_command=./dev/bazel_stamp_vars.sh //docker-images/dind:image_tarball` to build&stamp&load the oci image

Then [dive](https://github.com/wagoodman/dive) to inspect the image to make sure it removed files as expected
![image](https://github.com/sourcegraph/sourcegraph/assets/18282288/069755ce-4c3c-40bc-9e62-6c77cd303fbe)
2024-04-11 16:42:28 +01:00
Noah S-C
118f627bac
Revert "bazel: bump rules_{oci,pkg}" (#61766)
Revert "bazel: bump rules_{go,oci,pkg} (#60510)"

This reverts commit 1ee85c27f5.
2024-04-10 23:10:40 +02:00
William Bezuidenhout
770eb9f8ce
bazel: update gazelle to 0.35.0 (#61680)
update gazelle to 0.35.0
2024-04-08 16:46:49 +02:00
Noah S-C
1ee85c27f5
bazel: bump rules_{go,oci,pkg} (#60510)
Updating the patch based on https://github.com/bazelbuild/rules_go/pull/3863 having been merged, the old patch doesn't apply cleanly to 0.46.0 anyway.

## Test plan

CI
2024-04-08 12:54:15 +00:00
Will Dollman
2c1d55c00e
Revert "Hackathon: Build images end-to-end using Bazel (#60785)" (#61644)
This reverts commit 44db6658b6.
2024-04-05 13:43:19 +00:00
Will Dollman
44db6658b6
Hackathon: Build images end-to-end using Bazel (#60785)
* wip

* gitserver (mostly) wolfi 4 bazel

* the big heck of all things

* Add rules_apko lock translation rules to WORKSPACE

* Call apko_repositories() more

* fix rules_apko to handle our shorter repo urls

* fix workspace from rebase, and missing locks

* visibility on wolfi_base_image

* hand-fix a lock coz apko lock is 🅱️roken

* remove chainguard repo+keyring from base

* update locks

* add chainguard repo+keychain to single server manifest

* unrelated fixes, server+grafana still h*cked

* fix postgres-exporter

* the big fix

* aws lib got bumped?

* downgrade sso-oidc? idk

* ignore wolfi locks from prettier

* dynamically do the locks with a reporule

* document and make nice :nails:

* bazel run @rules_apko//apko patch

* Fix .typo.typo

* Update tooling for end-to-end Bazel images (#61106)

* Update sg wolfi image to build using Bazel

* bazel run @rules_apko//apko patch

* Fix .typo.typo

* Add update-images and implement apko YAML change monitoring

* Use bazel apko and add support for additional repos

* Refactor sg wolfi

* Rework wolfi base image auto-update pipeline

* sg bazel configure

* [rough] Add --check flag to sg wolfi lock

* Refactor sg wolfi lock --check

* Simplify check and update apko lock hash operations

* Fix resolveImagePath when running in bazel

* Fixup logic error in CheckApkoLockHashes

* Tweak DoBaseImageBuild output

* Remove debug output

* Fix sg wolfi lock --check behaviour for all images

* Replace base image build step with apko lock --check

* Remove debug line

* Minor fixups for CI step

* Wrap with AnnotatedCmd

* Fixup annotation

* Update apko lockfiles

* Allow additional repos to be passed

* Update build-base-image.sh with bazel + add back to pipeline

* Ensure that modified base images are rebuilt

* Solve bazelception

* Remove timestamp for bit-level reproducibility

* Skip local keygen when running on buildkite

* Add workaround for lack of local repo support in rules_apko

* Run apkoOps first as it's quick and might fail

* Remove blocking allBaseImagesBuilt step

* Remove unused promethus-gcp image

* Add special cases to resolveImagePath

* Cleanly handle case where no bazel build path exists

This could happen in cases where a base image is only used outside of sourcegraph/sourcegraph,
or if you've added a new base image config but haven't added the associated Bazel scaffolding

* Add debugging around failing docker builds

* More debugging

* Normalise apko_lockfile to match repo.bzl

* Fixup apko docker call

* Try passing imageconfigdir differently to docker

* Run ls in different container

* Soft-fail when using legacy build in Buildkite

* Add missing include

* Workaround for building sourcegraph and sourcegraph-dev

* Add postgresql-client package to server

This contains createdb, which was recently moved from postgresql

* Inflate postgres-12-codeinsights image to avoid rules_apko errors

* Remove update line from yaml files

* Fix issue caused by moving base sourcegraph image

* Remove apk-tools from server

* Update lockfiles

* Address review feedback

* Remove debug lines

* fix unbound var

---------

Co-authored-by: Noah Santschi-Cooney <noah@santschi-cooney.ch>

* go mod tidy + gazelle-update-repos after merging main

* Use aspect bazel cache

* Use Aspect bazel caching when calling bazel in bash and sg

* Append annotation

* Run apko lock on aspect agent

* Remove base image builds

Discussion in https://sourcegraph.slack.com/archives/C05EVRLQEUR/p1712307465660509

* Remove unused functionality

* Update BaseImageConfig comments

* Rewrite wolfi-images/README.md

* Add .apko/range.sh to .gitattributes

* Remove "wolfi" from :base_image and :base_tarball targets

* remove allowlist extras from debugging

* Tweak user instructions around package testing

* Add agent healthcheck to buildkite scripts

* prettier

---------

Co-authored-by: Noah Santschi-Cooney <noah@santschi-cooney.ch>
Co-authored-by: Noah S-C <noah@sourcegraph.com>
2024-04-05 13:57:45 +01:00
Noah S-C
75947b347f
bazel: refactor percy mocha tests to js_test instead of js_run_binary + build_test (#60983)
`bazel build` on percy mocha targets (such as //client/web/src/integration:integration-tests) no longer result in actually running the test!

Originally, we used `js_run_binary` with `build_test` as `js_test` doesnt support stamping, and we need to be able to read volatile variables for percy. 
Then, we worked around https://github.com/bazelbuild/bazel/issues/16231 in https://github.com/sourcegraph/sourcegraph/pull/58505 by not explicitly depending on the stamp variables, but exploiting a bit of a hack to read them anyways (will this work with RBE?)
Now, given that we're not explicitly stamping and still using the hack, we can use `js_test` instead, to avoid having the tests run as part of `bazel build`, instead only when we run `bazel test` (as is good 😌)

It is apparently possible to work around https://github.com/bazelbuild/bazel/issues/16231 when using disk/remote caches, but only for local builds (so no remote builds) according to [the following comment](https://github.com/bazelbuild/bazel/issues/16231#issuecomment-1772835555), but we would still need: 
1. `js_test` to support stamping and
2. this workaround to also apply to remote execution (as we're considering that once its supported in Aspect Workflows)

todo: update doc/dev/background-information/bazel/web_overview.md in new docs repo

## Test plan

CI 🎉
2024-03-26 10:58:20 +00:00
Noah S-C
d6a9269c48
bazel: rework schema migrations reporule without gsutil (#61295)
We use the GCS JSON API directly instead of gsutil or gcloud because:
        - gsutil may spend up to a ~1m20s trying to contact metadata.google.internal
            without a discovered way to disable that
        - gcloud disallows unauthed access to an even public bucket

TODO: in future iterations we can explore how to properly invalidate this. For now we can force a refresh with `bazel sync`

## Test plan

`bazel run //internal/database/migration/shared:write_stitched_migration_graph` runs successfully and without a change to the file
2024-03-25 16:17:26 +00:00
Noah S-C
fac966d0b6
chore: bump golang/protobuf to 1.5.4 because of nonsense (#61127)
Man I dont even know. Why does golang.google.org/protobuf even depend on github.com/golang/protobuf if the latter is deprecated and explicitly states to use the former...? https://github.com/golang/protobuf/issues/1596

## Test plan

go build & CI
2024-03-13 23:57:03 +00:00
Jean-Hadrien Chabran
9f10c1cb3d
rfc795: new release process foundations (#60962)
---------

Co-authored-by: William Bezuidenhout <william.bezuidenhout@sourcegraph.com>
2024-03-12 17:12:22 +01:00
William Bezuidenhout
12957aedb0
bazel: buildifier 6.4.0 (#61025)
buildifier 6.4.0
2024-03-12 13:18:26 +02:00
Noah S-C
463f11a388
bazel: patch rules_oci to run tar with --no-xattrs (#61004)
Workaround for https://github.com/sourcegraph/devx-support/issues/622. This is portable between both bsdtar and gnutar. On gnutar, this is the default, so this changes nothing for CI builds. This only changes behaviour in macOS with bsdtar.

It is unclear to me where a final solution will exist:
- An issue was opened upstream in docker/moby, but the latest opinion is that this is an issue with rules_oci _technically_ emitting docker-compatible formats that are incompatible with docker (Im not 100% sure yet that docker itself cant create a tarball that would fail to `docker load`, but I dont want to subject Christoph to more experiments lol) https://github.com/moby/moby/issues/47517
- A PR exists in rules_oci to use a hermetic BSD tar instead of system tar (doesnt work on nixos though coz dynamic libraries :sadge:). It uses `mtree` format to add files, I don't know yet if that works around xattr issue without also passing `--no-xattr` (my current belief is that it does not)  https://github.com/bazel-contrib/rules_oci/pull/385

## Test plan

Had Christoph run `bazel run //cmd/batcheshelper:image_tarball`, which succeeded with this patch
2024-03-11 19:27:37 +00:00
Noah S-C
a514e773e0
bazel: include //dev:go_mockgen in //dev:write_all_generated (#60963)
`//dev:go_mockgen` had `suggested_update_target` set to `//dev:write_all_generated` without it actually being part of that list. Theres two options here: updated `suggested_update_target` to `//dev:go_mockgen` for go_mockgen targets, or add the former to the latter. We're going with the latter here (we could also do both, but want to direct people towards write_all_generated even though it does more work in the worst case)

## Test plan

CI and `bazel run //dev:write_all_generated`
2024-03-11 14:23:29 +00:00
Noah S-C
cb7034680d
bump to Go 1.22.1 (#60902)
🚀 💎 🙌 🚙 

## Test plan

CI
2024-03-06 17:38:43 -07:00
Noah S-C
6ee3e17796
bazel: transition oci_image ourselves instead of via with_cfg.bzl (#60896)
with_cfg.bzl is a lot of complicated starlark which we can avoid having to try understand when debugging needs arise by doing what it does by-hand and specific to what we need (aka super cut down and simplified).

Extracted from https://github.com/sourcegraph/sourcegraph/compare/jcjh/msp-bazel-delivery#diff-1a8a445b4ce2a72080eca8ae2a3ae24bc904175e9d9aa2dd1938a29746ae86a3 while we were debugging why AW delivery was being problematic (this wasnt the reason, but I did this change in case it _was_ causing issues and itd be more understandable to read)

## Test plan

`bazel run //cmd/batcheshelper:image_tarball && docker run batcheshelper:candidate --help`
2024-03-06 16:05:38 +00:00
Noah S-C
15146c770a
bump aspect bazel-lib to 1.40.3 (#60869)
Allows us to drop some patches 🎉 

https://sourcegraph.slack.com/archives/C04BWU1519D/p1709664254285619

## Test plan

CI
2024-03-06 13:07:24 +00:00
Noah S-C
98e0f75d1e
bazel: use transitions to apply cross-compile platform automatically to oci_image (#60569)
Removes the need to pass `--config=docker-darwin` through the following mechanisms:

1. `--enable_platform_specific_config` to enable certain flags on macos only e.g. `--extra_toolchains @zig_sdk//toolchain:linux_amd64_gnu.2.34` and `--sandbox_add_mount_pair=/tmp` (see [.bazelrc change](https://github.com/sourcegraph/sourcegraph/pull/60569/files?file-filters%5B%5D=dotfile&show-viewed-files=true))
2. Apply a transition (using https://github.com/fmeum/with_cfg.bzl, please view [the following great video on it](https://www.youtube.com/watch?v=U5bdQRQY-io)) on `oci_image` targets when on the `@platforms//os:macos` platform to transition to the `@zig_sdk//platform:linux_amd64` platform. 
	- This will start at `oci_image` targets and propagate down to e.g. `go_{binary,library}` etc targets with the "transitioned" platform configuration, resulting in them being built with the transitioned-to platform
3. Remove `darwin_docker_e2e_go` config_setting and `darwin-docker` bool_flag.
	- These aren't necessary anymore, as the places where these were used were not in the transitive closure rooted at an `oci_image` target, meaning they wouldn't be transitioned.

To review, view [the following (filtered) files](https://github.com/sourcegraph/sourcegraph/pull/60569/files?file-filters%5B%5D=.bzl&file-filters%5B%5D=.sh&file-filters%5B%5D=.yaml&file-filters%5B%5D=No+extension&file-filters%5B%5D=dotfile&show-viewed-files=true)  along with [the root BUILD.bazel](https://github.com/sourcegraph/sourcegraph/pull/60569/files#diff-7fc57714ef13c3325ce2a1130202edced92fcccc0c6db34a72f7b57f60d552a3). All the other files are just changing the `load` statements from `@rules_oci` to `//dev:oci_defs.bzl`

## Test plan

CI, checked image locally and `sg test bazel-backend-integration` & `sg test bazel-e2e`
2024-02-20 13:57:56 +00:00
Noah S-C
19d9cfc73b
bazel: native go-mockgen in Bazel (#60386)
Adds a new:
- gazelle generator
- rule + rule targets + catchall target
for generating go-mockgen mocks & testing for their being up-to-date.

Each go_mockgen macro invocation adds targets for generating mocks, copying to the source tree, as well as testing whether the current source tree mocks are up-to-date.

How to use this: `bazel run //dev:go_mockgen` for the catch-all, or `bazel run //some/target:generate_mocks` for an individual package, and `bazel test //some/target:generate_mocks_tests` to test for up-to-date-ness. There is no catch-all for testing

This currently uses a fork of go-mockgen, with an open PR for upstream here: https://github.com/derision-test/go-mockgen/pull/50.

Closes https://github.com/sourcegraph/sourcegraph/issues/60099

## Test plan

Extensive testing during development, including the following cases:
- Deleting a generated file and its entry in a go_library/go_test `srcs` attribute list and then re-running `sg bazel configure`
- Adding a non-existent output directory to mockgen.test.yaml and running the bash one-liner emitted to prepare the workspace for rerunning `sg bazel configure`

The existing config tests a lot of existing paths anyway (creating mocks for a 3rd party library's interface, entries for a given output file in >1 config file etc)
2024-02-16 13:26:48 +00:00
Noah S-C
ba9d2e0ca2
bazel: move schema migrations fetching from GCS to bazel repository (#59879)
Does what it says on the tin

Caveat:
As this doesn't use the built-in downloaders, this probably cant make use of the repository cache. While it won't refetch it every single time (there is _some_ degree of caching), I'm not sure what will cause it to not use the cached one and refresh it. Its a very fast operation though.
See https://github.com/bazelbuild/bazel/issues/19267

## Test plan

`bazel build //internal/database/migration/shared:generate_stitched_migration_graph`
2024-02-14 17:40:39 +00:00