Commit Graph

3530 Commits

Author SHA1 Message Date
Michael Bahr
e85028b8bd
fix: update links for dev docs (#62758)
* fix: license checker info is in docs-legacy

* fix: update remaining dev links
2024-05-17 13:47:34 +02:00
James McNamara
5022b91a43
chore(localenv): disable tsc declaration files (#62680)
* try enabling flag to disable tsc declaration files

* enable flag locally

* enable flag for local dev

* remove flag
2024-05-17 10:40:14 +02:00
Varun Gandhi
b93b82b020
docs: Add link to Entitle permission (#62728) 2024-05-17 15:18:50 +08:00
sourcegraph-release-guild-bot
f4ec7e52d8
msp: add commit attribute to rollout (#62742)
Co-authored-by: James Cotter <jamescotter@pm.me>
2024-05-16 20:35:34 +01:00
Robert Lin
6c9e620913
sg msp: only generate skaffold assets if last stage of rollouts (#62736)
#62704 introduced a regression due to the changing of the semantics of `rollouts` configuration in code: previously, only the final stage would get it, but with #62704 this became available on all environments, and to infer the final stage a nil-safe helper `rollout.IsFinalStage()` was introduced.

This change fixes a missed check migration that causes additional assets to be incorrectly generated for non-final environments.

## Test plan

`sg msp generate -all`
2024-05-16 10:39:04 -07:00
William Bezuidenhout
ec7cbeec0f
feat(sg/repoferee): add security command with repo-report subcommand to fetch latest repoferee report (#62735)
* add security command to get repoferee report

* sg: adds security with `repo-report` command to fetch latest repoferee
report

* bazel
2024-05-16 19:21:13 +02:00
Robert Lin
6c59b02534
feat/msp: do not use tfvars file outside of deploy-type 'subscription' (#62704)
Closes CORE-121

The dependency on the generated `tfvars` file is frustrating for first-time MSP setup because it currently requires `-stable=false` to update, and doesn't actually serve any purpose for deploy types other than `subscription` (which uses it to isolate image changes that happen on via GitHub actions). This makes it so that we don't generate, or depend on, the dynamic `tfvars` file unless you are using `subscription`.

I've also added a rollout spec configuration, `initialImageTag`, to make the initial tag we provision environments with configurable (as some services might not publish `insiders` images) - see the docstring.

## Test plan

Inspect output of `sg msp generate -all`
2024-05-16 09:43:47 -07:00
Noah S-C
9b6ba7741e
bazel: transcribe test ownership to bazel tags (#62664) 2024-05-16 15:51:16 +01:00
James Cotter
d1404951eb
sg/msp: fix CustomTargetType reference in Target definition (#62727) 2024-05-16 13:32:37 +01:00
James Cotter
75356f8606
sg/msp: clarify repository annotation meaning in delivery pipeline (#62703)
PR feedback from: https://github.com/sourcegraph/sourcegraph/pull/62702
2024-05-15 21:46:39 +01:00
Robert Lin
d05d4d218f
lib/enterpriseportal: initial service API for RFC 885 (#62263)
See [RFC 885 Sourcegraph Enterprise Portal (go/enterprise-portal)](https://docs.google.com/document/d/1tiaW1IVKm_YSSYhH-z7Q8sv4HSO_YJ_Uu6eYDjX7uU4/edit#heading=h.tdaxc5h34u7q) - closes CORE-6. The only files requiring in-depth review are the `.proto` files, as everything else is generated:

- `lib/enterpriseportal/subscriptions/v1/subscriptions.proto`
- `lib/enterpriseportal/codyaccess/v1/codyaccess.proto`

This PR only introduces API definitions - implementation will come as subsequent PRs, tracked in the ["Launch Enterprise Portal" Linear project](https://linear.app/sourcegraph/project/launch-sourcegraph-enterprise-portal-ee5d9ea105c2).

Before reviewing the diffs, **please review this PR description in depth**.

### Design goals

This initial schema aims to help achieve CORE-97 by adding our initial "get subscription Cody access", as well our general Stage 1 goal of providing read-only access to our existing enterprise subscription mechanisms. In doing so, we can start to reshape the API in a way that accommodates future growth and addresses some debt we have accumulated over time, before the Stage 2 goal of having the new Enterprise Portal be the source-of-truth for all things subscriptions.

I am also aiming for a conservative approach with the Cody Gateway access related RPCs, to ease migration risks and allow for Cody teams to follow up quickly with more drastic changes in a V2 of the service after a Core-Services-driven migration to use the new service: https://github.com/sourcegraph/sourcegraph/pull/62263#issuecomment-2101874114

### Design overview

- **Multiple services**: Enterprise Portal aims to be the home of most Enterprise-related subscription and access management, but each component should be defined as a separate service to maintain clear boundaries between "core" capabilities and future extensions. One problem we see in the `dotcom { productSubscriptions }` is the embedding of additional concepts like Cody Gateway access makes the API surface unwieldy and brittle, and encourages an internal design that bundles everything together (the `product_subscriptions` table has 10 `cody_gateway_*` columns today). More concretely, this PR designs 2 services that Enterprise Portal will implement:
  - `EnterprisePortalSubscriptionsService` (`subscriptions.proto`): subscriptions and licenses CRUD
  - `EnterprisePortalCodyGatewayService` (`codygateway.proto`): Enterprise Cody Gateway access
- **Multiple protocols**: We use [ConnectRPC](https://connectrpc.com/) to generate traditional gRPC handlers for service-to-service use, but also a plain HTTP/1 "REST"-ish protocol (the ["Connect Protocol"](https://connectrpc.com/docs/protocol)) that works for web clients and simple integrations. Go bindings for the Connect protocol are generated into the `v1connect` subpackages.
- **Future licensing model/mechanism changes**: The _Subscription_ model is designed to remain static, but _Licenses_ are designed to accommodate future changes -`EnterpriseSubscriptionLicenseType` and `EnterpriseSubscriptionLicense` in this PR describe only the current type of license, referred to as "classic licenses", but we can extend this in the future for e.g. new models (refreshable licenses?) or new products (Cody-only? PLG enterprise?), or existing problems (test instance licenses?)
- **Granular history**: Instead of a `createdAt`, `isArchived`, `revokedAt` and  and so on, the new API defines Kubernetes-style `conditions` for licenses and subscriptions to describe creation, archival, and revocation events respectively, and can be more flexibly extended for future events and a lightweight audit log of major changes to a subscription or license. In particular, `revokedAt` already has a `revokedReason` - this allows us to extend these important events with additional metadata in a flexible manner.
- **Pagination**: I couldn't find a shared internal or off-the-shelf representation of pagination attributes, but each `List*` RPC describes `page_size`, `page_token`, and `next_page_token`
- **Querying/filtering**: I couldn't find a strong standard for this either, but in general:
  - `Get*` accepts `query` that is a `oneof`, with the goal of providing exact matches only.
  - `List*` accepts `repeated filter`, where each `filter` is a `oneof` a set of strategies relevant to a particular `List*` RPC. Multiple filters are treated as `AND`-concatenated.

Some major changes from the existing model:

- **Downgrade the concept of "subscription access token"**: this was built for Cody Gateway but I am not sure it has aged well, as the mechanism is still tied to individual licenses, did not find new non-Cody-Gateway use cases (except for license checks, though those do not require an "access token" model either), and today are still not "true" access tokens as they cannot be expired/managed properly. This PR relegates the concept to remain Cody-specific as it effectively is today so that we might be able to introduce a better subscription-wide model if the use case arises. Over time, we may want to make this even more opaque, relying entirely on zero-config instead (generating from license keys).
- **Subscriptions are no longer attached to a single dotcom user**: Most of these users today are not real users anyway, as our license creation process asks that you create a fake user account (["User account: [...] We create a company-level account for this."](https://handbook.sourcegraph.com/departments/technical-success/ce/process/license_keys/#license-key-mechanics)). The new API removes the concept entirely, in favour of a true user access management system in CORE-102.
- **Database/GraphQL IDs** are no longer exposed - we use external, prefixed UUIDs for representing entities over APIs in a universal manner.
- **Per-subscription Cody Gateway access no longer exposes `allowed models`**: I suggested this to  @rafax in light of recent problems with propagating new models to Enterprise customers. He agreed that the general product direction is "model options as a selling point" - it no longer makes sense to configure these at a per-subscription level. Instead, the Cody Gateway service should configure globally allowed models directly, and each Sourcegraph instance can determine what models they trust. If we really need this back we can add it later, but for now I think this removal is the right direction.

### Direct translations

`cmd/cody-gateway/internal/dotcom/operations.graphql` defines our key dependencies for achieving CORE-97. The concepts referred in `operations.graphql` translate to this new API as follows: 

- `dotcom { productSubscriptionByAccessToken(accessToken) }`: `codygateway.v1.GetCodyGatewayAccess({ access_token })`
- `dotcom { productSubscriptions }`: `codygateway.v1.ListCodyGatewayAccess()`
- `fragment ProductSubscriptionState`:
  - `id`: **n/a**
  - `uuid`: `subscriptions.v1.EnterpriseSubscription.id`
  - `account { username }`: `subscriptions.v1.EnterpriseSubscription.display_name`
  - `isArchived`: `subscriptions.v1.EnterpriseSubscription.conditions`
  - `codyGatewayAccess { ... }`: **separate RPC to `codygateway.v1.GetCodyGatewayAccess`**
  - `activeLicense { ... }`: **separate RPC to `subscriptions.v1.ListEnterpriseSubscriptionLicenses`**

### Why `lib/enterpriseportal`?

We recently had to move another Telemetry Gateway to `lib`: #62061. Inevitably, there will be services that live outside the monorepo that want to integrate with Enterprise Portal (one is on our roadmap: Cody Analytics in https://github.com/sourcegraph/cody-analytics). This allows us to share generated bindings and some useful helpers, while keeping things in the monorepo.

### Implications for Cody Clients

For now (and in the future), nothing is likely to change. Here's how I imagine things playing out:

```mermaid
graph TD
  cc["Cody Clients"] -- unified API --> cg[services like Cody Gateway]
  cg -- PLG users --> ssc[Self-Serve Cody]
  cg -- Enterprise users --> ep[Enterprise Portal]
```

## Test plan

CI passes, the schemas can be generated by hand:

```
sg gen buf \
  lib/enterpriseportal/subscriptions/v1/buf.gen.yaml \
  lib/enterpriseportal/codyaccess/v1/buf.gen.yaml
```

---------

Co-authored-by: Joe Chen <joe@sourcegraph.com>
Co-authored-by: Chris Smith <chrsmith@users.noreply.github.com>
2024-05-15 12:58:55 -07:00
James Cotter
3b394e7954
sg/msp: add repo annotation to delivery pipeline (#62702) 2024-05-15 12:35:00 -07:00
Julie Tibshirani
41db4044ae
Symbols: new backend integration test (#62686)
This PR creates a new GraphQL integration test file focused on symbol search.
It exercises the same searches the web client uses for code navigation.

In a follow-up, we will add cases for older commits and enable Rockskip.
2024-05-15 17:32:54 +01:00
Robert Lin
cb15cea2b0
msp/cloudrun: use GA launch stage (#62685)
VPC direct egress is now GA: see example in https://registry.terraform.io/providers/hashicorp/google/5.29.0/docs/resources/cloud_run_v2_service#example-usage---cloudrunv2-service-directvpc and https://cloud.google.com/run/docs/configuring/vpc-direct-vpc

This also fixes the infinite `GA` -> `BETA` drift we have in TFC
2024-05-15 17:32:54 +01:00
Robert Lin
cc6cfd8499
msp/rollouts: remove Cloud Deploy target import (#62687)
Now that #62644 (CORE-23) is rolled out, this import block is no longer needed (and may even be disruptive when provisioning new rollout pipelines). The change was rolled out in:

- https://github.com/sourcegraph/managed-services/pull/1416
- https://github.com/sourcegraph/managed-services/pull/1417
- https://github.com/sourcegraph/managed-services/pull/1403

## Test plan

n/a
2024-05-15 17:32:54 +01:00
Noah S-C
d96745d78a
build-tracker: include error if failing to write to bigquery (#62699)
Without this, this error won't be logged to Sentry, resulting in us missing it unless we check GCP

## Test plan

Discussed with @jac
2024-05-15 17:32:53 +01:00
Noah S-C
0260dff81f
bazel: migrate legacy postgres-12 dockerfile to rules_oci (#61963) 2024-05-15 17:32:53 +01:00
William Bezuidenhout
fbb72fbdbc
sg: cloud - move all cloud ephemeral commands to a sub command ephemeral (#62569)
* move all ephemeral commands to subcommand `ephemeral` or `eph`

* bazel
2024-05-15 15:10:15 +02:00
Julie Tibshirani
066cb1e7b2
Docs: update integration test instructions (#62679)
This PR updates the README for GraphQL integration tests to encourage devs to
run them in CI instead of running locally. It also update the local
instructions to use the `backend_integration_test` target.
2024-05-14 19:06:54 -07:00
Robert Lin
456315b54d
msp/rollouts: use new in-terraform custom target provisioning (#62644)
Closes CORE-23 - this change removes the manual `gcloud deploy apply` step previously required to enable MSP rollouts, thanks to a recent release of the Google Terraform provider.

## Test plan

https://github.com/sourcegraph/managed-services/pull/1403
2024-05-14 18:51:33 -07:00
Robert Lin
2d3b2c29f5
sg msp: add category flag for 'tfc sync' (#62675)
Enables staged rollouts of workspace updates.

## Test plan

```
sg msp tfc sync -all -category=test
```
2024-05-14 18:49:56 -07:00
Robert Lin
7308d16db9
msp/terraform: upgrade to 1.7.5 (#62650)
According to https://developer.hashicorp.com/terraform/language/v1.7.x/upgrade-guides this should be compatible with our current version, 1.3.10

We need to upgrade to use `import` blocks (TF 1.5), which will make https://github.com/sourcegraph/sourcegraph/pull/62644 and CORE-23 capable of a smooth rollout (otherwise we encounter conflict with the previously hand-deployed resources).

This also requires our CDKTF modules to be regenerated with the new Terraform version: https://github.com/sourcegraph/managed-services-platform-cdktf/pull/10

## Test plan

n/a - will do a staged rollout per https://www.notion.so/sourcegraph/MSP-infrastructure-upgrades-1808e7e45bd54f419dd93af542d99238?pvs=4
2024-05-14 12:33:06 -07:00
James Cotter
1d2076fc87
sg/msp: fix typo in exernal_health_check description (#62659) 2024-05-14 12:31:02 -07:00
Noah S-C
e2814f5fdc
build-tracker: include timestamp in agent state change events (#62670)
🙃  would be useful to have..

## Test plan

Confirmed bigquery code handles time.Time natively by inspecting the code
2024-05-14 18:03:17 +00:00
Michael Lin
0d2f0e7fb7
dev/linearhooks: add /-/healthz endpoint (#62646) 2024-05-14 00:25:10 +00:00
Robert Lin
71555cc0b1
msp/operationdocs: fix bad formatting (#62641)
Noticed some leftover awkward formatting from https://github.com/sourcegraph/sourcegraph/pull/62607

## Test plan

Golden tests
2024-05-13 14:31:01 -07:00
James Cotter
cf9bcb3d80
sg/msp: upgrade sentry (#62636) 2024-05-13 10:38:18 -07:00
Noah S-C
2c1fc163e6
build-tracker: fix handling of agent webhooks (#62632)
So the graphql API differs quite a bit 🙃 not using that as a reference anymore

## Test plan

updated unit test based on live data
2024-05-13 14:55:20 +00:00
Noah S-C
9a5eae8035
build-tracker: use repeated type for agent queues + deref strings (#62627)
TIL BigQuery has a native "repeated" type, so we dont have to comma separate this out : )

Not impacting any existing data as the current version isnt live yet (due to an msp misconfiguration) and the bigquery tables not being created yet)

## Test plan

CI and live
2024-05-13 12:00:55 +00:00
William Bezuidenhout
478742e0b4
sg: cloud - remove wip notice for ephemeral commands (#62568) 2024-05-13 12:50:53 +02:00
Jean-Hadrien Chabran
2a09fe27db
chore(ci): bump backcompat target to 5.4.0 (#62623)
* chore(ci): bump backcompat target to 5.4.0
2024-05-13 11:37:11 +02:00
Noah S-C
0be15f8983
build-tracker: emit agent state-change webhook events to BigQuery (#62598)
Track when agents come on & offline in build-tracker

Closes https://github.com/sourcegraph/sourcegraph/issues/61275

## Test plan

Added unit test, the rest will be decided by the Prod Gods
2024-05-12 16:20:04 +02:00
Robert Lin
fdf0bf9a02
msp/operationdocs: add incident response starter guide, Notion-specific formatting (#62607)
Closes CORE-20: adds a small per-service "incident response" section near the alerts reference section of each service, providing some simple starter context and linking to other relevant guidance.

This change also makes some Notion-oriented formatting tweaks: putting all paragraphs on a single line (because of https://github.com/sourcegraph/notionreposync/issues/9) and also rendering callouts with appropriate background colors (https://github.com/sourcegraph/notionreposync/pull/11).

## Test plan

Golden tests, roll out to Notion:

```sh
GITHUB_ACTIONS=true sg msp ops generate-handbook-pages
```

Incident response:

![image](https://github.com/sourcegraph/sourcegraph/assets/23356519/d07e0071-870f-4acb-b4a4-2246b40850a3)

Callouts:

![image](https://github.com/sourcegraph/sourcegraph/assets/23356519/6ec7dbea-cafd-40e0-b50c-780c4e9cbd22)
2024-05-10 23:56:41 +00:00
Robert Lin
7b6dd9080e
msp: centralize and expose locations configuration (#62604)
This change adds a `locations: { gcpRegion: "...", gcpLocation: "..." }` configuration to centralize all location-related options. `gcpRegion` specifies regional preferences, while `gcpLocation` specifies multi-regional preferences (for resources that support it - only BigQuery in most cases).

Closes CORE-24 - see issue for some context.

## Test plan

```
sg msp generate -all # no diff
```

```
sg msp schema -output='../managed-services/schema/service.schema.json'
```
2024-05-10 15:50:07 -07:00
James Cotter
2d5ed2e735
sg/msp: add cloud deploy pubsub notifications (#62596)
---------

Co-authored-by: Joe Chen <joe@sourcegraph.com>
Co-authored-by: Robert Lin <robert@bobheadxi.dev>
2024-05-10 22:51:47 +01:00
Robert Lin
022b4ad95f
msp/terraformcloud: add option to respect existing run mode (#62580)
When using https://github.com/sourcegraph/sourcegraph/pull/62565, we override test environments that are in CLI mode, which can cause infra to be rolled out by surprise via VCS mode on switch - this change adds an option to respect the existing run mode configuration via `-workspace-run-mode=ignore`.

Thread: https://sourcegraph.slack.com/archives/C06JENN2QBF/p1715256898022469?thread_ts=1715251558.736709&cid=C06JENN2QBF

## Test plan

```
sg msp tfc sync -all
👉 Syncing all environments for all services, including setting ALL workspaces to use run mode "vcs" (use '-workspace-run-mode=ignore' to respect the existing run mode) - are you sure? (y/N)  N
 aborting
Projects/sourcegraph/managed-services 1 » sg msp tfc sync -all -workspace-run-mode=ignore
👉 Syncing all environments for all services - are you sure? (y/N)  y
// ...
```
2024-05-09 14:57:40 -07:00
Robert Lin
4d6455996c
msp: add infra and runtime support for job checkins (#62508)
Closes CORE-21 - allows jobs to register check-ins using Sentry when they are configured as cron jobs: https://docs.sentry.io/product/crons/, for a nice view of "is my job running or nah" without using GCP's less-than-beautiful console views

1. Adds the configured schedule and deadline as environment variables for MSP jobs
2. Adds a contract mechanism for checking in, for example:
```go
	func work(ctx context.Context) (err error) {
		done, err := c.Diagnostics.JobExecutionCheckIn(ctx)
		if err != nil { /* failed to register check-in */ }
		defer done(err)

		// ... do work
	}
```

## Test plan

```sh
TestJobExecutionCheckIn_SENTRY_DSN='...' go test -v ./runtime/contract
```

![image](https://github.com/sourcegraph/sourcegraph/assets/23356519/8998af89-e74a-44a5-939a-92c8b63ea262)

In Slack:

![image](https://github.com/sourcegraph/sourcegraph/assets/23356519/0677e2db-5a33-4751-ae86-d43e5b1e159f)

It appears the message is not currently customizable: https://develop.sentry.dev/sdk/check-ins/

---------

Co-authored-by: Joe Chen <joe@sourcegraph.com>
2024-05-09 10:11:48 -07:00
Robert Lin
a4b128f84b
sg msp tfc sync: support applying to all services (#62565) 2024-05-09 10:01:42 -07:00
Robert Lin
1463f6724f
msp/terraformcloud: grant 'sso' team read access to MSP workspaces (#62559) 2024-05-09 09:53:20 -07:00
Varun Gandhi
a447b6121a
chore: Refactor codeintel middleware code (#62573)
Fixes GRAPH-570

Adds new unit tests and updates integration test.
2024-05-09 12:48:12 +00:00
William Bezuidenhout
5dbca2397b
sg+release: push releases to the cloud ephemeral registry too (#62566)
* push promoted release to the cloud ephemeral registry

* push internal releases to the cloud ephemeral registry

* update dev/ci/images go.mod
2024-05-09 14:36:02 +02:00
William Bezuidenhout
de33a21871
sg: cloud ephemeral - add build and upgrade commands (#62533)
* fix list instances

* fixups

* add upgrade command

* add build ephemeral command

* add suggestions

* actually send the upgrade request

* fix lease handling of 0 values

* bazel

* review comments

- fix spelling and formatting

---------
Co-authored-by: Noah S-C <noah@sourcegraph.com>

* fixup
2024-05-09 11:52:41 +02:00
Robert Lin
a7416695cd
sg gen buf: fix completions (#62555)
The command accepts `buf.gen.yaml` files, but the current completion provides generated Go files instead. The output is also provided in absolute-path format, which is difficult to read. This change addresses both issues, though the completion is a bit slow still.

## Test plan

Before (incorrect completions):

![image](https://github.com/sourcegraph/sourcegraph/assets/23356519/007cc580-3e2d-4c61-bd45-9326e775e48d)

After ():

```sh
go build -o ./sg ./dev/sg && ./sg install -f -p=false # install globally
```

<img width="883" alt="image" src="https://github.com/sourcegraph/sourcegraph/assets/23356519/7c1831bb-bffc-4918-bee2-7ecf65386cc8">

Running `sg gen buf` against some of the suggested targets works as expected
2024-05-08 10:58:55 -07:00
Michael Lin
99dae350d5
dev/linearhooks: add support for project modifier (#62521) 2024-05-08 17:15:58 +00:00
Bolaji Olajide
c960789f9d
release: return an error when promoting a dev release (#62541)
return an error when promoting a dev release
2024-05-08 09:26:53 -05:00
William Bezuidenhout
826db2cb61
sg: cloud ephemeral check if deployment exists already (#62456)
* fix list instances

* handle instance not found

* use spec values

* handle cancel of build when things fail

* rebase and refactor

* do not cancel build and improve messaging

* remove cancelling of build

* fixup

* sg: cloud ephemeral fetch license key from gcp secrets manager (#62479)

* fix list instances

* create local dev const and use it with secrets

* fetch ephemeral license key from secrets

* Update dev/sg/internal/cloud/client.go
2024-05-08 11:56:00 +02:00
Varun Gandhi
1dc8230edc
chore: Rename function to reduce confusion (#62522)
Previously, there were two similarly named functions
moveToNewTeam and moveIssueToTeam, this renames the first
one to identifyTeamToMoveTo
2024-05-07 19:31:40 -07:00
Jean-Hadrien Chabran
d488517383
chore(rel): bump minor for stitch graph + add support invalidating migrations repo rule (#62511)
chore(rel): bump minor for stitch graph + add support invalidating migrations repo rule (#62490)

* chore(bzl): allow to invalidate migrations repo rule

* chore(bzl): gen stitch graph for 5.4
2024-05-07 22:04:59 +00:00
William Bezuidenhout
f7fb02e0e3
sg+ci: cloud ephemeral annotation (#62489)
* force cloud ephemeral run

* invoke cloud ephemeral annotation

* do not actually push stuff

* update annotation

* use emoji syntax

* fixup

fix wording

* remove debugging
2024-05-07 17:57:54 +02:00
William Bezuidenhout
bd7d44be00
sg: skip honey event duration if event is nil (#62476)
* sg: skip honey event duration if event is nil

* remove timeCheck

* Update dev/sg/linters/linters.go

Co-authored-by: Noah S-C <noah@sourcegraph.com>

---------

Co-authored-by: Noah S-C <noah@sourcegraph.com>
2024-05-07 17:56:29 +02:00