The background publisher was started regardless if analytics was
disabled or not. This PR makes it so that we only publish analytics if
it is enabled.
To make it work and not duplicate the disabled analytics check, I moved
the usershell + background context creation to happen earlier.
## Test plan
CI and tested locally
## Changelog
* sg - only start the analytics background publisher when analytics are
enabled
---------
Co-authored-by: Jean-Hadrien Chabran <jh@chabran.fr>
**chore(appliance): extract constant for configmap name**
To the reconciler, this is just a value, but to higher-level packages
like appliance, there is a single configmap that is an entity. Let's
make sure all high-level orchestration packages can reference our name
for it. This could itself be extracted to injected config if there was a
motivation for it.
**chore(appliance): extract NewRandomNamespace() in k8senvtest**
From reconciler tests, so that we can reuse it in self-update tests.
**feat(appliance): self-update**
Add a worker thread to the appliance that periodically polls release
registry for newer versions, and updates its own Kubernetes deployment.
If the APPLIANCE_DEPLOYMENT_NAME environment variable is not set, this
feature is disabled. This PR will be accompanied by one to the
appliance's helm chart to add this variable by default.
**fix(appliance): only self-update 2 minor versions above deployed SG**
**chore(appliance): self-update integration test extra case**
Check that self-update doesn't run when SG is not yet deployed.
https://linear.app/sourcegraph/issue/REL-212/appliance-can-self-upgrade
Removes the `sg telemetry` command that pertains to the legacy V1
exporter that is specific to Cloud instances.
I got asked about this recently, and especially with the new `sg
analytics` for usage of the `sg` CLI, this has the potential to be
pretty confusing.
Part of https://linear.app/sourcegraph/issue/CORE-104
## Test plan
n/a
## Changelog
- `sg`: the deprecated `sg telemetry` command for allowlisting export of
V1 telemetry from Cloud instances has been removed. Use telemetry V2
instead.
Docker images executor, executor-kubernetes, bundled-executor has
reported high/critical CVE-2024-24790 , CVE-2023-45288 reported on
golang stdlib. Upon testing, src version 5.3.0 was using `1.20.x` as per
e8e79e0311
This pull request attempts to upgrade src version to 5.4.0
## Test plan
- CI 🟢
- src version should report 5.4.0 (I built the image locally and tested
it)
`docker run --platform linux/amd64 -it --entrypoint /bin/sh
executor:candidate`
## Changelog
<!-- OPTIONAL; info at
https://www.notion.so/sourcegraph/Writing-a-changelog-entry-dd997f411d524caabf0d8d38a24a878c
-->
Upgrade src-cli version to 5.4.0 to address CVE-2024-24790 ,
CVE-2023-45288
Currently if a cloud ephemeral build is trigger it is triggered on the
`main` sourcegraph pipeline. Once a build a triggered and a commit is
subsequently pushed the previous build is cancelled - which means the
Cloud Ephemeral build is cancelled leading to a failed deployment.
In this PR, we instead trigger a build on the Cloud Ephemeral pipeline.
Which is the _exact_ pipeline as `sourcegraph` main but:
- sets the pipeline env to always have `CLOUD_EPHEMERAL=true`
- does not cancel previous builds
## Test plan
https://buildkite.com/sourcegraph/cloud-ephemeral/builds/1
## Changelog
* `sg cloud eph` will now trigger builds on the `cloud-ephemeral`
pipeline
This PR restructures the packages to move all symbols-only code into the
symbols service. This helps to reason better about which service is
accessing what datastores.
Test plan:
Just moved code, compiler and CI are happy.
<!-- PR description tips:
https://www.notion.so/sourcegraph/Write-a-good-pull-request-description-610a7fd3e613496eb76f450db5a49b6e
-->
Patches CVE-2024-24790 by upgrading to 27-0-3 tag. However, the patched
version has CVE-2024-24791 😟 and it doesnt have patch.
## Test plan
<!-- REQUIRED; info at
https://docs-legacy.sourcegraph.com/dev/background-information/testing_principles
-->
Build and test image locally.
### Instruction to build and test locally
- Go to `dev/oci_deps.bzl`
- Find the current tag example `docker:26.1.3-dind`
- Go to docker registry and search for updated tag and grab one example:
`docker:27.0.3-dind`
- docker pull --platform linux/amd64 docker:27.0.3-dind
- Add `platforms = ["linux/amd64"],` to the oci_pull for building and
testing locally
```bzl
oci_pull(
name = "upstream_dind_base",
digest = "sha256:2632da0d24924b179adf1c2e6f4ea6fb866747e84baea6b2ffaa8bff982ce102",
platforms = ["linux/amd64"],
)
```
- Run `sg images build dind`
- For testing, run `docker run --rm -it --entrypoint /bin/sh -v
/var/run/docker.sock:/var/run/docker.sock dind:candidate`
- Test docker commands and pull and run image for testing
## Changelog
- Upgraded dind to 27.0.3 to patch CVE-2024-24790 vulnerability
<!-- OPTIONAL; info at
https://www.notion.so/sourcegraph/Writing-a-changelog-entry-dd997f411d524caabf0d8d38a24a878c
-->
We missed during the review that we are not using the `open` helper that
wraps using the right method depending on the OS, which means that `sg
analytics` doesn't work on Linux as is.
## Test plan
Locally tested.
Removes existing `sg analytics` command and replaces it with a
one-per-invocation sqlite backed approach. This is a local storage for
invocation events before theyre pushed to bigquery
## Test plan
```
sqlite> select * from analytics;
0190792e-af38-751a-b93e-8481290a18b6|1|{"args":[],"command":"sg help","flags":{"help":null,"sg":null},"nargs":0,"end_time":"2024-07-03T15:20:21.069837706Z","success":true}
0190792f-4e2b-7c35-98d6-ad73cab82391|1|{"args":["dotcom"],"command":"sg live","flags":{"live":null,"sg":null},"nargs":1,"end_time":"2024-07-03T15:21:04.563232429Z","success":true}
```
## Changelog
<!-- OPTIONAL; info at
https://www.notion.so/sourcegraph/Writing-a-changelog-entry-dd997f411d524caabf0d8d38a24a878c
-->
---------
Co-authored-by: William Bezuidenhout <william.bezuidenhout@sourcegraph.com>
Drive by fix, dropped a few names who left the company and simplified
commands.
See DINF-106
Before: `sg teammate time|details olaf`
After: `sg teammate olaf` (shows both of the above)
## Test plan
Locally tested + CI.
Adds a new `postgreSQL.logicalReplication` configuration to allow MSP to
generate prerequisite setup for integration with Datastream:
https://cloud.google.com/datastream/docs/sources-postgresql. Integration
with Datastream allows the Data Analytics team to self-serve data
enrichment needs for the Telemetry V2 pipeline.
Enabling this feature entails downtime (Cloud SQL instance restart), so
enabling the logical replication feature at the Cloud SQL level
(`cloudsql.logical_decoding`) is gated behind
`postgreSQL.logicalReplication: {}`.
Setting up the required stuff in Postgres is a bit complicated,
requiring 3 Postgres provider instances:
1. The default admin one, authenticated with our admin user
2. New: a workload identity provider, using
https://github.com/cyrilgdn/terraform-provider-postgresql/pull/448 /
https://github.com/sourcegraph/managed-services-platform-cdktf/pull/11.
This is required for creating a publication on selected tables, which
requires being owner of said table. Because tables are created by
application using e.g. auto-migrate, the workload identity is always the
table owner, so we need to impersonate the IAM user
3. New: a "replication user" which is created with the replication
permission. Replication seems to not be a propagated permission so we
need a role/user that has replication enabled.
A bit more context scattered here and there in the docstrings.
Beyond the Postgres configuration we also introduce some additional
resources to enable easy Datastream configuration:
1. Datastream Private Connection, which peers to the service private
network
2. Cloud SQL Proxy VM, which only allows connections to `:5432` from the
range specified in 1, allowing a connection to the Cloud SQL instance
2. Datastream Connection Profile attached to 1
From there, data team can click-ops or manage the Datastream Stream and
BigQuery destination on their own.
Closes CORE-165
Closes CORE-212
Sample config:
```yaml
resources:
postgreSQL:
databases:
- "primary"
logicalReplication:
publications:
- name: testing
database: primary
tables:
- users
```
## Test plan
https://github.com/sourcegraph/managed-services/pull/1569
## Changelog
- MSP services can now configure `postgreSQL.logicalReplication` to
enable Data Analytics team to replicate selected database tables into
BigQuery.
Fixes DINF-82; This was very much a rabbithole. A few things:
- The race that @bobheadxi mentioned here
https://github.com/sourcegraph/sourcegraph/pull/63405#discussion_r1648180713
wasn't from `*output.Output` being unsafe, but `outputtest.Buffer` as it
happened again (see
[DINF-82](https://linear.app/sourcegraph/issue/DINF-82/devsgsg-test-failed-with-a-detected-race-condition))
- There something messed up with `cmds.start()`, which sometimes ends up
printing the command output _after_ the exit message instead of before.
- The crude `sort.Strings(want|have)` that was there already fixes that.
- And without the sleep, it's possible to read the output from the
`outputtest.Buffer` before the command outputs get written to it.
- The `time.Sleep(300 * time.Milliseconds)` _mitigates/hides_ that
problem.
At least, this shouldn't blow up in CI and buys us time to fix the whole
thing. We're tracking this in DINF-104. And out of 200 runs, I also
stumbled on a race in `progress_tty`, tracked in DINF-105 (that packages
is originally meant to be used by `src-cli` and was re-used for `sg` 3
years ago).
I'm pretty unhappy about the solution, but a bandage is better than
nothing. While ideally, we should really reconsider dropping
`std.Output` entirely in `sg` and use the good stuff from
github.com/charmbracelet instead because we don't want to spend too much
time on arcane terminal things ourselves, I'm much more about concerned
the concurrency issues mentioned above.
## Test plan
CI + `sg bazel test //dev/sg:sg_test --runs_per_test=100`
Closes https://linear.app/sourcegraph/issue/SRC-410/race-in-gitserver-observability
This PR adds a mutex to the internal/observation.ErrCollector type that makes it safe to use across multiple goroutines.
(This could quite easily happen, as the FinishFunc's OnCancel method runs the logic that accesses/modifies ErrReporter in a separate goroutine:)
fa46a26f7a/internal/observation/observation.go (L156-L170)
## Test plan
CI now passes and doesn't report race conditions
## Changelog
- Fixed a threadsafety issue in the internal/observation.ErrCollector type
@chrsmith suggested this idea, which I like very much as well.
Pretty straightforward:
- if you're adding something you really don't want to commit and suspect
your future self to forget about it, you can add `FORBIDCOMMIT` anywhere
in your changes, and precommit will prevent you from accidentally
committing it.
- check is case insensitive.
I went for this instead of `NOCOMMIT` because it could be legitimately
be used for a var with the number of commits for example. And that's not
really something we want to add a pragma to disable the string itself
for either.
## Test plan

Small improvement as reported here
https://github.com/sourcegraph/devx-support/issues/1068
## Test plan
Tested locally
```
sourcegraph on wb/sg-bazel/rust-hint [$!+?] via 🐹 v1.22.4 via ❄️ impure (sourcegraph-dev-env) took 9m54s
❯ CARGO_BAZEL_ISOLATED=0 CARGO_BAZEL_REPIN_ONLY=crate_index go run ./dev/sg bazel configure rustdeps
✱ Invoking the following Bazel generating categories: rustdeps
👉 running command "bazel sync --only=crate_index"
sourcegraph on wb/sg-bazel/rust-hint [$!+?] via 🐹 v1.22.4 via ❄️ impure (sourcegraph-dev-env) took 51s
❯ CARGO_BAZEL_ISOLATED=1 CARGO_BAZEL_REPIN_ONLY=crate_index go run ./dev/sg bazel configure rustdeps
✱ Invoking the following Bazel generating categories: rustdeps
👉 running command "bazel sync --only=crate_index"
💡 pro-tip: run with CARGO_BAZEL_ISOLATED=0 for faster (but less sandboxed) repinning.
```
## Changelog
* sg - conditionally show protips when running `sg bazel`
Using `append` on a variable, then sharing that variable, surprisingly
seems to cause nondeterministic behaviour in the flags. This makes the
shared flag set a function so that each command gets its own set to
append to.
## Test plan
`sg enterprise subscription list -h` now has the correct flags
Currently the matrix is hardcoded in the msp repo.
Service operators can forget to add or remove their service from the
list.
GitHub supports dynamically generating the matrix from a previous jobs
output
([example](https://josh-ops.com/posts/github-actions-dynamic-matrix/))
This PR adds an `sg msp subscription-matrix` command which will generate
the matrix we need
Part of CORE-202
## Test plan
Output
```
{"service":[{"id":"cloud-ops","env":"prod","category":"internal"},{"id":"gatekeeper","env":"prod","category":"internal"},{"id":"linearhooks","env":"prod","category":"internal"}]}
```
Makes destructive updates usable in automation, such as GitHub actions
## Test plan
```
sg enterprise subscription update-membership -subscription-instance-domain='bobheadxi.dev' --auto-approve '...'
```
This makes it easier to run Sourcegraph in local dev by compiling a few
key services (frontend, searcher, repo-updater, gitserver, and worker)
into a single Go binary and running that.
Compared to `sg start` (which compiles and runs ~10 services), it's
faster to start up (by ~10% or a few seconds), takes a lot less memory
and CPU when running, has less log noise, and rebuilds faster. It is
slower to recompile for changes just to `frontend` because it needs to
link in more code on each recompile, but it's faster for most other Go
changes that require recompilation of multiple services.
This is only intended for local dev as a convenience. There may be
different behavior in this mode that could result in problems when your
code runs in the normal deployment. Usually our e2e tests should catch
this, but to be safe, you should run in the usual mode if you are making
sensitive cross-service changes.
Partially reverts "svcmain: Simplify service setup (#61903)" (commit
9541032292).
## Test plan
Existing tests cover any regressions to existing behavior. This new
behavior is for local dev only.
We currently don't publish images from the new-style patch release
branches like `5.4.5099`, as this is all performed using the new release
tooling.
In order to improve the release process, we (Security) would like to run
a daily scan of the current set of images built from the patch release
branch. Currently we only scan images built from `main`, but these
slowly diverge from the patch release branch in the 2 week window
between a monthly release and the patch release.
To give a specific example, we currently have no easy/automated way to
scan images from the `5.4.5099` branch that a release will be cut from
this afternoon until the release team run the internal release process.
This PR updates the pipeline so that whenever a new commit is pushed to
the patch release branch, it will publish a new set of images and
include the tag `<branch>-insiders`. Currently just pushing to
us.gcr.io, but equally could push to dockerhub.
Example of the jobfile for a matching branch after this PR:
`bazel --bazelrc=/tmp/aspect-generated.bazelrc
--bazelrc=.aspect/bazelrc/ci.sourcegraph.bazelrc run
//cmd/batcheshelper:candidate_push --stamp
--workspace_status_command=./dev/bazel_stamp_vars.sh -- --tag
dc438648b0 --tag dc438648b0cc_2024-06-20 --tag dc438648b0cc_279230
--tag will/5.4.9999-insiders --repository
us.gcr.io/sourcegraph-dev/batcheshelper && echo -e
'<tr><td>batcheshelper</td><td><code>us.gcr.io/sourcegraph-dev</code></td><td><code>dc438648b0cc</code>,
<code>dc438648b0cc_2024-06-20</code>, <code>dc438648b0cc_279230</code>,
<code>will/5.4.9999-insiders</code></td></tr>'
>>./annotations/pushed_images.md`
[Example buildkite
run](https://buildkite.com/sourcegraph/sourcegraph/builds/279230#_)
where the pattern was updated to match this branch, and pushing
non-candidate images was disabled.
This resolves one part of
[SEC-1734](https://linear.app/sourcegraph/issue/SEC-1734/scan-images-from-patch-release-branches)
<!-- 💡 To write a useful PR description, make sure that your description
covers:
- WHAT this PR is changing:
- How was it PREVIOUSLY.
- How it will be from NOW on.
- WHY this PR is needed.
- CONTEXT, i.e. to which initiative, project or RFC it belongs.
The structure of the description doesn't matter as much as covering
these points, so use
your best judgement based on your context.
Learn how to write good pull request description:
https://www.notion.so/sourcegraph/Write-a-good-pull-request-description-610a7fd3e613496eb76f450db5a49b6e?pvs=4
-->
## Test plan
- Manual testing of buildkite pipeline
<!-- All pull requests REQUIRE a test plan:
https://docs-legacy.sourcegraph.com/dev/background-information/testing_principles
-->
## Changelog
<!--
1. Ensure your pull request title is formatted as: $type($domain): $what
2. Add bullet list items for each additional detail you want to cover
(see example below)
3. You can edit this after the pull request was merged, as long as
release shipping it hasn't been promoted to the public.
4. For more information, please see this how-to
https://www.notion.so/sourcegraph/Writing-a-changelog-entry-dd997f411d524caabf0d8d38a24a878c?
Audience: TS/CSE > Customers > Teammates (in that order).
Cheat sheet: $type = chore|fix|feat $domain:
source|search|ci|release|plg|cody|local|...
-->
<!--
Example:
Title: fix(search): parse quotes with the appropriate context
Changelog section:
## Changelog
- When a quote is used with regexp pattern type, then ...
- Refactored underlying code.
-->
Tired of seeing the go toolchain being easier to use than nix.
Test Plan: nix develop on linux amd64 and macbook arm64 followed by
running "go test ./internal/search" working. Also confirming that "go
env GOROOT" points into the nix store.
Closes CORE-99, closes CORE-176
This PR is based off (and was also served as PoC of) [RFC 962: MSP IAM
framework](https://docs.google.com/document/d/1ItJlQnpR5AHbrfAholZqjH8-8dPF1iQcKh99gE6SSjs/edit).
It comes with two main parts:
1. The initial version of the MSP IAM SDK:
`lib/managedservicesplatform/iam`
- Embeds the [OpenFGA server
implementation](https://github.com/openfga/openfga/tree/main/pkg/server)
and exposes the a `ClientV1` for interacting with it.
- Automagically manages the both MSP IAM's and OpenFGA's database
migrations upon initializing the `ClientV1`.

- Ensures the specified OpenFGA's store and automatization model DSL
exists.
- Utility types and helpers to avoid easy mistakes (i.e. make the
relation tuples a bit more strongly-typed).
- Decided to put all types and pre-defined values together to simulate a
"central registry" and acting as a forcing function for services to form
some sort of convention. Then when we migrate the OpenFGA server to a
separate standalone service, it will be less headache about
consolidating similar meaning types/relations but different string
literals.
1. The first use case of the MSP IAM:
`cmd/enterprise-portal/internal/subscriptionsservice`
- Added/updated RPCs:
- Listing enterprise subscriptions via permissions
- Update enterprise subscriptions to assign instance domains
- Update enterprise subscriptions membership to assign roles (and
permissions)
- A database table for enterprise subscriptions, only storing the extra
instance domains as Enterprise Portal is not the
writeable-source-of-truth.
## Other minor changes
- Moved `internal/redislock` to `lib/redislock` to be used in MSP IAM
SDK.
- Call `createdb ...` as part of `enterprise-portal` install script in
`sg.config.yaml` (`msp_iam` database is a hard requirement of MSP IAM
framework).
## Test plan
Tested with gRPC UI:
- `UpdateEnterpriseSubscription` to assign an instance domain
- `UpdateEnterpriseSubscriptionMembership` to assign roles
- `ListEnterpriseSubscriptions`:
- List by subscription ID
- List by instance domain
- List by view cody analytics permissions
---------
Co-authored-by: Robert Lin <robert@bobheadxi.dev>
The search console page is broken, is not used or maintained, and is
only referenced by a series of blog posts years ago. We have product
support to remove it.
Follow-up to https://github.com/sourcegraph/sourcegraph/pull/63320 as I
noticed that the `UsageText` didn't include `sg db default-site-admin`.
Additionally, it was quite verbose without providing much info, so I
just dropped it in favour of highlighting notable commands.
Adds a subcommand to `sg db` called `default-site-admin` that creates a
site-admin user with user:pass `sourcegraph:sourcegraph` and a
predefined hard-coded token
`sgp_local_f0f0f0f0f0f0f0f0f0f0f0f0f0f0f0f0f0f0f0f0`
## Test plan
`go run ./dev/sg -- db default-site-admin` with clean database
`" "` after having run that (when everything should be set)
`" "` when user exists but token doesnt
## Changelog
This patch does a few things:
- Adds `go-enry` packages to depguard, so that people do not
accidentally use enry APIs instead of the corresponding APIs
in the `languages` package.
- Adds more tests for different functions in the languages package
to ensure mutual consistency in how language<->extension mappings
are handled.
- Adds tests for enry upgrades
- Adds comments with IDs so that related parts in the code can be
pieced together easily