This PR is a result/followup of the improvements we've made in the [SAMS repo](https://github.com/sourcegraph/sourcegraph-accounts/pull/199) that allows call sites to pass down a context (primarily to indicate deadline, and of course, cancellation if desired) and collects the error returned from `background.Routine`s `Stop` method.
Note that I did not adopt returning error from `Stop` method because I realize in monorepo, the more common (and arguably the desired) pattern is to hang on the call of `Start` method until `Stop` is called, so it is meaningless to collect errors from `Start` methods as return values anyway, and doing that would also complicate the design and semantics more than necessary.
All usages of the the `background.Routine` and `background.CombinedRoutines` are updated, I DID NOT try to interpret the code logic and make anything better other than fixing compile and test errors.
The only file that contains the core change is the [`lib/background/background.go`](https://github.com/sourcegraph/sourcegraph/pull/62136/files#diff-65c3228388620e91f8c22d91c18faac3f985fc67d64b08612df18fa7c04fafcd).
We currently don't call `Stop` on the pub/sub client, which is pretty important because I think it actually does buffering by default: https://sourcegraph.com/github.com/googleapis/google-cloud-go@6eb769621618a965abeabf11e6315bdb8be9b050/-/blob/pubsub/topic.go?L122-137
This change adds a `TopicPublisher` interface with write-only methods so that we don't accidentally Stop the client at callsites, and adds a Stop to telemetry-gateway. I'll follow up with another change to update Pings to the MSP runtime and apply a Stop there as well
* log: remove use of description paramter in Scoped
* temporarily point to sglog branch
* bazel configure + gazelle
* remove additional use of description param
* use latest versions of zoekt,log,mountinfo
* go.mod
This change adds:
- telemetry export background jobs: flagged behind `TELEMETRY_GATEWAY_EXPORTER_EXPORT_ADDR`, default empty => disabled
- telemetry redaction: configured in package `internal/telemetry/sensitivemetadataallowlist`
- telemetry-gateway service receiving events and forwarding it to a pub/sub topic (or just logging it, as configured in local dev)
- utilities for easily creating an event recorder: `internal/telemetry/telemetryrecorder`
Notes:
- all changes are feature-flagged to some degree, off by default, so the merge should be fairly low-risk.
- we decided that transmitting the full license key continues to be the way to go. we transmit it once per stream and attach it on all events in the telemetry-gateway. there is no auth mechanism at the moment
- GraphQL return type `EventLog.Source` is now a plain string instead of string enum. This should not be a breaking change in our clients, but must be made so that our generated V2 events do not break requesting of event logs
Stacked on https://github.com/sourcegraph/sourcegraph/pull/56520
Closes https://github.com/sourcegraph/sourcegraph/issues/56289
Closes https://github.com/sourcegraph/sourcegraph/issues/56287
## Test plan
Add an override to make the export super frequent:
```
env:
TELEMETRY_GATEWAY_EXPORTER_EXPORT_INTERVAL: "10s"
TELEMETRY_GATEWAY_EXPORTER_EXPORTED_EVENTS_RETENTION: "5m"
```
Start sourcegraph:
```
sg start
```
Enable `telemetry-export` featureflag (from https://github.com/sourcegraph/sourcegraph/pull/56520)
Emit some events in GraphQL:
```gql
mutation {
telemetry {
recordEvents(events:[{
feature:"foobar"
action:"view"
source:{
client:"WEB"
}
parameters:{
version:0
}
}]) {
alwaysNil
}
}
```
See series of log events:
```
[ worker] INFO worker.telemetrygateway-exporter telemetrygatewayexporter/telemetrygatewayexporter.go:61 Telemetry Gateway export enabled - initializing background routines
[ worker] INFO worker.telemetrygateway-exporter telemetrygatewayexporter/exporter.go:99 exporting events {"maxBatchSize": 10000, "count": 1}
[telemetry-g...y] INFO telemetry-gateway.pubsub pubsub/topic.go:115 Publish {"TraceId": "7852903434f0d2f647d397ee83b4009b", "SpanId": "8d945234bccf319b", "message": "{\"event\":{\"id\":\"dc96ae84-4ac4-4760-968f-0a0307b8bb3d\",\"timestamp\":\"2023-09-19T01:57:13.590266Z\",\"feature\":\"foobar\", ....
```
Build:
```
export VERSION="insiders"
bazel run //cmd/telemetry-gateway:candidate_push --config darwin-docker --stamp --workspace_status_command=./dev/bazel_stamp_vars.sh -- --tag $VERSION --repository us.gcr.io/sourcegraph-dev/telemetry-gateway
```
Deploy: https://github.com/sourcegraph/managed-services/pull/7
Add override:
```yaml
env:
# Port required. TODO: What's the best way to provide gRPC addresses, such that a
# localhost address is also possible?
TELEMETRY_GATEWAY_EXPORTER_EXPORT_ADDR: "https://telemetry-gateway.sgdev.org:443"
```
Repeat the above (`sg start` and emit some events):
```
[ worker] INFO worker.telemetrygateway-exporter telemetrygatewayexporter/exporter.go:94 exporting events {"maxBatchSize": 10000, "count": 6}
[ worker] INFO worker.telemetrygateway-exporter telemetrygatewayexporter/exporter.go:113 events exported {"maxBatchSize": 10000, "succeeded": 6}
[ worker] INFO worker.telemetrygateway-exporter telemetrygatewayexporter/exporter.go:94 exporting events {"maxBatchSize": 10000, "count": 1}
[ worker] INFO worker.telemetrygateway-exporter telemetrygatewayexporter/exporter.go:113 events exported {"maxBatchSize": 10000, "succeeded": 1}
```
This PR refactors the `internal/updatecheck` package a bit so that the
global state to store the Pub/Sub client is removed (the result of the
preceding PR https://github.com/sourcegraph/sourcegraph/pull/55524), and
can be pass down to lower-level helper function as an argument.
This is necessary for
https://github.com/sourcegraph/sourcegraph/pull/55467 to hold a Pub/Sub
client on the call site in order to use `pubsub.TopicClient`'s helper
functions (i.e. `Ping`) while still being able to reuse most of the code
logic from the `internal/updatecheck` package.
## Test plan
CI
This PR refactors the `internal/pubsub` package to be more modular and
other things:
1. Let the call sites make the decision of the client lifecycle.
- The fact that it looks like more global states are created is because
call sites that holding states need to be refactored as well, but out of
scope for the purpose of this PR.
1. Expose the client so more helpful methods can be made available.
- Here only the `Ping` method is added, which will later be used for
health-checking purpose.
1. We were creating a new "publish stream" for every single message we
publish, which is the "anti-pattern" based on the docstring of the
`Topic` method and my reading of its implementation:

1. Also see inline review comments.
## Test plan
1. Request access to the GCP project telligentsourcegraph through
Entitle
1. Add the following to the `sg.config.overwrite.yaml`:
```yaml
commandsets:
dotcom:
env:
PUBSUB_PROJECT_ID: 'telligentsourcegraph'
PUBSUB_TOPIC_ID: 'server-update-checks-test' # for pings
PUBSUB_DOTCOM_EVENTS_TOPIC_ID: 'server-update-checks-test' # for event
logging
```
1. Run `sg start dotcom`
1. For event logging
- Pull the message and visit the https://sourcegraph.test:3443
<img width="1546" alt="CleanShot 2023-08-02 at 11 09 17@2x"
src="https://github.com/sourcegraph/sourcegraph/assets/2946214/a3c102b1-293b-4cc2-9c2c-028ac613fdf9">
1. For pings
- Pull the message and do `curl
https://sourcegraph.test:3443/.api/updates?site=df0eed23-0e8c-4721-9849-147d20d59911&version=6.0.1`
<img width="581" alt="CleanShot 2023-08-02 at 11 11 19@2x"
src="https://github.com/sourcegraph/sourcegraph/assets/2946214/185c24ab-2837-4090-8248-a701487c8dc0">
Instead of relying on the global environment variable
GOOGLE_APPLICATION_CREDENTIALS, which is used by every Google Cloud
library.
Once we roll this out and remove the explicit
GOOGLE_APPLICATION_CREDENTIALS from deploy-sourcegraph-dot-com,
profiling should work.
We currently use gopkg.in/inconshreveable/log15.v2, which points to
github.com/inconshreveable/log15. However, recently goimports started inserting
github.com domain instead of the gopkg.in domain. This also seems to be the
preferred import path based on the documentation / import paths in the log15.