sourcegraph/sg.config.yaml

1914 lines
60 KiB
YAML
Raw Normal View History

# Documentation for how to override sg configuration for local development:
# https://github.com/sourcegraph/sourcegraph/blob/main/doc/dev/background-information/sg/index.md#configuration
env:
PGPORT: 5432
PGHOST: localhost
PGUSER: sourcegraph
PGPASSWORD: sourcegraph
PGDATABASE: sourcegraph
PGSSLMODE: disable
SG_DEV_MIGRATE_ON_APPLICATION_STARTUP: 'true'
INSECURE_DEV: true
SRC_REPOS_DIR: $HOME/.sourcegraph/repos
SRC_LOG_LEVEL: info
SRC_LOG_FORMAT: condensed
SRC_TRACE_LOG: false
# Set this to true to show an iTerm link to the file:line where the log message came from
SRC_LOG_SOURCE_LINK: false
# Use two gitserver instances in local dev
SRC_GIT_SERVER_1: 127.0.0.1:3501
SRC_GIT_SERVER_2: 127.0.0.1:3502
SRC_GIT_SERVERS: 127.0.0.1:3501 127.0.0.1:3502
# Enable sharded indexed search mode:
INDEXED_SEARCH_SERVERS: localhost:3070 localhost:3071
GO111MODULE: 'on'
DEPLOY_TYPE: dev
SRC_HTTP_ADDR: ':3082'
# I don't think we even need to set these?
SEARCHER_URL: http://127.0.0.1:3181
REPO_UPDATER_URL: http://127.0.0.1:3182
REDIS_ENDPOINT: 127.0.0.1:6379
SYMBOLS_URL: http://localhost:3184
embeddings: searcher and indexer (#48017) # High-level architecture overview <img width="2231" alt="Screenshot 2023-02-24 at 15 13 59" src="https://user-images.githubusercontent.com/6417322/221200130-53c1ff25-4c47-4532-885f-5c4f9dadb05e.png"> # Embeddings Really quickly: embeddings are a semantic representation of text. Embeddings are usually floating-point vectors with 256+ elements. The neat thing about embeddings is that they allow us to search over textual information using a semantic correlation between the query and the text, not just syntactic (matching keywords). In this PR, we implemented an embedding service that will allow us to do semantic code search over repositories in Sourcegraph. So, for example, you'll be able to ask, "how do access tokens work in Sourcegraph", and it will give you a list of the closest matching code files. Additionally, we build a context detection service powered by embeddings. In chat applications, it is important to know whether the user's message requires additional context. We have to differentiate between two cases: the user asks a general question about the codebase, or the user references something in the existing conversation. In the latter case, including the context would ruin the flow of the conversation, and the chatbot would most likely return a confusing answer. We determine whether a query _does not_ require additional context using two approaches: 1. We check if the query contains well-known phrases that would indicate the user is referencing the existing conversation (e.g., translate previous, change that) 1. We have a static dataset of messages that require context and a dataset of messages that do not. We embed both datasets, and then, using embedding similarity, we can check which set is more similar to the query. ## GraphQL API We add four new resolvers to the GraphQL API: ```graphql extend type Query { embeddingsSearch(repo: ID!, query: String!, codeResultsCount: Int!, textResultsCount: Int!): EmbeddingsSearchResults! isContextRequiredForQuery(query: String!): Boolean! } extend type Mutation { scheduleRepositoriesForEmbedding(repoNames: [String!]!): EmptyResponse! scheduleContextDetectionForEmbedding: EmptyResponse! } ``` - `embeddingsSearch` performs embeddings search over the repo embeddings and returns the specified number of results - `isContextRequiredForQuery` determines whether the given query requires additional context - `scheduleRepositoriesForEmbedding` schedules a repo embedding background job - `scheduleContextDetectionForEmbedding` schedules a context detection embedding background job that embeds a static dataset of messages. ## Repo embedding background job Embedding a repository is implemented as a background job. The background job handler receives the repository and the revision, which should be embedded. Handler then gathers a list of files from the gitserver and excludes files >1MB in size. The list of files is split into code and text files (.md, .txt), and we build a separate embedding index for both. We split them because in a combined index, the text files always tended to feature as top results and didn't leave any room for code files. Once we have the list of files, the procedure is as follows: - For each file - Get file contents from gitserver - Check if the file is embeddable (is not autogenerated, is large enough, does not have long lines) - Split the file into embeddable chunks - Embed the file chunks using an external embedding service (defined in site config) - Add embedded file chunks and metadata to the index - Metadata contains the file name, the start line, and the end line of the chunk - Once all files are processed, the index is marshaled into JSON and stored in Cloud storage (GCS, S3) ### Site config changes As mentioned, we use a configurable external embedding API that does the actual text -> vector embedding part. Ideally, this allows us to swap embedding providers in the future. ```json "embeddings": { "description": "Configuration for embeddings service.", "type": "object", "required": ["enabled", "dimensions", "model", "accessToken", "url"], "properties": { "enabled": { "description": "Toggles whether embedding service is enabled.", "type": "boolean", "default": false }, "dimensions": { "description": "The dimensionality of the embedding vectors.", "type": "integer", "minimum": 0 }, "model": { "description": "The model used for embedding.", "type": "string" }, "accessToken": { "description": "The access token used to authenticate with the external embedding API service.", "type": "string" }, "url": { "description": "The url to the external embedding API service.", "type": "string", "format": "uri" } } } ``` ## Repo embeddings search The repo embeddings search is implemented in its own service. When a user queries a repo using embeddings search, the following happens: - Download the repo embedding index from blob storage and cache it in memory - We cache up to 5 embedding indexes in memory - Embed the query and use the embedded query vector to find similar code and text file metadata in the embedding index - Query gitserver for the actual file contents - Return the results ## Interesting files - [Similarity search](https://github.com/sourcegraph/sourcegraph/pull/48017/files#diff-102cc83520004eb0e2795e49bc435c5142ca555189b1db3a52bbf1ffb82fa3c6) - [Repo embedding job handler](https://github.com/sourcegraph/sourcegraph/pull/48017/files#diff-c345f373f426398beb4b9cd5852ba862a2718687882db2a8b2d9c7fbb5f1dc52) - [External embedding api client](https://github.com/sourcegraph/sourcegraph/pull/48017/files#diff-ad1e7956f518e4bcaee17dd9e7ac04a5e090c00d970fcd273919e887e1d2cf8f) - [Embedding a repo](https://github.com/sourcegraph/sourcegraph/pull/48017/files#diff-1f35118727128095b7816791b6f0a2e0e060cddee43d25102859b8159465585c) - [Embeddings searcher service](https://github.com/sourcegraph/sourcegraph/pull/48017/files#diff-5b20f3e7ef87041daeeaef98b58ebf7388519cedcdfc359dc5e6d4e0b021472e) - [Embeddings search](https://github.com/sourcegraph/sourcegraph/pull/48017/files#diff-79f95b9cc3f1ef39c1a0b88015bd9cd6c19c30a8d4c147409f1b8e8cd9462ea1) - [Repo embedding index cache management](https://github.com/sourcegraph/sourcegraph/pull/48017/files#diff-8a41f7dec31054889dbf86e97c52223d5636b4d408c6b375bcfc09160a8b70f8) - [GraphQL resolvers](https://github.com/sourcegraph/sourcegraph/pull/48017/files#diff-9b30a0b5efcb63e2f4611b99ab137fbe09629a769a4f30d10a1b2da41a01d21f) ## Test plan - Start by filling out the `embeddings` object in the site config (let me know if you need an API key) - Start the embeddings service using `sg start embeddings` - Go to the `/api/console` page and schedule a repo embedding job and a context detection embedding job: ```graphql mutation { scheduleRepositoriesForEmbedding(repoNames: ["github.com/sourcegraph/handbook"]) { __typename } scheduleContextDetectionForEmbedding { __typename } } ``` - Once both are finished, you should be able to query the repo embedding index, and determine whether context is need for a given query: ```graphql query { isContextRequiredForQuery(query: "how do access tokens work") embeddingsSearch( repo: "UmVwb3NpdG9yeToy", # github.com/sourcegraph/handbook GQL ID query: "how do access tokens work", codeResultsCount: 5, textResultsCount: 5) { codeResults { fileName content } textResults { fileName content } } } ```
2023-03-01 09:50:12 +00:00
EMBEDDINGS_URL: http://localhost:9991
SRC_SYNTECT_SERVER: http://localhost:9238
SRC_FRONTEND_INTERNAL: localhost:3090
GRAFANA_SERVER_URL: http://localhost:3370
PROMETHEUS_URL: http://localhost:9090
JAEGER_SERVER_URL: http://localhost:16686
SRC_DEVELOPMENT: 'true'
SRC_PROF_HTTP: ''
SRC_PROF_SERVICES: |
[
{ "Name": "frontend", "Host": "127.0.0.1:6063" },
{ "Name": "gitserver-0", "Host": "127.0.0.1:3551" },
{ "Name": "gitserver-1", "Host": "127.0.0.1:3552" },
{ "Name": "searcher", "Host": "127.0.0.1:6069" },
{ "Name": "symbols", "Host": "127.0.0.1:6071" },
{ "Name": "repo-updater", "Host": "127.0.0.1:6074" },
{ "Name": "codeintel-worker", "Host": "127.0.0.1:6088" },
{ "Name": "worker", "Host": "127.0.0.1:6089" },
2022-08-03 10:08:04 +00:00
{ "Name": "worker-executors", "Host": "127.0.0.1:6996" },
{ "Name": "embeddings", "Host": "127.0.0.1:6099" },
{ "Name": "zoekt-index-0", "Host": "127.0.0.1:6072" },
{ "Name": "zoekt-index-1", "Host": "127.0.0.1:6073" },
{ "Name": "syntactic-code-intel-worker-0", "Host": "127.0.0.1:6075" },
{ "Name": "syntactic-code-intel-worker-1", "Host": "127.0.0.1:6076" },
{ "Name": "zoekt-web-0", "Host": "127.0.0.1:3070", "DefaultPath": "/debug/requests/" },
{ "Name": "zoekt-web-1", "Host": "127.0.0.1:3071", "DefaultPath": "/debug/requests/" }
]
# Settings/config
SITE_CONFIG_FILE: ./dev/site-config.json
SITE_CONFIG_ALLOW_EDITS: true
GLOBAL_SETTINGS_FILE: ./dev/global-settings.json
GLOBAL_SETTINGS_ALLOW_EDITS: true
# Point codeintel to the `frontend` database in development
CODEINTEL_PGPORT: $PGPORT
CODEINTEL_PGHOST: $PGHOST
CODEINTEL_PGUSER: $PGUSER
CODEINTEL_PGPASSWORD: $PGPASSWORD
CODEINTEL_PGDATABASE: $PGDATABASE
CODEINTEL_PGSSLMODE: $PGSSLMODE
CODEINTEL_PGDATASOURCE: $PGDATASOURCE
CODEINTEL_PG_ALLOW_SINGLE_DB: true
# Required for `frontend` and `web` commands
SOURCEGRAPH_HTTPS_DOMAIN: sourcegraph.test
SOURCEGRAPH_HTTPS_PORT: 3443
# Required for `web` commands
NODE_OPTIONS: '--max_old_space_size=8192'
# Default `NODE_ENV` to `development`
NODE_ENV: development
# Required for codeintel uploadstore
PRECISE_CODE_INTEL_UPLOAD_AWS_ENDPOINT: http://localhost:9000
PRECISE_CODE_INTEL_UPLOAD_BACKEND: blobstore
# Required for embeddings job upload
EMBEDDINGS_UPLOAD_AWS_ENDPOINT: http://localhost:9000
# Required for upload of search job results
SEARCH_JOBS_UPLOAD_AWS_ENDPOINT: http://localhost:9000
# Point code insights to the `frontend` database in development
CODEINSIGHTS_PGPORT: $PGPORT
CODEINSIGHTS_PGHOST: $PGHOST
CODEINSIGHTS_PGUSER: $PGUSER
CODEINSIGHTS_PGPASSWORD: $PGPASSWORD
CODEINSIGHTS_PGDATABASE: $PGDATABASE
CODEINSIGHTS_PGSSLMODE: $PGSSLMODE
CODEINSIGHTS_PGDATASOURCE: $PGDATASOURCE
# Disable code insights by default
DB_STARTUP_TIMEOUT: 120s # codeinsights-db needs more time to start in some instances.
DISABLE_CODE_INSIGHTS_HISTORICAL: true
DISABLE_CODE_INSIGHTS: true
# # OpenTelemetry in dev - use single http/json endpoint
# OTEL_EXPORTER_OTLP_ENDPOINT: http://127.0.0.1:4318
# OTEL_EXPORTER_OTLP_PROTOCOL: http/json
# Enable gRPC Web UI for debugging
GRPC_WEB_UI_ENABLED: 'true'
# Enable full protobuf message logging when an internal error occurred
SRC_GRPC_INTERNAL_ERROR_LOGGING_LOG_PROTOBUF_MESSAGES_ENABLED: 'true'
SRC_GRPC_INTERNAL_ERROR_LOGGING_LOG_PROTOBUF_MESSAGES_JSON_TRUNCATION_SIZE_BYTES: '1KB'
SRC_GRPC_INTERNAL_ERROR_LOGGING_LOG_PROTOBUF_MESSAGES_HANDLING_MAX_MESSAGE_SIZE_BYTES: '100MB'
## zoekt-specific message logging
GRPC_INTERNAL_ERROR_LOGGING_LOG_PROTOBUF_MESSAGES_ENABLED: 'true'
GRPC_INTERNAL_ERROR_LOGGING_LOG_PROTOBUF_MESSAGES_JSON_TRUNCATION_SIZE_BYTES: '1KB'
GRPC_INTERNAL_ERROR_LOGGING_LOG_PROTOBUF_MESSAGES_HANDLING_MAX_MESSAGE_SIZE_BYTES: '100MB'
# Telemetry V2 export configuration. By default, this points to a test
# instance (go/msp-ops/telemetry-gateway#dev). Set the following:
#
# TELEMETRY_GATEWAY_EXPORTER_EXPORT_ADDR: 'http://127.0.0.1:6080'
#
# in 'sg.config.overwrite.yaml' to point to a locally running Telemetry
# Gateway instead (via 'sg run telemetry-gateway')
TELEMETRY_GATEWAY_EXPORTER_EXPORT_ADDR: "https://telemetry-gateway.sgdev.org:443"
SRC_TELEMETRY_EVENTS_EXPORT_ALL: 'true'
telemetrygateway: add exporter and service (#56699) This change adds: - telemetry export background jobs: flagged behind `TELEMETRY_GATEWAY_EXPORTER_EXPORT_ADDR`, default empty => disabled - telemetry redaction: configured in package `internal/telemetry/sensitivemetadataallowlist` - telemetry-gateway service receiving events and forwarding it to a pub/sub topic (or just logging it, as configured in local dev) - utilities for easily creating an event recorder: `internal/telemetry/telemetryrecorder` Notes: - all changes are feature-flagged to some degree, off by default, so the merge should be fairly low-risk. - we decided that transmitting the full license key continues to be the way to go. we transmit it once per stream and attach it on all events in the telemetry-gateway. there is no auth mechanism at the moment - GraphQL return type `EventLog.Source` is now a plain string instead of string enum. This should not be a breaking change in our clients, but must be made so that our generated V2 events do not break requesting of event logs Stacked on https://github.com/sourcegraph/sourcegraph/pull/56520 Closes https://github.com/sourcegraph/sourcegraph/issues/56289 Closes https://github.com/sourcegraph/sourcegraph/issues/56287 ## Test plan Add an override to make the export super frequent: ``` env: TELEMETRY_GATEWAY_EXPORTER_EXPORT_INTERVAL: "10s" TELEMETRY_GATEWAY_EXPORTER_EXPORTED_EVENTS_RETENTION: "5m" ``` Start sourcegraph: ``` sg start ``` Enable `telemetry-export` featureflag (from https://github.com/sourcegraph/sourcegraph/pull/56520) Emit some events in GraphQL: ```gql mutation { telemetry { recordEvents(events:[{ feature:"foobar" action:"view" source:{ client:"WEB" } parameters:{ version:0 } }]) { alwaysNil } } ``` See series of log events: ``` [ worker] INFO worker.telemetrygateway-exporter telemetrygatewayexporter/telemetrygatewayexporter.go:61 Telemetry Gateway export enabled - initializing background routines [ worker] INFO worker.telemetrygateway-exporter telemetrygatewayexporter/exporter.go:99 exporting events {"maxBatchSize": 10000, "count": 1} [telemetry-g...y] INFO telemetry-gateway.pubsub pubsub/topic.go:115 Publish {"TraceId": "7852903434f0d2f647d397ee83b4009b", "SpanId": "8d945234bccf319b", "message": "{\"event\":{\"id\":\"dc96ae84-4ac4-4760-968f-0a0307b8bb3d\",\"timestamp\":\"2023-09-19T01:57:13.590266Z\",\"feature\":\"foobar\", .... ``` Build: ``` export VERSION="insiders" bazel run //cmd/telemetry-gateway:candidate_push --config darwin-docker --stamp --workspace_status_command=./dev/bazel_stamp_vars.sh -- --tag $VERSION --repository us.gcr.io/sourcegraph-dev/telemetry-gateway ``` Deploy: https://github.com/sourcegraph/managed-services/pull/7 Add override: ```yaml env: # Port required. TODO: What's the best way to provide gRPC addresses, such that a # localhost address is also possible? TELEMETRY_GATEWAY_EXPORTER_EXPORT_ADDR: "https://telemetry-gateway.sgdev.org:443" ``` Repeat the above (`sg start` and emit some events): ``` [ worker] INFO worker.telemetrygateway-exporter telemetrygatewayexporter/exporter.go:94 exporting events {"maxBatchSize": 10000, "count": 6} [ worker] INFO worker.telemetrygateway-exporter telemetrygatewayexporter/exporter.go:113 events exported {"maxBatchSize": 10000, "succeeded": 6} [ worker] INFO worker.telemetrygateway-exporter telemetrygatewayexporter/exporter.go:94 exporting events {"maxBatchSize": 10000, "count": 1} [ worker] INFO worker.telemetrygateway-exporter telemetrygatewayexporter/exporter.go:113 events exported {"maxBatchSize": 10000, "succeeded": 1} ```
2023-09-20 05:20:15 +00:00
# By default, allow temporary edits to external services.
EXTSVC_CONFIG_ALLOW_EDITS: true
commands:
server:
description: Run an all-in-one sourcegraph/server image
cmd: ./dev/run-server-image.sh
env:
TAG: insiders
CLEAN: 'true'
DATA: '/tmp/sourcegraph-data'
URL: 'http://localhost:7080'
frontend:
description: Frontend
cmd: |
# TODO: This should be fixed
export SOURCEGRAPH_LICENSE_GENERATION_KEY=$(cat ../dev-private/enterprise/dev/test-license-generation-key.pem)
# If EXTSVC_CONFIG_FILE is *unset*, set a default.
export EXTSVC_CONFIG_FILE=${EXTSVC_CONFIG_FILE-'../dev-private/enterprise/dev/external-services-config.json'}
.bin/frontend
2021-10-09 01:47:08 +00:00
install: |
if [ -n "$DELVE" ]; then
2023-09-12 10:10:10 +00:00
export GCFLAGS='all=-N -l'
2021-10-09 01:47:08 +00:00
fi
go build -gcflags="$GCFLAGS" -o .bin/frontend github.com/sourcegraph/sourcegraph/cmd/frontend
checkBinary: .bin/frontend
env:
CONFIGURATION_MODE: server
USE_ENHANCED_LANGUAGE_DETECTION: false
SITE_CONFIG_FILE: '../dev-private/enterprise/dev/site-config.json'
SITE_CONFIG_ESCAPE_HATCH_PATH: '$HOME/.sourcegraph/site-config.json'
# frontend processes need this to be so that the paths to the assets are rendered correctly
WEB_BUILDER_DEV_SERVER: 1
watch:
- lib
- internal
- cmd/frontend
gitserver-template: &gitserver_template
cmd: .bin/gitserver
2021-10-09 01:47:08 +00:00
install: |
if [ -n "$DELVE" ]; then
2023-09-12 10:10:10 +00:00
export GCFLAGS='all=-N -l'
2021-10-09 01:47:08 +00:00
fi
go build -gcflags="$GCFLAGS" -o .bin/gitserver github.com/sourcegraph/sourcegraph/cmd/gitserver
checkBinary: .bin/gitserver
env:
HOSTNAME: 127.0.0.1:3178
watch:
- lib
- internal
- cmd/gitserver
# This is only here to stay backwards-compatible with people's custom
# `sg.config.overwrite.yaml` files
gitserver:
<<: *gitserver_template
gitserver-0:
<<: *gitserver_template
env:
GITSERVER_EXTERNAL_ADDR: 127.0.0.1:3501
GITSERVER_ADDR: 127.0.0.1:3501
SRC_REPOS_DIR: $HOME/.sourcegraph/repos_1
SRC_PROF_HTTP: 127.0.0.1:3551
gitserver-1:
<<: *gitserver_template
env:
GITSERVER_EXTERNAL_ADDR: 127.0.0.1:3502
GITSERVER_ADDR: 127.0.0.1:3502
SRC_REPOS_DIR: $HOME/.sourcegraph/repos_2
SRC_PROF_HTTP: 127.0.0.1:3552
repo-updater:
cmd: |
export SOURCEGRAPH_LICENSE_GENERATION_KEY=$(cat ../dev-private/enterprise/dev/test-license-generation-key.pem)
.bin/repo-updater
2021-10-09 01:47:08 +00:00
install: |
if [ -n "$DELVE" ]; then
2023-09-12 10:10:10 +00:00
export GCFLAGS='all=-N -l'
2021-10-09 01:47:08 +00:00
fi
go build -gcflags="$GCFLAGS" -o .bin/repo-updater github.com/sourcegraph/sourcegraph/cmd/repo-updater
checkBinary: .bin/repo-updater
watch:
- lib
- internal
- cmd/repo-updater
symbols:
cmd: .bin/symbols
install: |
if [ -n "$DELVE" ]; then
2023-09-12 10:10:10 +00:00
export GCFLAGS='all=-N -l'
fi
# Ensure scip-ctags-dev is installed to avoid prompting the user to
# install it manually.
if [ ! -f $(./dev/scip-ctags-install.sh which) ]; then
./dev/scip-ctags-install.sh
fi
2023-10-04 19:43:34 +00:00
go build -gcflags="$GCFLAGS" -o .bin/symbols github.com/sourcegraph/sourcegraph/cmd/symbols
checkBinary: .bin/symbols
env:
CTAGS_COMMAND: dev/universal-ctags-dev
SCIP_CTAGS_COMMAND: dev/scip-ctags-dev
CTAGS_PROCESSES: 2
USE_ROCKSKIP: 'false'
watch:
- lib
- internal
- cmd/symbols
- internal/rockskip
embeddings: searcher and indexer (#48017) # High-level architecture overview <img width="2231" alt="Screenshot 2023-02-24 at 15 13 59" src="https://user-images.githubusercontent.com/6417322/221200130-53c1ff25-4c47-4532-885f-5c4f9dadb05e.png"> # Embeddings Really quickly: embeddings are a semantic representation of text. Embeddings are usually floating-point vectors with 256+ elements. The neat thing about embeddings is that they allow us to search over textual information using a semantic correlation between the query and the text, not just syntactic (matching keywords). In this PR, we implemented an embedding service that will allow us to do semantic code search over repositories in Sourcegraph. So, for example, you'll be able to ask, "how do access tokens work in Sourcegraph", and it will give you a list of the closest matching code files. Additionally, we build a context detection service powered by embeddings. In chat applications, it is important to know whether the user's message requires additional context. We have to differentiate between two cases: the user asks a general question about the codebase, or the user references something in the existing conversation. In the latter case, including the context would ruin the flow of the conversation, and the chatbot would most likely return a confusing answer. We determine whether a query _does not_ require additional context using two approaches: 1. We check if the query contains well-known phrases that would indicate the user is referencing the existing conversation (e.g., translate previous, change that) 1. We have a static dataset of messages that require context and a dataset of messages that do not. We embed both datasets, and then, using embedding similarity, we can check which set is more similar to the query. ## GraphQL API We add four new resolvers to the GraphQL API: ```graphql extend type Query { embeddingsSearch(repo: ID!, query: String!, codeResultsCount: Int!, textResultsCount: Int!): EmbeddingsSearchResults! isContextRequiredForQuery(query: String!): Boolean! } extend type Mutation { scheduleRepositoriesForEmbedding(repoNames: [String!]!): EmptyResponse! scheduleContextDetectionForEmbedding: EmptyResponse! } ``` - `embeddingsSearch` performs embeddings search over the repo embeddings and returns the specified number of results - `isContextRequiredForQuery` determines whether the given query requires additional context - `scheduleRepositoriesForEmbedding` schedules a repo embedding background job - `scheduleContextDetectionForEmbedding` schedules a context detection embedding background job that embeds a static dataset of messages. ## Repo embedding background job Embedding a repository is implemented as a background job. The background job handler receives the repository and the revision, which should be embedded. Handler then gathers a list of files from the gitserver and excludes files >1MB in size. The list of files is split into code and text files (.md, .txt), and we build a separate embedding index for both. We split them because in a combined index, the text files always tended to feature as top results and didn't leave any room for code files. Once we have the list of files, the procedure is as follows: - For each file - Get file contents from gitserver - Check if the file is embeddable (is not autogenerated, is large enough, does not have long lines) - Split the file into embeddable chunks - Embed the file chunks using an external embedding service (defined in site config) - Add embedded file chunks and metadata to the index - Metadata contains the file name, the start line, and the end line of the chunk - Once all files are processed, the index is marshaled into JSON and stored in Cloud storage (GCS, S3) ### Site config changes As mentioned, we use a configurable external embedding API that does the actual text -> vector embedding part. Ideally, this allows us to swap embedding providers in the future. ```json "embeddings": { "description": "Configuration for embeddings service.", "type": "object", "required": ["enabled", "dimensions", "model", "accessToken", "url"], "properties": { "enabled": { "description": "Toggles whether embedding service is enabled.", "type": "boolean", "default": false }, "dimensions": { "description": "The dimensionality of the embedding vectors.", "type": "integer", "minimum": 0 }, "model": { "description": "The model used for embedding.", "type": "string" }, "accessToken": { "description": "The access token used to authenticate with the external embedding API service.", "type": "string" }, "url": { "description": "The url to the external embedding API service.", "type": "string", "format": "uri" } } } ``` ## Repo embeddings search The repo embeddings search is implemented in its own service. When a user queries a repo using embeddings search, the following happens: - Download the repo embedding index from blob storage and cache it in memory - We cache up to 5 embedding indexes in memory - Embed the query and use the embedded query vector to find similar code and text file metadata in the embedding index - Query gitserver for the actual file contents - Return the results ## Interesting files - [Similarity search](https://github.com/sourcegraph/sourcegraph/pull/48017/files#diff-102cc83520004eb0e2795e49bc435c5142ca555189b1db3a52bbf1ffb82fa3c6) - [Repo embedding job handler](https://github.com/sourcegraph/sourcegraph/pull/48017/files#diff-c345f373f426398beb4b9cd5852ba862a2718687882db2a8b2d9c7fbb5f1dc52) - [External embedding api client](https://github.com/sourcegraph/sourcegraph/pull/48017/files#diff-ad1e7956f518e4bcaee17dd9e7ac04a5e090c00d970fcd273919e887e1d2cf8f) - [Embedding a repo](https://github.com/sourcegraph/sourcegraph/pull/48017/files#diff-1f35118727128095b7816791b6f0a2e0e060cddee43d25102859b8159465585c) - [Embeddings searcher service](https://github.com/sourcegraph/sourcegraph/pull/48017/files#diff-5b20f3e7ef87041daeeaef98b58ebf7388519cedcdfc359dc5e6d4e0b021472e) - [Embeddings search](https://github.com/sourcegraph/sourcegraph/pull/48017/files#diff-79f95b9cc3f1ef39c1a0b88015bd9cd6c19c30a8d4c147409f1b8e8cd9462ea1) - [Repo embedding index cache management](https://github.com/sourcegraph/sourcegraph/pull/48017/files#diff-8a41f7dec31054889dbf86e97c52223d5636b4d408c6b375bcfc09160a8b70f8) - [GraphQL resolvers](https://github.com/sourcegraph/sourcegraph/pull/48017/files#diff-9b30a0b5efcb63e2f4611b99ab137fbe09629a769a4f30d10a1b2da41a01d21f) ## Test plan - Start by filling out the `embeddings` object in the site config (let me know if you need an API key) - Start the embeddings service using `sg start embeddings` - Go to the `/api/console` page and schedule a repo embedding job and a context detection embedding job: ```graphql mutation { scheduleRepositoriesForEmbedding(repoNames: ["github.com/sourcegraph/handbook"]) { __typename } scheduleContextDetectionForEmbedding { __typename } } ``` - Once both are finished, you should be able to query the repo embedding index, and determine whether context is need for a given query: ```graphql query { isContextRequiredForQuery(query: "how do access tokens work") embeddingsSearch( repo: "UmVwb3NpdG9yeToy", # github.com/sourcegraph/handbook GQL ID query: "how do access tokens work", codeResultsCount: 5, textResultsCount: 5) { codeResults { fileName content } textResults { fileName content } } } ```
2023-03-01 09:50:12 +00:00
embeddings:
cmd: |
export SOURCEGRAPH_LICENSE_GENERATION_KEY=$(cat ../dev-private/enterprise/dev/test-license-generation-key.pem)
.bin/embeddings
embeddings: searcher and indexer (#48017) # High-level architecture overview <img width="2231" alt="Screenshot 2023-02-24 at 15 13 59" src="https://user-images.githubusercontent.com/6417322/221200130-53c1ff25-4c47-4532-885f-5c4f9dadb05e.png"> # Embeddings Really quickly: embeddings are a semantic representation of text. Embeddings are usually floating-point vectors with 256+ elements. The neat thing about embeddings is that they allow us to search over textual information using a semantic correlation between the query and the text, not just syntactic (matching keywords). In this PR, we implemented an embedding service that will allow us to do semantic code search over repositories in Sourcegraph. So, for example, you'll be able to ask, "how do access tokens work in Sourcegraph", and it will give you a list of the closest matching code files. Additionally, we build a context detection service powered by embeddings. In chat applications, it is important to know whether the user's message requires additional context. We have to differentiate between two cases: the user asks a general question about the codebase, or the user references something in the existing conversation. In the latter case, including the context would ruin the flow of the conversation, and the chatbot would most likely return a confusing answer. We determine whether a query _does not_ require additional context using two approaches: 1. We check if the query contains well-known phrases that would indicate the user is referencing the existing conversation (e.g., translate previous, change that) 1. We have a static dataset of messages that require context and a dataset of messages that do not. We embed both datasets, and then, using embedding similarity, we can check which set is more similar to the query. ## GraphQL API We add four new resolvers to the GraphQL API: ```graphql extend type Query { embeddingsSearch(repo: ID!, query: String!, codeResultsCount: Int!, textResultsCount: Int!): EmbeddingsSearchResults! isContextRequiredForQuery(query: String!): Boolean! } extend type Mutation { scheduleRepositoriesForEmbedding(repoNames: [String!]!): EmptyResponse! scheduleContextDetectionForEmbedding: EmptyResponse! } ``` - `embeddingsSearch` performs embeddings search over the repo embeddings and returns the specified number of results - `isContextRequiredForQuery` determines whether the given query requires additional context - `scheduleRepositoriesForEmbedding` schedules a repo embedding background job - `scheduleContextDetectionForEmbedding` schedules a context detection embedding background job that embeds a static dataset of messages. ## Repo embedding background job Embedding a repository is implemented as a background job. The background job handler receives the repository and the revision, which should be embedded. Handler then gathers a list of files from the gitserver and excludes files >1MB in size. The list of files is split into code and text files (.md, .txt), and we build a separate embedding index for both. We split them because in a combined index, the text files always tended to feature as top results and didn't leave any room for code files. Once we have the list of files, the procedure is as follows: - For each file - Get file contents from gitserver - Check if the file is embeddable (is not autogenerated, is large enough, does not have long lines) - Split the file into embeddable chunks - Embed the file chunks using an external embedding service (defined in site config) - Add embedded file chunks and metadata to the index - Metadata contains the file name, the start line, and the end line of the chunk - Once all files are processed, the index is marshaled into JSON and stored in Cloud storage (GCS, S3) ### Site config changes As mentioned, we use a configurable external embedding API that does the actual text -> vector embedding part. Ideally, this allows us to swap embedding providers in the future. ```json "embeddings": { "description": "Configuration for embeddings service.", "type": "object", "required": ["enabled", "dimensions", "model", "accessToken", "url"], "properties": { "enabled": { "description": "Toggles whether embedding service is enabled.", "type": "boolean", "default": false }, "dimensions": { "description": "The dimensionality of the embedding vectors.", "type": "integer", "minimum": 0 }, "model": { "description": "The model used for embedding.", "type": "string" }, "accessToken": { "description": "The access token used to authenticate with the external embedding API service.", "type": "string" }, "url": { "description": "The url to the external embedding API service.", "type": "string", "format": "uri" } } } ``` ## Repo embeddings search The repo embeddings search is implemented in its own service. When a user queries a repo using embeddings search, the following happens: - Download the repo embedding index from blob storage and cache it in memory - We cache up to 5 embedding indexes in memory - Embed the query and use the embedded query vector to find similar code and text file metadata in the embedding index - Query gitserver for the actual file contents - Return the results ## Interesting files - [Similarity search](https://github.com/sourcegraph/sourcegraph/pull/48017/files#diff-102cc83520004eb0e2795e49bc435c5142ca555189b1db3a52bbf1ffb82fa3c6) - [Repo embedding job handler](https://github.com/sourcegraph/sourcegraph/pull/48017/files#diff-c345f373f426398beb4b9cd5852ba862a2718687882db2a8b2d9c7fbb5f1dc52) - [External embedding api client](https://github.com/sourcegraph/sourcegraph/pull/48017/files#diff-ad1e7956f518e4bcaee17dd9e7ac04a5e090c00d970fcd273919e887e1d2cf8f) - [Embedding a repo](https://github.com/sourcegraph/sourcegraph/pull/48017/files#diff-1f35118727128095b7816791b6f0a2e0e060cddee43d25102859b8159465585c) - [Embeddings searcher service](https://github.com/sourcegraph/sourcegraph/pull/48017/files#diff-5b20f3e7ef87041daeeaef98b58ebf7388519cedcdfc359dc5e6d4e0b021472e) - [Embeddings search](https://github.com/sourcegraph/sourcegraph/pull/48017/files#diff-79f95b9cc3f1ef39c1a0b88015bd9cd6c19c30a8d4c147409f1b8e8cd9462ea1) - [Repo embedding index cache management](https://github.com/sourcegraph/sourcegraph/pull/48017/files#diff-8a41f7dec31054889dbf86e97c52223d5636b4d408c6b375bcfc09160a8b70f8) - [GraphQL resolvers](https://github.com/sourcegraph/sourcegraph/pull/48017/files#diff-9b30a0b5efcb63e2f4611b99ab137fbe09629a769a4f30d10a1b2da41a01d21f) ## Test plan - Start by filling out the `embeddings` object in the site config (let me know if you need an API key) - Start the embeddings service using `sg start embeddings` - Go to the `/api/console` page and schedule a repo embedding job and a context detection embedding job: ```graphql mutation { scheduleRepositoriesForEmbedding(repoNames: ["github.com/sourcegraph/handbook"]) { __typename } scheduleContextDetectionForEmbedding { __typename } } ``` - Once both are finished, you should be able to query the repo embedding index, and determine whether context is need for a given query: ```graphql query { isContextRequiredForQuery(query: "how do access tokens work") embeddingsSearch( repo: "UmVwb3NpdG9yeToy", # github.com/sourcegraph/handbook GQL ID query: "how do access tokens work", codeResultsCount: 5, textResultsCount: 5) { codeResults { fileName content } textResults { fileName content } } } ```
2023-03-01 09:50:12 +00:00
install: |
if [ -n "$DELVE" ]; then
2023-09-12 10:10:10 +00:00
export GCFLAGS='all=-N -l'
embeddings: searcher and indexer (#48017) # High-level architecture overview <img width="2231" alt="Screenshot 2023-02-24 at 15 13 59" src="https://user-images.githubusercontent.com/6417322/221200130-53c1ff25-4c47-4532-885f-5c4f9dadb05e.png"> # Embeddings Really quickly: embeddings are a semantic representation of text. Embeddings are usually floating-point vectors with 256+ elements. The neat thing about embeddings is that they allow us to search over textual information using a semantic correlation between the query and the text, not just syntactic (matching keywords). In this PR, we implemented an embedding service that will allow us to do semantic code search over repositories in Sourcegraph. So, for example, you'll be able to ask, "how do access tokens work in Sourcegraph", and it will give you a list of the closest matching code files. Additionally, we build a context detection service powered by embeddings. In chat applications, it is important to know whether the user's message requires additional context. We have to differentiate between two cases: the user asks a general question about the codebase, or the user references something in the existing conversation. In the latter case, including the context would ruin the flow of the conversation, and the chatbot would most likely return a confusing answer. We determine whether a query _does not_ require additional context using two approaches: 1. We check if the query contains well-known phrases that would indicate the user is referencing the existing conversation (e.g., translate previous, change that) 1. We have a static dataset of messages that require context and a dataset of messages that do not. We embed both datasets, and then, using embedding similarity, we can check which set is more similar to the query. ## GraphQL API We add four new resolvers to the GraphQL API: ```graphql extend type Query { embeddingsSearch(repo: ID!, query: String!, codeResultsCount: Int!, textResultsCount: Int!): EmbeddingsSearchResults! isContextRequiredForQuery(query: String!): Boolean! } extend type Mutation { scheduleRepositoriesForEmbedding(repoNames: [String!]!): EmptyResponse! scheduleContextDetectionForEmbedding: EmptyResponse! } ``` - `embeddingsSearch` performs embeddings search over the repo embeddings and returns the specified number of results - `isContextRequiredForQuery` determines whether the given query requires additional context - `scheduleRepositoriesForEmbedding` schedules a repo embedding background job - `scheduleContextDetectionForEmbedding` schedules a context detection embedding background job that embeds a static dataset of messages. ## Repo embedding background job Embedding a repository is implemented as a background job. The background job handler receives the repository and the revision, which should be embedded. Handler then gathers a list of files from the gitserver and excludes files >1MB in size. The list of files is split into code and text files (.md, .txt), and we build a separate embedding index for both. We split them because in a combined index, the text files always tended to feature as top results and didn't leave any room for code files. Once we have the list of files, the procedure is as follows: - For each file - Get file contents from gitserver - Check if the file is embeddable (is not autogenerated, is large enough, does not have long lines) - Split the file into embeddable chunks - Embed the file chunks using an external embedding service (defined in site config) - Add embedded file chunks and metadata to the index - Metadata contains the file name, the start line, and the end line of the chunk - Once all files are processed, the index is marshaled into JSON and stored in Cloud storage (GCS, S3) ### Site config changes As mentioned, we use a configurable external embedding API that does the actual text -> vector embedding part. Ideally, this allows us to swap embedding providers in the future. ```json "embeddings": { "description": "Configuration for embeddings service.", "type": "object", "required": ["enabled", "dimensions", "model", "accessToken", "url"], "properties": { "enabled": { "description": "Toggles whether embedding service is enabled.", "type": "boolean", "default": false }, "dimensions": { "description": "The dimensionality of the embedding vectors.", "type": "integer", "minimum": 0 }, "model": { "description": "The model used for embedding.", "type": "string" }, "accessToken": { "description": "The access token used to authenticate with the external embedding API service.", "type": "string" }, "url": { "description": "The url to the external embedding API service.", "type": "string", "format": "uri" } } } ``` ## Repo embeddings search The repo embeddings search is implemented in its own service. When a user queries a repo using embeddings search, the following happens: - Download the repo embedding index from blob storage and cache it in memory - We cache up to 5 embedding indexes in memory - Embed the query and use the embedded query vector to find similar code and text file metadata in the embedding index - Query gitserver for the actual file contents - Return the results ## Interesting files - [Similarity search](https://github.com/sourcegraph/sourcegraph/pull/48017/files#diff-102cc83520004eb0e2795e49bc435c5142ca555189b1db3a52bbf1ffb82fa3c6) - [Repo embedding job handler](https://github.com/sourcegraph/sourcegraph/pull/48017/files#diff-c345f373f426398beb4b9cd5852ba862a2718687882db2a8b2d9c7fbb5f1dc52) - [External embedding api client](https://github.com/sourcegraph/sourcegraph/pull/48017/files#diff-ad1e7956f518e4bcaee17dd9e7ac04a5e090c00d970fcd273919e887e1d2cf8f) - [Embedding a repo](https://github.com/sourcegraph/sourcegraph/pull/48017/files#diff-1f35118727128095b7816791b6f0a2e0e060cddee43d25102859b8159465585c) - [Embeddings searcher service](https://github.com/sourcegraph/sourcegraph/pull/48017/files#diff-5b20f3e7ef87041daeeaef98b58ebf7388519cedcdfc359dc5e6d4e0b021472e) - [Embeddings search](https://github.com/sourcegraph/sourcegraph/pull/48017/files#diff-79f95b9cc3f1ef39c1a0b88015bd9cd6c19c30a8d4c147409f1b8e8cd9462ea1) - [Repo embedding index cache management](https://github.com/sourcegraph/sourcegraph/pull/48017/files#diff-8a41f7dec31054889dbf86e97c52223d5636b4d408c6b375bcfc09160a8b70f8) - [GraphQL resolvers](https://github.com/sourcegraph/sourcegraph/pull/48017/files#diff-9b30a0b5efcb63e2f4611b99ab137fbe09629a769a4f30d10a1b2da41a01d21f) ## Test plan - Start by filling out the `embeddings` object in the site config (let me know if you need an API key) - Start the embeddings service using `sg start embeddings` - Go to the `/api/console` page and schedule a repo embedding job and a context detection embedding job: ```graphql mutation { scheduleRepositoriesForEmbedding(repoNames: ["github.com/sourcegraph/handbook"]) { __typename } scheduleContextDetectionForEmbedding { __typename } } ``` - Once both are finished, you should be able to query the repo embedding index, and determine whether context is need for a given query: ```graphql query { isContextRequiredForQuery(query: "how do access tokens work") embeddingsSearch( repo: "UmVwb3NpdG9yeToy", # github.com/sourcegraph/handbook GQL ID query: "how do access tokens work", codeResultsCount: 5, textResultsCount: 5) { codeResults { fileName content } textResults { fileName content } } } ```
2023-03-01 09:50:12 +00:00
fi
go build -gcflags="$GCFLAGS" -o .bin/embeddings github.com/sourcegraph/sourcegraph/cmd/embeddings
embeddings: searcher and indexer (#48017) # High-level architecture overview <img width="2231" alt="Screenshot 2023-02-24 at 15 13 59" src="https://user-images.githubusercontent.com/6417322/221200130-53c1ff25-4c47-4532-885f-5c4f9dadb05e.png"> # Embeddings Really quickly: embeddings are a semantic representation of text. Embeddings are usually floating-point vectors with 256+ elements. The neat thing about embeddings is that they allow us to search over textual information using a semantic correlation between the query and the text, not just syntactic (matching keywords). In this PR, we implemented an embedding service that will allow us to do semantic code search over repositories in Sourcegraph. So, for example, you'll be able to ask, "how do access tokens work in Sourcegraph", and it will give you a list of the closest matching code files. Additionally, we build a context detection service powered by embeddings. In chat applications, it is important to know whether the user's message requires additional context. We have to differentiate between two cases: the user asks a general question about the codebase, or the user references something in the existing conversation. In the latter case, including the context would ruin the flow of the conversation, and the chatbot would most likely return a confusing answer. We determine whether a query _does not_ require additional context using two approaches: 1. We check if the query contains well-known phrases that would indicate the user is referencing the existing conversation (e.g., translate previous, change that) 1. We have a static dataset of messages that require context and a dataset of messages that do not. We embed both datasets, and then, using embedding similarity, we can check which set is more similar to the query. ## GraphQL API We add four new resolvers to the GraphQL API: ```graphql extend type Query { embeddingsSearch(repo: ID!, query: String!, codeResultsCount: Int!, textResultsCount: Int!): EmbeddingsSearchResults! isContextRequiredForQuery(query: String!): Boolean! } extend type Mutation { scheduleRepositoriesForEmbedding(repoNames: [String!]!): EmptyResponse! scheduleContextDetectionForEmbedding: EmptyResponse! } ``` - `embeddingsSearch` performs embeddings search over the repo embeddings and returns the specified number of results - `isContextRequiredForQuery` determines whether the given query requires additional context - `scheduleRepositoriesForEmbedding` schedules a repo embedding background job - `scheduleContextDetectionForEmbedding` schedules a context detection embedding background job that embeds a static dataset of messages. ## Repo embedding background job Embedding a repository is implemented as a background job. The background job handler receives the repository and the revision, which should be embedded. Handler then gathers a list of files from the gitserver and excludes files >1MB in size. The list of files is split into code and text files (.md, .txt), and we build a separate embedding index for both. We split them because in a combined index, the text files always tended to feature as top results and didn't leave any room for code files. Once we have the list of files, the procedure is as follows: - For each file - Get file contents from gitserver - Check if the file is embeddable (is not autogenerated, is large enough, does not have long lines) - Split the file into embeddable chunks - Embed the file chunks using an external embedding service (defined in site config) - Add embedded file chunks and metadata to the index - Metadata contains the file name, the start line, and the end line of the chunk - Once all files are processed, the index is marshaled into JSON and stored in Cloud storage (GCS, S3) ### Site config changes As mentioned, we use a configurable external embedding API that does the actual text -> vector embedding part. Ideally, this allows us to swap embedding providers in the future. ```json "embeddings": { "description": "Configuration for embeddings service.", "type": "object", "required": ["enabled", "dimensions", "model", "accessToken", "url"], "properties": { "enabled": { "description": "Toggles whether embedding service is enabled.", "type": "boolean", "default": false }, "dimensions": { "description": "The dimensionality of the embedding vectors.", "type": "integer", "minimum": 0 }, "model": { "description": "The model used for embedding.", "type": "string" }, "accessToken": { "description": "The access token used to authenticate with the external embedding API service.", "type": "string" }, "url": { "description": "The url to the external embedding API service.", "type": "string", "format": "uri" } } } ``` ## Repo embeddings search The repo embeddings search is implemented in its own service. When a user queries a repo using embeddings search, the following happens: - Download the repo embedding index from blob storage and cache it in memory - We cache up to 5 embedding indexes in memory - Embed the query and use the embedded query vector to find similar code and text file metadata in the embedding index - Query gitserver for the actual file contents - Return the results ## Interesting files - [Similarity search](https://github.com/sourcegraph/sourcegraph/pull/48017/files#diff-102cc83520004eb0e2795e49bc435c5142ca555189b1db3a52bbf1ffb82fa3c6) - [Repo embedding job handler](https://github.com/sourcegraph/sourcegraph/pull/48017/files#diff-c345f373f426398beb4b9cd5852ba862a2718687882db2a8b2d9c7fbb5f1dc52) - [External embedding api client](https://github.com/sourcegraph/sourcegraph/pull/48017/files#diff-ad1e7956f518e4bcaee17dd9e7ac04a5e090c00d970fcd273919e887e1d2cf8f) - [Embedding a repo](https://github.com/sourcegraph/sourcegraph/pull/48017/files#diff-1f35118727128095b7816791b6f0a2e0e060cddee43d25102859b8159465585c) - [Embeddings searcher service](https://github.com/sourcegraph/sourcegraph/pull/48017/files#diff-5b20f3e7ef87041daeeaef98b58ebf7388519cedcdfc359dc5e6d4e0b021472e) - [Embeddings search](https://github.com/sourcegraph/sourcegraph/pull/48017/files#diff-79f95b9cc3f1ef39c1a0b88015bd9cd6c19c30a8d4c147409f1b8e8cd9462ea1) - [Repo embedding index cache management](https://github.com/sourcegraph/sourcegraph/pull/48017/files#diff-8a41f7dec31054889dbf86e97c52223d5636b4d408c6b375bcfc09160a8b70f8) - [GraphQL resolvers](https://github.com/sourcegraph/sourcegraph/pull/48017/files#diff-9b30a0b5efcb63e2f4611b99ab137fbe09629a769a4f30d10a1b2da41a01d21f) ## Test plan - Start by filling out the `embeddings` object in the site config (let me know if you need an API key) - Start the embeddings service using `sg start embeddings` - Go to the `/api/console` page and schedule a repo embedding job and a context detection embedding job: ```graphql mutation { scheduleRepositoriesForEmbedding(repoNames: ["github.com/sourcegraph/handbook"]) { __typename } scheduleContextDetectionForEmbedding { __typename } } ``` - Once both are finished, you should be able to query the repo embedding index, and determine whether context is need for a given query: ```graphql query { isContextRequiredForQuery(query: "how do access tokens work") embeddingsSearch( repo: "UmVwb3NpdG9yeToy", # github.com/sourcegraph/handbook GQL ID query: "how do access tokens work", codeResultsCount: 5, textResultsCount: 5) { codeResults { fileName content } textResults { fileName content } } } ```
2023-03-01 09:50:12 +00:00
checkBinary: .bin/embeddings
watch:
- lib
- internal
- cmd/embeddings
- internal/embeddings
worker:
cmd: |
export SOURCEGRAPH_LICENSE_GENERATION_KEY=$(cat ../dev-private/enterprise/dev/test-license-generation-key.pem)
.bin/worker
install: |
if [ -n "$DELVE" ]; then
2023-09-12 10:10:10 +00:00
export GCFLAGS='all=-N -l'
fi
go build -gcflags="$GCFLAGS" -o .bin/worker github.com/sourcegraph/sourcegraph/cmd/worker
checkBinary: .bin/worker
watch:
- lib
- internal
- cmd/worker
embeddings: searcher and indexer (#48017) # High-level architecture overview <img width="2231" alt="Screenshot 2023-02-24 at 15 13 59" src="https://user-images.githubusercontent.com/6417322/221200130-53c1ff25-4c47-4532-885f-5c4f9dadb05e.png"> # Embeddings Really quickly: embeddings are a semantic representation of text. Embeddings are usually floating-point vectors with 256+ elements. The neat thing about embeddings is that they allow us to search over textual information using a semantic correlation between the query and the text, not just syntactic (matching keywords). In this PR, we implemented an embedding service that will allow us to do semantic code search over repositories in Sourcegraph. So, for example, you'll be able to ask, "how do access tokens work in Sourcegraph", and it will give you a list of the closest matching code files. Additionally, we build a context detection service powered by embeddings. In chat applications, it is important to know whether the user's message requires additional context. We have to differentiate between two cases: the user asks a general question about the codebase, or the user references something in the existing conversation. In the latter case, including the context would ruin the flow of the conversation, and the chatbot would most likely return a confusing answer. We determine whether a query _does not_ require additional context using two approaches: 1. We check if the query contains well-known phrases that would indicate the user is referencing the existing conversation (e.g., translate previous, change that) 1. We have a static dataset of messages that require context and a dataset of messages that do not. We embed both datasets, and then, using embedding similarity, we can check which set is more similar to the query. ## GraphQL API We add four new resolvers to the GraphQL API: ```graphql extend type Query { embeddingsSearch(repo: ID!, query: String!, codeResultsCount: Int!, textResultsCount: Int!): EmbeddingsSearchResults! isContextRequiredForQuery(query: String!): Boolean! } extend type Mutation { scheduleRepositoriesForEmbedding(repoNames: [String!]!): EmptyResponse! scheduleContextDetectionForEmbedding: EmptyResponse! } ``` - `embeddingsSearch` performs embeddings search over the repo embeddings and returns the specified number of results - `isContextRequiredForQuery` determines whether the given query requires additional context - `scheduleRepositoriesForEmbedding` schedules a repo embedding background job - `scheduleContextDetectionForEmbedding` schedules a context detection embedding background job that embeds a static dataset of messages. ## Repo embedding background job Embedding a repository is implemented as a background job. The background job handler receives the repository and the revision, which should be embedded. Handler then gathers a list of files from the gitserver and excludes files >1MB in size. The list of files is split into code and text files (.md, .txt), and we build a separate embedding index for both. We split them because in a combined index, the text files always tended to feature as top results and didn't leave any room for code files. Once we have the list of files, the procedure is as follows: - For each file - Get file contents from gitserver - Check if the file is embeddable (is not autogenerated, is large enough, does not have long lines) - Split the file into embeddable chunks - Embed the file chunks using an external embedding service (defined in site config) - Add embedded file chunks and metadata to the index - Metadata contains the file name, the start line, and the end line of the chunk - Once all files are processed, the index is marshaled into JSON and stored in Cloud storage (GCS, S3) ### Site config changes As mentioned, we use a configurable external embedding API that does the actual text -> vector embedding part. Ideally, this allows us to swap embedding providers in the future. ```json "embeddings": { "description": "Configuration for embeddings service.", "type": "object", "required": ["enabled", "dimensions", "model", "accessToken", "url"], "properties": { "enabled": { "description": "Toggles whether embedding service is enabled.", "type": "boolean", "default": false }, "dimensions": { "description": "The dimensionality of the embedding vectors.", "type": "integer", "minimum": 0 }, "model": { "description": "The model used for embedding.", "type": "string" }, "accessToken": { "description": "The access token used to authenticate with the external embedding API service.", "type": "string" }, "url": { "description": "The url to the external embedding API service.", "type": "string", "format": "uri" } } } ``` ## Repo embeddings search The repo embeddings search is implemented in its own service. When a user queries a repo using embeddings search, the following happens: - Download the repo embedding index from blob storage and cache it in memory - We cache up to 5 embedding indexes in memory - Embed the query and use the embedded query vector to find similar code and text file metadata in the embedding index - Query gitserver for the actual file contents - Return the results ## Interesting files - [Similarity search](https://github.com/sourcegraph/sourcegraph/pull/48017/files#diff-102cc83520004eb0e2795e49bc435c5142ca555189b1db3a52bbf1ffb82fa3c6) - [Repo embedding job handler](https://github.com/sourcegraph/sourcegraph/pull/48017/files#diff-c345f373f426398beb4b9cd5852ba862a2718687882db2a8b2d9c7fbb5f1dc52) - [External embedding api client](https://github.com/sourcegraph/sourcegraph/pull/48017/files#diff-ad1e7956f518e4bcaee17dd9e7ac04a5e090c00d970fcd273919e887e1d2cf8f) - [Embedding a repo](https://github.com/sourcegraph/sourcegraph/pull/48017/files#diff-1f35118727128095b7816791b6f0a2e0e060cddee43d25102859b8159465585c) - [Embeddings searcher service](https://github.com/sourcegraph/sourcegraph/pull/48017/files#diff-5b20f3e7ef87041daeeaef98b58ebf7388519cedcdfc359dc5e6d4e0b021472e) - [Embeddings search](https://github.com/sourcegraph/sourcegraph/pull/48017/files#diff-79f95b9cc3f1ef39c1a0b88015bd9cd6c19c30a8d4c147409f1b8e8cd9462ea1) - [Repo embedding index cache management](https://github.com/sourcegraph/sourcegraph/pull/48017/files#diff-8a41f7dec31054889dbf86e97c52223d5636b4d408c6b375bcfc09160a8b70f8) - [GraphQL resolvers](https://github.com/sourcegraph/sourcegraph/pull/48017/files#diff-9b30a0b5efcb63e2f4611b99ab137fbe09629a769a4f30d10a1b2da41a01d21f) ## Test plan - Start by filling out the `embeddings` object in the site config (let me know if you need an API key) - Start the embeddings service using `sg start embeddings` - Go to the `/api/console` page and schedule a repo embedding job and a context detection embedding job: ```graphql mutation { scheduleRepositoriesForEmbedding(repoNames: ["github.com/sourcegraph/handbook"]) { __typename } scheduleContextDetectionForEmbedding { __typename } } ``` - Once both are finished, you should be able to query the repo embedding index, and determine whether context is need for a given query: ```graphql query { isContextRequiredForQuery(query: "how do access tokens work") embeddingsSearch( repo: "UmVwb3NpdG9yeToy", # github.com/sourcegraph/handbook GQL ID query: "how do access tokens work", codeResultsCount: 5, textResultsCount: 5) { codeResults { fileName content } textResults { fileName content } } } ```
2023-03-01 09:50:12 +00:00
cody-gateway:
cmd: |
.bin/cody-gateway
install: |
if [ -n "$DELVE" ]; then
2023-09-12 10:10:10 +00:00
export GCFLAGS='all=-N -l'
fi
go build -gcflags="$GCFLAGS" -o .bin/cody-gateway github.com/sourcegraph/sourcegraph/cmd/cody-gateway
checkBinary: .bin/cody-gateway
env:
CODY_GATEWAY_ANTHROPIC_ACCESS_TOKEN: foobar
# Set in override if you want to test local Cody Gateway: https://docs-legacy.sourcegraph.com/dev/how-to/cody_gateway
CODY_GATEWAY_DOTCOM_ACCESS_TOKEN: ''
CODY_GATEWAY_DOTCOM_API_URL: https://sourcegraph.test:3443/.api/graphql
CODY_GATEWAY_ALLOW_ANONYMOUS: true
CODY_GATEWAY_DIAGNOSTICS_SECRET: sekret
# Set in override if you want to test Embeddings with local Cody Gateway: http://go/embeddings-api-token-link
CODY_GATEWAY_SOURCEGRAPH_EMBEDDINGS_API_TOKEN: sekret
SRC_LOG_LEVEL: info
2023-07-20 20:35:16 +00:00
# Enables metrics in dev via debugserver
SRC_PROF_HTTP: '127.0.0.1:6098'
watch:
- lib
- internal
- cmd/cody-gateway
telemetrygateway: add exporter and service (#56699) This change adds: - telemetry export background jobs: flagged behind `TELEMETRY_GATEWAY_EXPORTER_EXPORT_ADDR`, default empty => disabled - telemetry redaction: configured in package `internal/telemetry/sensitivemetadataallowlist` - telemetry-gateway service receiving events and forwarding it to a pub/sub topic (or just logging it, as configured in local dev) - utilities for easily creating an event recorder: `internal/telemetry/telemetryrecorder` Notes: - all changes are feature-flagged to some degree, off by default, so the merge should be fairly low-risk. - we decided that transmitting the full license key continues to be the way to go. we transmit it once per stream and attach it on all events in the telemetry-gateway. there is no auth mechanism at the moment - GraphQL return type `EventLog.Source` is now a plain string instead of string enum. This should not be a breaking change in our clients, but must be made so that our generated V2 events do not break requesting of event logs Stacked on https://github.com/sourcegraph/sourcegraph/pull/56520 Closes https://github.com/sourcegraph/sourcegraph/issues/56289 Closes https://github.com/sourcegraph/sourcegraph/issues/56287 ## Test plan Add an override to make the export super frequent: ``` env: TELEMETRY_GATEWAY_EXPORTER_EXPORT_INTERVAL: "10s" TELEMETRY_GATEWAY_EXPORTER_EXPORTED_EVENTS_RETENTION: "5m" ``` Start sourcegraph: ``` sg start ``` Enable `telemetry-export` featureflag (from https://github.com/sourcegraph/sourcegraph/pull/56520) Emit some events in GraphQL: ```gql mutation { telemetry { recordEvents(events:[{ feature:"foobar" action:"view" source:{ client:"WEB" } parameters:{ version:0 } }]) { alwaysNil } } ``` See series of log events: ``` [ worker] INFO worker.telemetrygateway-exporter telemetrygatewayexporter/telemetrygatewayexporter.go:61 Telemetry Gateway export enabled - initializing background routines [ worker] INFO worker.telemetrygateway-exporter telemetrygatewayexporter/exporter.go:99 exporting events {"maxBatchSize": 10000, "count": 1} [telemetry-g...y] INFO telemetry-gateway.pubsub pubsub/topic.go:115 Publish {"TraceId": "7852903434f0d2f647d397ee83b4009b", "SpanId": "8d945234bccf319b", "message": "{\"event\":{\"id\":\"dc96ae84-4ac4-4760-968f-0a0307b8bb3d\",\"timestamp\":\"2023-09-19T01:57:13.590266Z\",\"feature\":\"foobar\", .... ``` Build: ``` export VERSION="insiders" bazel run //cmd/telemetry-gateway:candidate_push --config darwin-docker --stamp --workspace_status_command=./dev/bazel_stamp_vars.sh -- --tag $VERSION --repository us.gcr.io/sourcegraph-dev/telemetry-gateway ``` Deploy: https://github.com/sourcegraph/managed-services/pull/7 Add override: ```yaml env: # Port required. TODO: What's the best way to provide gRPC addresses, such that a # localhost address is also possible? TELEMETRY_GATEWAY_EXPORTER_EXPORT_ADDR: "https://telemetry-gateway.sgdev.org:443" ``` Repeat the above (`sg start` and emit some events): ``` [ worker] INFO worker.telemetrygateway-exporter telemetrygatewayexporter/exporter.go:94 exporting events {"maxBatchSize": 10000, "count": 6} [ worker] INFO worker.telemetrygateway-exporter telemetrygatewayexporter/exporter.go:113 events exported {"maxBatchSize": 10000, "succeeded": 6} [ worker] INFO worker.telemetrygateway-exporter telemetrygatewayexporter/exporter.go:94 exporting events {"maxBatchSize": 10000, "count": 1} [ worker] INFO worker.telemetrygateway-exporter telemetrygatewayexporter/exporter.go:113 events exported {"maxBatchSize": 10000, "succeeded": 1} ```
2023-09-20 05:20:15 +00:00
telemetry-gateway:
cmd: |
# Telemetry Gateway needs this to parse and validate incoming license keys.
export SOURCEGRAPH_LICENSE_GENERATION_KEY=$(cat ../dev-private/enterprise/dev/test-license-generation-key.pem)
.bin/telemetry-gateway
install: |
if [ -n "$DELVE" ]; then
export GCFLAGS='all=-N -l'
fi
go build -gcflags="$GCFLAGS" -o .bin/telemetry-gateway github.com/sourcegraph/sourcegraph/cmd/telemetry-gateway
checkBinary: .bin/telemetry-gateway
env:
PORT: '6080'
telemetrygateway: add exporter and service (#56699) This change adds: - telemetry export background jobs: flagged behind `TELEMETRY_GATEWAY_EXPORTER_EXPORT_ADDR`, default empty => disabled - telemetry redaction: configured in package `internal/telemetry/sensitivemetadataallowlist` - telemetry-gateway service receiving events and forwarding it to a pub/sub topic (or just logging it, as configured in local dev) - utilities for easily creating an event recorder: `internal/telemetry/telemetryrecorder` Notes: - all changes are feature-flagged to some degree, off by default, so the merge should be fairly low-risk. - we decided that transmitting the full license key continues to be the way to go. we transmit it once per stream and attach it on all events in the telemetry-gateway. there is no auth mechanism at the moment - GraphQL return type `EventLog.Source` is now a plain string instead of string enum. This should not be a breaking change in our clients, but must be made so that our generated V2 events do not break requesting of event logs Stacked on https://github.com/sourcegraph/sourcegraph/pull/56520 Closes https://github.com/sourcegraph/sourcegraph/issues/56289 Closes https://github.com/sourcegraph/sourcegraph/issues/56287 ## Test plan Add an override to make the export super frequent: ``` env: TELEMETRY_GATEWAY_EXPORTER_EXPORT_INTERVAL: "10s" TELEMETRY_GATEWAY_EXPORTER_EXPORTED_EVENTS_RETENTION: "5m" ``` Start sourcegraph: ``` sg start ``` Enable `telemetry-export` featureflag (from https://github.com/sourcegraph/sourcegraph/pull/56520) Emit some events in GraphQL: ```gql mutation { telemetry { recordEvents(events:[{ feature:"foobar" action:"view" source:{ client:"WEB" } parameters:{ version:0 } }]) { alwaysNil } } ``` See series of log events: ``` [ worker] INFO worker.telemetrygateway-exporter telemetrygatewayexporter/telemetrygatewayexporter.go:61 Telemetry Gateway export enabled - initializing background routines [ worker] INFO worker.telemetrygateway-exporter telemetrygatewayexporter/exporter.go:99 exporting events {"maxBatchSize": 10000, "count": 1} [telemetry-g...y] INFO telemetry-gateway.pubsub pubsub/topic.go:115 Publish {"TraceId": "7852903434f0d2f647d397ee83b4009b", "SpanId": "8d945234bccf319b", "message": "{\"event\":{\"id\":\"dc96ae84-4ac4-4760-968f-0a0307b8bb3d\",\"timestamp\":\"2023-09-19T01:57:13.590266Z\",\"feature\":\"foobar\", .... ``` Build: ``` export VERSION="insiders" bazel run //cmd/telemetry-gateway:candidate_push --config darwin-docker --stamp --workspace_status_command=./dev/bazel_stamp_vars.sh -- --tag $VERSION --repository us.gcr.io/sourcegraph-dev/telemetry-gateway ``` Deploy: https://github.com/sourcegraph/managed-services/pull/7 Add override: ```yaml env: # Port required. TODO: What's the best way to provide gRPC addresses, such that a # localhost address is also possible? TELEMETRY_GATEWAY_EXPORTER_EXPORT_ADDR: "https://telemetry-gateway.sgdev.org:443" ``` Repeat the above (`sg start` and emit some events): ``` [ worker] INFO worker.telemetrygateway-exporter telemetrygatewayexporter/exporter.go:94 exporting events {"maxBatchSize": 10000, "count": 6} [ worker] INFO worker.telemetrygateway-exporter telemetrygatewayexporter/exporter.go:113 events exported {"maxBatchSize": 10000, "succeeded": 6} [ worker] INFO worker.telemetrygateway-exporter telemetrygatewayexporter/exporter.go:94 exporting events {"maxBatchSize": 10000, "count": 1} [ worker] INFO worker.telemetrygateway-exporter telemetrygatewayexporter/exporter.go:113 events exported {"maxBatchSize": 10000, "succeeded": 1} ```
2023-09-20 05:20:15 +00:00
DIAGNOSTICS_SECRET: sekret
TELEMETRY_GATEWAY_EVENTS_PUBSUB_ENABLED: false
SRC_LOG_LEVEL: info
GRPC_WEB_UI_ENABLED: true
# Set for convenience - use real values in sg.config.overwrite.yaml if you
# are interacting with RPCs that enforce SAMS M2M auth. See
# https://github.com/sourcegraph/accounts.sourcegraph.com/wiki/Operators-Cheat-Sheet#create-a-new-idp-client
TELEMETRY_GATEWAY_SAMS_CLIENT_ID: 'foo'
TELEMETRY_GATEWAY_SAMS_CLIENT_SECRET: 'bar'
telemetrygateway: add exporter and service (#56699) This change adds: - telemetry export background jobs: flagged behind `TELEMETRY_GATEWAY_EXPORTER_EXPORT_ADDR`, default empty => disabled - telemetry redaction: configured in package `internal/telemetry/sensitivemetadataallowlist` - telemetry-gateway service receiving events and forwarding it to a pub/sub topic (or just logging it, as configured in local dev) - utilities for easily creating an event recorder: `internal/telemetry/telemetryrecorder` Notes: - all changes are feature-flagged to some degree, off by default, so the merge should be fairly low-risk. - we decided that transmitting the full license key continues to be the way to go. we transmit it once per stream and attach it on all events in the telemetry-gateway. there is no auth mechanism at the moment - GraphQL return type `EventLog.Source` is now a plain string instead of string enum. This should not be a breaking change in our clients, but must be made so that our generated V2 events do not break requesting of event logs Stacked on https://github.com/sourcegraph/sourcegraph/pull/56520 Closes https://github.com/sourcegraph/sourcegraph/issues/56289 Closes https://github.com/sourcegraph/sourcegraph/issues/56287 ## Test plan Add an override to make the export super frequent: ``` env: TELEMETRY_GATEWAY_EXPORTER_EXPORT_INTERVAL: "10s" TELEMETRY_GATEWAY_EXPORTER_EXPORTED_EVENTS_RETENTION: "5m" ``` Start sourcegraph: ``` sg start ``` Enable `telemetry-export` featureflag (from https://github.com/sourcegraph/sourcegraph/pull/56520) Emit some events in GraphQL: ```gql mutation { telemetry { recordEvents(events:[{ feature:"foobar" action:"view" source:{ client:"WEB" } parameters:{ version:0 } }]) { alwaysNil } } ``` See series of log events: ``` [ worker] INFO worker.telemetrygateway-exporter telemetrygatewayexporter/telemetrygatewayexporter.go:61 Telemetry Gateway export enabled - initializing background routines [ worker] INFO worker.telemetrygateway-exporter telemetrygatewayexporter/exporter.go:99 exporting events {"maxBatchSize": 10000, "count": 1} [telemetry-g...y] INFO telemetry-gateway.pubsub pubsub/topic.go:115 Publish {"TraceId": "7852903434f0d2f647d397ee83b4009b", "SpanId": "8d945234bccf319b", "message": "{\"event\":{\"id\":\"dc96ae84-4ac4-4760-968f-0a0307b8bb3d\",\"timestamp\":\"2023-09-19T01:57:13.590266Z\",\"feature\":\"foobar\", .... ``` Build: ``` export VERSION="insiders" bazel run //cmd/telemetry-gateway:candidate_push --config darwin-docker --stamp --workspace_status_command=./dev/bazel_stamp_vars.sh -- --tag $VERSION --repository us.gcr.io/sourcegraph-dev/telemetry-gateway ``` Deploy: https://github.com/sourcegraph/managed-services/pull/7 Add override: ```yaml env: # Port required. TODO: What's the best way to provide gRPC addresses, such that a # localhost address is also possible? TELEMETRY_GATEWAY_EXPORTER_EXPORT_ADDR: "https://telemetry-gateway.sgdev.org:443" ``` Repeat the above (`sg start` and emit some events): ``` [ worker] INFO worker.telemetrygateway-exporter telemetrygatewayexporter/exporter.go:94 exporting events {"maxBatchSize": 10000, "count": 6} [ worker] INFO worker.telemetrygateway-exporter telemetrygatewayexporter/exporter.go:113 events exported {"maxBatchSize": 10000, "succeeded": 6} [ worker] INFO worker.telemetrygateway-exporter telemetrygatewayexporter/exporter.go:94 exporting events {"maxBatchSize": 10000, "count": 1} [ worker] INFO worker.telemetrygateway-exporter telemetrygatewayexporter/exporter.go:113 events exported {"maxBatchSize": 10000, "succeeded": 1} ```
2023-09-20 05:20:15 +00:00
watch:
- lib
- internal
- cmd/telemetry-gateway
- internal/telemetrygateway
pings:
cmd: |
.bin/pings
install: |
if [ -n "$DELVE" ]; then
2023-09-12 10:10:10 +00:00
export GCFLAGS='all=-N -l'
fi
go build -gcflags="$GCFLAGS" -o .bin/pings github.com/sourcegraph/sourcegraph/cmd/pings
checkBinary: .bin/pings
env:
msp/runtime: export contract and helpers for direct usage (#61488) Exports package `github.com/sourcegraph/sourcegraph/lib/managedservicesplatform/runtime/contract` to expose previously-internal env-configuration (the "MSP contract") loading, helpers, and configuration. This allows programs that aren't 100% in the MSP runtime ecosystem yet (hopefully reducing future barrier to entry), or are in particular scenarios where they might not want to use runtime ([e.g. MSP jobs which doesn't have proper runtime support yet](https://sourcegraph.slack.com/archives/C06062P5TS5/p1711663052720289)), to integrate with MSP-provisioned infrastructure like BigQuery, PostgreSQL, and even Sentry etc. directly. The recommended path will still be to build a `runtime`-compliant interface, but it can be a hassle to migrate sometimes, and as mentioned, we don't properly support jobs yet. Simple example: ```go // Parse the environment into an Env instance. e, _ := contract.ParseEnv([]string{"MSP=true"}) // Extract Contract instance from Env configuration. c := contract.New(logger, service, e) // Also load other custom configuration here from Env you want here // Check for errors on Env retrieval (missing/invalid values, etc.) if err := e.Validate(); err != nil { ... } // Use Contract helpers and configuration values writer, _ := c.BigQuery.GetTableWriter(ctx, "my-table") writer.Write(...) ``` ## Test plan There are no functionality changes, only code moving. But, ran some sanity checks: ``` sg run msp-example sg run telemetry-gateway sg run pings ``` --------- Co-authored-by: Chris Smith <chrsmith@users.noreply.github.com>
2024-04-04 11:05:51 +00:00
PORT: '6080'
SRC_LOG_LEVEL: info
DIAGNOSTICS_SECRET: 'lifeisgood'
PINGS_PUBSUB_PROJECT_ID: 'telligentsourcegraph'
PINGS_PUBSUB_TOPIC_ID: 'server-update-checks-test'
HUBSPOT_ACCESS_TOKEN: ''
# Enables metrics in dev via debugserver
SRC_PROF_HTTP: '127.0.0.1:7011'
watch:
- lib
- internal
- cmd/pings
msp-example:
cmd: .bin/msp-example
install: |
if [ -n "$DELVE" ]; then
export GCFLAGS='all=-N -l'
fi
go build -gcflags="$GCFLAGS" -o .bin/msp-example github.com/sourcegraph/sourcegraph/cmd/msp-example
checkBinary: .bin/msp-example
env:
PORT: '9080'
DIAGNOSTICS_SECRET: sekret
SRC_LOG_LEVEL: debug
STATELESS_MODE: 'true'
watch:
- cmd/msp-example
- lib/managedservicesplatform
feat/enterprise-portal: DB layer for {Get/List}CodyGatewayAccess (#62706) Part of CORE-112. We need to implement the `CodyAccess` service proposed in https://github.com/sourcegraph/sourcegraph/pull/62263, so that Cody Gateway can depend on it as we start a transition over to Enterprise Portal as the source-or-truth for Cody Gateway access; see the [Linear project](https://linear.app/sourcegraph/project/kr-launch-enterprise-portal-for-cody-gateway-and-cody-analytics-ee5d9ea105c2/overview). This PR implements the data layer by reading directly from the Sourcegraph.com Cloud SQL database, and a subsequent PR https://github.com/sourcegraph/sourcegraph/pull/62771 will expose this via the API and also implement auth; nothing in this PR is used yet. Most things in this PR will be undone by the end of a [follow-up project](https://linear.app/sourcegraph/project/kr-enterprise-portal-manages-all-enterprise-subscriptions-12f1d5047bd2/overview) tentatively slated for completion by end-of-August. ### Query I've opted to write a new query specifically to fetch the data required to fulfill the proposed `CodyAccess` RPCs; the existing queries fetch a lot more than is strictly needed, and often make multiple round trips to the database. The new query fetches everything it needs for get/list in a single round trip. `EXPLAIN ANALYZE` of the new list-all query against the Sourcegraph.com production database indicates this is likely performant enough for our internal-only use cases, especially as this will only be around for a few months. ``` QUERY PLAN --------------------------------------------------------------------------------------------------------------------------------------------------------------------------- HashAggregate (cost=1610.56..1629.45 rows=1511 width=121) (actual time=23.358..24.921 rows=1512 loops=1) Group Key: ps.id -> Hash Left Join (cost=958.18..1585.58 rows=1999 width=1094) (actual time=8.258..12.255 rows=2748 loops=1) Hash Cond: (ps.id = active_license.product_subscription_id) -> Hash Right Join (cost=67.00..689.14 rows=1999 width=956) (actual time=1.098..3.970 rows=2748 loops=1) Hash Cond: (product_licenses.product_subscription_id = ps.id) -> Seq Scan on product_licenses (cost=0.00..616.88 rows=1999 width=919) (actual time=0.015..1.769 rows=2002 loops=1) Filter: (access_token_enabled IS TRUE) Rows Removed by Filter: 1789 -> Hash (cost=48.11..48.11 rows=1511 width=53) (actual time=1.055..1.056 rows=1512 loops=1) Buckets: 2048 Batches: 1 Memory Usage: 93kB -> Seq Scan on product_subscriptions ps (cost=0.00..48.11 rows=1511 width=53) (actual time=0.016..0.552 rows=1512 loops=1) -> Hash (cost=874.39..874.39 rows=1343 width=154) (actual time=7.123..7.125 rows=1343 loops=1) Buckets: 2048 Batches: 1 Memory Usage: 248kB -> Subquery Scan on active_license (cost=842.02..874.39 rows=1343 width=154) (actual time=5.425..6.461 rows=1343 loops=1) -> Unique (cost=842.02..860.96 rows=1343 width=162) (actual time=5.422..6.268 rows=1343 loops=1) -> Sort (cost=842.02..851.49 rows=3788 width=162) (actual time=5.421..5.719 rows=3791 loops=1) Sort Key: product_licenses_1.product_subscription_id, product_licenses_1.created_at DESC Sort Method: quicksort Memory: 1059kB -> Seq Scan on product_licenses product_licenses_1 (cost=0.00..616.88 rows=3788 width=162) (actual time=0.003..1.872 rows=3791 loops=1) Planning Time: 2.266 ms Execution Time: 28.568 ms ``` We noted the lack of index on `product_livenses.subscription_id`, but it doesn't seem to be an issue at this scale, so I've left it as is. ### Pagination After discussing with Erik, we decided there is no need to implement pagination for the list-all RPC yet; a rough upper bound of 1kb per subscription * 1511 rows (see `EXPLAIN ANALYZE` above) is 1.5MB, which is well below the per-message limits we have set for Sourcegraph-internal traffic (40MB), and below the [default 4MB limit](https://pkg.go.dev/google.golang.org/grpc#MaxRecvMsgSize) as well. In https://github.com/sourcegraph/sourcegraph/pull/62771 providing pagination parameters will result in a `CodeUnimplemented` error. We can figure out how we want to implement pagination as part of the [follow-up project](https://linear.app/sourcegraph/project/kr-enterprise-portal-manages-all-enterprise-subscriptions-12f1d5047bd2/overview) to migrate the data to an Enterprise-Portal-owned database. ### Testing A good chunk of this PR's changes are exposing a small set of `cmd/frontend` internals **for testing** via the new `cmd/frontend/dotcomproductsubscriptiontest`: - seeding test databases with subscriptions and licenses - for "regression testing" the new read queries by validating what the new read queries get, against what the existing GraphQL resolvers resolve to. This is important because the GraphQL resolvers has a lot of the override logic See `TestGetCodyGatewayAccessAttributes` for how all this is used. <img width="799" alt="image" src="https://github.com/sourcegraph/sourcegraph/assets/23356519/af4d0c1e-c9a9-448a-9b8e-0f328688a75a"> There is also some hackery involved in setting up a `pgx/v5` connection used in MSP from the `sql.DB` + `pgx/v4` stuff used by `dbtest`; see `newTestDotcomReader` docstrings for details. ## Test plan ``` go test -v ./cmd/enterprise-portal/internal/dotcomdb ``` --- Co-authored-by: Chris Smith <chrsmith@users.noreply.github.com> Co-authored-by: Joe Chen <joe@sourcegraph.com>
2024-05-22 19:56:59 +00:00
enterprise-portal:
cmd: |
feat/enterprise-portal: ConnectRPC layer for {Get/List}CodyGatewayAccess (#62771) This PR exposes the data layer implemented in https://github.com/sourcegraph/sourcegraph/pull/62706 via the Enterprise Portal API. We register the services proposed in #62263 and also set up tooling like gRPC UI locally for DX. Auth is via SAMS M2M; https://github.com/sourcegraph/sourcegraph-accounts-sdk-go/pull/28 and https://github.com/sourcegraph/sourcegraph-accounts/pull/227 rolls out the new scopes, and https://github.com/sourcegraph/managed-services/pull/1474 adds credentials for the enterprise-portal-dev deployment. Closes CORE-112 ## Test plan https://github.com/sourcegraph/sourcegraph/pull/62706 has extensive testing of the data layer, and this PR expands on it a little bit. I tested the RPC layer by hand: Create SAMS client for Enterprise Portal Dev in **accounts.sgdev.org**: ```sh curl -s -X POST \ -H "Authorization: Bearer $MANAGEMENT_SECRET" \ https://accounts.sgdev.org/api/management/v1/identity-provider/clients \ --data '{"name": "enterprise-portal-dev", "scopes": [], "redirect_uris": ["https://enterprise-portal.sgdev.org"]}' | jq ``` Configure `sg.config.overwrite.yaml` ```yaml enterprise-portal: env: SRC_LOG_LEVEL: debug # sams-dev SAMS_URL: https://accounts.sgdev.org ENTERPRISE_PORTAL_SAMS_CLIENT_ID: "sams_cid_..." ENTERPRISE_PORTAL_SAMS_CLIENT_SECRET: "sams_cs_..." ``` Create a test client (later, we will do the same thing for Cody Gateway), also in **accounts.sgdev.org**: ```sh curl -s -X POST \ -H "Authorization: Bearer $MANAGEMENT_SECRET" \ https://accounts.sgdev.org/api/management/v1/identity-provider/clients \ --data '{"name": "enterprise-portal-dev-reader", "scopes": ["enterprise_portal::codyaccess::read", "enterprise_portal::subscription::read"], "redirect_uris": ["https://enterprise-portal.sgdev.org"]}' | jq ``` Then: ``` sg run enterprise-portal ``` Navigate to the locally-enabled gRPC debug UI at http://localhost:6081/debug/grcpui, using https://github.com/sourcegraph/sourcegraph/pull/62883 to get an access token from our test client to add in the request metadata: ```sh sg sams create-client-token -s 'enterprise_portal::codyaccess::read' ``` I'm using some local subscriptions I've made previously in `sg start dotcom`: ![image](https://github.com/sourcegraph/sourcegraph/assets/23356519/a55c6f0d-b0ae-4e68-8e4c-ccb6e2cc442d) ![image](https://github.com/sourcegraph/sourcegraph/assets/23356519/19d18104-1051-4a82-abe0-58010dd13a27) Without a valid authorization header: ![image](https://github.com/sourcegraph/sourcegraph/assets/23356519/c9cf4c89-9902-48f8-ac41-daf9a63ca789) Verified a lookup using the returned access tokens also works --------- Co-authored-by: Jean-Hadrien Chabran <jh@chabran.fr> Co-authored-by: Joe Chen <joe@sourcegraph.com>
2024-05-27 20:39:57 +00:00
# Connect to local development database, with the assumption that it will
# have dotcom database tables.
feat/enterprise-portal: DB layer for {Get/List}CodyGatewayAccess (#62706) Part of CORE-112. We need to implement the `CodyAccess` service proposed in https://github.com/sourcegraph/sourcegraph/pull/62263, so that Cody Gateway can depend on it as we start a transition over to Enterprise Portal as the source-or-truth for Cody Gateway access; see the [Linear project](https://linear.app/sourcegraph/project/kr-launch-enterprise-portal-for-cody-gateway-and-cody-analytics-ee5d9ea105c2/overview). This PR implements the data layer by reading directly from the Sourcegraph.com Cloud SQL database, and a subsequent PR https://github.com/sourcegraph/sourcegraph/pull/62771 will expose this via the API and also implement auth; nothing in this PR is used yet. Most things in this PR will be undone by the end of a [follow-up project](https://linear.app/sourcegraph/project/kr-enterprise-portal-manages-all-enterprise-subscriptions-12f1d5047bd2/overview) tentatively slated for completion by end-of-August. ### Query I've opted to write a new query specifically to fetch the data required to fulfill the proposed `CodyAccess` RPCs; the existing queries fetch a lot more than is strictly needed, and often make multiple round trips to the database. The new query fetches everything it needs for get/list in a single round trip. `EXPLAIN ANALYZE` of the new list-all query against the Sourcegraph.com production database indicates this is likely performant enough for our internal-only use cases, especially as this will only be around for a few months. ``` QUERY PLAN --------------------------------------------------------------------------------------------------------------------------------------------------------------------------- HashAggregate (cost=1610.56..1629.45 rows=1511 width=121) (actual time=23.358..24.921 rows=1512 loops=1) Group Key: ps.id -> Hash Left Join (cost=958.18..1585.58 rows=1999 width=1094) (actual time=8.258..12.255 rows=2748 loops=1) Hash Cond: (ps.id = active_license.product_subscription_id) -> Hash Right Join (cost=67.00..689.14 rows=1999 width=956) (actual time=1.098..3.970 rows=2748 loops=1) Hash Cond: (product_licenses.product_subscription_id = ps.id) -> Seq Scan on product_licenses (cost=0.00..616.88 rows=1999 width=919) (actual time=0.015..1.769 rows=2002 loops=1) Filter: (access_token_enabled IS TRUE) Rows Removed by Filter: 1789 -> Hash (cost=48.11..48.11 rows=1511 width=53) (actual time=1.055..1.056 rows=1512 loops=1) Buckets: 2048 Batches: 1 Memory Usage: 93kB -> Seq Scan on product_subscriptions ps (cost=0.00..48.11 rows=1511 width=53) (actual time=0.016..0.552 rows=1512 loops=1) -> Hash (cost=874.39..874.39 rows=1343 width=154) (actual time=7.123..7.125 rows=1343 loops=1) Buckets: 2048 Batches: 1 Memory Usage: 248kB -> Subquery Scan on active_license (cost=842.02..874.39 rows=1343 width=154) (actual time=5.425..6.461 rows=1343 loops=1) -> Unique (cost=842.02..860.96 rows=1343 width=162) (actual time=5.422..6.268 rows=1343 loops=1) -> Sort (cost=842.02..851.49 rows=3788 width=162) (actual time=5.421..5.719 rows=3791 loops=1) Sort Key: product_licenses_1.product_subscription_id, product_licenses_1.created_at DESC Sort Method: quicksort Memory: 1059kB -> Seq Scan on product_licenses product_licenses_1 (cost=0.00..616.88 rows=3788 width=162) (actual time=0.003..1.872 rows=3791 loops=1) Planning Time: 2.266 ms Execution Time: 28.568 ms ``` We noted the lack of index on `product_livenses.subscription_id`, but it doesn't seem to be an issue at this scale, so I've left it as is. ### Pagination After discussing with Erik, we decided there is no need to implement pagination for the list-all RPC yet; a rough upper bound of 1kb per subscription * 1511 rows (see `EXPLAIN ANALYZE` above) is 1.5MB, which is well below the per-message limits we have set for Sourcegraph-internal traffic (40MB), and below the [default 4MB limit](https://pkg.go.dev/google.golang.org/grpc#MaxRecvMsgSize) as well. In https://github.com/sourcegraph/sourcegraph/pull/62771 providing pagination parameters will result in a `CodeUnimplemented` error. We can figure out how we want to implement pagination as part of the [follow-up project](https://linear.app/sourcegraph/project/kr-enterprise-portal-manages-all-enterprise-subscriptions-12f1d5047bd2/overview) to migrate the data to an Enterprise-Portal-owned database. ### Testing A good chunk of this PR's changes are exposing a small set of `cmd/frontend` internals **for testing** via the new `cmd/frontend/dotcomproductsubscriptiontest`: - seeding test databases with subscriptions and licenses - for "regression testing" the new read queries by validating what the new read queries get, against what the existing GraphQL resolvers resolve to. This is important because the GraphQL resolvers has a lot of the override logic See `TestGetCodyGatewayAccessAttributes` for how all this is used. <img width="799" alt="image" src="https://github.com/sourcegraph/sourcegraph/assets/23356519/af4d0c1e-c9a9-448a-9b8e-0f328688a75a"> There is also some hackery involved in setting up a `pgx/v5` connection used in MSP from the `sql.DB` + `pgx/v4` stuff used by `dbtest`; see `newTestDotcomReader` docstrings for details. ## Test plan ``` go test -v ./cmd/enterprise-portal/internal/dotcomdb ``` --- Co-authored-by: Chris Smith <chrsmith@users.noreply.github.com> Co-authored-by: Joe Chen <joe@sourcegraph.com>
2024-05-22 19:56:59 +00:00
export DOTCOM_PGDSN_OVERRIDE="postgres://$PGUSER:$PGPASSWORD@$PGHOST:$PGPORT/$PGDATABASE?sslmode=$PGSSLMODE"
.bin/enterprise-portal
install: |
if [ -n "$DELVE" ]; then
export GCFLAGS='all=-N -l'
fi
go build -gcflags="$GCFLAGS" -o .bin/enterprise-portal github.com/sourcegraph/sourcegraph/cmd/enterprise-portal
checkBinary: .bin/enterprise-portal
env:
PORT: '6081'
DIAGNOSTICS_SECRET: sekret
SRC_LOG_LEVEL: debug
feat/enterprise-portal: ConnectRPC layer for {Get/List}CodyGatewayAccess (#62771) This PR exposes the data layer implemented in https://github.com/sourcegraph/sourcegraph/pull/62706 via the Enterprise Portal API. We register the services proposed in #62263 and also set up tooling like gRPC UI locally for DX. Auth is via SAMS M2M; https://github.com/sourcegraph/sourcegraph-accounts-sdk-go/pull/28 and https://github.com/sourcegraph/sourcegraph-accounts/pull/227 rolls out the new scopes, and https://github.com/sourcegraph/managed-services/pull/1474 adds credentials for the enterprise-portal-dev deployment. Closes CORE-112 ## Test plan https://github.com/sourcegraph/sourcegraph/pull/62706 has extensive testing of the data layer, and this PR expands on it a little bit. I tested the RPC layer by hand: Create SAMS client for Enterprise Portal Dev in **accounts.sgdev.org**: ```sh curl -s -X POST \ -H "Authorization: Bearer $MANAGEMENT_SECRET" \ https://accounts.sgdev.org/api/management/v1/identity-provider/clients \ --data '{"name": "enterprise-portal-dev", "scopes": [], "redirect_uris": ["https://enterprise-portal.sgdev.org"]}' | jq ``` Configure `sg.config.overwrite.yaml` ```yaml enterprise-portal: env: SRC_LOG_LEVEL: debug # sams-dev SAMS_URL: https://accounts.sgdev.org ENTERPRISE_PORTAL_SAMS_CLIENT_ID: "sams_cid_..." ENTERPRISE_PORTAL_SAMS_CLIENT_SECRET: "sams_cs_..." ``` Create a test client (later, we will do the same thing for Cody Gateway), also in **accounts.sgdev.org**: ```sh curl -s -X POST \ -H "Authorization: Bearer $MANAGEMENT_SECRET" \ https://accounts.sgdev.org/api/management/v1/identity-provider/clients \ --data '{"name": "enterprise-portal-dev-reader", "scopes": ["enterprise_portal::codyaccess::read", "enterprise_portal::subscription::read"], "redirect_uris": ["https://enterprise-portal.sgdev.org"]}' | jq ``` Then: ``` sg run enterprise-portal ``` Navigate to the locally-enabled gRPC debug UI at http://localhost:6081/debug/grcpui, using https://github.com/sourcegraph/sourcegraph/pull/62883 to get an access token from our test client to add in the request metadata: ```sh sg sams create-client-token -s 'enterprise_portal::codyaccess::read' ``` I'm using some local subscriptions I've made previously in `sg start dotcom`: ![image](https://github.com/sourcegraph/sourcegraph/assets/23356519/a55c6f0d-b0ae-4e68-8e4c-ccb6e2cc442d) ![image](https://github.com/sourcegraph/sourcegraph/assets/23356519/19d18104-1051-4a82-abe0-58010dd13a27) Without a valid authorization header: ![image](https://github.com/sourcegraph/sourcegraph/assets/23356519/c9cf4c89-9902-48f8-ac41-daf9a63ca789) Verified a lookup using the returned access tokens also works --------- Co-authored-by: Jean-Hadrien Chabran <jh@chabran.fr> Co-authored-by: Joe Chen <joe@sourcegraph.com>
2024-05-27 20:39:57 +00:00
GRPC_WEB_UI_ENABLED: 'true'
# Connects to local database, so include all licenses from local DB
DOTCOM_INCLUDE_PRODUCTION_LICENSES: 'true'
feat/enterprise-portal: ConnectRPC layer for {Get/List}CodyGatewayAccess (#62771) This PR exposes the data layer implemented in https://github.com/sourcegraph/sourcegraph/pull/62706 via the Enterprise Portal API. We register the services proposed in #62263 and also set up tooling like gRPC UI locally for DX. Auth is via SAMS M2M; https://github.com/sourcegraph/sourcegraph-accounts-sdk-go/pull/28 and https://github.com/sourcegraph/sourcegraph-accounts/pull/227 rolls out the new scopes, and https://github.com/sourcegraph/managed-services/pull/1474 adds credentials for the enterprise-portal-dev deployment. Closes CORE-112 ## Test plan https://github.com/sourcegraph/sourcegraph/pull/62706 has extensive testing of the data layer, and this PR expands on it a little bit. I tested the RPC layer by hand: Create SAMS client for Enterprise Portal Dev in **accounts.sgdev.org**: ```sh curl -s -X POST \ -H "Authorization: Bearer $MANAGEMENT_SECRET" \ https://accounts.sgdev.org/api/management/v1/identity-provider/clients \ --data '{"name": "enterprise-portal-dev", "scopes": [], "redirect_uris": ["https://enterprise-portal.sgdev.org"]}' | jq ``` Configure `sg.config.overwrite.yaml` ```yaml enterprise-portal: env: SRC_LOG_LEVEL: debug # sams-dev SAMS_URL: https://accounts.sgdev.org ENTERPRISE_PORTAL_SAMS_CLIENT_ID: "sams_cid_..." ENTERPRISE_PORTAL_SAMS_CLIENT_SECRET: "sams_cs_..." ``` Create a test client (later, we will do the same thing for Cody Gateway), also in **accounts.sgdev.org**: ```sh curl -s -X POST \ -H "Authorization: Bearer $MANAGEMENT_SECRET" \ https://accounts.sgdev.org/api/management/v1/identity-provider/clients \ --data '{"name": "enterprise-portal-dev-reader", "scopes": ["enterprise_portal::codyaccess::read", "enterprise_portal::subscription::read"], "redirect_uris": ["https://enterprise-portal.sgdev.org"]}' | jq ``` Then: ``` sg run enterprise-portal ``` Navigate to the locally-enabled gRPC debug UI at http://localhost:6081/debug/grcpui, using https://github.com/sourcegraph/sourcegraph/pull/62883 to get an access token from our test client to add in the request metadata: ```sh sg sams create-client-token -s 'enterprise_portal::codyaccess::read' ``` I'm using some local subscriptions I've made previously in `sg start dotcom`: ![image](https://github.com/sourcegraph/sourcegraph/assets/23356519/a55c6f0d-b0ae-4e68-8e4c-ccb6e2cc442d) ![image](https://github.com/sourcegraph/sourcegraph/assets/23356519/19d18104-1051-4a82-abe0-58010dd13a27) Without a valid authorization header: ![image](https://github.com/sourcegraph/sourcegraph/assets/23356519/c9cf4c89-9902-48f8-ac41-daf9a63ca789) Verified a lookup using the returned access tokens also works --------- Co-authored-by: Jean-Hadrien Chabran <jh@chabran.fr> Co-authored-by: Joe Chen <joe@sourcegraph.com>
2024-05-27 20:39:57 +00:00
# Used for authentication
SAMS_URL: https://accounts.sgdev.org
# client name: 'enterprise-portal-local-dev'
ENTERPRISE_PORTAL_SAMS_CLIENT_ID: "sams_cid_018fc125-5a92-70fa-8dee-2c6df3adc100"
externalSecrets:
ENTERPRISE_PORTAL_SAMS_CLIENT_SECRET:
project: sourcegraph-local-dev
name: ENTERPRISE_PORTAL_LOCAL_SAMS_CLIENT_SECRET
feat/enterprise-portal: DB layer for {Get/List}CodyGatewayAccess (#62706) Part of CORE-112. We need to implement the `CodyAccess` service proposed in https://github.com/sourcegraph/sourcegraph/pull/62263, so that Cody Gateway can depend on it as we start a transition over to Enterprise Portal as the source-or-truth for Cody Gateway access; see the [Linear project](https://linear.app/sourcegraph/project/kr-launch-enterprise-portal-for-cody-gateway-and-cody-analytics-ee5d9ea105c2/overview). This PR implements the data layer by reading directly from the Sourcegraph.com Cloud SQL database, and a subsequent PR https://github.com/sourcegraph/sourcegraph/pull/62771 will expose this via the API and also implement auth; nothing in this PR is used yet. Most things in this PR will be undone by the end of a [follow-up project](https://linear.app/sourcegraph/project/kr-enterprise-portal-manages-all-enterprise-subscriptions-12f1d5047bd2/overview) tentatively slated for completion by end-of-August. ### Query I've opted to write a new query specifically to fetch the data required to fulfill the proposed `CodyAccess` RPCs; the existing queries fetch a lot more than is strictly needed, and often make multiple round trips to the database. The new query fetches everything it needs for get/list in a single round trip. `EXPLAIN ANALYZE` of the new list-all query against the Sourcegraph.com production database indicates this is likely performant enough for our internal-only use cases, especially as this will only be around for a few months. ``` QUERY PLAN --------------------------------------------------------------------------------------------------------------------------------------------------------------------------- HashAggregate (cost=1610.56..1629.45 rows=1511 width=121) (actual time=23.358..24.921 rows=1512 loops=1) Group Key: ps.id -> Hash Left Join (cost=958.18..1585.58 rows=1999 width=1094) (actual time=8.258..12.255 rows=2748 loops=1) Hash Cond: (ps.id = active_license.product_subscription_id) -> Hash Right Join (cost=67.00..689.14 rows=1999 width=956) (actual time=1.098..3.970 rows=2748 loops=1) Hash Cond: (product_licenses.product_subscription_id = ps.id) -> Seq Scan on product_licenses (cost=0.00..616.88 rows=1999 width=919) (actual time=0.015..1.769 rows=2002 loops=1) Filter: (access_token_enabled IS TRUE) Rows Removed by Filter: 1789 -> Hash (cost=48.11..48.11 rows=1511 width=53) (actual time=1.055..1.056 rows=1512 loops=1) Buckets: 2048 Batches: 1 Memory Usage: 93kB -> Seq Scan on product_subscriptions ps (cost=0.00..48.11 rows=1511 width=53) (actual time=0.016..0.552 rows=1512 loops=1) -> Hash (cost=874.39..874.39 rows=1343 width=154) (actual time=7.123..7.125 rows=1343 loops=1) Buckets: 2048 Batches: 1 Memory Usage: 248kB -> Subquery Scan on active_license (cost=842.02..874.39 rows=1343 width=154) (actual time=5.425..6.461 rows=1343 loops=1) -> Unique (cost=842.02..860.96 rows=1343 width=162) (actual time=5.422..6.268 rows=1343 loops=1) -> Sort (cost=842.02..851.49 rows=3788 width=162) (actual time=5.421..5.719 rows=3791 loops=1) Sort Key: product_licenses_1.product_subscription_id, product_licenses_1.created_at DESC Sort Method: quicksort Memory: 1059kB -> Seq Scan on product_licenses product_licenses_1 (cost=0.00..616.88 rows=3788 width=162) (actual time=0.003..1.872 rows=3791 loops=1) Planning Time: 2.266 ms Execution Time: 28.568 ms ``` We noted the lack of index on `product_livenses.subscription_id`, but it doesn't seem to be an issue at this scale, so I've left it as is. ### Pagination After discussing with Erik, we decided there is no need to implement pagination for the list-all RPC yet; a rough upper bound of 1kb per subscription * 1511 rows (see `EXPLAIN ANALYZE` above) is 1.5MB, which is well below the per-message limits we have set for Sourcegraph-internal traffic (40MB), and below the [default 4MB limit](https://pkg.go.dev/google.golang.org/grpc#MaxRecvMsgSize) as well. In https://github.com/sourcegraph/sourcegraph/pull/62771 providing pagination parameters will result in a `CodeUnimplemented` error. We can figure out how we want to implement pagination as part of the [follow-up project](https://linear.app/sourcegraph/project/kr-enterprise-portal-manages-all-enterprise-subscriptions-12f1d5047bd2/overview) to migrate the data to an Enterprise-Portal-owned database. ### Testing A good chunk of this PR's changes are exposing a small set of `cmd/frontend` internals **for testing** via the new `cmd/frontend/dotcomproductsubscriptiontest`: - seeding test databases with subscriptions and licenses - for "regression testing" the new read queries by validating what the new read queries get, against what the existing GraphQL resolvers resolve to. This is important because the GraphQL resolvers has a lot of the override logic See `TestGetCodyGatewayAccessAttributes` for how all this is used. <img width="799" alt="image" src="https://github.com/sourcegraph/sourcegraph/assets/23356519/af4d0c1e-c9a9-448a-9b8e-0f328688a75a"> There is also some hackery involved in setting up a `pgx/v5` connection used in MSP from the `sql.DB` + `pgx/v4` stuff used by `dbtest`; see `newTestDotcomReader` docstrings for details. ## Test plan ``` go test -v ./cmd/enterprise-portal/internal/dotcomdb ``` --- Co-authored-by: Chris Smith <chrsmith@users.noreply.github.com> Co-authored-by: Joe Chen <joe@sourcegraph.com>
2024-05-22 19:56:59 +00:00
watch:
- lib
- cmd/enterprise-portal
searcher:
cmd: .bin/searcher
2021-10-09 01:47:08 +00:00
install: |
if [ -n "$DELVE" ]; then
2023-09-12 10:10:10 +00:00
export GCFLAGS='all=-N -l'
2021-10-09 01:47:08 +00:00
fi
go build -gcflags="$GCFLAGS" -o .bin/searcher github.com/sourcegraph/sourcegraph/cmd/searcher
checkBinary: .bin/searcher
watch:
- lib
- internal
- cmd/searcher
caddy:
ignoreStdout: true
ignoreStderr: true
cmd: .bin/caddy_${CADDY_VERSION} run --watch --config=dev/Caddyfile
install_func: installCaddy
env:
CADDY_VERSION: 2.7.3
web:
description: Enterprise version of the web app
cmd: pnpm --filter @sourcegraph/web dev
install: |
pnpm install
pnpm run generate
env:
ENABLE_OPEN_TELEMETRY: true
# Needed so that node can ping the caddy server
NODE_TLS_REJECT_UNAUTHORIZED: 0
web-sveltekit:
description: Enterprise version of the web sveltekit app
cmd: pnpm --filter @sourcegraph/web-sveltekit dev:enterprise
install: |
pnpm install
web-standalone-http:
description: Standalone web frontend (dev) with API proxy to a configurable URL
cmd: pnpm --filter @sourcegraph/web serve:dev --color
install: |
pnpm install
pnpm run generate
env:
WEB_BUILDER_SERVE_INDEX: true
SOURCEGRAPH_API_URL: https://sourcegraph.sourcegraph.com
web-standalone-http-prod:
description: Standalone web frontend (production) with API proxy to a configurable URL
cmd: pnpm --filter @sourcegraph/web serve:prod
install: pnpm --filter @sourcegraph/web run build
env:
NODE_ENV: production
WEB_BUILDER_SERVE_INDEX: true
SOURCEGRAPH_API_URL: https://k8s.sgdev.org
web-integration-build:
description: Build development web application for integration tests
cmd: pnpm --filter @sourcegraph/web run build
env:
INTEGRATION_TESTS: true
web-integration-build-prod:
description: Build production web application for integration tests
cmd: pnpm --filter @sourcegraph/web run build
env:
INTEGRATION_TESTS: true
NODE_ENV: production
Enable bazel for web-sveltekit and make it available in production builds (#55177) ## Bazel It took me some time to figure out how to make it work. I don't claim that this is the best setup (because I don't really have an idea what I'm doing here), I just tried to get things working. The main issues had been around loading the generate `client/*` packages and importing `*.scss` files. ### Loading `@sourcegraph/*` packages Unlike our other build tools, SvelteKit/vite load the application into Node for pre-rendering. This happens regardless whether pre-rendering/server-side-rendering is enabled or not (these settings live in the source code because they can be enabled/disabled per route). Long story short I had to configure vite to also process any `@sourcegraph/*` packages in order to make them compatible with node. You might wonder why that's not necessary when running vite directly in the repo? In the repo the `@sourcegraph/*` dependencies are all links to the corresponding `client/*` packages. Vite detects that and automatically treats them as "sources", not dependencies. ### SCSS files Somewhat related to the previous point, the built `@sourcegraph/*` packages do not contain any source SCSS files, only the generated CSS files. So importing SCSS files via `@sourcegraph/.../....scss` doesn't work. Furthermore, the generate code in the packages themselves import SCSS files, which also doesn't work. The "fix" for this is to rewrite any `*.scss` file imports to `*.css` file imports, but only inside those packages or only referencing files inside those packages. That's what we do in our `webpack.bazel.config.js` file as well. However, for global styles we need the SCSS files. I added a new target for copying those to the sandbox. --- Additionally this PR makes the following changes: - Rearrange style imports to remove unnecessary duplication and reduce the number of callsites that import from `@sourcegraph/*` packages. - Remove React integration with Notebooks and Insights. It was broken anyway at the moment and removing it reduces the number of dependencies and therefore points of failure. - Added a new target to copy the image files used by the prototype into the sandbox. - Disables gazelle for the sveltekit package for the time being. Type checking won't pass anyway because the code in the other client packages don't follow the same restrictions as `client/web-sveltekit`. - Updated the main header and dev server to proxy requests for notebooks, code insights and user settings to sourcegraph.com. ## Production build integration These changes make it possible to serve the SvelteKit version for search and repo pages when the `enable-sveltekit` feature flag is turned on. I aimed to make as few changes to the existing routing and handler code as possible to - server the SvelteKit index page for search and repo routes - make all other SvelteKit assets accessible via the `/.assets/` route In a nutshell, this is how it works now: - When building for development, the SvelteKit build process will output its files to `ui/assets/`, the same folder where webpack puts its files. To avoid conflicts with webpack generated files, all SvelteKit files are put in a subdirectory. - For production something similar happens except that bazel will copy all the files into a target directory - When accessing a search or repo route, we check, just before the response is rendered, whether to render the SvelteKit version or the React version. The challenge here was that we use the same handler for a lot of routes. `sveltekit.go` maintains a separate list of routes for which the SvelteKit version is available. This way I only had to add a check to three handler functions. And of course the feature flag must be enabled for the user/instance. - Because the SvelteKit files are stored in the same location as the webkit ones, serving those files via the `/.assets/` route "just works". Well, mostly. In order for the SvelteKit page to use the correct root-relative path I had to create a custom adapater, `sgAdapter`, which updates the URLs in the index page accordingly (I tried a lot of other approaches, some would have required changes to the assets handler... this was the more "contained" solution). Caveat: This is not ready to be officially tested: - Navigating between the React and SvelteKit versions does not always work as expected. Because of client side routing, navigating to e.g. the search page from the React app will load the React version of the site. The client side code needs to be updated to enforce a server refresh. I'll look into that in a future PR. - The SvelteKit version is relatively incomplete. Code intel, new search input, repo settings pages, etc are all missing. Most of this work is tracked in #55027. But before we spend more time getting things feature complete we want to do limited testing with the prototype in prod. - I wasn't able to get SvelteKit rebuilding to work reliably with `sg start enterprise-bazel`. For now it only builds the files once at start so that they exist. I'll look into improving the developer experience when running the full server locally in the future. For now, running `sg start web-sveltekit-standalone` is good enough. - Switching between the React and the SvelteKit version is definitely noticeable during development. I suspect to be faster in production (React is faster in production). Whether or not we go this route remains to be seen. Maybe we are embedding React pages into SvelteKit instead. At this point we just need to try how SvelteKit feels in production. - The SvelteKit `index.html` page lacks many things that the React `app.html` file has (e.g. preview links, analytics, observeability, etc). These have to be added eventually, but those are not necessary either for this initial test.
2023-10-16 12:15:59 +00:00
web-sveltekit-standalone:
description: Standalone SvelteKit web frontend (dev) with API proxy to a configurable URL
cmd: pnpm --filter @sourcegraph/web-sveltekit run dev
install: |
pnpm install
pnpm generate
web-sveltekit-prod-watch:
description: Builds the prod version of the SvelteKit web app and rebuilds on changes
cmd: pnpm --filter @sourcegraph/web-sveltekit run build --watch
install: |
pnpm install
pnpm generate
docsite:
description: Docsite instance serving the docs
env:
RUN_SCRIPT_NAME: .bin/bazel_run_docsite.sh
cmd: |
# We tell bazel to write out a script to run docsite and run that script via sg otherwise
# when we get a SIGINT ... bazel gets killed but docsite doesn't get killed properly. So we use --script_path
# which tells bazel to write out a script to run docsite, and let sg run that script rather, which means
# any signal gets propagated and docsite gets properly terminated.
#
# We also specifically put this in .bin, since that directory is gitignored, otherwise the run script is left
# around and currently there is no clean way to remove it - even using a bash trap doesn't work, since the trap
# never gets executed due to sg running the script.
bazel run --script_path=${RUN_SCRIPT_NAME} --noshow_progress --noshow_loading_progress //doc:serve
./${RUN_SCRIPT_NAME}
syntax-highlighter:
ignoreStdout: true
ignoreStderr: true
cmd: |
docker run --name=syntax-highlighter --rm -p9238:9238 \
-e WORKERS=1 -e ROCKET_ADDRESS=0.0.0.0 \
sourcegraph/syntax-highlighter:insiders
install: |
# Remove containers by the old name, too.
docker inspect syntect_server >/dev/null 2>&1 && docker rm -f syntect_server || true
docker inspect syntax-highlighter >/dev/null 2>&1 && docker rm -f syntax-highlighter || true
# Pull syntax-highlighter latest insider image, only during install, but
# skip if OFFLINE=true is set.
if [[ "$OFFLINE" != "true" ]]; then
docker pull -q sourcegraph/syntax-highlighter:insiders
fi
zoekt-indexserver-template: &zoekt_indexserver_template
cmd: |
env PATH="${PWD}/.bin:$PATH" .bin/zoekt-sourcegraph-indexserver \
-sourcegraph_url 'http://localhost:3090' \
-index "$HOME/.sourcegraph/zoekt/index-$ZOEKT_NUM" \
-hostname "localhost:$ZOEKT_HOSTNAME_PORT" \
-interval 1m \
-listen "127.0.0.1:$ZOEKT_LISTEN_PORT" \
-cpu_fraction 0.25
install: |
if [ -n "$DELVE" ]; then
export GCFLAGS='all=-N -l'
fi
mkdir -p .bin
export GOBIN="${PWD}/.bin"
go install -gcflags="$GCFLAGS" github.com/sourcegraph/zoekt/cmd/zoekt-archive-index
go install -gcflags="$GCFLAGS" github.com/sourcegraph/zoekt/cmd/zoekt-git-index
go install -gcflags="$GCFLAGS" github.com/sourcegraph/zoekt/cmd/zoekt-sourcegraph-indexserver
checkBinary: .bin/zoekt-sourcegraph-indexserver
env: &zoektenv
CTAGS_COMMAND: dev/universal-ctags-dev
SCIP_CTAGS_COMMAND: dev/scip-ctags-dev
GRPC_ENABLED: true
zoekt-index-0:
<<: *zoekt_indexserver_template
env:
<<: *zoektenv
ZOEKT_NUM: 0
ZOEKT_HOSTNAME_PORT: 3070
ZOEKT_LISTEN_PORT: 6072
zoekt-index-1:
<<: *zoekt_indexserver_template
env:
<<: *zoektenv
ZOEKT_NUM: 1
ZOEKT_HOSTNAME_PORT: 3071
ZOEKT_LISTEN_PORT: 6073
zoekt-web-template: &zoekt_webserver_template
install: |
if [ -n "$DELVE" ]; then
export GCFLAGS='all=-N -l'
fi
mkdir -p .bin
env GOBIN="${PWD}/.bin" go install -gcflags="$GCFLAGS" github.com/sourcegraph/zoekt/cmd/zoekt-webserver
checkBinary: .bin/zoekt-webserver
env:
JAEGER_DISABLED: true
OPENTELEMETRY_DISABLED: false
GOGC: 25
zoekt-web-0:
<<: *zoekt_webserver_template
cmd: env PATH="${PWD}/.bin:$PATH" .bin/zoekt-webserver -index "$HOME/.sourcegraph/zoekt/index-0" -pprof -rpc -indexserver_proxy -listen "127.0.0.1:3070"
zoekt-web-1:
<<: *zoekt_webserver_template
cmd: env PATH="${PWD}/.bin:$PATH" .bin/zoekt-webserver -index "$HOME/.sourcegraph/zoekt/index-1" -pprof -rpc -indexserver_proxy -listen "127.0.0.1:3071"
codeintel-worker:
cmd: |
export SOURCEGRAPH_LICENSE_GENERATION_KEY=$(cat ../dev-private/enterprise/dev/test-license-generation-key.pem)
.bin/codeintel-worker
install: |
2021-10-09 01:47:08 +00:00
if [ -n "$DELVE" ]; then
2023-09-12 10:10:10 +00:00
export GCFLAGS='all=-N -l'
2021-10-09 01:47:08 +00:00
fi
go build -gcflags="$GCFLAGS" -o .bin/codeintel-worker github.com/sourcegraph/sourcegraph/cmd/precise-code-intel-worker
checkBinary: .bin/codeintel-worker
watch:
- lib
- internal
- cmd/precise-code-intel-worker
- lib/codeintel
syntactic-codeintel-worker-template: &syntactic_codeintel_worker_template
cmd: |
export SOURCEGRAPH_LICENSE_GENERATION_KEY=$(cat ../dev-private/enterprise/dev/test-license-generation-key.pem)
.bin/syntactic-code-intel-worker
install: |
if [ -n "$DELVE" ]; then
export GCFLAGS='all=-N -l'
fi
if [ ! -f $(./dev/scip-syntax-install.sh which) ]; then
echo "Building scip-syntax"
./dev/scip-syntax-install.sh
fi
echo "Building codeintel-outkline-scip-worker"
go build -gcflags="$GCFLAGS" -o .bin/syntactic-code-intel-worker github.com/sourcegraph/sourcegraph/cmd/syntactic-code-intel-worker
checkBinary: .bin/syntactic-code-intel-worker
watch:
- lib
- internal
- cmd/syntactic-code-intel-worker
- lib/codeintel
env:
SCIP_SYNTAX_PATH: dev/scip-syntax-dev
syntactic-code-intel-worker-0:
<<: *syntactic_codeintel_worker_template
env:
SYNTACTIC_CODE_INTEL_WORKER_ADDR: 127.0.0.1:6075
syntactic-code-intel-worker-1:
<<: *syntactic_codeintel_worker_template
cmd: |
export SOURCEGRAPH_LICENSE_GENERATION_KEY=$(cat ../dev-private/enterprise/dev/test-license-generation-key.pem)
.bin/syntactic-code-intel-worker
env:
SYNTACTIC_CODE_INTEL_WORKER_ADDR: 127.0.0.1:6076
executor-template:
&executor_template # TMPDIR is set here so it's not set in the `install` process, which would trip up `go build`.
cmd: |
env TMPDIR="$HOME/.sourcegraph/executor-temp" .bin/executor
install: |
2021-10-09 01:47:08 +00:00
if [ -n "$DELVE" ]; then
2023-09-12 10:10:10 +00:00
export GCFLAGS='all=-N -l'
2021-10-09 01:47:08 +00:00
fi
go build -gcflags="$GCFLAGS" -o .bin/executor github.com/sourcegraph/sourcegraph/cmd/executor
checkBinary: .bin/executor
env:
# Required for frontend and executor to communicate
EXECUTOR_FRONTEND_URL: http://localhost:3080
# Must match the secret defined in the site config.
EXECUTOR_FRONTEND_PASSWORD: hunter2hunter2hunter2
# Disable firecracker inside executor in dev
EXECUTOR_USE_FIRECRACKER: false
EXECUTOR_QUEUE_NAME: TEMPLATE
watch:
- lib
- internal
- cmd/executor
2023-06-04 22:30:05 +00:00
executor-kubernetes-template: &executor_kubernetes_template
cmd: |
cd $MANIFEST_PATH
cleanup() {
executor: Single K8s Job (#53311) Part of #50601 (not the actual full implementation as corners were cut). ## Changes - Created a specific package to handle workspace files - Makes it easier to have same functionality across "modes" - Moved `workspace.FileStore` to `files.Store` - Created env vars to configuring the single job - The volume can be a dynamically created PVC or an `emptyDir` with a size limit - Reading of container logs is now kicked off during the pod watching. - This makes it easier to have shared functionality between the different modes - Can kick off logs based on the container status vs waiting - For the single job, a the job token is put into a secret that is created and deleted for the job - Refactored the Command Logger into a separate package to avoid cyclical imports - Will also make it easier to use in a custom image ### Important Changes - `enterprise/cmd/executor/internal/worker/files/files.go` - `enterprise/cmd/executor/internal/worker/runtime/kubernetes.go` - `enterprise/cmd/executor/internal/worker/runner/kubernetes.go` - `enterprise/cmd/executor/internal/worker/command/kubernetes.go` ### The Good - Handles step skipping - Handle workspace files - Handles env outputs - Handles previous step outputs - Hidden behind an environment variables. So can be turned off and on - Refactoring has been done to make it easier to extract functionality into another image to do this instead of hardcoding into the init containers ### The Bad - Using `batcheshlper` to run the setup `initContainer` and the `main` container - This image comes with `git`, so it works out well for the cloning - This image is already (probably) already being pulled in, so one less thing for a user to bring in - There is some hardcoded nonsense (e.g. step skipping) - Cannot handle large workspace files (e.g. binaries) ## Test plan - Updated existing tests - Added new tests - Ran tests with `docker` runtime - Ran tests with `kubernetes` runtime - Tested existing functionality - Tested single job functionality
2023-06-15 17:47:57 +00:00
kubectl delete jobs --all
kubectl delete -f .
}
kubectl delete -f . --ignore-not-found
kubectl apply -f .
executor: Single K8s Job (#53311) Part of #50601 (not the actual full implementation as corners were cut). ## Changes - Created a specific package to handle workspace files - Makes it easier to have same functionality across "modes" - Moved `workspace.FileStore` to `files.Store` - Created env vars to configuring the single job - The volume can be a dynamically created PVC or an `emptyDir` with a size limit - Reading of container logs is now kicked off during the pod watching. - This makes it easier to have shared functionality between the different modes - Can kick off logs based on the container status vs waiting - For the single job, a the job token is put into a secret that is created and deleted for the job - Refactored the Command Logger into a separate package to avoid cyclical imports - Will also make it easier to use in a custom image ### Important Changes - `enterprise/cmd/executor/internal/worker/files/files.go` - `enterprise/cmd/executor/internal/worker/runtime/kubernetes.go` - `enterprise/cmd/executor/internal/worker/runner/kubernetes.go` - `enterprise/cmd/executor/internal/worker/command/kubernetes.go` ### The Good - Handles step skipping - Handle workspace files - Handles env outputs - Handles previous step outputs - Hidden behind an environment variables. So can be turned off and on - Refactoring has been done to make it easier to extract functionality into another image to do this instead of hardcoding into the init containers ### The Bad - Using `batcheshlper` to run the setup `initContainer` and the `main` container - This image comes with `git`, so it works out well for the cloning - This image is already (probably) already being pulled in, so one less thing for a user to bring in - There is some hardcoded nonsense (e.g. step skipping) - Cannot handle large workspace files (e.g. binaries) ## Test plan - Updated existing tests - Added new tests - Ran tests with `docker` runtime - Ran tests with `kubernetes` runtime - Tested existing functionality - Tested single job functionality
2023-06-15 17:47:57 +00:00
trap cleanup EXIT SIGINT
while true; do
sleep 1
done
install: |
bazel run //cmd/executor-kubernetes:image_tarball
env:
IMAGE: executor-kubernetes:candidate
# TODO: This is required but should only be set on M1 Macs.
PLATFORM: linux/arm64
watch:
- lib
- internal
- cmd/executor
codeintel-executor:
<<: *executor_template
cmd: |
env TMPDIR="$HOME/.sourcegraph/indexer-temp" .bin/executor
env:
EXECUTOR_QUEUE_NAME: codeintel
# If you want to use this, either start it with `sg run batches-executor-firecracker` or
# modify the `commandsets.batches` in your local `sg.config.overwrite.yaml`
codeintel-executor-firecracker:
<<: *executor_template
cmd: |
env TMPDIR="$HOME/.sourcegraph/codeintel-executor-temp" \
sudo --preserve-env=TMPDIR,EXECUTOR_QUEUE_NAME,EXECUTOR_FRONTEND_URL,EXECUTOR_FRONTEND_PASSWORD,EXECUTOR_USE_FIRECRACKER \
.bin/executor
env:
EXECUTOR_USE_FIRECRACKER: true
EXECUTOR_QUEUE_NAME: codeintel
codeintel-executor-kubernetes:
<<: *executor_kubernetes_template
env:
MANIFEST_PATH: ./cmd/executor/kubernetes/codeintel
batches-executor:
<<: *executor_template
cmd: |
env TMPDIR="$HOME/.sourcegraph/batches-executor-temp" .bin/executor
env:
2021-09-22 10:03:57 +00:00
EXECUTOR_QUEUE_NAME: batches
EXECUTOR_MAXIMUM_NUM_JOBS: 8
# If you want to use this, either start it with `sg run batches-executor-firecracker` or
# modify the `commandsets.batches` in your local `sg.config.overwrite.yaml`
batches-executor-firecracker:
<<: *executor_template
cmd: |
env TMPDIR="$HOME/.sourcegraph/batches-executor-temp" \
sudo --preserve-env=TMPDIR,EXECUTOR_QUEUE_NAME,EXECUTOR_FRONTEND_URL,EXECUTOR_FRONTEND_PASSWORD,EXECUTOR_USE_FIRECRACKER \
.bin/executor
env:
EXECUTOR_USE_FIRECRACKER: true
EXECUTOR_QUEUE_NAME: batches
batches-executor-kubernetes:
<<: *executor_kubernetes_template
env:
MANIFEST_PATH: ./cmd/executor/kubernetes/batches
Experiment: Natively run SSBC in docker (#44034) This adds an experimental code path that I will use to test a docker-only execution mode for server-side batch changes. This code path is never executed for customers until we make the switch when we deem it ready. This will allow me to dogfood this while it's not available to customer instances yet. Ultimately, the goal of this is to make executors simply be "the job runner platform through a generic interface". Today, this depends on src-cli to do a good bunch of the work. This is a blocker for going full docker-based with executors, which will ultimately be a requirement on the road to k8s-based executors. As this removes the dependency on src-cli, nothing but the job interface and API endpoints tie executor and Sourcegraph instance together. Ultimately, this will allow us to support larger version spans between the two (pending executors going GA and being feature-complete). Known issues/limitations: Steps skipped in between steps that run don't work yet Skipping steps dynamically is inefficient as we cannot tell the executor to skip a step IF X, so we replace the script by exit 0 It is unclear if all variants of file mounts still work. Basic cases do work. Files used to be read-only in src-cli, they aren't now, but content is still reset in between steps. The assumption that everything operates in /work is broken here, because we need to use what executors give us to persist out-of-repo state in between containers (like the step result from the previous step) It is unclear if workspace mounts work Cache keys are not correctly computed if using workspace mounts - the metadataretriever is nil We still use log outputs to transfer the AfterStepResults to the Sourcegraph instance, this should finally become an artifact instead. Then, we don't have to rely on the execution_log_entires anymore and can theoretically prune those after some time. This column is currently growing indefinitely. It depends on tee being available in the docker images to capture the cmd.stdout/cmd.stderr properly for template variable rendering Env-vars are not rendered in their evaluated form post-execution File permissions are unclear and might be similarly broken to how they are now - or even worse Disclaimer: It's not feature complete today! But it is also not hitting any default code paths either. As development on this goes on, we can eventually remove the feature flag and run the new job format on all instances. This PR handles fallback of rendering old records correctly in the UI already.
2022-11-09 23:20:43 +00:00
# This tool rebuilds the batcheshelper image every time the source of it is changed.
batcheshelper-builder:
# Nothing to run for this, we just want to re-run the install script every time.
cmd: exit 0
install: |
bazel: use transitions to apply cross-compile platform automatically to oci_image (#60569) Removes the need to pass `--config=docker-darwin` through the following mechanisms: 1. `--enable_platform_specific_config` to enable certain flags on macos only e.g. `--extra_toolchains @zig_sdk//toolchain:linux_amd64_gnu.2.34` and `--sandbox_add_mount_pair=/tmp` (see [.bazelrc change](https://github.com/sourcegraph/sourcegraph/pull/60569/files?file-filters%5B%5D=dotfile&show-viewed-files=true)) 2. Apply a transition (using https://github.com/fmeum/with_cfg.bzl, please view [the following great video on it](https://www.youtube.com/watch?v=U5bdQRQY-io)) on `oci_image` targets when on the `@platforms//os:macos` platform to transition to the `@zig_sdk//platform:linux_amd64` platform. - This will start at `oci_image` targets and propagate down to e.g. `go_{binary,library}` etc targets with the "transitioned" platform configuration, resulting in them being built with the transitioned-to platform 3. Remove `darwin_docker_e2e_go` config_setting and `darwin-docker` bool_flag. - These aren't necessary anymore, as the places where these were used were not in the transitive closure rooted at an `oci_image` target, meaning they wouldn't be transitioned. To review, view [the following (filtered) files](https://github.com/sourcegraph/sourcegraph/pull/60569/files?file-filters%5B%5D=.bzl&file-filters%5B%5D=.sh&file-filters%5B%5D=.yaml&file-filters%5B%5D=No+extension&file-filters%5B%5D=dotfile&show-viewed-files=true) along with [the root BUILD.bazel](https://github.com/sourcegraph/sourcegraph/pull/60569/files#diff-7fc57714ef13c3325ce2a1130202edced92fcccc0c6db34a72f7b57f60d552a3). All the other files are just changing the `load` statements from `@rules_oci` to `//dev:oci_defs.bzl` ## Test plan CI, checked image locally and `sg test bazel-backend-integration` & `sg test bazel-e2e`
2024-02-20 13:57:56 +00:00
bazel build //cmd/batcheshelper:image_tarball
docker load --input $(bazel cquery //cmd/batcheshelper:image_tarball --output=files)
Experiment: Natively run SSBC in docker (#44034) This adds an experimental code path that I will use to test a docker-only execution mode for server-side batch changes. This code path is never executed for customers until we make the switch when we deem it ready. This will allow me to dogfood this while it's not available to customer instances yet. Ultimately, the goal of this is to make executors simply be "the job runner platform through a generic interface". Today, this depends on src-cli to do a good bunch of the work. This is a blocker for going full docker-based with executors, which will ultimately be a requirement on the road to k8s-based executors. As this removes the dependency on src-cli, nothing but the job interface and API endpoints tie executor and Sourcegraph instance together. Ultimately, this will allow us to support larger version spans between the two (pending executors going GA and being feature-complete). Known issues/limitations: Steps skipped in between steps that run don't work yet Skipping steps dynamically is inefficient as we cannot tell the executor to skip a step IF X, so we replace the script by exit 0 It is unclear if all variants of file mounts still work. Basic cases do work. Files used to be read-only in src-cli, they aren't now, but content is still reset in between steps. The assumption that everything operates in /work is broken here, because we need to use what executors give us to persist out-of-repo state in between containers (like the step result from the previous step) It is unclear if workspace mounts work Cache keys are not correctly computed if using workspace mounts - the metadataretriever is nil We still use log outputs to transfer the AfterStepResults to the Sourcegraph instance, this should finally become an artifact instead. Then, we don't have to rely on the execution_log_entires anymore and can theoretically prune those after some time. This column is currently growing indefinitely. It depends on tee being available in the docker images to capture the cmd.stdout/cmd.stderr properly for template variable rendering Env-vars are not rendered in their evaluated form post-execution File permissions are unclear and might be similarly broken to how they are now - or even worse Disclaimer: It's not feature complete today! But it is also not hitting any default code paths either. As development on this goes on, we can eventually remove the feature flag and run the new job format on all instances. This PR handles fallback of rendering old records correctly in the UI already.
2022-11-09 23:20:43 +00:00
env:
IMAGE: batcheshelper:candidate
Experiment: Natively run SSBC in docker (#44034) This adds an experimental code path that I will use to test a docker-only execution mode for server-side batch changes. This code path is never executed for customers until we make the switch when we deem it ready. This will allow me to dogfood this while it's not available to customer instances yet. Ultimately, the goal of this is to make executors simply be "the job runner platform through a generic interface". Today, this depends on src-cli to do a good bunch of the work. This is a blocker for going full docker-based with executors, which will ultimately be a requirement on the road to k8s-based executors. As this removes the dependency on src-cli, nothing but the job interface and API endpoints tie executor and Sourcegraph instance together. Ultimately, this will allow us to support larger version spans between the two (pending executors going GA and being feature-complete). Known issues/limitations: Steps skipped in between steps that run don't work yet Skipping steps dynamically is inefficient as we cannot tell the executor to skip a step IF X, so we replace the script by exit 0 It is unclear if all variants of file mounts still work. Basic cases do work. Files used to be read-only in src-cli, they aren't now, but content is still reset in between steps. The assumption that everything operates in /work is broken here, because we need to use what executors give us to persist out-of-repo state in between containers (like the step result from the previous step) It is unclear if workspace mounts work Cache keys are not correctly computed if using workspace mounts - the metadataretriever is nil We still use log outputs to transfer the AfterStepResults to the Sourcegraph instance, this should finally become an artifact instead. Then, we don't have to rely on the execution_log_entires anymore and can theoretically prune those after some time. This column is currently growing indefinitely. It depends on tee being available in the docker images to capture the cmd.stdout/cmd.stderr properly for template variable rendering Env-vars are not rendered in their evaluated form post-execution File permissions are unclear and might be similarly broken to how they are now - or even worse Disclaimer: It's not feature complete today! But it is also not hitting any default code paths either. As development on this goes on, we can eventually remove the feature flag and run the new job format on all instances. This PR handles fallback of rendering old records correctly in the UI already.
2022-11-09 23:20:43 +00:00
# TODO: This is required but should only be set on M1 Macs.
PLATFORM: linux/arm64
watch:
- cmd/batcheshelper
Experiment: Natively run SSBC in docker (#44034) This adds an experimental code path that I will use to test a docker-only execution mode for server-side batch changes. This code path is never executed for customers until we make the switch when we deem it ready. This will allow me to dogfood this while it's not available to customer instances yet. Ultimately, the goal of this is to make executors simply be "the job runner platform through a generic interface". Today, this depends on src-cli to do a good bunch of the work. This is a blocker for going full docker-based with executors, which will ultimately be a requirement on the road to k8s-based executors. As this removes the dependency on src-cli, nothing but the job interface and API endpoints tie executor and Sourcegraph instance together. Ultimately, this will allow us to support larger version spans between the two (pending executors going GA and being feature-complete). Known issues/limitations: Steps skipped in between steps that run don't work yet Skipping steps dynamically is inefficient as we cannot tell the executor to skip a step IF X, so we replace the script by exit 0 It is unclear if all variants of file mounts still work. Basic cases do work. Files used to be read-only in src-cli, they aren't now, but content is still reset in between steps. The assumption that everything operates in /work is broken here, because we need to use what executors give us to persist out-of-repo state in between containers (like the step result from the previous step) It is unclear if workspace mounts work Cache keys are not correctly computed if using workspace mounts - the metadataretriever is nil We still use log outputs to transfer the AfterStepResults to the Sourcegraph instance, this should finally become an artifact instead. Then, we don't have to rely on the execution_log_entires anymore and can theoretically prune those after some time. This column is currently growing indefinitely. It depends on tee being available in the docker images to capture the cmd.stdout/cmd.stderr properly for template variable rendering Env-vars are not rendered in their evaluated form post-execution File permissions are unclear and might be similarly broken to how they are now - or even worse Disclaimer: It's not feature complete today! But it is also not hitting any default code paths either. As development on this goes on, we can eventually remove the feature flag and run the new job format on all instances. This PR handles fallback of rendering old records correctly in the UI already.
2022-11-09 23:20:43 +00:00
- lib/batches
continueWatchOnExit: true
2023-06-04 22:30:05 +00:00
multiqueue-executor:
<<: *executor_template
cmd: |
env TMPDIR="$HOME/.sourcegraph/multiqueue-executor-temp" .bin/executor
env:
EXECUTOR_QUEUE_NAME: ''
EXECUTOR_QUEUE_NAMES: 'codeintel,batches'
2023-06-04 22:30:05 +00:00
EXECUTOR_MAXIMUM_NUM_JOBS: 8
blobstore:
cmd: .bin/blobstore
install: |
# Ensure the old blobstore Docker container is not running
docker rm -f blobstore
if [ -n "$DELVE" ]; then
2023-09-12 10:10:10 +00:00
export GCFLAGS='all=-N -l'
2023-06-04 22:30:05 +00:00
fi
go build -gcflags="$GCFLAGS" -o .bin/blobstore github.com/sourcegraph/sourcegraph/cmd/blobstore
checkBinary: .bin/blobstore
watch:
- lib
- internal
- cmd/blobstore
env:
BLOBSTORE_DATA_DIR: $HOME/.sourcegraph-dev/data/blobstore-go
redis-postgres:
# Add the following overwrites to your sg.config.overwrite.yaml to use the docker-compose
# database:
#
# env:
# PGHOST: localhost
# PGPASSWORD: sourcegraph
# PGUSER: sourcegraph
#
# You could also add an overwrite to add `redis-postgres` to the relevant command set(s).
description: Dockerized version of redis and postgres
cmd: docker-compose -f dev/redis-postgres.yml up $COMPOSE_ARGS
env:
COMPOSE_ARGS: --force-recreate
jaeger:
cmd: |
echo "Jaeger will be available on http://localhost:16686/-/debug/jaeger/search"
.bin/jaeger-all-in-one-${JAEGER_VERSION} --log-level ${JAEGER_LOG_LEVEL}
install_func: installJaeger
env:
JAEGER_VERSION: 1.45.0
2023-06-04 22:30:05 +00:00
JAEGER_DISK: $HOME/.sourcegraph-dev/data/jaeger
JAEGER_LOG_LEVEL: error
QUERY_BASE_PATH: /-/debug/jaeger
grafana:
cmd: |
if [[ $(uname) == "Linux" ]]; then
# Linux needs an extra arg to support host.internal.docker, which is how grafana connects
# to the prometheus backend.
ADD_HOST_FLAG="--add-host=host.docker.internal:host-gateway"
# Docker users on Linux will generally be using direct user mapping, which
# means that they'll want the data in the volume mount to be owned by the
# same user as is running this script. Fortunately, the Grafana container
# doesn't really care what user it runs as, so long as it can write to
# /var/lib/grafana.
DOCKER_USER="--user=$UID"
fi
echo "Grafana: serving on http://localhost:${PORT}"
echo "Grafana: note that logs are piped to ${GRAFANA_LOG_FILE}"
docker run --rm ${DOCKER_USER} \
--name=${CONTAINER} \
--cpus=1 \
--memory=1g \
-p 0.0.0.0:3370:3370 ${ADD_HOST_FLAG} \
-v "${GRAFANA_DISK}":/var/lib/grafana \
-v "$(pwd)"/dev/grafana/all:/sg_config_grafana/provisioning/datasources \
grafana:candidate >"${GRAFANA_LOG_FILE}" 2>&1
2023-06-04 22:30:05 +00:00
install: |
mkdir -p "${GRAFANA_DISK}"
mkdir -p "$(dirname ${GRAFANA_LOG_FILE})"
docker inspect $CONTAINER >/dev/null 2>&1 && docker rm -f $CONTAINER
bazel build //docker-images/grafana:image_tarball
docker load --input $(bazel cquery //docker-images/grafana:image_tarball --output=files)
2023-06-04 22:30:05 +00:00
env:
GRAFANA_DISK: $HOME/.sourcegraph-dev/data/grafana
# Log file location: since we log outside of the Docker container, we should
# log somewhere that's _not_ ~/.sourcegraph-dev/data/grafana, since that gets
# volume mounted into the container and therefore has its own ownership
# semantics.
# Now for the actual logging. Grafana's output gets sent to stdout and stderr.
# We want to capture that output, but because it's fairly noisy, don't want to
# display it in the normal case.
GRAFANA_LOG_FILE: $HOME/.sourcegraph-dev/logs/grafana/grafana.log
IMAGE: grafana:candidate
2023-06-04 22:30:05 +00:00
CONTAINER: grafana
PORT: 3370
# docker containers must access things via docker host on non-linux platforms
DOCKER_USER: ''
ADD_HOST_FLAG: ''
2023-06-04 22:30:05 +00:00
CACHE: false
prometheus:
cmd: |
if [[ $(uname) == "Linux" ]]; then
DOCKER_USER="--user=$UID"
# Frontend generally runs outside of Docker, so to access it we need to be
# able to access ports on the host. --net=host is a very dirty way of
# enabling this.
DOCKER_NET="--net=host"
SRC_FRONTEND_INTERNAL="localhost:3090"
fi
echo "Prometheus: serving on http://localhost:${PORT}"
echo "Prometheus: note that logs are piped to ${PROMETHEUS_LOG_FILE}"
docker run --rm ${DOCKER_NET} ${DOCKER_USER} \
--name=${CONTAINER} \
--cpus=1 \
--memory=4g \
-p 0.0.0.0:9090:9090 \
-v "${PROMETHEUS_DISK}":/prometheus \
-v "$(pwd)/${CONFIG_DIR}":/sg_prometheus_add_ons \
-e SRC_FRONTEND_INTERNAL="${SRC_FRONTEND_INTERNAL}" \
-e DISABLE_SOURCEGRAPH_CONFIG="${DISABLE_SOURCEGRAPH_CONFIG:-""}" \
-e DISABLE_ALERTMANAGER="${DISABLE_ALERTMANAGER:-""}" \
-e PROMETHEUS_ADDITIONAL_FLAGS="--web.enable-lifecycle --web.enable-admin-api" \
${IMAGE} >"${PROMETHEUS_LOG_FILE}" 2>&1
install: |
mkdir -p "${PROMETHEUS_DISK}"
mkdir -p "$(dirname ${PROMETHEUS_LOG_FILE})"
docker inspect $CONTAINER >/dev/null 2>&1 && docker rm -f $CONTAINER
if [[ $(uname) == "Linux" ]]; then
PROM_TARGETS="dev/prometheus/linux/prometheus_targets.yml"
fi
cp ${PROM_TARGETS} "${CONFIG_DIR}"/prometheus_targets.yml
bazel: use transitions to apply cross-compile platform automatically to oci_image (#60569) Removes the need to pass `--config=docker-darwin` through the following mechanisms: 1. `--enable_platform_specific_config` to enable certain flags on macos only e.g. `--extra_toolchains @zig_sdk//toolchain:linux_amd64_gnu.2.34` and `--sandbox_add_mount_pair=/tmp` (see [.bazelrc change](https://github.com/sourcegraph/sourcegraph/pull/60569/files?file-filters%5B%5D=dotfile&show-viewed-files=true)) 2. Apply a transition (using https://github.com/fmeum/with_cfg.bzl, please view [the following great video on it](https://www.youtube.com/watch?v=U5bdQRQY-io)) on `oci_image` targets when on the `@platforms//os:macos` platform to transition to the `@zig_sdk//platform:linux_amd64` platform. - This will start at `oci_image` targets and propagate down to e.g. `go_{binary,library}` etc targets with the "transitioned" platform configuration, resulting in them being built with the transitioned-to platform 3. Remove `darwin_docker_e2e_go` config_setting and `darwin-docker` bool_flag. - These aren't necessary anymore, as the places where these were used were not in the transitive closure rooted at an `oci_image` target, meaning they wouldn't be transitioned. To review, view [the following (filtered) files](https://github.com/sourcegraph/sourcegraph/pull/60569/files?file-filters%5B%5D=.bzl&file-filters%5B%5D=.sh&file-filters%5B%5D=.yaml&file-filters%5B%5D=No+extension&file-filters%5B%5D=dotfile&show-viewed-files=true) along with [the root BUILD.bazel](https://github.com/sourcegraph/sourcegraph/pull/60569/files#diff-7fc57714ef13c3325ce2a1130202edced92fcccc0c6db34a72f7b57f60d552a3). All the other files are just changing the `load` statements from `@rules_oci` to `//dev:oci_defs.bzl` ## Test plan CI, checked image locally and `sg test bazel-backend-integration` & `sg test bazel-e2e`
2024-02-20 13:57:56 +00:00
bazel build //docker-images/prometheus:image_tarball
docker load --input $(bazel cquery //docker-images/prometheus:image_tarball --output=files)
2023-06-04 22:30:05 +00:00
env:
PROMETHEUS_DISK: $HOME/.sourcegraph-dev/data/prometheus
# See comment above for `grafana`
PROMETHEUS_LOG_FILE: $HOME/.sourcegraph-dev/logs/prometheus/prometheus.log
IMAGE: prometheus:candidate
2023-06-04 22:30:05 +00:00
CONTAINER: prometheus
PORT: 9090
CONFIG_DIR: docker-images/prometheus/config
DOCKER_USER: ''
DOCKER_NET: ''
2023-06-04 22:30:05 +00:00
PROM_TARGETS: dev/prometheus/all/prometheus_targets.yml
SRC_FRONTEND_INTERNAL: host.docker.internal:3090
ADD_HOST_FLAG: ''
2023-06-04 22:30:05 +00:00
DISABLE_SOURCEGRAPH_CONFIG: false
postgres_exporter:
cmd: |
if [[ $(uname) == "Linux" ]]; then
# Linux needs an extra arg to support host.internal.docker, which is how grafana connects
# to the prometheus backend.
ADD_HOST_FLAG="--add-host=host.docker.internal:host-gateway"
fi
# Use psql to read the effective values for PG* env vars (instead of, e.g., hardcoding the default
# values).
get_pg_env() { psql -c '\set' | grep "$1" | cut -f 2 -d "'"; }
PGHOST=${PGHOST-$(get_pg_env HOST)}
PGUSER=${PGUSER-$(get_pg_env USER)}
PGPORT=${PGPORT-$(get_pg_env PORT)}
# we need to be able to query migration_logs table
PGDATABASE=${PGDATABASE-$(get_pg_env DBNAME)}
ADJUSTED_HOST=${PGHOST:-127.0.0.1}
if [[ ("$ADJUSTED_HOST" == "localhost" || "$ADJUSTED_HOST" == "127.0.0.1" || -f "$ADJUSTED_HOST") && "$OSTYPE" != "linux-gnu" ]]; then
ADJUSTED_HOST="host.docker.internal"
fi
NET_ARG=""
DATA_SOURCE_NAME="postgresql://${PGUSER}:${PGPASSWORD}@${ADJUSTED_HOST}:${PGPORT}/${PGDATABASE}?sslmode=${PGSSLMODE:-disable}"
if [[ "$OSTYPE" == "linux-gnu" ]]; then
NET_ARG="--net=host"
DATA_SOURCE_NAME="postgresql://${PGUSER}:${PGPASSWORD}@${ADJUSTED_HOST}:${PGPORT}/${PGDATABASE}?sslmode=${PGSSLMODE:-disable}"
fi
echo "postgres_exporter: serving on http://localhost:${PORT}"
docker run --rm ${DOCKER_USER} \
--name=${CONTAINER} \
-e DATA_SOURCE_NAME="${DATA_SOURCE_NAME}" \
--cpus=1 \
--memory=1g \
-p 0.0.0.0:9187:9187 ${ADD_HOST_FLAG} \
"${IMAGE}"
install: |
docker inspect $CONTAINER >/dev/null 2>&1 && docker rm -f $CONTAINER
bazel build //docker-images/postgres_exporter:image_tarball
docker load --input $(bazel cquery //docker-images/postgres_exporter:image_tarball --output=files)
env:
IMAGE: postgres-exporter:candidate
CONTAINER: postgres_exporter
# docker containers must access things via docker host on non-linux platforms
DOCKER_USER: ''
ADD_HOST_FLAG: ''
2023-06-04 22:30:05 +00:00
monitoring-generator:
cmd: echo "monitoring-generator is deprecated, please run 'sg generate go' or 'bazel run //dev:write_all_generated' instead"
2023-06-04 22:30:05 +00:00
env:
otel-collector:
install: |
bazel: use transitions to apply cross-compile platform automatically to oci_image (#60569) Removes the need to pass `--config=docker-darwin` through the following mechanisms: 1. `--enable_platform_specific_config` to enable certain flags on macos only e.g. `--extra_toolchains @zig_sdk//toolchain:linux_amd64_gnu.2.34` and `--sandbox_add_mount_pair=/tmp` (see [.bazelrc change](https://github.com/sourcegraph/sourcegraph/pull/60569/files?file-filters%5B%5D=dotfile&show-viewed-files=true)) 2. Apply a transition (using https://github.com/fmeum/with_cfg.bzl, please view [the following great video on it](https://www.youtube.com/watch?v=U5bdQRQY-io)) on `oci_image` targets when on the `@platforms//os:macos` platform to transition to the `@zig_sdk//platform:linux_amd64` platform. - This will start at `oci_image` targets and propagate down to e.g. `go_{binary,library}` etc targets with the "transitioned" platform configuration, resulting in them being built with the transitioned-to platform 3. Remove `darwin_docker_e2e_go` config_setting and `darwin-docker` bool_flag. - These aren't necessary anymore, as the places where these were used were not in the transitive closure rooted at an `oci_image` target, meaning they wouldn't be transitioned. To review, view [the following (filtered) files](https://github.com/sourcegraph/sourcegraph/pull/60569/files?file-filters%5B%5D=.bzl&file-filters%5B%5D=.sh&file-filters%5B%5D=.yaml&file-filters%5B%5D=No+extension&file-filters%5B%5D=dotfile&show-viewed-files=true) along with [the root BUILD.bazel](https://github.com/sourcegraph/sourcegraph/pull/60569/files#diff-7fc57714ef13c3325ce2a1130202edced92fcccc0c6db34a72f7b57f60d552a3). All the other files are just changing the `load` statements from `@rules_oci` to `//dev:oci_defs.bzl` ## Test plan CI, checked image locally and `sg test bazel-backend-integration` & `sg test bazel-e2e`
2024-02-20 13:57:56 +00:00
bazel build //docker-images/opentelemetry-collector:image_tarball
docker load --input $(bazel cquery //docker-images/opentelemetry-collector:image_tarball --output=files)
2023-06-04 22:30:05 +00:00
description: OpenTelemetry collector
cmd: |
JAEGER_HOST='host.docker.internal'
if [[ $(uname) == "Linux" ]]; then
# Jaeger generally runs outside of Docker, so to access it we need to be
# able to access ports on the host, because the Docker host only exists on
# MacOS. --net=host is a very dirty way of enabling this.
DOCKER_NET="--net=host"
JAEGER_HOST="localhost"
fi
docker container rm -f otel-collector
2023-06-04 22:30:05 +00:00
docker run --rm --name=otel-collector $DOCKER_NET $DOCKER_ARGS \
-p 4317:4317 -p 4318:4318 -p 55679:55679 -p 55670:55670 \
-p 8888:8888 \
-e JAEGER_HOST=$JAEGER_HOST \
-e HONEYCOMB_API_KEY=$HONEYCOMB_API_KEY \
-e HONEYCOMB_DATASET=$HONEYCOMB_DATASET \
$IMAGE --config "/etc/otel-collector/$CONFIGURATION_FILE"
env:
IMAGE: opentelemetry-collector:candidate
2023-06-04 22:30:05 +00:00
# Overwrite the following in sg.config.overwrite.yaml, based on which collector
# config you are using - see docker-images/opentelemetry-collector for more details.
CONFIGURATION_FILE: 'configs/jaeger.yaml'
2023-06-04 22:30:05 +00:00
# HONEYCOMB_API_KEY: ''
# HONEYCOMB_DATASET: ''
storybook:
cmd: pnpm storybook
install: pnpm install
# This will execute `env`, a utility to print the process environment. Can
# be used to debug which global vars `sg` uses.
debug-env:
description: Debug env vars
cmd: env
bext:
cmd: pnpm --filter @sourcegraph/browser dev
install: pnpm install
bazel: add command sets that uses bazel under the hood (#48460) This PRs introduces `bazelCommand`(s) in `sg.config.yaml` to handle starting those services, plus command sets for both oss and enterprise. `go run ./dev/sg start enterprise-bazel` will start most of the services with bazel and will keep using normal commands for the others (caddy, etc ...). The code powering those is built on top of `startedCmd` which I don't want to touch until we have a bit more understanding how all of this behaves. Once we have that, we can start refactoring this so it's more straightforward. Running the command set with bazel is bit verbose, you'll see two new "services" being ran: - `[ bazel]` which is ran at the very beginning and the equivalent of the old `install` part of the commands, except it does all of them at once. Once it's complete, it will stop exit nicely. - `[ iBazel]` which handles rebuilding the services in the background. The only thing it will do is to watch the FS based on the bazel rules and the depencency tree to understand what exactly need to be rebuilt. Once it detects some changes, it will overwrite the existing binary that was previously built by the first step (`[ bazel]`). The bazel commands will detect that the binary has been overwritten and will restart gracefully once it's done, meaning that the services will still be running during the new build. EDIT: added the codeintel command set for @efritz who's up to be our guinea pig on this ## Test plan <!-- All pull requests REQUIRE a test plan: https://docs.sourcegraph.com/dev/background-information/testing_principles --> Locally tested. --------- Co-authored-by: davejrt <davetry@gmail.com> Co-authored-by: William Bezuidenhout <william.bezuidenhout@sourcegraph.com>
2023-03-02 10:31:51 +00:00
bazelCommands:
blobstore:
target: //cmd/blobstore
env:
BLOBSTORE_DATA_DIR: $HOME/.sourcegraph-dev/data/blobstore-go
cody-gateway:
target: //cmd/cody-gateway
env:
CODY_GATEWAY_ANTHROPIC_ACCESS_TOKEN: foobar
# Set in override if you want to test local Cody Gateway: https://docs-legacy.sourcegraph.com/dev/how-to/cody_gateway
CODY_GATEWAY_DOTCOM_ACCESS_TOKEN: ''
CODY_GATEWAY_DOTCOM_API_URL: https://sourcegraph.test:3443/.api/graphql
CODY_GATEWAY_ALLOW_ANONYMOUS: true
CODY_GATEWAY_DIAGNOSTICS_SECRET: sekret
# Set in override if you want to test Embeddings with local Cody Gateway: http://go/embeddings-api-token-link
CODY_GATEWAY_SOURCEGRAPH_EMBEDDINGS_API_TOKEN: sekret
SRC_LOG_LEVEL: info
# Enables metrics in dev via debugserver
SRC_PROF_HTTP: '127.0.0.1:6098'
docsite:
runTarget: //doc:serve
bazel: add command sets that uses bazel under the hood (#48460) This PRs introduces `bazelCommand`(s) in `sg.config.yaml` to handle starting those services, plus command sets for both oss and enterprise. `go run ./dev/sg start enterprise-bazel` will start most of the services with bazel and will keep using normal commands for the others (caddy, etc ...). The code powering those is built on top of `startedCmd` which I don't want to touch until we have a bit more understanding how all of this behaves. Once we have that, we can start refactoring this so it's more straightforward. Running the command set with bazel is bit verbose, you'll see two new "services" being ran: - `[ bazel]` which is ran at the very beginning and the equivalent of the old `install` part of the commands, except it does all of them at once. Once it's complete, it will stop exit nicely. - `[ iBazel]` which handles rebuilding the services in the background. The only thing it will do is to watch the FS based on the bazel rules and the depencency tree to understand what exactly need to be rebuilt. Once it detects some changes, it will overwrite the existing binary that was previously built by the first step (`[ bazel]`). The bazel commands will detect that the binary has been overwritten and will restart gracefully once it's done, meaning that the services will still be running during the new build. EDIT: added the codeintel command set for @efritz who's up to be our guinea pig on this ## Test plan <!-- All pull requests REQUIRE a test plan: https://docs.sourcegraph.com/dev/background-information/testing_principles --> Locally tested. --------- Co-authored-by: davejrt <davetry@gmail.com> Co-authored-by: William Bezuidenhout <william.bezuidenhout@sourcegraph.com>
2023-03-02 10:31:51 +00:00
searcher:
target: //cmd/searcher
syntax-highlighter:
target: //docker-images/syntax-highlighter:syntect_server
ignoreStdout: true
ignoreStderr: true
env:
# Environment copied from Dockerfile
WORKERS: '1'
ROCKET_ENV: 'production'
ROCKET_LIMITS: '{json=10485760}'
ROCKET_SECRET_KEY: 'SeerutKeyIsI7releuantAndknvsuZPluaseIgnorYA='
ROCKET_KEEP_ALIVE: '0'
ROCKET_PORT: '9238'
QUIET: 'true'
bazel: add command sets that uses bazel under the hood (#48460) This PRs introduces `bazelCommand`(s) in `sg.config.yaml` to handle starting those services, plus command sets for both oss and enterprise. `go run ./dev/sg start enterprise-bazel` will start most of the services with bazel and will keep using normal commands for the others (caddy, etc ...). The code powering those is built on top of `startedCmd` which I don't want to touch until we have a bit more understanding how all of this behaves. Once we have that, we can start refactoring this so it's more straightforward. Running the command set with bazel is bit verbose, you'll see two new "services" being ran: - `[ bazel]` which is ran at the very beginning and the equivalent of the old `install` part of the commands, except it does all of them at once. Once it's complete, it will stop exit nicely. - `[ iBazel]` which handles rebuilding the services in the background. The only thing it will do is to watch the FS based on the bazel rules and the depencency tree to understand what exactly need to be rebuilt. Once it detects some changes, it will overwrite the existing binary that was previously built by the first step (`[ bazel]`). The bazel commands will detect that the binary has been overwritten and will restart gracefully once it's done, meaning that the services will still be running during the new build. EDIT: added the codeintel command set for @efritz who's up to be our guinea pig on this ## Test plan <!-- All pull requests REQUIRE a test plan: https://docs.sourcegraph.com/dev/background-information/testing_principles --> Locally tested. --------- Co-authored-by: davejrt <davetry@gmail.com> Co-authored-by: William Bezuidenhout <william.bezuidenhout@sourcegraph.com>
2023-03-02 10:31:51 +00:00
frontend:
description: Enterprise frontend
target: //cmd/frontend
bazel: add command sets that uses bazel under the hood (#48460) This PRs introduces `bazelCommand`(s) in `sg.config.yaml` to handle starting those services, plus command sets for both oss and enterprise. `go run ./dev/sg start enterprise-bazel` will start most of the services with bazel and will keep using normal commands for the others (caddy, etc ...). The code powering those is built on top of `startedCmd` which I don't want to touch until we have a bit more understanding how all of this behaves. Once we have that, we can start refactoring this so it's more straightforward. Running the command set with bazel is bit verbose, you'll see two new "services" being ran: - `[ bazel]` which is ran at the very beginning and the equivalent of the old `install` part of the commands, except it does all of them at once. Once it's complete, it will stop exit nicely. - `[ iBazel]` which handles rebuilding the services in the background. The only thing it will do is to watch the FS based on the bazel rules and the depencency tree to understand what exactly need to be rebuilt. Once it detects some changes, it will overwrite the existing binary that was previously built by the first step (`[ bazel]`). The bazel commands will detect that the binary has been overwritten and will restart gracefully once it's done, meaning that the services will still be running during the new build. EDIT: added the codeintel command set for @efritz who's up to be our guinea pig on this ## Test plan <!-- All pull requests REQUIRE a test plan: https://docs.sourcegraph.com/dev/background-information/testing_principles --> Locally tested. --------- Co-authored-by: davejrt <davetry@gmail.com> Co-authored-by: William Bezuidenhout <william.bezuidenhout@sourcegraph.com>
2023-03-02 10:31:51 +00:00
precmd: |
export SOURCEGRAPH_LICENSE_GENERATION_KEY=$(cat ../dev-private/enterprise/dev/test-license-generation-key.pem)
# If EXTSVC_CONFIG_FILE is *unset*, set a default.
export EXTSVC_CONFIG_FILE=${EXTSVC_CONFIG_FILE-'../dev-private/enterprise/dev/external-services-config.json'}
env:
CONFIGURATION_MODE: server
USE_ENHANCED_LANGUAGE_DETECTION: false
SITE_CONFIG_FILE: '../dev-private/enterprise/dev/site-config.json'
SITE_CONFIG_ESCAPE_HATCH_PATH: '$HOME/.sourcegraph/site-config.json'
bazel: add command sets that uses bazel under the hood (#48460) This PRs introduces `bazelCommand`(s) in `sg.config.yaml` to handle starting those services, plus command sets for both oss and enterprise. `go run ./dev/sg start enterprise-bazel` will start most of the services with bazel and will keep using normal commands for the others (caddy, etc ...). The code powering those is built on top of `startedCmd` which I don't want to touch until we have a bit more understanding how all of this behaves. Once we have that, we can start refactoring this so it's more straightforward. Running the command set with bazel is bit verbose, you'll see two new "services" being ran: - `[ bazel]` which is ran at the very beginning and the equivalent of the old `install` part of the commands, except it does all of them at once. Once it's complete, it will stop exit nicely. - `[ iBazel]` which handles rebuilding the services in the background. The only thing it will do is to watch the FS based on the bazel rules and the depencency tree to understand what exactly need to be rebuilt. Once it detects some changes, it will overwrite the existing binary that was previously built by the first step (`[ bazel]`). The bazel commands will detect that the binary has been overwritten and will restart gracefully once it's done, meaning that the services will still be running during the new build. EDIT: added the codeintel command set for @efritz who's up to be our guinea pig on this ## Test plan <!-- All pull requests REQUIRE a test plan: https://docs.sourcegraph.com/dev/background-information/testing_principles --> Locally tested. --------- Co-authored-by: davejrt <davetry@gmail.com> Co-authored-by: William Bezuidenhout <william.bezuidenhout@sourcegraph.com>
2023-03-02 10:31:51 +00:00
# frontend processes need this to be so that the paths to the assets are rendered correctly
WEB_BUILDER_DEV_SERVER: 1
bazel: add command sets that uses bazel under the hood (#48460) This PRs introduces `bazelCommand`(s) in `sg.config.yaml` to handle starting those services, plus command sets for both oss and enterprise. `go run ./dev/sg start enterprise-bazel` will start most of the services with bazel and will keep using normal commands for the others (caddy, etc ...). The code powering those is built on top of `startedCmd` which I don't want to touch until we have a bit more understanding how all of this behaves. Once we have that, we can start refactoring this so it's more straightforward. Running the command set with bazel is bit verbose, you'll see two new "services" being ran: - `[ bazel]` which is ran at the very beginning and the equivalent of the old `install` part of the commands, except it does all of them at once. Once it's complete, it will stop exit nicely. - `[ iBazel]` which handles rebuilding the services in the background. The only thing it will do is to watch the FS based on the bazel rules and the depencency tree to understand what exactly need to be rebuilt. Once it detects some changes, it will overwrite the existing binary that was previously built by the first step (`[ bazel]`). The bazel commands will detect that the binary has been overwritten and will restart gracefully once it's done, meaning that the services will still be running during the new build. EDIT: added the codeintel command set for @efritz who's up to be our guinea pig on this ## Test plan <!-- All pull requests REQUIRE a test plan: https://docs.sourcegraph.com/dev/background-information/testing_principles --> Locally tested. --------- Co-authored-by: davejrt <davetry@gmail.com> Co-authored-by: William Bezuidenhout <william.bezuidenhout@sourcegraph.com>
2023-03-02 10:31:51 +00:00
worker:
target: //cmd/worker
bazel: add command sets that uses bazel under the hood (#48460) This PRs introduces `bazelCommand`(s) in `sg.config.yaml` to handle starting those services, plus command sets for both oss and enterprise. `go run ./dev/sg start enterprise-bazel` will start most of the services with bazel and will keep using normal commands for the others (caddy, etc ...). The code powering those is built on top of `startedCmd` which I don't want to touch until we have a bit more understanding how all of this behaves. Once we have that, we can start refactoring this so it's more straightforward. Running the command set with bazel is bit verbose, you'll see two new "services" being ran: - `[ bazel]` which is ran at the very beginning and the equivalent of the old `install` part of the commands, except it does all of them at once. Once it's complete, it will stop exit nicely. - `[ iBazel]` which handles rebuilding the services in the background. The only thing it will do is to watch the FS based on the bazel rules and the depencency tree to understand what exactly need to be rebuilt. Once it detects some changes, it will overwrite the existing binary that was previously built by the first step (`[ bazel]`). The bazel commands will detect that the binary has been overwritten and will restart gracefully once it's done, meaning that the services will still be running during the new build. EDIT: added the codeintel command set for @efritz who's up to be our guinea pig on this ## Test plan <!-- All pull requests REQUIRE a test plan: https://docs.sourcegraph.com/dev/background-information/testing_principles --> Locally tested. --------- Co-authored-by: davejrt <davetry@gmail.com> Co-authored-by: William Bezuidenhout <william.bezuidenhout@sourcegraph.com>
2023-03-02 10:31:51 +00:00
precmd: |
export SOURCEGRAPH_LICENSE_GENERATION_KEY=$(cat ../dev-private/enterprise/dev/test-license-generation-key.pem)
repo-updater:
target: //cmd/repo-updater
bazel: add command sets that uses bazel under the hood (#48460) This PRs introduces `bazelCommand`(s) in `sg.config.yaml` to handle starting those services, plus command sets for both oss and enterprise. `go run ./dev/sg start enterprise-bazel` will start most of the services with bazel and will keep using normal commands for the others (caddy, etc ...). The code powering those is built on top of `startedCmd` which I don't want to touch until we have a bit more understanding how all of this behaves. Once we have that, we can start refactoring this so it's more straightforward. Running the command set with bazel is bit verbose, you'll see two new "services" being ran: - `[ bazel]` which is ran at the very beginning and the equivalent of the old `install` part of the commands, except it does all of them at once. Once it's complete, it will stop exit nicely. - `[ iBazel]` which handles rebuilding the services in the background. The only thing it will do is to watch the FS based on the bazel rules and the depencency tree to understand what exactly need to be rebuilt. Once it detects some changes, it will overwrite the existing binary that was previously built by the first step (`[ bazel]`). The bazel commands will detect that the binary has been overwritten and will restart gracefully once it's done, meaning that the services will still be running during the new build. EDIT: added the codeintel command set for @efritz who's up to be our guinea pig on this ## Test plan <!-- All pull requests REQUIRE a test plan: https://docs.sourcegraph.com/dev/background-information/testing_principles --> Locally tested. --------- Co-authored-by: davejrt <davetry@gmail.com> Co-authored-by: William Bezuidenhout <william.bezuidenhout@sourcegraph.com>
2023-03-02 10:31:51 +00:00
precmd: |
export SOURCEGRAPH_LICENSE_GENERATION_KEY=$(cat ../dev-private/enterprise/dev/test-license-generation-key.pem)
symbols:
2023-10-04 19:43:34 +00:00
target: //cmd/symbols
bazel: add command sets that uses bazel under the hood (#48460) This PRs introduces `bazelCommand`(s) in `sg.config.yaml` to handle starting those services, plus command sets for both oss and enterprise. `go run ./dev/sg start enterprise-bazel` will start most of the services with bazel and will keep using normal commands for the others (caddy, etc ...). The code powering those is built on top of `startedCmd` which I don't want to touch until we have a bit more understanding how all of this behaves. Once we have that, we can start refactoring this so it's more straightforward. Running the command set with bazel is bit verbose, you'll see two new "services" being ran: - `[ bazel]` which is ran at the very beginning and the equivalent of the old `install` part of the commands, except it does all of them at once. Once it's complete, it will stop exit nicely. - `[ iBazel]` which handles rebuilding the services in the background. The only thing it will do is to watch the FS based on the bazel rules and the depencency tree to understand what exactly need to be rebuilt. Once it detects some changes, it will overwrite the existing binary that was previously built by the first step (`[ bazel]`). The bazel commands will detect that the binary has been overwritten and will restart gracefully once it's done, meaning that the services will still be running during the new build. EDIT: added the codeintel command set for @efritz who's up to be our guinea pig on this ## Test plan <!-- All pull requests REQUIRE a test plan: https://docs.sourcegraph.com/dev/background-information/testing_principles --> Locally tested. --------- Co-authored-by: davejrt <davetry@gmail.com> Co-authored-by: William Bezuidenhout <william.bezuidenhout@sourcegraph.com>
2023-03-02 10:31:51 +00:00
checkBinary: .bin/symbols
env:
CTAGS_COMMAND: dev/universal-ctags-dev
SCIP_CTAGS_COMMAND: dev/scip-ctags-dev
bazel: add command sets that uses bazel under the hood (#48460) This PRs introduces `bazelCommand`(s) in `sg.config.yaml` to handle starting those services, plus command sets for both oss and enterprise. `go run ./dev/sg start enterprise-bazel` will start most of the services with bazel and will keep using normal commands for the others (caddy, etc ...). The code powering those is built on top of `startedCmd` which I don't want to touch until we have a bit more understanding how all of this behaves. Once we have that, we can start refactoring this so it's more straightforward. Running the command set with bazel is bit verbose, you'll see two new "services" being ran: - `[ bazel]` which is ran at the very beginning and the equivalent of the old `install` part of the commands, except it does all of them at once. Once it's complete, it will stop exit nicely. - `[ iBazel]` which handles rebuilding the services in the background. The only thing it will do is to watch the FS based on the bazel rules and the depencency tree to understand what exactly need to be rebuilt. Once it detects some changes, it will overwrite the existing binary that was previously built by the first step (`[ bazel]`). The bazel commands will detect that the binary has been overwritten and will restart gracefully once it's done, meaning that the services will still be running during the new build. EDIT: added the codeintel command set for @efritz who's up to be our guinea pig on this ## Test plan <!-- All pull requests REQUIRE a test plan: https://docs.sourcegraph.com/dev/background-information/testing_principles --> Locally tested. --------- Co-authored-by: davejrt <davetry@gmail.com> Co-authored-by: William Bezuidenhout <william.bezuidenhout@sourcegraph.com>
2023-03-02 10:31:51 +00:00
CTAGS_PROCESSES: 2
USE_ROCKSKIP: 'false'
bazel: add command sets that uses bazel under the hood (#48460) This PRs introduces `bazelCommand`(s) in `sg.config.yaml` to handle starting those services, plus command sets for both oss and enterprise. `go run ./dev/sg start enterprise-bazel` will start most of the services with bazel and will keep using normal commands for the others (caddy, etc ...). The code powering those is built on top of `startedCmd` which I don't want to touch until we have a bit more understanding how all of this behaves. Once we have that, we can start refactoring this so it's more straightforward. Running the command set with bazel is bit verbose, you'll see two new "services" being ran: - `[ bazel]` which is ran at the very beginning and the equivalent of the old `install` part of the commands, except it does all of them at once. Once it's complete, it will stop exit nicely. - `[ iBazel]` which handles rebuilding the services in the background. The only thing it will do is to watch the FS based on the bazel rules and the depencency tree to understand what exactly need to be rebuilt. Once it detects some changes, it will overwrite the existing binary that was previously built by the first step (`[ bazel]`). The bazel commands will detect that the binary has been overwritten and will restart gracefully once it's done, meaning that the services will still be running during the new build. EDIT: added the codeintel command set for @efritz who's up to be our guinea pig on this ## Test plan <!-- All pull requests REQUIRE a test plan: https://docs.sourcegraph.com/dev/background-information/testing_principles --> Locally tested. --------- Co-authored-by: davejrt <davetry@gmail.com> Co-authored-by: William Bezuidenhout <william.bezuidenhout@sourcegraph.com>
2023-03-02 10:31:51 +00:00
gitserver-template: &gitserver_bazel_template
target: //cmd/gitserver
bazel: add command sets that uses bazel under the hood (#48460) This PRs introduces `bazelCommand`(s) in `sg.config.yaml` to handle starting those services, plus command sets for both oss and enterprise. `go run ./dev/sg start enterprise-bazel` will start most of the services with bazel and will keep using normal commands for the others (caddy, etc ...). The code powering those is built on top of `startedCmd` which I don't want to touch until we have a bit more understanding how all of this behaves. Once we have that, we can start refactoring this so it's more straightforward. Running the command set with bazel is bit verbose, you'll see two new "services" being ran: - `[ bazel]` which is ran at the very beginning and the equivalent of the old `install` part of the commands, except it does all of them at once. Once it's complete, it will stop exit nicely. - `[ iBazel]` which handles rebuilding the services in the background. The only thing it will do is to watch the FS based on the bazel rules and the depencency tree to understand what exactly need to be rebuilt. Once it detects some changes, it will overwrite the existing binary that was previously built by the first step (`[ bazel]`). The bazel commands will detect that the binary has been overwritten and will restart gracefully once it's done, meaning that the services will still be running during the new build. EDIT: added the codeintel command set for @efritz who's up to be our guinea pig on this ## Test plan <!-- All pull requests REQUIRE a test plan: https://docs.sourcegraph.com/dev/background-information/testing_principles --> Locally tested. --------- Co-authored-by: davejrt <davetry@gmail.com> Co-authored-by: William Bezuidenhout <william.bezuidenhout@sourcegraph.com>
2023-03-02 10:31:51 +00:00
env: &gitserverenv
HOSTNAME: 127.0.0.1:3178
# This is only here to stay backwards-compatible with people's custom
# `sg.config.overwrite.yaml` files
gitserver:
<<: *gitserver_bazel_template
gitserver-0:
<<: *gitserver_bazel_template
env:
<<: *gitserverenv
GITSERVER_EXTERNAL_ADDR: 127.0.0.1:3501
GITSERVER_ADDR: 127.0.0.1:3501
SRC_REPOS_DIR: $HOME/.sourcegraph/repos_1
SRC_PROF_HTTP: 127.0.0.1:3551
gitserver-1:
<<: *gitserver_bazel_template
env:
<<: *gitserverenv
GITSERVER_EXTERNAL_ADDR: 127.0.0.1:3502
GITSERVER_ADDR: 127.0.0.1:3502
SRC_REPOS_DIR: $HOME/.sourcegraph/repos_2
SRC_PROF_HTTP: 127.0.0.1:3552
bazel: add command sets that uses bazel under the hood (#48460) This PRs introduces `bazelCommand`(s) in `sg.config.yaml` to handle starting those services, plus command sets for both oss and enterprise. `go run ./dev/sg start enterprise-bazel` will start most of the services with bazel and will keep using normal commands for the others (caddy, etc ...). The code powering those is built on top of `startedCmd` which I don't want to touch until we have a bit more understanding how all of this behaves. Once we have that, we can start refactoring this so it's more straightforward. Running the command set with bazel is bit verbose, you'll see two new "services" being ran: - `[ bazel]` which is ran at the very beginning and the equivalent of the old `install` part of the commands, except it does all of them at once. Once it's complete, it will stop exit nicely. - `[ iBazel]` which handles rebuilding the services in the background. The only thing it will do is to watch the FS based on the bazel rules and the depencency tree to understand what exactly need to be rebuilt. Once it detects some changes, it will overwrite the existing binary that was previously built by the first step (`[ bazel]`). The bazel commands will detect that the binary has been overwritten and will restart gracefully once it's done, meaning that the services will still be running during the new build. EDIT: added the codeintel command set for @efritz who's up to be our guinea pig on this ## Test plan <!-- All pull requests REQUIRE a test plan: https://docs.sourcegraph.com/dev/background-information/testing_principles --> Locally tested. --------- Co-authored-by: davejrt <davetry@gmail.com> Co-authored-by: William Bezuidenhout <william.bezuidenhout@sourcegraph.com>
2023-03-02 10:31:51 +00:00
codeintel-worker:
precmd: |
export SOURCEGRAPH_LICENSE_GENERATION_KEY=$(cat ../dev-private/enterprise/dev/test-license-generation-key.pem)
target: //cmd/precise-code-intel-worker
bazel: add command sets that uses bazel under the hood (#48460) This PRs introduces `bazelCommand`(s) in `sg.config.yaml` to handle starting those services, plus command sets for both oss and enterprise. `go run ./dev/sg start enterprise-bazel` will start most of the services with bazel and will keep using normal commands for the others (caddy, etc ...). The code powering those is built on top of `startedCmd` which I don't want to touch until we have a bit more understanding how all of this behaves. Once we have that, we can start refactoring this so it's more straightforward. Running the command set with bazel is bit verbose, you'll see two new "services" being ran: - `[ bazel]` which is ran at the very beginning and the equivalent of the old `install` part of the commands, except it does all of them at once. Once it's complete, it will stop exit nicely. - `[ iBazel]` which handles rebuilding the services in the background. The only thing it will do is to watch the FS based on the bazel rules and the depencency tree to understand what exactly need to be rebuilt. Once it detects some changes, it will overwrite the existing binary that was previously built by the first step (`[ bazel]`). The bazel commands will detect that the binary has been overwritten and will restart gracefully once it's done, meaning that the services will still be running during the new build. EDIT: added the codeintel command set for @efritz who's up to be our guinea pig on this ## Test plan <!-- All pull requests REQUIRE a test plan: https://docs.sourcegraph.com/dev/background-information/testing_principles --> Locally tested. --------- Co-authored-by: davejrt <davetry@gmail.com> Co-authored-by: William Bezuidenhout <william.bezuidenhout@sourcegraph.com>
2023-03-02 10:31:51 +00:00
executor-template: &executor_template_bazel
target: //cmd/executor
bazel: add command sets that uses bazel under the hood (#48460) This PRs introduces `bazelCommand`(s) in `sg.config.yaml` to handle starting those services, plus command sets for both oss and enterprise. `go run ./dev/sg start enterprise-bazel` will start most of the services with bazel and will keep using normal commands for the others (caddy, etc ...). The code powering those is built on top of `startedCmd` which I don't want to touch until we have a bit more understanding how all of this behaves. Once we have that, we can start refactoring this so it's more straightforward. Running the command set with bazel is bit verbose, you'll see two new "services" being ran: - `[ bazel]` which is ran at the very beginning and the equivalent of the old `install` part of the commands, except it does all of them at once. Once it's complete, it will stop exit nicely. - `[ iBazel]` which handles rebuilding the services in the background. The only thing it will do is to watch the FS based on the bazel rules and the depencency tree to understand what exactly need to be rebuilt. Once it detects some changes, it will overwrite the existing binary that was previously built by the first step (`[ bazel]`). The bazel commands will detect that the binary has been overwritten and will restart gracefully once it's done, meaning that the services will still be running during the new build. EDIT: added the codeintel command set for @efritz who's up to be our guinea pig on this ## Test plan <!-- All pull requests REQUIRE a test plan: https://docs.sourcegraph.com/dev/background-information/testing_principles --> Locally tested. --------- Co-authored-by: davejrt <davetry@gmail.com> Co-authored-by: William Bezuidenhout <william.bezuidenhout@sourcegraph.com>
2023-03-02 10:31:51 +00:00
env:
EXECUTOR_QUEUE_NAME: TEMPLATE
TMPDIR: $HOME/.sourcegraph/executor-temp
# Required for frontend and executor to communicate
EXECUTOR_FRONTEND_URL: http://localhost:3080
# Must match the secret defined in the site config.
EXECUTOR_FRONTEND_PASSWORD: hunter2hunter2hunter2
# Disable firecracker inside executor in dev
EXECUTOR_USE_FIRECRACKER: false
bazel: add command sets that uses bazel under the hood (#48460) This PRs introduces `bazelCommand`(s) in `sg.config.yaml` to handle starting those services, plus command sets for both oss and enterprise. `go run ./dev/sg start enterprise-bazel` will start most of the services with bazel and will keep using normal commands for the others (caddy, etc ...). The code powering those is built on top of `startedCmd` which I don't want to touch until we have a bit more understanding how all of this behaves. Once we have that, we can start refactoring this so it's more straightforward. Running the command set with bazel is bit verbose, you'll see two new "services" being ran: - `[ bazel]` which is ran at the very beginning and the equivalent of the old `install` part of the commands, except it does all of them at once. Once it's complete, it will stop exit nicely. - `[ iBazel]` which handles rebuilding the services in the background. The only thing it will do is to watch the FS based on the bazel rules and the depencency tree to understand what exactly need to be rebuilt. Once it detects some changes, it will overwrite the existing binary that was previously built by the first step (`[ bazel]`). The bazel commands will detect that the binary has been overwritten and will restart gracefully once it's done, meaning that the services will still be running during the new build. EDIT: added the codeintel command set for @efritz who's up to be our guinea pig on this ## Test plan <!-- All pull requests REQUIRE a test plan: https://docs.sourcegraph.com/dev/background-information/testing_principles --> Locally tested. --------- Co-authored-by: davejrt <davetry@gmail.com> Co-authored-by: William Bezuidenhout <william.bezuidenhout@sourcegraph.com>
2023-03-02 10:31:51 +00:00
codeintel-executor:
<<: *executor_template_bazel
env:
EXECUTOR_QUEUE_NAME: codeintel
TMPDIR: $HOME/.sourcegraph/indexer-temp
dockerCommands:
batcheshelper-builder:
# Nothing to run for this, we just want to re-run the install script every time.
cmd: exit 0
target: //cmd/batcheshelper:image_tarball
image: batcheshelper:candidate
env:
# TODO: This is required but should only be set on M1 Macs.
PLATFORM: linux/arm64
continueWatchOnExit: true
grafana:
target: //docker-images/grafana:image_tarball
docker:
image: grafana:candidate
ports:
- 3370
flags:
cpus: 1
memory: 1g
volumes:
- from: $HOME/.sourcegraph-dev/data/grafana
to: /var/lib/grafana
- from: $(pwd)/dev/grafana/all
to: /sg_config_grafana/provisioning/datasources
linux:
flags:
# Linux needs an extra arg to support host.internal.docker, which is how grafana connects
# to the prometheus backend.
add-host: host.docker.internal:host-gateway
# Docker users on Linux will generally be using direct user mapping, which
# means that they'll want the data in the volume mount to be owned by the
# same user as is running this script. Fortunately, the Grafana container
# doesn't really care what user it runs as, so long as it can write to
# /var/lib/grafana.
user: $UID
# Log file location: since we log outside of the Docker container, we should
# log somewhere that's _not_ ~/.sourcegraph-dev/data/grafana, since that gets
# volume mounted into the container and therefore has its own ownership
# semantics.
# Now for the actual logging. Grafana's output gets sent to stdout and stderr.
# We want to capture that output, but because it's fairly noisy, don't want to
# display it in the normal case.
logfile: $HOME/.sourcegraph-dev/logs/grafana/grafana.log
env:
# docker containers must access things via docker host on non-linux platforms
CACHE: false
otel-collector:
target: //docker-images/opentelemetry-collector:image_tarball
description: OpenTelemetry collector
args: '--config "/etc/otel-collector/$CONFIGURATION_FILE"'
docker:
image: opentelemetry-collector:candidate
ports:
- 4317
- 4318
- 55679
- 55670
- 8888
linux:
flags:
# Jaeger generally runs outside of Docker, so to access it we need to be
# able to access ports on the host, because the Docker host only exists on
# MacOS. --net=host is a very dirty way of enabling this.
net: host
env:
JAEGER_HOST: localhost
env:
JAEGER_HOST: host.docker.internal
# Overwrite the following in sg.config.overwrite.yaml, based on which collector
# config you are using - see docker-images/opentelemetry-collector for more details.
CONFIGURATION_FILE: 'configs/jaeger.yaml'
postgres_exporter:
target: //docker-images/postgres_exporter:image_tarball
docker:
image: postgres-exporter:candidate
flags:
cpus: 1
memory: 1g
ports:
- 9187
linux:
flags:
# Linux needs an extra arg to support host.internal.docker, which is how
# postgres_exporter connects to the prometheus backend.
add-host: host.docker.internal:host-gateway
net: host
precmd: |
# Use psql to read the effective values for PG* env vars (instead of, e.g., hardcoding the default
# values).
get_pg_env() { psql -c '\set' | grep "$1" | cut -f 2 -d "'"; }
PGHOST=${PGHOST-$(get_pg_env HOST)}
PGUSER=${PGUSER-$(get_pg_env USER)}
PGPORT=${PGPORT-$(get_pg_env PORT)}
# we need to be able to query migration_logs table
PGDATABASE=${PGDATABASE-$(get_pg_env DBNAME)}
ADJUSTED_HOST=${PGHOST:-127.0.0.1}
if [[ ("$ADJUSTED_HOST" == "localhost" || "$ADJUSTED_HOST" == "127.0.0.1" || -f "$ADJUSTED_HOST") && "$OSTYPE" != "linux-gnu" ]]; then
ADJUSTED_HOST="host.docker.internal"
fi
env:
DATA_SOURCE_NAME: postgresql://${PGUSER}:${PGPASSWORD}@${ADJUSTED_HOST}:${PGPORT}/${PGDATABASE}?sslmode=${PGSSLMODE:-disable}
prometheus:
target: //docker-images/prometheus:image_tarball
logfile: $HOME/.sourcegraph-dev/logs/prometheus/prometheus.log
docker:
image: prometheus:candidate
volumes:
- from: $HOME/.sourcegraph-dev/data/prometheus
to: /prometheus
- from: $(pwd)/$CONFIG_DIR
to: /sg_prometheus_add_ons
flags:
cpus: 1
memory: 4g
ports:
- 9090
linux:
flags:
net: host
user: $UID
env:
PROM_TARGETS: dev/prometheus/linux/prometheus_targets.yml
SRC_FRONTEND_INTERNAL: localhost:3090
precmd: cp ${PROM_TARGETS} "${CONFIG_DIR}"/prometheus_targets.yml
env:
CONFIG_DIR: docker-images/prometheus/config
PROM_TARGETS: dev/prometheus/all/prometheus_targets.yml
SRC_FRONTEND_INTERNAL: host.docker.internal:3090
DISABLE_SOURCEGRAPH_CONFIG: false
DISABLE_ALERTMANAGER: false
PROMETHEUS_ADDITIONAL_FLAGS: '--web.enable-lifecycle --web.enable-admin-api'
syntax-highlighter:
ignoreStdout: true
ignoreStderr: true
docker:
image: sourcegraph/syntax-highlighter:insiders
pull: true
ports:
- 9238
env:
WORKERS: 1
ROCKET_ADDRESS: 0.0.0.0
bazel: add command sets that uses bazel under the hood (#48460) This PRs introduces `bazelCommand`(s) in `sg.config.yaml` to handle starting those services, plus command sets for both oss and enterprise. `go run ./dev/sg start enterprise-bazel` will start most of the services with bazel and will keep using normal commands for the others (caddy, etc ...). The code powering those is built on top of `startedCmd` which I don't want to touch until we have a bit more understanding how all of this behaves. Once we have that, we can start refactoring this so it's more straightforward. Running the command set with bazel is bit verbose, you'll see two new "services" being ran: - `[ bazel]` which is ran at the very beginning and the equivalent of the old `install` part of the commands, except it does all of them at once. Once it's complete, it will stop exit nicely. - `[ iBazel]` which handles rebuilding the services in the background. The only thing it will do is to watch the FS based on the bazel rules and the depencency tree to understand what exactly need to be rebuilt. Once it detects some changes, it will overwrite the existing binary that was previously built by the first step (`[ bazel]`). The bazel commands will detect that the binary has been overwritten and will restart gracefully once it's done, meaning that the services will still be running during the new build. EDIT: added the codeintel command set for @efritz who's up to be our guinea pig on this ## Test plan <!-- All pull requests REQUIRE a test plan: https://docs.sourcegraph.com/dev/background-information/testing_principles --> Locally tested. --------- Co-authored-by: davejrt <davetry@gmail.com> Co-authored-by: William Bezuidenhout <william.bezuidenhout@sourcegraph.com>
2023-03-02 10:31:51 +00:00
#
# CommandSets ################################################################
#
defaultCommandset: enterprise
commandsets:
bazel: add command sets that uses bazel under the hood (#48460) This PRs introduces `bazelCommand`(s) in `sg.config.yaml` to handle starting those services, plus command sets for both oss and enterprise. `go run ./dev/sg start enterprise-bazel` will start most of the services with bazel and will keep using normal commands for the others (caddy, etc ...). The code powering those is built on top of `startedCmd` which I don't want to touch until we have a bit more understanding how all of this behaves. Once we have that, we can start refactoring this so it's more straightforward. Running the command set with bazel is bit verbose, you'll see two new "services" being ran: - `[ bazel]` which is ran at the very beginning and the equivalent of the old `install` part of the commands, except it does all of them at once. Once it's complete, it will stop exit nicely. - `[ iBazel]` which handles rebuilding the services in the background. The only thing it will do is to watch the FS based on the bazel rules and the depencency tree to understand what exactly need to be rebuilt. Once it detects some changes, it will overwrite the existing binary that was previously built by the first step (`[ bazel]`). The bazel commands will detect that the binary has been overwritten and will restart gracefully once it's done, meaning that the services will still be running during the new build. EDIT: added the codeintel command set for @efritz who's up to be our guinea pig on this ## Test plan <!-- All pull requests REQUIRE a test plan: https://docs.sourcegraph.com/dev/background-information/testing_principles --> Locally tested. --------- Co-authored-by: davejrt <davetry@gmail.com> Co-authored-by: William Bezuidenhout <william.bezuidenhout@sourcegraph.com>
2023-03-02 10:31:51 +00:00
enterprise-bazel: &enterprise_bazel_set
checks:
- redis
- postgres
- git
- bazelisk
- ibazel
- dev-private
bazel: add command sets that uses bazel under the hood (#48460) This PRs introduces `bazelCommand`(s) in `sg.config.yaml` to handle starting those services, plus command sets for both oss and enterprise. `go run ./dev/sg start enterprise-bazel` will start most of the services with bazel and will keep using normal commands for the others (caddy, etc ...). The code powering those is built on top of `startedCmd` which I don't want to touch until we have a bit more understanding how all of this behaves. Once we have that, we can start refactoring this so it's more straightforward. Running the command set with bazel is bit verbose, you'll see two new "services" being ran: - `[ bazel]` which is ran at the very beginning and the equivalent of the old `install` part of the commands, except it does all of them at once. Once it's complete, it will stop exit nicely. - `[ iBazel]` which handles rebuilding the services in the background. The only thing it will do is to watch the FS based on the bazel rules and the depencency tree to understand what exactly need to be rebuilt. Once it detects some changes, it will overwrite the existing binary that was previously built by the first step (`[ bazel]`). The bazel commands will detect that the binary has been overwritten and will restart gracefully once it's done, meaning that the services will still be running during the new build. EDIT: added the codeintel command set for @efritz who's up to be our guinea pig on this ## Test plan <!-- All pull requests REQUIRE a test plan: https://docs.sourcegraph.com/dev/background-information/testing_principles --> Locally tested. --------- Co-authored-by: davejrt <davetry@gmail.com> Co-authored-by: William Bezuidenhout <william.bezuidenhout@sourcegraph.com>
2023-03-02 10:31:51 +00:00
bazelCommands:
- blobstore
- docsite
bazel: add command sets that uses bazel under the hood (#48460) This PRs introduces `bazelCommand`(s) in `sg.config.yaml` to handle starting those services, plus command sets for both oss and enterprise. `go run ./dev/sg start enterprise-bazel` will start most of the services with bazel and will keep using normal commands for the others (caddy, etc ...). The code powering those is built on top of `startedCmd` which I don't want to touch until we have a bit more understanding how all of this behaves. Once we have that, we can start refactoring this so it's more straightforward. Running the command set with bazel is bit verbose, you'll see two new "services" being ran: - `[ bazel]` which is ran at the very beginning and the equivalent of the old `install` part of the commands, except it does all of them at once. Once it's complete, it will stop exit nicely. - `[ iBazel]` which handles rebuilding the services in the background. The only thing it will do is to watch the FS based on the bazel rules and the depencency tree to understand what exactly need to be rebuilt. Once it detects some changes, it will overwrite the existing binary that was previously built by the first step (`[ bazel]`). The bazel commands will detect that the binary has been overwritten and will restart gracefully once it's done, meaning that the services will still be running during the new build. EDIT: added the codeintel command set for @efritz who's up to be our guinea pig on this ## Test plan <!-- All pull requests REQUIRE a test plan: https://docs.sourcegraph.com/dev/background-information/testing_principles --> Locally tested. --------- Co-authored-by: davejrt <davetry@gmail.com> Co-authored-by: William Bezuidenhout <william.bezuidenhout@sourcegraph.com>
2023-03-02 10:31:51 +00:00
- frontend
- worker
- repo-updater
- gitserver-0
- gitserver-1
- searcher
- symbols
# - syntax-highlighter
bazel: add command sets that uses bazel under the hood (#48460) This PRs introduces `bazelCommand`(s) in `sg.config.yaml` to handle starting those services, plus command sets for both oss and enterprise. `go run ./dev/sg start enterprise-bazel` will start most of the services with bazel and will keep using normal commands for the others (caddy, etc ...). The code powering those is built on top of `startedCmd` which I don't want to touch until we have a bit more understanding how all of this behaves. Once we have that, we can start refactoring this so it's more straightforward. Running the command set with bazel is bit verbose, you'll see two new "services" being ran: - `[ bazel]` which is ran at the very beginning and the equivalent of the old `install` part of the commands, except it does all of them at once. Once it's complete, it will stop exit nicely. - `[ iBazel]` which handles rebuilding the services in the background. The only thing it will do is to watch the FS based on the bazel rules and the depencency tree to understand what exactly need to be rebuilt. Once it detects some changes, it will overwrite the existing binary that was previously built by the first step (`[ bazel]`). The bazel commands will detect that the binary has been overwritten and will restart gracefully once it's done, meaning that the services will still be running during the new build. EDIT: added the codeintel command set for @efritz who's up to be our guinea pig on this ## Test plan <!-- All pull requests REQUIRE a test plan: https://docs.sourcegraph.com/dev/background-information/testing_principles --> Locally tested. --------- Co-authored-by: davejrt <davetry@gmail.com> Co-authored-by: William Bezuidenhout <william.bezuidenhout@sourcegraph.com>
2023-03-02 10:31:51 +00:00
commands:
- web
- zoekt-index-0
- zoekt-index-1
- zoekt-web-0
- zoekt-web-1
- caddy
# If you modify this command set, please consider also updating the dotcom runset.
enterprise: &enterprise_set
checks:
- docker
- redis
- postgres
- git
- dev-private
commands:
- frontend
- worker
- repo-updater
- web
- gitserver-0
- gitserver-1
- searcher
- caddy
- symbols
# TODO https://github.com/sourcegraph/devx-support/issues/537
# - docsite
- syntax-highlighter
- zoekt-index-0
- zoekt-index-1
- zoekt-web-0
- zoekt-web-1
- blobstore
- embeddings
env:
DISABLE_CODE_INSIGHTS_HISTORICAL: false
DISABLE_CODE_INSIGHTS: false
enterprise-e2e:
<<: *enterprise_set
env:
# EXTSVC_CONFIG_FILE being set prevents the e2e test suite to add
# additional connections.
EXTSVC_CONFIG_FILE: ''
dotcom:
# This is 95% the enterprise runset, with the addition of Cody Gateway.
checks:
- docker
- redis
- postgres
- git
- dev-private
commands:
- frontend
- worker
- repo-updater
- web
- gitserver-0
- gitserver-1
- searcher
- symbols
- caddy
- docsite
- syntax-highlighter
- zoekt-index-0
- zoekt-index-1
- zoekt-web-0
- zoekt-web-1
- blobstore
- embeddings
- cody-gateway
env:
SOURCEGRAPHDOTCOM_MODE: true
bazel: add command sets that uses bazel under the hood (#48460) This PRs introduces `bazelCommand`(s) in `sg.config.yaml` to handle starting those services, plus command sets for both oss and enterprise. `go run ./dev/sg start enterprise-bazel` will start most of the services with bazel and will keep using normal commands for the others (caddy, etc ...). The code powering those is built on top of `startedCmd` which I don't want to touch until we have a bit more understanding how all of this behaves. Once we have that, we can start refactoring this so it's more straightforward. Running the command set with bazel is bit verbose, you'll see two new "services" being ran: - `[ bazel]` which is ran at the very beginning and the equivalent of the old `install` part of the commands, except it does all of them at once. Once it's complete, it will stop exit nicely. - `[ iBazel]` which handles rebuilding the services in the background. The only thing it will do is to watch the FS based on the bazel rules and the depencency tree to understand what exactly need to be rebuilt. Once it detects some changes, it will overwrite the existing binary that was previously built by the first step (`[ bazel]`). The bazel commands will detect that the binary has been overwritten and will restart gracefully once it's done, meaning that the services will still be running during the new build. EDIT: added the codeintel command set for @efritz who's up to be our guinea pig on this ## Test plan <!-- All pull requests REQUIRE a test plan: https://docs.sourcegraph.com/dev/background-information/testing_principles --> Locally tested. --------- Co-authored-by: davejrt <davetry@gmail.com> Co-authored-by: William Bezuidenhout <william.bezuidenhout@sourcegraph.com>
2023-03-02 10:31:51 +00:00
codeintel-bazel: &codeintel_bazel_set
checks:
- docker
- redis
- postgres
- git
- bazelisk
- ibazel
- dev-private
bazel: add command sets that uses bazel under the hood (#48460) This PRs introduces `bazelCommand`(s) in `sg.config.yaml` to handle starting those services, plus command sets for both oss and enterprise. `go run ./dev/sg start enterprise-bazel` will start most of the services with bazel and will keep using normal commands for the others (caddy, etc ...). The code powering those is built on top of `startedCmd` which I don't want to touch until we have a bit more understanding how all of this behaves. Once we have that, we can start refactoring this so it's more straightforward. Running the command set with bazel is bit verbose, you'll see two new "services" being ran: - `[ bazel]` which is ran at the very beginning and the equivalent of the old `install` part of the commands, except it does all of them at once. Once it's complete, it will stop exit nicely. - `[ iBazel]` which handles rebuilding the services in the background. The only thing it will do is to watch the FS based on the bazel rules and the depencency tree to understand what exactly need to be rebuilt. Once it detects some changes, it will overwrite the existing binary that was previously built by the first step (`[ bazel]`). The bazel commands will detect that the binary has been overwritten and will restart gracefully once it's done, meaning that the services will still be running during the new build. EDIT: added the codeintel command set for @efritz who's up to be our guinea pig on this ## Test plan <!-- All pull requests REQUIRE a test plan: https://docs.sourcegraph.com/dev/background-information/testing_principles --> Locally tested. --------- Co-authored-by: davejrt <davetry@gmail.com> Co-authored-by: William Bezuidenhout <william.bezuidenhout@sourcegraph.com>
2023-03-02 10:31:51 +00:00
bazelCommands:
- blobstore
bazel: add command sets that uses bazel under the hood (#48460) This PRs introduces `bazelCommand`(s) in `sg.config.yaml` to handle starting those services, plus command sets for both oss and enterprise. `go run ./dev/sg start enterprise-bazel` will start most of the services with bazel and will keep using normal commands for the others (caddy, etc ...). The code powering those is built on top of `startedCmd` which I don't want to touch until we have a bit more understanding how all of this behaves. Once we have that, we can start refactoring this so it's more straightforward. Running the command set with bazel is bit verbose, you'll see two new "services" being ran: - `[ bazel]` which is ran at the very beginning and the equivalent of the old `install` part of the commands, except it does all of them at once. Once it's complete, it will stop exit nicely. - `[ iBazel]` which handles rebuilding the services in the background. The only thing it will do is to watch the FS based on the bazel rules and the depencency tree to understand what exactly need to be rebuilt. Once it detects some changes, it will overwrite the existing binary that was previously built by the first step (`[ bazel]`). The bazel commands will detect that the binary has been overwritten and will restart gracefully once it's done, meaning that the services will still be running during the new build. EDIT: added the codeintel command set for @efritz who's up to be our guinea pig on this ## Test plan <!-- All pull requests REQUIRE a test plan: https://docs.sourcegraph.com/dev/background-information/testing_principles --> Locally tested. --------- Co-authored-by: davejrt <davetry@gmail.com> Co-authored-by: William Bezuidenhout <william.bezuidenhout@sourcegraph.com>
2023-03-02 10:31:51 +00:00
- frontend
- worker
- repo-updater
- gitserver-0
- gitserver-1
- searcher
- symbols
- syntax-highlighter
bazel: add command sets that uses bazel under the hood (#48460) This PRs introduces `bazelCommand`(s) in `sg.config.yaml` to handle starting those services, plus command sets for both oss and enterprise. `go run ./dev/sg start enterprise-bazel` will start most of the services with bazel and will keep using normal commands for the others (caddy, etc ...). The code powering those is built on top of `startedCmd` which I don't want to touch until we have a bit more understanding how all of this behaves. Once we have that, we can start refactoring this so it's more straightforward. Running the command set with bazel is bit verbose, you'll see two new "services" being ran: - `[ bazel]` which is ran at the very beginning and the equivalent of the old `install` part of the commands, except it does all of them at once. Once it's complete, it will stop exit nicely. - `[ iBazel]` which handles rebuilding the services in the background. The only thing it will do is to watch the FS based on the bazel rules and the depencency tree to understand what exactly need to be rebuilt. Once it detects some changes, it will overwrite the existing binary that was previously built by the first step (`[ bazel]`). The bazel commands will detect that the binary has been overwritten and will restart gracefully once it's done, meaning that the services will still be running during the new build. EDIT: added the codeintel command set for @efritz who's up to be our guinea pig on this ## Test plan <!-- All pull requests REQUIRE a test plan: https://docs.sourcegraph.com/dev/background-information/testing_principles --> Locally tested. --------- Co-authored-by: davejrt <davetry@gmail.com> Co-authored-by: William Bezuidenhout <william.bezuidenhout@sourcegraph.com>
2023-03-02 10:31:51 +00:00
- codeintel-worker
- codeintel-executor
commands:
- web
- docsite
- zoekt-index-0
- zoekt-index-1
- zoekt-web-0
- zoekt-web-1
- caddy
- jaeger
- grafana
- prometheus
codeintel-syntactic:
checks:
- docker
- redis
- postgres
- git
- dev-private
commands:
- frontend
- web
- worker
- blobstore
- repo-updater
- gitserver-0
- gitserver-1
- syntactic-code-intel-worker-0
- syntactic-code-intel-worker-1
codeintel:
checks:
- docker
- redis
- postgres
- git
- dev-private
commands:
- frontend
- worker
- repo-updater
- web
- gitserver-0
- gitserver-1
- searcher
- symbols
- caddy
- docsite
- syntax-highlighter
- zoekt-index-0
- zoekt-index-1
- zoekt-web-0
- zoekt-web-1
- blobstore
- codeintel-worker
- codeintel-executor
# - otel-collector
- jaeger
- grafana
- prometheus
Executors: enable dequeueing and heartbeating for multiple queues (#52016) - Closes https://github.com/sourcegraph/sourcegraph/issues/50614 Depends on the following issues: - https://github.com/sourcegraph/sourcegraph/issues/50616 - Merged in https://github.com/sourcegraph/sourcegraph/pull/52020 - https://github.com/sourcegraph/sourcegraph/issues/51656 - Merged in https://github.com/sourcegraph/sourcegraph/pull/52381 - https://github.com/sourcegraph/sourcegraph/issues/51658 - Merged in https://github.com/sourcegraph/sourcegraph/pull/52525 Due to the dependency on the work listed above, all PRs were ultimately merged into this branch, so the client side implementation got a little bit messy to review, for which I apologize. The initial intent of this PR, closing #50614, was reviewed before the merges occurred. ## Demos ### Scenario 1: old Sourcegraph version, new executor version #### batches, single queue https://github.com/sourcegraph/sourcegraph/assets/2979513/6d325326-d4f5-4e6f-a5e0-0918dbde07bf #### codeintel, single queue https://github.com/sourcegraph/sourcegraph/assets/2979513/ff3a87db-ef07-4221-a159-73b93a6ae55b ### Scenario 2: new Sourcegraph version, old executor version #### batches, single queue https://github.com/sourcegraph/sourcegraph/assets/2979513/c66a9b9f-b745-4abc-bf87-9077dd5f2959 #### codeintel, single queue https://github.com/sourcegraph/sourcegraph/assets/2979513/785d7f34-630f-4c2a-abb6-82ca3b896813 ### Scenario 3: new Sourcegraph version, new executor version #### batches, single queue https://github.com/sourcegraph/sourcegraph/assets/2979513/69c5c06b-ea1c-47f7-9196-e3f108794493 #### codeintel, single queue https://github.com/sourcegraph/sourcegraph/assets/2979513/4e791d98-06cd-4fa7-93dd-81be0a73c744 #### batches + codeintel in parallel, multi queue https://github.com/sourcegraph/sourcegraph/assets/2979513/004608dd-f98c-4305-aeb3-a01444c12224 ## Initial PR description (enable client to dequeue from multiple queues) This PR updates the client's `Dequeue` method to dequeue from multiple queues. It's backwards compatible with single-queue configurations. The field `Job.Queue` is only set when dequeuing from the general `/dequeue` endpoint, so the job `MarkXxx` methods default to single-queue behaviour if the job does not specify a queue name. Unrelated to the linked issue, the multi-handler will log and return no content early in the event of an empty queue list in a dequeue request (although this should never occur). ## Test plan - [x] Unit tests - [x] Local testing - [x] Dogfood testing - [x] Demo <!-- All pull requests REQUIRE a test plan: https://docs.sourcegraph.com/dev/background-information/testing_principles -->
2023-06-04 13:25:05 +00:00
codeintel-kubernetes:
checks:
- docker
- redis
- postgres
- git
- dev-private
commands:
- frontend
- worker
- repo-updater
- web
- gitserver-0
- gitserver-1
- searcher
- symbols
- caddy
- docsite
- syntax-highlighter
- zoekt-index-0
- zoekt-index-1
- zoekt-web-0
- zoekt-web-1
- blobstore
- codeintel-worker
Executors: enable dequeueing and heartbeating for multiple queues (#52016) - Closes https://github.com/sourcegraph/sourcegraph/issues/50614 Depends on the following issues: - https://github.com/sourcegraph/sourcegraph/issues/50616 - Merged in https://github.com/sourcegraph/sourcegraph/pull/52020 - https://github.com/sourcegraph/sourcegraph/issues/51656 - Merged in https://github.com/sourcegraph/sourcegraph/pull/52381 - https://github.com/sourcegraph/sourcegraph/issues/51658 - Merged in https://github.com/sourcegraph/sourcegraph/pull/52525 Due to the dependency on the work listed above, all PRs were ultimately merged into this branch, so the client side implementation got a little bit messy to review, for which I apologize. The initial intent of this PR, closing #50614, was reviewed before the merges occurred. ## Demos ### Scenario 1: old Sourcegraph version, new executor version #### batches, single queue https://github.com/sourcegraph/sourcegraph/assets/2979513/6d325326-d4f5-4e6f-a5e0-0918dbde07bf #### codeintel, single queue https://github.com/sourcegraph/sourcegraph/assets/2979513/ff3a87db-ef07-4221-a159-73b93a6ae55b ### Scenario 2: new Sourcegraph version, old executor version #### batches, single queue https://github.com/sourcegraph/sourcegraph/assets/2979513/c66a9b9f-b745-4abc-bf87-9077dd5f2959 #### codeintel, single queue https://github.com/sourcegraph/sourcegraph/assets/2979513/785d7f34-630f-4c2a-abb6-82ca3b896813 ### Scenario 3: new Sourcegraph version, new executor version #### batches, single queue https://github.com/sourcegraph/sourcegraph/assets/2979513/69c5c06b-ea1c-47f7-9196-e3f108794493 #### codeintel, single queue https://github.com/sourcegraph/sourcegraph/assets/2979513/4e791d98-06cd-4fa7-93dd-81be0a73c744 #### batches + codeintel in parallel, multi queue https://github.com/sourcegraph/sourcegraph/assets/2979513/004608dd-f98c-4305-aeb3-a01444c12224 ## Initial PR description (enable client to dequeue from multiple queues) This PR updates the client's `Dequeue` method to dequeue from multiple queues. It's backwards compatible with single-queue configurations. The field `Job.Queue` is only set when dequeuing from the general `/dequeue` endpoint, so the job `MarkXxx` methods default to single-queue behaviour if the job does not specify a queue name. Unrelated to the linked issue, the multi-handler will log and return no content early in the event of an empty queue list in a dequeue request (although this should never occur). ## Test plan - [x] Unit tests - [x] Local testing - [x] Dogfood testing - [x] Demo <!-- All pull requests REQUIRE a test plan: https://docs.sourcegraph.com/dev/background-information/testing_principles -->
2023-06-04 13:25:05 +00:00
- codeintel-executor-kubernetes
# - otel-collector
- jaeger
- grafana
- prometheus
Executors: enable dequeueing and heartbeating for multiple queues (#52016) - Closes https://github.com/sourcegraph/sourcegraph/issues/50614 Depends on the following issues: - https://github.com/sourcegraph/sourcegraph/issues/50616 - Merged in https://github.com/sourcegraph/sourcegraph/pull/52020 - https://github.com/sourcegraph/sourcegraph/issues/51656 - Merged in https://github.com/sourcegraph/sourcegraph/pull/52381 - https://github.com/sourcegraph/sourcegraph/issues/51658 - Merged in https://github.com/sourcegraph/sourcegraph/pull/52525 Due to the dependency on the work listed above, all PRs were ultimately merged into this branch, so the client side implementation got a little bit messy to review, for which I apologize. The initial intent of this PR, closing #50614, was reviewed before the merges occurred. ## Demos ### Scenario 1: old Sourcegraph version, new executor version #### batches, single queue https://github.com/sourcegraph/sourcegraph/assets/2979513/6d325326-d4f5-4e6f-a5e0-0918dbde07bf #### codeintel, single queue https://github.com/sourcegraph/sourcegraph/assets/2979513/ff3a87db-ef07-4221-a159-73b93a6ae55b ### Scenario 2: new Sourcegraph version, old executor version #### batches, single queue https://github.com/sourcegraph/sourcegraph/assets/2979513/c66a9b9f-b745-4abc-bf87-9077dd5f2959 #### codeintel, single queue https://github.com/sourcegraph/sourcegraph/assets/2979513/785d7f34-630f-4c2a-abb6-82ca3b896813 ### Scenario 3: new Sourcegraph version, new executor version #### batches, single queue https://github.com/sourcegraph/sourcegraph/assets/2979513/69c5c06b-ea1c-47f7-9196-e3f108794493 #### codeintel, single queue https://github.com/sourcegraph/sourcegraph/assets/2979513/4e791d98-06cd-4fa7-93dd-81be0a73c744 #### batches + codeintel in parallel, multi queue https://github.com/sourcegraph/sourcegraph/assets/2979513/004608dd-f98c-4305-aeb3-a01444c12224 ## Initial PR description (enable client to dequeue from multiple queues) This PR updates the client's `Dequeue` method to dequeue from multiple queues. It's backwards compatible with single-queue configurations. The field `Job.Queue` is only set when dequeuing from the general `/dequeue` endpoint, so the job `MarkXxx` methods default to single-queue behaviour if the job does not specify a queue name. Unrelated to the linked issue, the multi-handler will log and return no content early in the event of an empty queue list in a dequeue request (although this should never occur). ## Test plan - [x] Unit tests - [x] Local testing - [x] Dogfood testing - [x] Demo <!-- All pull requests REQUIRE a test plan: https://docs.sourcegraph.com/dev/background-information/testing_principles -->
2023-06-04 13:25:05 +00:00
enterprise-codeintel:
checks:
- docker
- redis
- postgres
- git
- dev-private
commands:
- frontend
- worker
- repo-updater
- web
- gitserver-0
- gitserver-1
- searcher
- symbols
- caddy
- docsite
- syntax-highlighter
- zoekt-index-0
- zoekt-index-1
- zoekt-web-0
- zoekt-web-1
- blobstore
- codeintel-worker
Executors: enable dequeueing and heartbeating for multiple queues (#52016) - Closes https://github.com/sourcegraph/sourcegraph/issues/50614 Depends on the following issues: - https://github.com/sourcegraph/sourcegraph/issues/50616 - Merged in https://github.com/sourcegraph/sourcegraph/pull/52020 - https://github.com/sourcegraph/sourcegraph/issues/51656 - Merged in https://github.com/sourcegraph/sourcegraph/pull/52381 - https://github.com/sourcegraph/sourcegraph/issues/51658 - Merged in https://github.com/sourcegraph/sourcegraph/pull/52525 Due to the dependency on the work listed above, all PRs were ultimately merged into this branch, so the client side implementation got a little bit messy to review, for which I apologize. The initial intent of this PR, closing #50614, was reviewed before the merges occurred. ## Demos ### Scenario 1: old Sourcegraph version, new executor version #### batches, single queue https://github.com/sourcegraph/sourcegraph/assets/2979513/6d325326-d4f5-4e6f-a5e0-0918dbde07bf #### codeintel, single queue https://github.com/sourcegraph/sourcegraph/assets/2979513/ff3a87db-ef07-4221-a159-73b93a6ae55b ### Scenario 2: new Sourcegraph version, old executor version #### batches, single queue https://github.com/sourcegraph/sourcegraph/assets/2979513/c66a9b9f-b745-4abc-bf87-9077dd5f2959 #### codeintel, single queue https://github.com/sourcegraph/sourcegraph/assets/2979513/785d7f34-630f-4c2a-abb6-82ca3b896813 ### Scenario 3: new Sourcegraph version, new executor version #### batches, single queue https://github.com/sourcegraph/sourcegraph/assets/2979513/69c5c06b-ea1c-47f7-9196-e3f108794493 #### codeintel, single queue https://github.com/sourcegraph/sourcegraph/assets/2979513/4e791d98-06cd-4fa7-93dd-81be0a73c744 #### batches + codeintel in parallel, multi queue https://github.com/sourcegraph/sourcegraph/assets/2979513/004608dd-f98c-4305-aeb3-a01444c12224 ## Initial PR description (enable client to dequeue from multiple queues) This PR updates the client's `Dequeue` method to dequeue from multiple queues. It's backwards compatible with single-queue configurations. The field `Job.Queue` is only set when dequeuing from the general `/dequeue` endpoint, so the job `MarkXxx` methods default to single-queue behaviour if the job does not specify a queue name. Unrelated to the linked issue, the multi-handler will log and return no content early in the event of an empty queue list in a dequeue request (although this should never occur). ## Test plan - [x] Unit tests - [x] Local testing - [x] Dogfood testing - [x] Demo <!-- All pull requests REQUIRE a test plan: https://docs.sourcegraph.com/dev/background-information/testing_principles -->
2023-06-04 13:25:05 +00:00
- codeintel-executor
- otel-collector
- jaeger
- grafana
- prometheus
Executors: enable dequeueing and heartbeating for multiple queues (#52016) - Closes https://github.com/sourcegraph/sourcegraph/issues/50614 Depends on the following issues: - https://github.com/sourcegraph/sourcegraph/issues/50616 - Merged in https://github.com/sourcegraph/sourcegraph/pull/52020 - https://github.com/sourcegraph/sourcegraph/issues/51656 - Merged in https://github.com/sourcegraph/sourcegraph/pull/52381 - https://github.com/sourcegraph/sourcegraph/issues/51658 - Merged in https://github.com/sourcegraph/sourcegraph/pull/52525 Due to the dependency on the work listed above, all PRs were ultimately merged into this branch, so the client side implementation got a little bit messy to review, for which I apologize. The initial intent of this PR, closing #50614, was reviewed before the merges occurred. ## Demos ### Scenario 1: old Sourcegraph version, new executor version #### batches, single queue https://github.com/sourcegraph/sourcegraph/assets/2979513/6d325326-d4f5-4e6f-a5e0-0918dbde07bf #### codeintel, single queue https://github.com/sourcegraph/sourcegraph/assets/2979513/ff3a87db-ef07-4221-a159-73b93a6ae55b ### Scenario 2: new Sourcegraph version, old executor version #### batches, single queue https://github.com/sourcegraph/sourcegraph/assets/2979513/c66a9b9f-b745-4abc-bf87-9077dd5f2959 #### codeintel, single queue https://github.com/sourcegraph/sourcegraph/assets/2979513/785d7f34-630f-4c2a-abb6-82ca3b896813 ### Scenario 3: new Sourcegraph version, new executor version #### batches, single queue https://github.com/sourcegraph/sourcegraph/assets/2979513/69c5c06b-ea1c-47f7-9196-e3f108794493 #### codeintel, single queue https://github.com/sourcegraph/sourcegraph/assets/2979513/4e791d98-06cd-4fa7-93dd-81be0a73c744 #### batches + codeintel in parallel, multi queue https://github.com/sourcegraph/sourcegraph/assets/2979513/004608dd-f98c-4305-aeb3-a01444c12224 ## Initial PR description (enable client to dequeue from multiple queues) This PR updates the client's `Dequeue` method to dequeue from multiple queues. It's backwards compatible with single-queue configurations. The field `Job.Queue` is only set when dequeuing from the general `/dequeue` endpoint, so the job `MarkXxx` methods default to single-queue behaviour if the job does not specify a queue name. Unrelated to the linked issue, the multi-handler will log and return no content early in the event of an empty queue list in a dequeue request (although this should never occur). ## Test plan - [x] Unit tests - [x] Local testing - [x] Dogfood testing - [x] Demo <!-- All pull requests REQUIRE a test plan: https://docs.sourcegraph.com/dev/background-information/testing_principles -->
2023-06-04 13:25:05 +00:00
enterprise-codeintel-multi-queue-executor:
checks:
- docker
- redis
- postgres
- git
- dev-private
Executors: enable dequeueing and heartbeating for multiple queues (#52016) - Closes https://github.com/sourcegraph/sourcegraph/issues/50614 Depends on the following issues: - https://github.com/sourcegraph/sourcegraph/issues/50616 - Merged in https://github.com/sourcegraph/sourcegraph/pull/52020 - https://github.com/sourcegraph/sourcegraph/issues/51656 - Merged in https://github.com/sourcegraph/sourcegraph/pull/52381 - https://github.com/sourcegraph/sourcegraph/issues/51658 - Merged in https://github.com/sourcegraph/sourcegraph/pull/52525 Due to the dependency on the work listed above, all PRs were ultimately merged into this branch, so the client side implementation got a little bit messy to review, for which I apologize. The initial intent of this PR, closing #50614, was reviewed before the merges occurred. ## Demos ### Scenario 1: old Sourcegraph version, new executor version #### batches, single queue https://github.com/sourcegraph/sourcegraph/assets/2979513/6d325326-d4f5-4e6f-a5e0-0918dbde07bf #### codeintel, single queue https://github.com/sourcegraph/sourcegraph/assets/2979513/ff3a87db-ef07-4221-a159-73b93a6ae55b ### Scenario 2: new Sourcegraph version, old executor version #### batches, single queue https://github.com/sourcegraph/sourcegraph/assets/2979513/c66a9b9f-b745-4abc-bf87-9077dd5f2959 #### codeintel, single queue https://github.com/sourcegraph/sourcegraph/assets/2979513/785d7f34-630f-4c2a-abb6-82ca3b896813 ### Scenario 3: new Sourcegraph version, new executor version #### batches, single queue https://github.com/sourcegraph/sourcegraph/assets/2979513/69c5c06b-ea1c-47f7-9196-e3f108794493 #### codeintel, single queue https://github.com/sourcegraph/sourcegraph/assets/2979513/4e791d98-06cd-4fa7-93dd-81be0a73c744 #### batches + codeintel in parallel, multi queue https://github.com/sourcegraph/sourcegraph/assets/2979513/004608dd-f98c-4305-aeb3-a01444c12224 ## Initial PR description (enable client to dequeue from multiple queues) This PR updates the client's `Dequeue` method to dequeue from multiple queues. It's backwards compatible with single-queue configurations. The field `Job.Queue` is only set when dequeuing from the general `/dequeue` endpoint, so the job `MarkXxx` methods default to single-queue behaviour if the job does not specify a queue name. Unrelated to the linked issue, the multi-handler will log and return no content early in the event of an empty queue list in a dequeue request (although this should never occur). ## Test plan - [x] Unit tests - [x] Local testing - [x] Dogfood testing - [x] Demo <!-- All pull requests REQUIRE a test plan: https://docs.sourcegraph.com/dev/background-information/testing_principles -->
2023-06-04 13:25:05 +00:00
commands:
- frontend
- worker
- repo-updater
- web
- gitserver-0
- gitserver-1
- searcher
- symbols
- caddy
- docsite
- syntax-highlighter
- zoekt-index-0
- zoekt-index-1
- zoekt-web-0
- zoekt-web-1
- blobstore
- codeintel-worker
- multiqueue-executor
# - otel-collector
- jaeger
- grafana
- prometheus
bazel: add command sets that uses bazel under the hood (#48460) This PRs introduces `bazelCommand`(s) in `sg.config.yaml` to handle starting those services, plus command sets for both oss and enterprise. `go run ./dev/sg start enterprise-bazel` will start most of the services with bazel and will keep using normal commands for the others (caddy, etc ...). The code powering those is built on top of `startedCmd` which I don't want to touch until we have a bit more understanding how all of this behaves. Once we have that, we can start refactoring this so it's more straightforward. Running the command set with bazel is bit verbose, you'll see two new "services" being ran: - `[ bazel]` which is ran at the very beginning and the equivalent of the old `install` part of the commands, except it does all of them at once. Once it's complete, it will stop exit nicely. - `[ iBazel]` which handles rebuilding the services in the background. The only thing it will do is to watch the FS based on the bazel rules and the depencency tree to understand what exactly need to be rebuilt. Once it detects some changes, it will overwrite the existing binary that was previously built by the first step (`[ bazel]`). The bazel commands will detect that the binary has been overwritten and will restart gracefully once it's done, meaning that the services will still be running during the new build. EDIT: added the codeintel command set for @efritz who's up to be our guinea pig on this ## Test plan <!-- All pull requests REQUIRE a test plan: https://docs.sourcegraph.com/dev/background-information/testing_principles --> Locally tested. --------- Co-authored-by: davejrt <davetry@gmail.com> Co-authored-by: William Bezuidenhout <william.bezuidenhout@sourcegraph.com>
2023-03-02 10:31:51 +00:00
enterprise-codeintel-bazel:
<<: *codeintel_bazel_set
enterprise-codeinsights:
checks:
- docker
- redis
- postgres
- git
- dev-private
commands:
- frontend
- worker
- repo-updater
- web
- gitserver-0
- gitserver-1
- searcher
- symbols
- caddy
- docsite
- syntax-highlighter
- zoekt-index-0
- zoekt-index-1
- zoekt-web-0
- zoekt-web-1
- blobstore
env:
DISABLE_CODE_INSIGHTS_HISTORICAL: false
DISABLE_CODE_INSIGHTS: false
2021-05-25 08:33:48 +00:00
api-only:
checks:
- docker
- redis
- postgres
- git
- dev-private
commands:
- frontend
- worker
- repo-updater
- gitserver-0
- gitserver-1
- searcher
- symbols
- zoekt-index-0
- zoekt-index-1
- zoekt-web-0
- zoekt-web-1
- blobstore
2021-05-25 08:33:48 +00:00
batches:
checks:
- docker
- redis
- postgres
- git
- dev-private
commands:
- frontend
- worker
- repo-updater
- web
- gitserver-0
- gitserver-1
- searcher
- symbols
- caddy
- docsite
- syntax-highlighter
- zoekt-index-0
- zoekt-index-1
- zoekt-web-0
- zoekt-web-1
- blobstore
- batches-executor
Experiment: Natively run SSBC in docker (#44034) This adds an experimental code path that I will use to test a docker-only execution mode for server-side batch changes. This code path is never executed for customers until we make the switch when we deem it ready. This will allow me to dogfood this while it's not available to customer instances yet. Ultimately, the goal of this is to make executors simply be "the job runner platform through a generic interface". Today, this depends on src-cli to do a good bunch of the work. This is a blocker for going full docker-based with executors, which will ultimately be a requirement on the road to k8s-based executors. As this removes the dependency on src-cli, nothing but the job interface and API endpoints tie executor and Sourcegraph instance together. Ultimately, this will allow us to support larger version spans between the two (pending executors going GA and being feature-complete). Known issues/limitations: Steps skipped in between steps that run don't work yet Skipping steps dynamically is inefficient as we cannot tell the executor to skip a step IF X, so we replace the script by exit 0 It is unclear if all variants of file mounts still work. Basic cases do work. Files used to be read-only in src-cli, they aren't now, but content is still reset in between steps. The assumption that everything operates in /work is broken here, because we need to use what executors give us to persist out-of-repo state in between containers (like the step result from the previous step) It is unclear if workspace mounts work Cache keys are not correctly computed if using workspace mounts - the metadataretriever is nil We still use log outputs to transfer the AfterStepResults to the Sourcegraph instance, this should finally become an artifact instead. Then, we don't have to rely on the execution_log_entires anymore and can theoretically prune those after some time. This column is currently growing indefinitely. It depends on tee being available in the docker images to capture the cmd.stdout/cmd.stderr properly for template variable rendering Env-vars are not rendered in their evaluated form post-execution File permissions are unclear and might be similarly broken to how they are now - or even worse Disclaimer: It's not feature complete today! But it is also not hitting any default code paths either. As development on this goes on, we can eventually remove the feature flag and run the new job format on all instances. This PR handles fallback of rendering old records correctly in the UI already.
2022-11-09 23:20:43 +00:00
- batcheshelper-builder
batches-kubernetes:
checks:
- docker
- redis
- postgres
- git
- dev-private
commands:
- frontend
- worker
- repo-updater
- web
- gitserver-0
- gitserver-1
- searcher
- symbols
- caddy
- docsite
- syntax-highlighter
- zoekt-index-0
- zoekt-index-1
- zoekt-web-0
- zoekt-web-1
- blobstore
- batches-executor-kubernetes
- batcheshelper-builder
iam:
2021-07-23 07:59:43 +00:00
checks:
- docker
- redis
- postgres
- git
- dev-private
2021-07-23 07:59:43 +00:00
commands:
- frontend
- repo-updater
- web
- gitserver-0
- gitserver-1
2021-07-23 07:59:43 +00:00
- caddy
monitoring:
checks:
- docker
commands:
- jaeger
dockerCommands:
- otel-collector
- prometheus
- grafana
- postgres_exporter
monitoring-og:
checks:
- docker
commands:
- jaeger
- otel-collector
- prometheus
- grafana
- postgres_exporter
monitoring-alerts:
checks:
- docker
- redis
- postgres
commands:
- prometheus
- grafana
# For generated alerts docs
- docsite
# For the alerting integration with frontend
- frontend
- web
- caddy
web-standalone:
commands:
- web-standalone-http
- caddy
Enable bazel for web-sveltekit and make it available in production builds (#55177) ## Bazel It took me some time to figure out how to make it work. I don't claim that this is the best setup (because I don't really have an idea what I'm doing here), I just tried to get things working. The main issues had been around loading the generate `client/*` packages and importing `*.scss` files. ### Loading `@sourcegraph/*` packages Unlike our other build tools, SvelteKit/vite load the application into Node for pre-rendering. This happens regardless whether pre-rendering/server-side-rendering is enabled or not (these settings live in the source code because they can be enabled/disabled per route). Long story short I had to configure vite to also process any `@sourcegraph/*` packages in order to make them compatible with node. You might wonder why that's not necessary when running vite directly in the repo? In the repo the `@sourcegraph/*` dependencies are all links to the corresponding `client/*` packages. Vite detects that and automatically treats them as "sources", not dependencies. ### SCSS files Somewhat related to the previous point, the built `@sourcegraph/*` packages do not contain any source SCSS files, only the generated CSS files. So importing SCSS files via `@sourcegraph/.../....scss` doesn't work. Furthermore, the generate code in the packages themselves import SCSS files, which also doesn't work. The "fix" for this is to rewrite any `*.scss` file imports to `*.css` file imports, but only inside those packages or only referencing files inside those packages. That's what we do in our `webpack.bazel.config.js` file as well. However, for global styles we need the SCSS files. I added a new target for copying those to the sandbox. --- Additionally this PR makes the following changes: - Rearrange style imports to remove unnecessary duplication and reduce the number of callsites that import from `@sourcegraph/*` packages. - Remove React integration with Notebooks and Insights. It was broken anyway at the moment and removing it reduces the number of dependencies and therefore points of failure. - Added a new target to copy the image files used by the prototype into the sandbox. - Disables gazelle for the sveltekit package for the time being. Type checking won't pass anyway because the code in the other client packages don't follow the same restrictions as `client/web-sveltekit`. - Updated the main header and dev server to proxy requests for notebooks, code insights and user settings to sourcegraph.com. ## Production build integration These changes make it possible to serve the SvelteKit version for search and repo pages when the `enable-sveltekit` feature flag is turned on. I aimed to make as few changes to the existing routing and handler code as possible to - server the SvelteKit index page for search and repo routes - make all other SvelteKit assets accessible via the `/.assets/` route In a nutshell, this is how it works now: - When building for development, the SvelteKit build process will output its files to `ui/assets/`, the same folder where webpack puts its files. To avoid conflicts with webpack generated files, all SvelteKit files are put in a subdirectory. - For production something similar happens except that bazel will copy all the files into a target directory - When accessing a search or repo route, we check, just before the response is rendered, whether to render the SvelteKit version or the React version. The challenge here was that we use the same handler for a lot of routes. `sveltekit.go` maintains a separate list of routes for which the SvelteKit version is available. This way I only had to add a check to three handler functions. And of course the feature flag must be enabled for the user/instance. - Because the SvelteKit files are stored in the same location as the webkit ones, serving those files via the `/.assets/` route "just works". Well, mostly. In order for the SvelteKit page to use the correct root-relative path I had to create a custom adapater, `sgAdapter`, which updates the URLs in the index page accordingly (I tried a lot of other approaches, some would have required changes to the assets handler... this was the more "contained" solution). Caveat: This is not ready to be officially tested: - Navigating between the React and SvelteKit versions does not always work as expected. Because of client side routing, navigating to e.g. the search page from the React app will load the React version of the site. The client side code needs to be updated to enforce a server refresh. I'll look into that in a future PR. - The SvelteKit version is relatively incomplete. Code intel, new search input, repo settings pages, etc are all missing. Most of this work is tracked in #55027. But before we spend more time getting things feature complete we want to do limited testing with the prototype in prod. - I wasn't able to get SvelteKit rebuilding to work reliably with `sg start enterprise-bazel`. For now it only builds the files once at start so that they exist. I'll look into improving the developer experience when running the full server locally in the future. For now, running `sg start web-sveltekit-standalone` is good enough. - Switching between the React and the SvelteKit version is definitely noticeable during development. I suspect to be faster in production (React is faster in production). Whether or not we go this route remains to be seen. Maybe we are embedding React pages into SvelteKit instead. At this point we just need to try how SvelteKit feels in production. - The SvelteKit `index.html` page lacks many things that the React `app.html` file has (e.g. preview links, analytics, observeability, etc). These have to be added eventually, but those are not necessary either for this initial test.
2023-10-16 12:15:59 +00:00
web-sveltekit-standalone:
commands:
- web-sveltekit-standalone
- caddy
env:
SK_PORT: 3080
web-standalone-prod:
commands:
- web-standalone-http-prod
- caddy
# For testing our OpenTelemetry stack
otel:
checks:
- docker
commands:
- otel-collector
- jaeger
single-program:
Sourcegraph App (single-binary branch) (#46547) * internal: add service and singleprogram packages * sg.config.yaml: add single-binary build targets * internal/env: add a function for clearing environ cache * internal/{workerutil,metrics}: add a hack to allow running 2 executors in the same process * internal/conf: add single-program deploy type * internal/singleprogram: clarify security * cmd/sourcegraph-oss: add initial single-binary main (will not build yet) * enterprise/cmd/sourcegraph: initial enterprise single-binary * Add multi-platform builds for single-program * single-binary: correctly build JS artifacts into binary * license_finder licenses add github.com/xi2/xz "Public domain" * internal/service/svcmain: correctly initialize logger for DeprecatedSingleServiceMain * worker: refactor to new service pattern * cmd/github-proxy: refactor to use new service pattern * symbols: refactor to use new service pattern * gitserver: refactor to user new service pattern * searcher: refactor to use new service pattern * gitserver: refactor to use new service pattern * repo-updater: refactor to use new service pattern * frontend: refactor to use new service pattern * executor: refactor to use new service pattern * internal/symbols: use new LoadConfig pattern * precise-code-intel-worker: refactor to use new service pattern * internal/symbols: load config for tests * cmd/repo-updater: remove LoadConfig approach * cmd/symbols: workaround env var conflict with searcher * executor: internal: add workaround to allow running 2 instances in same process * executors: add EXECUTOR_QUEUE_DISABLE_ACCESS_TOKEN for single-binary and dev deployments only * single-binary: use EXECUTOR_QUEUE_DISABLE_ACCESS_TOKEN * extsvc/github: fix default value for single-program deploy type * single-binary: stop relying on a local ctags image * single-binary: use unix sockets for postgres * release App snapshots in CI when pushed to app/release-snapshot branch * internal/service/svcmain: update TODO comment * executor: correct DEPLOY_TYPE check * dev/check: allow single-binary to import dbconn * executor: remove accidental reliance on dbconn package * executor: improve error logging when running commands (#46546) * executor: improve error logging when running commands * executor: do not attempt std config validation running e.g. install cmd * executor: do not pull in the conf package / frontend reliance * ci: executors: correct site config for passwordless auth * server: fix bug where github-proxy would try to be a conf server * CI: executors: fix integration test passwordless auth * executors: allow passwordless auth in sourcegraph/server for testing * repo-updater: fix enterprise init (caused regression in repository syncing) Signed-off-by: Stephen Gutekanst <stephen@sourcegraph.com> Co-authored-by: Peter Guy <peter.guy@sourcegraph.com> Co-authored-by: Quinn Slack <quinn@slack.org>
2023-01-20 00:35:39 +00:00
checks:
- git
- dev-private
Sourcegraph App (single-binary branch) (#46547) * internal: add service and singleprogram packages * sg.config.yaml: add single-binary build targets * internal/env: add a function for clearing environ cache * internal/{workerutil,metrics}: add a hack to allow running 2 executors in the same process * internal/conf: add single-program deploy type * internal/singleprogram: clarify security * cmd/sourcegraph-oss: add initial single-binary main (will not build yet) * enterprise/cmd/sourcegraph: initial enterprise single-binary * Add multi-platform builds for single-program * single-binary: correctly build JS artifacts into binary * license_finder licenses add github.com/xi2/xz "Public domain" * internal/service/svcmain: correctly initialize logger for DeprecatedSingleServiceMain * worker: refactor to new service pattern * cmd/github-proxy: refactor to use new service pattern * symbols: refactor to use new service pattern * gitserver: refactor to user new service pattern * searcher: refactor to use new service pattern * gitserver: refactor to use new service pattern * repo-updater: refactor to use new service pattern * frontend: refactor to use new service pattern * executor: refactor to use new service pattern * internal/symbols: use new LoadConfig pattern * precise-code-intel-worker: refactor to use new service pattern * internal/symbols: load config for tests * cmd/repo-updater: remove LoadConfig approach * cmd/symbols: workaround env var conflict with searcher * executor: internal: add workaround to allow running 2 instances in same process * executors: add EXECUTOR_QUEUE_DISABLE_ACCESS_TOKEN for single-binary and dev deployments only * single-binary: use EXECUTOR_QUEUE_DISABLE_ACCESS_TOKEN * extsvc/github: fix default value for single-program deploy type * single-binary: stop relying on a local ctags image * single-binary: use unix sockets for postgres * release App snapshots in CI when pushed to app/release-snapshot branch * internal/service/svcmain: update TODO comment * executor: correct DEPLOY_TYPE check * dev/check: allow single-binary to import dbconn * executor: remove accidental reliance on dbconn package * executor: improve error logging when running commands (#46546) * executor: improve error logging when running commands * executor: do not attempt std config validation running e.g. install cmd * executor: do not pull in the conf package / frontend reliance * ci: executors: correct site config for passwordless auth * server: fix bug where github-proxy would try to be a conf server * CI: executors: fix integration test passwordless auth * executors: allow passwordless auth in sourcegraph/server for testing * repo-updater: fix enterprise init (caused regression in repository syncing) Signed-off-by: Stephen Gutekanst <stephen@sourcegraph.com> Co-authored-by: Peter Guy <peter.guy@sourcegraph.com> Co-authored-by: Quinn Slack <quinn@slack.org>
2023-01-20 00:35:39 +00:00
commands:
- sourcegraph
- web
- caddy
env:
DISABLE_CODE_INSIGHTS: false
PRECISE_CODE_INTEL_UPLOAD_AWS_ENDPOINT: http://localhost:49000
EMBEDDINGS_UPLOAD_AWS_ENDPOINT: http://localhost:49000
USE_EMBEDDED_POSTGRESQL: false
cody-gateway:
checks:
- redis
commands:
- cody-gateway
cody-gateway-bazel:
checks:
- redis
bazelCommands:
- cody-gateway
Enable bazel for web-sveltekit and make it available in production builds (#55177) ## Bazel It took me some time to figure out how to make it work. I don't claim that this is the best setup (because I don't really have an idea what I'm doing here), I just tried to get things working. The main issues had been around loading the generate `client/*` packages and importing `*.scss` files. ### Loading `@sourcegraph/*` packages Unlike our other build tools, SvelteKit/vite load the application into Node for pre-rendering. This happens regardless whether pre-rendering/server-side-rendering is enabled or not (these settings live in the source code because they can be enabled/disabled per route). Long story short I had to configure vite to also process any `@sourcegraph/*` packages in order to make them compatible with node. You might wonder why that's not necessary when running vite directly in the repo? In the repo the `@sourcegraph/*` dependencies are all links to the corresponding `client/*` packages. Vite detects that and automatically treats them as "sources", not dependencies. ### SCSS files Somewhat related to the previous point, the built `@sourcegraph/*` packages do not contain any source SCSS files, only the generated CSS files. So importing SCSS files via `@sourcegraph/.../....scss` doesn't work. Furthermore, the generate code in the packages themselves import SCSS files, which also doesn't work. The "fix" for this is to rewrite any `*.scss` file imports to `*.css` file imports, but only inside those packages or only referencing files inside those packages. That's what we do in our `webpack.bazel.config.js` file as well. However, for global styles we need the SCSS files. I added a new target for copying those to the sandbox. --- Additionally this PR makes the following changes: - Rearrange style imports to remove unnecessary duplication and reduce the number of callsites that import from `@sourcegraph/*` packages. - Remove React integration with Notebooks and Insights. It was broken anyway at the moment and removing it reduces the number of dependencies and therefore points of failure. - Added a new target to copy the image files used by the prototype into the sandbox. - Disables gazelle for the sveltekit package for the time being. Type checking won't pass anyway because the code in the other client packages don't follow the same restrictions as `client/web-sveltekit`. - Updated the main header and dev server to proxy requests for notebooks, code insights and user settings to sourcegraph.com. ## Production build integration These changes make it possible to serve the SvelteKit version for search and repo pages when the `enable-sveltekit` feature flag is turned on. I aimed to make as few changes to the existing routing and handler code as possible to - server the SvelteKit index page for search and repo routes - make all other SvelteKit assets accessible via the `/.assets/` route In a nutshell, this is how it works now: - When building for development, the SvelteKit build process will output its files to `ui/assets/`, the same folder where webpack puts its files. To avoid conflicts with webpack generated files, all SvelteKit files are put in a subdirectory. - For production something similar happens except that bazel will copy all the files into a target directory - When accessing a search or repo route, we check, just before the response is rendered, whether to render the SvelteKit version or the React version. The challenge here was that we use the same handler for a lot of routes. `sveltekit.go` maintains a separate list of routes for which the SvelteKit version is available. This way I only had to add a check to three handler functions. And of course the feature flag must be enabled for the user/instance. - Because the SvelteKit files are stored in the same location as the webkit ones, serving those files via the `/.assets/` route "just works". Well, mostly. In order for the SvelteKit page to use the correct root-relative path I had to create a custom adapater, `sgAdapter`, which updates the URLs in the index page accordingly (I tried a lot of other approaches, some would have required changes to the assets handler... this was the more "contained" solution). Caveat: This is not ready to be officially tested: - Navigating between the React and SvelteKit versions does not always work as expected. Because of client side routing, navigating to e.g. the search page from the React app will load the React version of the site. The client side code needs to be updated to enforce a server refresh. I'll look into that in a future PR. - The SvelteKit version is relatively incomplete. Code intel, new search input, repo settings pages, etc are all missing. Most of this work is tracked in #55027. But before we spend more time getting things feature complete we want to do limited testing with the prototype in prod. - I wasn't able to get SvelteKit rebuilding to work reliably with `sg start enterprise-bazel`. For now it only builds the files once at start so that they exist. I'll look into improving the developer experience when running the full server locally in the future. For now, running `sg start web-sveltekit-standalone` is good enough. - Switching between the React and the SvelteKit version is definitely noticeable during development. I suspect to be faster in production (React is faster in production). Whether or not we go this route remains to be seen. Maybe we are embedding React pages into SvelteKit instead. At this point we just need to try how SvelteKit feels in production. - The SvelteKit `index.html` page lacks many things that the React `app.html` file has (e.g. preview links, analytics, observeability, etc). These have to be added eventually, but those are not necessary either for this initial test.
2023-10-16 12:15:59 +00:00
enterprise-bazel-sveltekit:
<<: *enterprise_bazel_set
env:
SVELTEKIT: true
enterprise-sveltekit:
<<: *enterprise_set
# Keep in sync with &enterprise_set.commands
commands:
- frontend
- worker
- repo-updater
- web
- web-sveltekit
- gitserver-0
- gitserver-1
- searcher
- caddy
- symbols
# TODO https://github.com/sourcegraph/devx-support/issues/537
# - docsite
- syntax-highlighter
- zoekt-index-0
- zoekt-index-1
- zoekt-web-0
- zoekt-web-1
- blobstore
- embeddings
Enable bazel for web-sveltekit and make it available in production builds (#55177) ## Bazel It took me some time to figure out how to make it work. I don't claim that this is the best setup (because I don't really have an idea what I'm doing here), I just tried to get things working. The main issues had been around loading the generate `client/*` packages and importing `*.scss` files. ### Loading `@sourcegraph/*` packages Unlike our other build tools, SvelteKit/vite load the application into Node for pre-rendering. This happens regardless whether pre-rendering/server-side-rendering is enabled or not (these settings live in the source code because they can be enabled/disabled per route). Long story short I had to configure vite to also process any `@sourcegraph/*` packages in order to make them compatible with node. You might wonder why that's not necessary when running vite directly in the repo? In the repo the `@sourcegraph/*` dependencies are all links to the corresponding `client/*` packages. Vite detects that and automatically treats them as "sources", not dependencies. ### SCSS files Somewhat related to the previous point, the built `@sourcegraph/*` packages do not contain any source SCSS files, only the generated CSS files. So importing SCSS files via `@sourcegraph/.../....scss` doesn't work. Furthermore, the generate code in the packages themselves import SCSS files, which also doesn't work. The "fix" for this is to rewrite any `*.scss` file imports to `*.css` file imports, but only inside those packages or only referencing files inside those packages. That's what we do in our `webpack.bazel.config.js` file as well. However, for global styles we need the SCSS files. I added a new target for copying those to the sandbox. --- Additionally this PR makes the following changes: - Rearrange style imports to remove unnecessary duplication and reduce the number of callsites that import from `@sourcegraph/*` packages. - Remove React integration with Notebooks and Insights. It was broken anyway at the moment and removing it reduces the number of dependencies and therefore points of failure. - Added a new target to copy the image files used by the prototype into the sandbox. - Disables gazelle for the sveltekit package for the time being. Type checking won't pass anyway because the code in the other client packages don't follow the same restrictions as `client/web-sveltekit`. - Updated the main header and dev server to proxy requests for notebooks, code insights and user settings to sourcegraph.com. ## Production build integration These changes make it possible to serve the SvelteKit version for search and repo pages when the `enable-sveltekit` feature flag is turned on. I aimed to make as few changes to the existing routing and handler code as possible to - server the SvelteKit index page for search and repo routes - make all other SvelteKit assets accessible via the `/.assets/` route In a nutshell, this is how it works now: - When building for development, the SvelteKit build process will output its files to `ui/assets/`, the same folder where webpack puts its files. To avoid conflicts with webpack generated files, all SvelteKit files are put in a subdirectory. - For production something similar happens except that bazel will copy all the files into a target directory - When accessing a search or repo route, we check, just before the response is rendered, whether to render the SvelteKit version or the React version. The challenge here was that we use the same handler for a lot of routes. `sveltekit.go` maintains a separate list of routes for which the SvelteKit version is available. This way I only had to add a check to three handler functions. And of course the feature flag must be enabled for the user/instance. - Because the SvelteKit files are stored in the same location as the webkit ones, serving those files via the `/.assets/` route "just works". Well, mostly. In order for the SvelteKit page to use the correct root-relative path I had to create a custom adapater, `sgAdapter`, which updates the URLs in the index page accordingly (I tried a lot of other approaches, some would have required changes to the assets handler... this was the more "contained" solution). Caveat: This is not ready to be officially tested: - Navigating between the React and SvelteKit versions does not always work as expected. Because of client side routing, navigating to e.g. the search page from the React app will load the React version of the site. The client side code needs to be updated to enforce a server refresh. I'll look into that in a future PR. - The SvelteKit version is relatively incomplete. Code intel, new search input, repo settings pages, etc are all missing. Most of this work is tracked in #55027. But before we spend more time getting things feature complete we want to do limited testing with the prototype in prod. - I wasn't able to get SvelteKit rebuilding to work reliably with `sg start enterprise-bazel`. For now it only builds the files once at start so that they exist. I'll look into improving the developer experience when running the full server locally in the future. For now, running `sg start web-sveltekit-standalone` is good enough. - Switching between the React and the SvelteKit version is definitely noticeable during development. I suspect to be faster in production (React is faster in production). Whether or not we go this route remains to be seen. Maybe we are embedding React pages into SvelteKit instead. At this point we just need to try how SvelteKit feels in production. - The SvelteKit `index.html` page lacks many things that the React `app.html` file has (e.g. preview links, analytics, observeability, etc). These have to be added eventually, but those are not necessary either for this initial test.
2023-10-16 12:15:59 +00:00
env:
SVELTEKIT: true
tests:
# These can be run with `sg test [name]`
backend:
cmd: go test
defaultArgs: ./...
bazel-backend-integration:
cmd: |
export GHE_GITHUB_TOKEN=$(gcloud secrets versions access latest --secret=GHE_GITHUB_TOKEN --quiet --project=sourcegraph-ci)
export GH_TOKEN=$(gcloud secrets versions access latest --secret=GITHUB_TOKEN --quiet --project=sourcegraph-ci)
export BITBUCKET_SERVER_USERNAME=$(gcloud secrets versions access latest --secret=BITBUCKET_SERVER_USERNAME --quiet --project=sourcegraph-ci)
export BITBUCKET_SERVER_TOKEN=$(gcloud secrets versions access latest --secret=BITBUCKET_SERVER_TOKEN --quiet --project=sourcegraph-ci)
export BITBUCKET_SERVER_URL=$(gcloud secrets versions access latest --secret=BITBUCKET_SERVER_URL --quiet --project=sourcegraph-ci)
export PERFORCE_PASSWORD=$(gcloud secrets versions access latest --secret=PERFORCE_PASSWORD --quiet --project=sourcegraph-ci)
export PERFORCE_USER=$(gcloud secrets versions access latest --secret=PERFORCE_USER --quiet --project=sourcegraph-ci)
export PERFORCE_PORT=$(gcloud secrets versions access latest --secret=PERFORCE_PORT --quiet --project=sourcegraph-ci)
export SOURCEGRAPH_LICENSE_KEY=$(gcloud secrets versions access latest --secret=SOURCEGRAPH_LICENSE_KEY --quiet --project=sourcegraph-ci)
export SOURCEGRAPH_LICENSE_GENERATION_KEY=$(gcloud secrets versions access latest --secret=SOURCEGRAPH_LICENSE_GENERATION_KEY --quiet --project=sourcegraph-ci)
bazel: use transitions to apply cross-compile platform automatically to oci_image (#60569) Removes the need to pass `--config=docker-darwin` through the following mechanisms: 1. `--enable_platform_specific_config` to enable certain flags on macos only e.g. `--extra_toolchains @zig_sdk//toolchain:linux_amd64_gnu.2.34` and `--sandbox_add_mount_pair=/tmp` (see [.bazelrc change](https://github.com/sourcegraph/sourcegraph/pull/60569/files?file-filters%5B%5D=dotfile&show-viewed-files=true)) 2. Apply a transition (using https://github.com/fmeum/with_cfg.bzl, please view [the following great video on it](https://www.youtube.com/watch?v=U5bdQRQY-io)) on `oci_image` targets when on the `@platforms//os:macos` platform to transition to the `@zig_sdk//platform:linux_amd64` platform. - This will start at `oci_image` targets and propagate down to e.g. `go_{binary,library}` etc targets with the "transitioned" platform configuration, resulting in them being built with the transitioned-to platform 3. Remove `darwin_docker_e2e_go` config_setting and `darwin-docker` bool_flag. - These aren't necessary anymore, as the places where these were used were not in the transitive closure rooted at an `oci_image` target, meaning they wouldn't be transitioned. To review, view [the following (filtered) files](https://github.com/sourcegraph/sourcegraph/pull/60569/files?file-filters%5B%5D=.bzl&file-filters%5B%5D=.sh&file-filters%5B%5D=.yaml&file-filters%5B%5D=No+extension&file-filters%5B%5D=dotfile&show-viewed-files=true) along with [the root BUILD.bazel](https://github.com/sourcegraph/sourcegraph/pull/60569/files#diff-7fc57714ef13c3325ce2a1130202edced92fcccc0c6db34a72f7b57f60d552a3). All the other files are just changing the `load` statements from `@rules_oci` to `//dev:oci_defs.bzl` ## Test plan CI, checked image locally and `sg test bazel-backend-integration` & `sg test bazel-e2e`
2024-02-20 13:57:56 +00:00
bazel test //testing:backend_integration_test --verbose_failures --sandbox_debug
bazel-e2e:
cmd: |
export GHE_GITHUB_TOKEN=$(gcloud secrets versions access latest --secret=GHE_GITHUB_TOKEN --quiet --project=sourcegraph-ci)
export GH_TOKEN=$(gcloud secrets versions access latest --secret=GITHUB_TOKEN --quiet --project=sourcegraph-ci)
export SOURCEGRAPH_LICENSE_KEY=$(gcloud secrets versions access latest --secret=SOURCEGRAPH_LICENSE_KEY --quiet --project=sourcegraph-ci)
export SOURCEGRAPH_LICENSE_GENERATION_KEY=$(gcloud secrets versions access latest --secret=SOURCEGRAPH_LICENSE_GENERATION_KEY --quiet --project=sourcegraph-ci)
bazel test //testing:e2e_test --test_env=HEADLESS=false --test_env=SOURCEGRAPH_BASE_URL="http://localhost:7080" --test_env=GHE_GITHUB_TOKEN=$GHE_GITHUB_TOKEN --test_env=GH_TOKEN=$GH_TOKEN --test_env=DISPLAY=$DISPLAY
bazel-web-integration:
cmd: |
export GH_TOKEN=$(gcloud secrets versions access latest --secret=GITHUB_TOKEN --quiet --project=sourcegraph-ci)
export PERCY_TOKEN=$(gcloud secrets versions access latest --secret=PERCY_TOKEN --quiet --project=sourcegraph-ci)
bazel test //client/web/src/integration:integration-tests --test_env=HEADLESS=false --test_env=SOURCEGRAPH_BASE_URL="http://localhost:7080" --test_env=GH_TOKEN=$GH_TOKEN --test_env=DISPLAY=$DISPLAY --test_env=PERCY_TOKEN=$PERCY_TOKEN
backend-integration:
cmd: cd dev/gqltest && go test -long -base-url $BASE_URL -email $EMAIL -username $USERNAME -password $PASSWORD ./gqltest
env:
# These are defaults. They can be overwritten by setting the env vars when
# running the command.
BASE_URL: 'http://localhost:3080'
EMAIL: 'joe@sourcegraph.com'
PASSWORD: '12345'
bext:
cmd: pnpm --filter @sourcegraph/browser test
bext-build:
cmd: EXTENSION_PERMISSIONS_ALL_URLS=true pnpm --filter @sourcegraph/browser build
bext-integration:
cmd: pnpm --filter @sourcegraph/browser test-integration
bext-e2e:
cmd: pnpm --filter @sourcegraph/browser mocha ./src/end-to-end/github.test.ts ./src/end-to-end/gitlab.test.ts
env:
SOURCEGRAPH_BASE_URL: https://sourcegraph.com
client:
cmd: pnpm run test
docsite:
cmd: .bin/docsite_${DOCSITE_VERSION} check ./doc
env:
2023-08-18 18:40:44 +00:00
DOCSITE_VERSION: v1.9.4 # Update DOCSITE_VERSION everywhere in all places (including outside this repo)
web-e2e:
preamble: |
A Sourcegraph isntance must be already running for these tests to work, most
commonly with: `sg start enterprise-e2e`
See more details: https://docs-legacy.sourcegraph.com/dev/how-to/testing#running-end-to-end-tests
cmd: pnpm test-e2e
env:
TEST_USER_EMAIL: test@sourcegraph.com
TEST_USER_PASSWORD: supersecurepassword
SOURCEGRAPH_BASE_URL: https://sourcegraph.test:3443
BROWSER: chrome
externalSecrets:
GH_TOKEN:
project: 'sourcegraph-ci'
name: 'BUILDKITE_GITHUBDOTCOM_TOKEN'
web-regression:
preamble: |
A Sourcegraph instance must be already running for these tests to work, most
commonly with: `sg start enterprise-e2e`
See more details: https://docs-legacy.sourcegraph.com/dev/how-to/testing#running-regression-tests
cmd: pnpm test-regression
env:
SOURCEGRAPH_SUDO_USER: test
SOURCEGRAPH_BASE_URL: https://sourcegraph.test:3443
TEST_USER_PASSWORD: supersecurepassword
BROWSER: chrome
web-integration:
preamble: |
A web application should be built for these tests to work, most
commonly with: `sg run web-integration-build` or `sg run web-integration-build-prod` for production build.
See more details: https://docs-legacy.sourcegraph.com/dev/how-to/testing#running-integration-tests
cmd: pnpm test-integration
web-integration:debug:
preamble: |
A Sourcegraph instance must be already running for these tests to work, most
commonly with: `sg start web-standalone`
See more details: https://docs-legacy.sourcegraph.com/dev/how-to/testing#running-integration-tests
cmd: pnpm test-integration:debug