sourcegraph/sg.config.yaml

1618 lines
49 KiB
YAML
Raw Normal View History

# Documentation for how to override sg configuration for local development:
# https://github.com/sourcegraph/sourcegraph/blob/main/doc/dev/background-information/sg/index.md#configuration
env:
PGPORT: 5432
PGHOST: localhost
PGUSER: sourcegraph
PGPASSWORD: sourcegraph
PGDATABASE: sourcegraph
PGSSLMODE: disable
SG_DEV_MIGRATE_ON_APPLICATION_STARTUP: "true"
INSECURE_DEV: true
SRC_REPOS_DIR: $HOME/.sourcegraph/repos
SRC_LOG_LEVEL: info
SRC_LOG_FORMAT: condensed
SRC_TRACE_LOG: false
# Set this to true to show an iTerm link to the file:line where the log message came from
SRC_LOG_SOURCE_LINK: false
# Use two gitserver instances in local dev
SRC_GIT_SERVER_1: 127.0.0.1:3501
SRC_GIT_SERVER_2: 127.0.0.1:3502
SRC_GIT_SERVERS: 127.0.0.1:3501 127.0.0.1:3502
# Enable sharded indexed search mode:
INDEXED_SEARCH_SERVERS: localhost:3070 localhost:3071
GO111MODULE: "on"
DEPLOY_TYPE: dev
SRC_HTTP_ADDR: ":3082"
# I don't think we even need to set these?
SEARCHER_URL: http://127.0.0.1:3181
REPO_UPDATER_URL: http://127.0.0.1:3182
REDIS_ENDPOINT: 127.0.0.1:6379
SYMBOLS_URL: http://localhost:3184
embeddings: searcher and indexer (#48017) # High-level architecture overview <img width="2231" alt="Screenshot 2023-02-24 at 15 13 59" src="https://user-images.githubusercontent.com/6417322/221200130-53c1ff25-4c47-4532-885f-5c4f9dadb05e.png"> # Embeddings Really quickly: embeddings are a semantic representation of text. Embeddings are usually floating-point vectors with 256+ elements. The neat thing about embeddings is that they allow us to search over textual information using a semantic correlation between the query and the text, not just syntactic (matching keywords). In this PR, we implemented an embedding service that will allow us to do semantic code search over repositories in Sourcegraph. So, for example, you'll be able to ask, "how do access tokens work in Sourcegraph", and it will give you a list of the closest matching code files. Additionally, we build a context detection service powered by embeddings. In chat applications, it is important to know whether the user's message requires additional context. We have to differentiate between two cases: the user asks a general question about the codebase, or the user references something in the existing conversation. In the latter case, including the context would ruin the flow of the conversation, and the chatbot would most likely return a confusing answer. We determine whether a query _does not_ require additional context using two approaches: 1. We check if the query contains well-known phrases that would indicate the user is referencing the existing conversation (e.g., translate previous, change that) 1. We have a static dataset of messages that require context and a dataset of messages that do not. We embed both datasets, and then, using embedding similarity, we can check which set is more similar to the query. ## GraphQL API We add four new resolvers to the GraphQL API: ```graphql extend type Query { embeddingsSearch(repo: ID!, query: String!, codeResultsCount: Int!, textResultsCount: Int!): EmbeddingsSearchResults! isContextRequiredForQuery(query: String!): Boolean! } extend type Mutation { scheduleRepositoriesForEmbedding(repoNames: [String!]!): EmptyResponse! scheduleContextDetectionForEmbedding: EmptyResponse! } ``` - `embeddingsSearch` performs embeddings search over the repo embeddings and returns the specified number of results - `isContextRequiredForQuery` determines whether the given query requires additional context - `scheduleRepositoriesForEmbedding` schedules a repo embedding background job - `scheduleContextDetectionForEmbedding` schedules a context detection embedding background job that embeds a static dataset of messages. ## Repo embedding background job Embedding a repository is implemented as a background job. The background job handler receives the repository and the revision, which should be embedded. Handler then gathers a list of files from the gitserver and excludes files >1MB in size. The list of files is split into code and text files (.md, .txt), and we build a separate embedding index for both. We split them because in a combined index, the text files always tended to feature as top results and didn't leave any room for code files. Once we have the list of files, the procedure is as follows: - For each file - Get file contents from gitserver - Check if the file is embeddable (is not autogenerated, is large enough, does not have long lines) - Split the file into embeddable chunks - Embed the file chunks using an external embedding service (defined in site config) - Add embedded file chunks and metadata to the index - Metadata contains the file name, the start line, and the end line of the chunk - Once all files are processed, the index is marshaled into JSON and stored in Cloud storage (GCS, S3) ### Site config changes As mentioned, we use a configurable external embedding API that does the actual text -> vector embedding part. Ideally, this allows us to swap embedding providers in the future. ```json "embeddings": { "description": "Configuration for embeddings service.", "type": "object", "required": ["enabled", "dimensions", "model", "accessToken", "url"], "properties": { "enabled": { "description": "Toggles whether embedding service is enabled.", "type": "boolean", "default": false }, "dimensions": { "description": "The dimensionality of the embedding vectors.", "type": "integer", "minimum": 0 }, "model": { "description": "The model used for embedding.", "type": "string" }, "accessToken": { "description": "The access token used to authenticate with the external embedding API service.", "type": "string" }, "url": { "description": "The url to the external embedding API service.", "type": "string", "format": "uri" } } } ``` ## Repo embeddings search The repo embeddings search is implemented in its own service. When a user queries a repo using embeddings search, the following happens: - Download the repo embedding index from blob storage and cache it in memory - We cache up to 5 embedding indexes in memory - Embed the query and use the embedded query vector to find similar code and text file metadata in the embedding index - Query gitserver for the actual file contents - Return the results ## Interesting files - [Similarity search](https://github.com/sourcegraph/sourcegraph/pull/48017/files#diff-102cc83520004eb0e2795e49bc435c5142ca555189b1db3a52bbf1ffb82fa3c6) - [Repo embedding job handler](https://github.com/sourcegraph/sourcegraph/pull/48017/files#diff-c345f373f426398beb4b9cd5852ba862a2718687882db2a8b2d9c7fbb5f1dc52) - [External embedding api client](https://github.com/sourcegraph/sourcegraph/pull/48017/files#diff-ad1e7956f518e4bcaee17dd9e7ac04a5e090c00d970fcd273919e887e1d2cf8f) - [Embedding a repo](https://github.com/sourcegraph/sourcegraph/pull/48017/files#diff-1f35118727128095b7816791b6f0a2e0e060cddee43d25102859b8159465585c) - [Embeddings searcher service](https://github.com/sourcegraph/sourcegraph/pull/48017/files#diff-5b20f3e7ef87041daeeaef98b58ebf7388519cedcdfc359dc5e6d4e0b021472e) - [Embeddings search](https://github.com/sourcegraph/sourcegraph/pull/48017/files#diff-79f95b9cc3f1ef39c1a0b88015bd9cd6c19c30a8d4c147409f1b8e8cd9462ea1) - [Repo embedding index cache management](https://github.com/sourcegraph/sourcegraph/pull/48017/files#diff-8a41f7dec31054889dbf86e97c52223d5636b4d408c6b375bcfc09160a8b70f8) - [GraphQL resolvers](https://github.com/sourcegraph/sourcegraph/pull/48017/files#diff-9b30a0b5efcb63e2f4611b99ab137fbe09629a769a4f30d10a1b2da41a01d21f) ## Test plan - Start by filling out the `embeddings` object in the site config (let me know if you need an API key) - Start the embeddings service using `sg start embeddings` - Go to the `/api/console` page and schedule a repo embedding job and a context detection embedding job: ```graphql mutation { scheduleRepositoriesForEmbedding(repoNames: ["github.com/sourcegraph/handbook"]) { __typename } scheduleContextDetectionForEmbedding { __typename } } ``` - Once both are finished, you should be able to query the repo embedding index, and determine whether context is need for a given query: ```graphql query { isContextRequiredForQuery(query: "how do access tokens work") embeddingsSearch( repo: "UmVwb3NpdG9yeToy", # github.com/sourcegraph/handbook GQL ID query: "how do access tokens work", codeResultsCount: 5, textResultsCount: 5) { codeResults { fileName content } textResults { fileName content } } } ```
2023-03-01 09:50:12 +00:00
EMBEDDINGS_URL: http://localhost:9991
SRC_SYNTECT_SERVER: http://localhost:9238
SRC_FRONTEND_INTERNAL: localhost:3090
GRAFANA_SERVER_URL: http://localhost:3370
PROMETHEUS_URL: http://localhost:9090
JAEGER_SERVER_URL: http://localhost:16686
2023-08-10 17:34:08 +00:00
SRC_GRPC_ENABLE_CONF: "true"
SRC_DEVELOPMENT: "true"
SRC_PROF_HTTP: ""
SRC_PROF_SERVICES: |
[
{ "Name": "frontend", "Host": "127.0.0.1:6063" },
{ "Name": "gitserver-0", "Host": "127.0.0.1:3551" },
{ "Name": "gitserver-1", "Host": "127.0.0.1:3552" },
{ "Name": "searcher", "Host": "127.0.0.1:6069" },
{ "Name": "symbols", "Host": "127.0.0.1:6071" },
{ "Name": "repo-updater", "Host": "127.0.0.1:6074" },
{ "Name": "codeintel-worker", "Host": "127.0.0.1:6088" },
{ "Name": "worker", "Host": "127.0.0.1:6089" },
2022-08-03 10:08:04 +00:00
{ "Name": "worker-executors", "Host": "127.0.0.1:6996" },
{ "Name": "embeddings", "Host": "127.0.0.1:6099" },
{ "Name": "zoekt-index-0", "Host": "127.0.0.1:6072" },
{ "Name": "zoekt-index-1", "Host": "127.0.0.1:6073" },
{ "Name": "zoekt-web-0", "Host": "127.0.0.1:3070", "DefaultPath": "/debug/requests/" },
{ "Name": "zoekt-web-1", "Host": "127.0.0.1:3071", "DefaultPath": "/debug/requests/" }
]
# Settings/config
SITE_CONFIG_FILE: ./dev/site-config.json
SITE_CONFIG_ALLOW_EDITS: true
GLOBAL_SETTINGS_FILE: ./dev/global-settings.json
GLOBAL_SETTINGS_ALLOW_EDITS: true
# Point codeintel to the `frontend` database in development
CODEINTEL_PGPORT: $PGPORT
CODEINTEL_PGHOST: $PGHOST
CODEINTEL_PGUSER: $PGUSER
CODEINTEL_PGPASSWORD: $PGPASSWORD
CODEINTEL_PGDATABASE: $PGDATABASE
CODEINTEL_PGSSLMODE: $PGSSLMODE
CODEINTEL_PGDATASOURCE: $PGDATASOURCE
CODEINTEL_PG_ALLOW_SINGLE_DB: true
# Required for `frontend` and `web` commands
SOURCEGRAPH_HTTPS_DOMAIN: sourcegraph.test
SOURCEGRAPH_HTTPS_PORT: 3443
# Required for `web` commands
NODE_OPTIONS: "--max_old_space_size=8192"
# Default `NODE_ENV` to `development`
NODE_ENV: development
# Required for codeintel uploadstore
PRECISE_CODE_INTEL_UPLOAD_AWS_ENDPOINT: http://localhost:9000
PRECISE_CODE_INTEL_UPLOAD_BACKEND: blobstore
# Required for embeddings job upload
EMBEDDINGS_UPLOAD_AWS_ENDPOINT: http://localhost:9000
# Required for upload of search job results
SEARCH_JOBS_UPLOAD_AWS_ENDPOINT: http://localhost:9000
# Disable auto-indexing the CNCF repo group (this only works in Cloud)
# This setting will be going away soon
DISABLE_CNCF: notonmybox
# Point code insights to the `frontend` database in development
CODEINSIGHTS_PGPORT: $PGPORT
CODEINSIGHTS_PGHOST: $PGHOST
CODEINSIGHTS_PGUSER: $PGUSER
CODEINSIGHTS_PGPASSWORD: $PGPASSWORD
CODEINSIGHTS_PGDATABASE: $PGDATABASE
CODEINSIGHTS_PGSSLMODE: $PGSSLMODE
CODEINSIGHTS_PGDATASOURCE: $PGDATASOURCE
# Disable code insights by default
DB_STARTUP_TIMEOUT: 120s # codeinsights-db needs more time to start in some instances.
DISABLE_CODE_INSIGHTS_HISTORICAL: true
DISABLE_CODE_INSIGHTS: true
# # OpenTelemetry in dev - use single http/json endpoint
# OTEL_EXPORTER_OTLP_ENDPOINT: http://127.0.0.1:4318
# OTEL_EXPORTER_OTLP_PROTOCOL: http/json
# Enable gRPC Web UI for debugging
GRPC_WEB_UI_ENABLED: "true"
# Enable full protobuf message logging when an internal error occurred
SRC_GRPC_INTERNAL_ERROR_LOGGING_LOG_PROTOBUF_MESSAGES_ENABLED: "true"
SRC_GRPC_INTERNAL_ERROR_LOGGING_LOG_PROTOBUF_MESSAGES_JSON_TRUNCATION_SIZE_BYTES: "1KB"
SRC_GRPC_INTERNAL_ERROR_LOGGING_LOG_PROTOBUF_MESSAGES_HANDLING_MAX_MESSAGE_SIZE_BYTES: "100MB"
## zoekt-specific message logging
GRPC_INTERNAL_ERROR_LOGGING_LOG_PROTOBUF_MESSAGES_ENABLED: "true"
GRPC_INTERNAL_ERROR_LOGGING_LOG_PROTOBUF_MESSAGES_JSON_TRUNCATION_SIZE_BYTES: "1KB"
GRPC_INTERNAL_ERROR_LOGGING_LOG_PROTOBUF_MESSAGES_HANDLING_MAX_MESSAGE_SIZE_BYTES: "100MB"
telemetrygateway: add exporter and service (#56699) This change adds: - telemetry export background jobs: flagged behind `TELEMETRY_GATEWAY_EXPORTER_EXPORT_ADDR`, default empty => disabled - telemetry redaction: configured in package `internal/telemetry/sensitivemetadataallowlist` - telemetry-gateway service receiving events and forwarding it to a pub/sub topic (or just logging it, as configured in local dev) - utilities for easily creating an event recorder: `internal/telemetry/telemetryrecorder` Notes: - all changes are feature-flagged to some degree, off by default, so the merge should be fairly low-risk. - we decided that transmitting the full license key continues to be the way to go. we transmit it once per stream and attach it on all events in the telemetry-gateway. there is no auth mechanism at the moment - GraphQL return type `EventLog.Source` is now a plain string instead of string enum. This should not be a breaking change in our clients, but must be made so that our generated V2 events do not break requesting of event logs Stacked on https://github.com/sourcegraph/sourcegraph/pull/56520 Closes https://github.com/sourcegraph/sourcegraph/issues/56289 Closes https://github.com/sourcegraph/sourcegraph/issues/56287 ## Test plan Add an override to make the export super frequent: ``` env: TELEMETRY_GATEWAY_EXPORTER_EXPORT_INTERVAL: "10s" TELEMETRY_GATEWAY_EXPORTER_EXPORTED_EVENTS_RETENTION: "5m" ``` Start sourcegraph: ``` sg start ``` Enable `telemetry-export` featureflag (from https://github.com/sourcegraph/sourcegraph/pull/56520) Emit some events in GraphQL: ```gql mutation { telemetry { recordEvents(events:[{ feature:"foobar" action:"view" source:{ client:"WEB" } parameters:{ version:0 } }]) { alwaysNil } } ``` See series of log events: ``` [ worker] INFO worker.telemetrygateway-exporter telemetrygatewayexporter/telemetrygatewayexporter.go:61 Telemetry Gateway export enabled - initializing background routines [ worker] INFO worker.telemetrygateway-exporter telemetrygatewayexporter/exporter.go:99 exporting events {"maxBatchSize": 10000, "count": 1} [telemetry-g...y] INFO telemetry-gateway.pubsub pubsub/topic.go:115 Publish {"TraceId": "7852903434f0d2f647d397ee83b4009b", "SpanId": "8d945234bccf319b", "message": "{\"event\":{\"id\":\"dc96ae84-4ac4-4760-968f-0a0307b8bb3d\",\"timestamp\":\"2023-09-19T01:57:13.590266Z\",\"feature\":\"foobar\", .... ``` Build: ``` export VERSION="insiders" bazel run //cmd/telemetry-gateway:candidate_push --config darwin-docker --stamp --workspace_status_command=./dev/bazel_stamp_vars.sh -- --tag $VERSION --repository us.gcr.io/sourcegraph-dev/telemetry-gateway ``` Deploy: https://github.com/sourcegraph/managed-services/pull/7 Add override: ```yaml env: # Port required. TODO: What's the best way to provide gRPC addresses, such that a # localhost address is also possible? TELEMETRY_GATEWAY_EXPORTER_EXPORT_ADDR: "https://telemetry-gateway.sgdev.org:443" ``` Repeat the above (`sg start` and emit some events): ``` [ worker] INFO worker.telemetrygateway-exporter telemetrygatewayexporter/exporter.go:94 exporting events {"maxBatchSize": 10000, "count": 6} [ worker] INFO worker.telemetrygateway-exporter telemetrygatewayexporter/exporter.go:113 events exported {"maxBatchSize": 10000, "succeeded": 6} [ worker] INFO worker.telemetrygateway-exporter telemetrygatewayexporter/exporter.go:94 exporting events {"maxBatchSize": 10000, "count": 1} [ worker] INFO worker.telemetrygateway-exporter telemetrygatewayexporter/exporter.go:113 events exported {"maxBatchSize": 10000, "succeeded": 1} ```
2023-09-20 05:20:15 +00:00
TELEMETRY_GATEWAY_EXPORTER_EXPORT_ADDR: "http://127.0.0.1:10080"
commands:
server:
description: Run an all-in-one sourcegraph/server image
cmd: ./dev/run-server-image.sh
env:
TAG: insiders
CLEAN: "true"
DATA: "/tmp/sourcegraph-data"
URL: "http://localhost:7080"
frontend:
description: Frontend
cmd: |
# TODO: This should be fixed
export SOURCEGRAPH_LICENSE_GENERATION_KEY=$(cat ../dev-private/enterprise/dev/test-license-generation-key.pem)
# If EXTSVC_CONFIG_FILE is *unset*, set a default.
export EXTSVC_CONFIG_FILE=${EXTSVC_CONFIG_FILE-'../dev-private/enterprise/dev/external-services-config.json'}
.bin/frontend
2021-10-09 01:47:08 +00:00
install: |
if [ -n "$DELVE" ]; then
2023-09-12 10:10:10 +00:00
export GCFLAGS='all=-N -l'
2021-10-09 01:47:08 +00:00
fi
go build -gcflags="$GCFLAGS" -o .bin/frontend github.com/sourcegraph/sourcegraph/cmd/frontend
checkBinary: .bin/frontend
env:
CONFIGURATION_MODE: server
USE_ENHANCED_LANGUAGE_DETECTION: false
SITE_CONFIG_FILE: "../dev-private/enterprise/dev/site-config.json"
SITE_CONFIG_ESCAPE_HATCH_PATH: "$HOME/.sourcegraph/site-config.json"
# frontend processes need this to be so that the paths to the assets are rendered correctly
WEBPACK_DEV_SERVER: 1
watch:
- lib
- internal
- cmd/frontend
gitserver-template: &gitserver_template
cmd: .bin/gitserver
2021-10-09 01:47:08 +00:00
install: |
if [ -n "$DELVE" ]; then
2023-09-12 10:10:10 +00:00
export GCFLAGS='all=-N -l'
2021-10-09 01:47:08 +00:00
fi
go build -gcflags="$GCFLAGS" -o .bin/gitserver github.com/sourcegraph/sourcegraph/cmd/gitserver
checkBinary: .bin/gitserver
env: &gitserverenv
HOSTNAME: 127.0.0.1:3178
watch:
- lib
- internal
- cmd/gitserver
# This is only here to stay backwards-compatible with people's custom
# `sg.config.overwrite.yaml` files
gitserver:
<<: *gitserver_template
gitserver-0:
<<: *gitserver_template
env:
<<: *gitserverenv
GITSERVER_EXTERNAL_ADDR: 127.0.0.1:3501
GITSERVER_ADDR: 127.0.0.1:3501
SRC_REPOS_DIR: $HOME/.sourcegraph/repos_1
SRC_PROF_HTTP: 127.0.0.1:3551
gitserver-1:
<<: *gitserver_template
env:
<<: *gitserverenv
GITSERVER_EXTERNAL_ADDR: 127.0.0.1:3502
GITSERVER_ADDR: 127.0.0.1:3502
SRC_REPOS_DIR: $HOME/.sourcegraph/repos_2
SRC_PROF_HTTP: 127.0.0.1:3552
repo-updater:
cmd: |
export SOURCEGRAPH_LICENSE_GENERATION_KEY=$(cat ../dev-private/enterprise/dev/test-license-generation-key.pem)
.bin/repo-updater
2021-10-09 01:47:08 +00:00
install: |
if [ -n "$DELVE" ]; then
2023-09-12 10:10:10 +00:00
export GCFLAGS='all=-N -l'
2021-10-09 01:47:08 +00:00
fi
go build -gcflags="$GCFLAGS" -o .bin/repo-updater github.com/sourcegraph/sourcegraph/cmd/repo-updater
checkBinary: .bin/repo-updater
watch:
- lib
- internal
- cmd/repo-updater
symbols:
cmd: .bin/symbols
install: |
if [ -n "$DELVE" ]; then
2023-09-12 10:10:10 +00:00
export GCFLAGS='all=-N -l'
fi
go build -gcflags="$GCFLAGS" -o .bin/symbols github.com/sourcegraph/sourcegraph/enterprise/cmd/symbols
checkBinary: .bin/symbols
env:
CTAGS_COMMAND: dev/universal-ctags-dev
SCIP_CTAGS_COMMAND: dev/scip-ctags-dev
CTAGS_PROCESSES: 2
USE_ROCKSKIP: "false"
watch:
- lib
- internal
- cmd/symbols
- enterprise/cmd/symbols
- internal/rockskip
embeddings: searcher and indexer (#48017) # High-level architecture overview <img width="2231" alt="Screenshot 2023-02-24 at 15 13 59" src="https://user-images.githubusercontent.com/6417322/221200130-53c1ff25-4c47-4532-885f-5c4f9dadb05e.png"> # Embeddings Really quickly: embeddings are a semantic representation of text. Embeddings are usually floating-point vectors with 256+ elements. The neat thing about embeddings is that they allow us to search over textual information using a semantic correlation between the query and the text, not just syntactic (matching keywords). In this PR, we implemented an embedding service that will allow us to do semantic code search over repositories in Sourcegraph. So, for example, you'll be able to ask, "how do access tokens work in Sourcegraph", and it will give you a list of the closest matching code files. Additionally, we build a context detection service powered by embeddings. In chat applications, it is important to know whether the user's message requires additional context. We have to differentiate between two cases: the user asks a general question about the codebase, or the user references something in the existing conversation. In the latter case, including the context would ruin the flow of the conversation, and the chatbot would most likely return a confusing answer. We determine whether a query _does not_ require additional context using two approaches: 1. We check if the query contains well-known phrases that would indicate the user is referencing the existing conversation (e.g., translate previous, change that) 1. We have a static dataset of messages that require context and a dataset of messages that do not. We embed both datasets, and then, using embedding similarity, we can check which set is more similar to the query. ## GraphQL API We add four new resolvers to the GraphQL API: ```graphql extend type Query { embeddingsSearch(repo: ID!, query: String!, codeResultsCount: Int!, textResultsCount: Int!): EmbeddingsSearchResults! isContextRequiredForQuery(query: String!): Boolean! } extend type Mutation { scheduleRepositoriesForEmbedding(repoNames: [String!]!): EmptyResponse! scheduleContextDetectionForEmbedding: EmptyResponse! } ``` - `embeddingsSearch` performs embeddings search over the repo embeddings and returns the specified number of results - `isContextRequiredForQuery` determines whether the given query requires additional context - `scheduleRepositoriesForEmbedding` schedules a repo embedding background job - `scheduleContextDetectionForEmbedding` schedules a context detection embedding background job that embeds a static dataset of messages. ## Repo embedding background job Embedding a repository is implemented as a background job. The background job handler receives the repository and the revision, which should be embedded. Handler then gathers a list of files from the gitserver and excludes files >1MB in size. The list of files is split into code and text files (.md, .txt), and we build a separate embedding index for both. We split them because in a combined index, the text files always tended to feature as top results and didn't leave any room for code files. Once we have the list of files, the procedure is as follows: - For each file - Get file contents from gitserver - Check if the file is embeddable (is not autogenerated, is large enough, does not have long lines) - Split the file into embeddable chunks - Embed the file chunks using an external embedding service (defined in site config) - Add embedded file chunks and metadata to the index - Metadata contains the file name, the start line, and the end line of the chunk - Once all files are processed, the index is marshaled into JSON and stored in Cloud storage (GCS, S3) ### Site config changes As mentioned, we use a configurable external embedding API that does the actual text -> vector embedding part. Ideally, this allows us to swap embedding providers in the future. ```json "embeddings": { "description": "Configuration for embeddings service.", "type": "object", "required": ["enabled", "dimensions", "model", "accessToken", "url"], "properties": { "enabled": { "description": "Toggles whether embedding service is enabled.", "type": "boolean", "default": false }, "dimensions": { "description": "The dimensionality of the embedding vectors.", "type": "integer", "minimum": 0 }, "model": { "description": "The model used for embedding.", "type": "string" }, "accessToken": { "description": "The access token used to authenticate with the external embedding API service.", "type": "string" }, "url": { "description": "The url to the external embedding API service.", "type": "string", "format": "uri" } } } ``` ## Repo embeddings search The repo embeddings search is implemented in its own service. When a user queries a repo using embeddings search, the following happens: - Download the repo embedding index from blob storage and cache it in memory - We cache up to 5 embedding indexes in memory - Embed the query and use the embedded query vector to find similar code and text file metadata in the embedding index - Query gitserver for the actual file contents - Return the results ## Interesting files - [Similarity search](https://github.com/sourcegraph/sourcegraph/pull/48017/files#diff-102cc83520004eb0e2795e49bc435c5142ca555189b1db3a52bbf1ffb82fa3c6) - [Repo embedding job handler](https://github.com/sourcegraph/sourcegraph/pull/48017/files#diff-c345f373f426398beb4b9cd5852ba862a2718687882db2a8b2d9c7fbb5f1dc52) - [External embedding api client](https://github.com/sourcegraph/sourcegraph/pull/48017/files#diff-ad1e7956f518e4bcaee17dd9e7ac04a5e090c00d970fcd273919e887e1d2cf8f) - [Embedding a repo](https://github.com/sourcegraph/sourcegraph/pull/48017/files#diff-1f35118727128095b7816791b6f0a2e0e060cddee43d25102859b8159465585c) - [Embeddings searcher service](https://github.com/sourcegraph/sourcegraph/pull/48017/files#diff-5b20f3e7ef87041daeeaef98b58ebf7388519cedcdfc359dc5e6d4e0b021472e) - [Embeddings search](https://github.com/sourcegraph/sourcegraph/pull/48017/files#diff-79f95b9cc3f1ef39c1a0b88015bd9cd6c19c30a8d4c147409f1b8e8cd9462ea1) - [Repo embedding index cache management](https://github.com/sourcegraph/sourcegraph/pull/48017/files#diff-8a41f7dec31054889dbf86e97c52223d5636b4d408c6b375bcfc09160a8b70f8) - [GraphQL resolvers](https://github.com/sourcegraph/sourcegraph/pull/48017/files#diff-9b30a0b5efcb63e2f4611b99ab137fbe09629a769a4f30d10a1b2da41a01d21f) ## Test plan - Start by filling out the `embeddings` object in the site config (let me know if you need an API key) - Start the embeddings service using `sg start embeddings` - Go to the `/api/console` page and schedule a repo embedding job and a context detection embedding job: ```graphql mutation { scheduleRepositoriesForEmbedding(repoNames: ["github.com/sourcegraph/handbook"]) { __typename } scheduleContextDetectionForEmbedding { __typename } } ``` - Once both are finished, you should be able to query the repo embedding index, and determine whether context is need for a given query: ```graphql query { isContextRequiredForQuery(query: "how do access tokens work") embeddingsSearch( repo: "UmVwb3NpdG9yeToy", # github.com/sourcegraph/handbook GQL ID query: "how do access tokens work", codeResultsCount: 5, textResultsCount: 5) { codeResults { fileName content } textResults { fileName content } } } ```
2023-03-01 09:50:12 +00:00
embeddings:
cmd: |
export SOURCEGRAPH_LICENSE_GENERATION_KEY=$(cat ../dev-private/enterprise/dev/test-license-generation-key.pem)
.bin/embeddings
embeddings: searcher and indexer (#48017) # High-level architecture overview <img width="2231" alt="Screenshot 2023-02-24 at 15 13 59" src="https://user-images.githubusercontent.com/6417322/221200130-53c1ff25-4c47-4532-885f-5c4f9dadb05e.png"> # Embeddings Really quickly: embeddings are a semantic representation of text. Embeddings are usually floating-point vectors with 256+ elements. The neat thing about embeddings is that they allow us to search over textual information using a semantic correlation between the query and the text, not just syntactic (matching keywords). In this PR, we implemented an embedding service that will allow us to do semantic code search over repositories in Sourcegraph. So, for example, you'll be able to ask, "how do access tokens work in Sourcegraph", and it will give you a list of the closest matching code files. Additionally, we build a context detection service powered by embeddings. In chat applications, it is important to know whether the user's message requires additional context. We have to differentiate between two cases: the user asks a general question about the codebase, or the user references something in the existing conversation. In the latter case, including the context would ruin the flow of the conversation, and the chatbot would most likely return a confusing answer. We determine whether a query _does not_ require additional context using two approaches: 1. We check if the query contains well-known phrases that would indicate the user is referencing the existing conversation (e.g., translate previous, change that) 1. We have a static dataset of messages that require context and a dataset of messages that do not. We embed both datasets, and then, using embedding similarity, we can check which set is more similar to the query. ## GraphQL API We add four new resolvers to the GraphQL API: ```graphql extend type Query { embeddingsSearch(repo: ID!, query: String!, codeResultsCount: Int!, textResultsCount: Int!): EmbeddingsSearchResults! isContextRequiredForQuery(query: String!): Boolean! } extend type Mutation { scheduleRepositoriesForEmbedding(repoNames: [String!]!): EmptyResponse! scheduleContextDetectionForEmbedding: EmptyResponse! } ``` - `embeddingsSearch` performs embeddings search over the repo embeddings and returns the specified number of results - `isContextRequiredForQuery` determines whether the given query requires additional context - `scheduleRepositoriesForEmbedding` schedules a repo embedding background job - `scheduleContextDetectionForEmbedding` schedules a context detection embedding background job that embeds a static dataset of messages. ## Repo embedding background job Embedding a repository is implemented as a background job. The background job handler receives the repository and the revision, which should be embedded. Handler then gathers a list of files from the gitserver and excludes files >1MB in size. The list of files is split into code and text files (.md, .txt), and we build a separate embedding index for both. We split them because in a combined index, the text files always tended to feature as top results and didn't leave any room for code files. Once we have the list of files, the procedure is as follows: - For each file - Get file contents from gitserver - Check if the file is embeddable (is not autogenerated, is large enough, does not have long lines) - Split the file into embeddable chunks - Embed the file chunks using an external embedding service (defined in site config) - Add embedded file chunks and metadata to the index - Metadata contains the file name, the start line, and the end line of the chunk - Once all files are processed, the index is marshaled into JSON and stored in Cloud storage (GCS, S3) ### Site config changes As mentioned, we use a configurable external embedding API that does the actual text -> vector embedding part. Ideally, this allows us to swap embedding providers in the future. ```json "embeddings": { "description": "Configuration for embeddings service.", "type": "object", "required": ["enabled", "dimensions", "model", "accessToken", "url"], "properties": { "enabled": { "description": "Toggles whether embedding service is enabled.", "type": "boolean", "default": false }, "dimensions": { "description": "The dimensionality of the embedding vectors.", "type": "integer", "minimum": 0 }, "model": { "description": "The model used for embedding.", "type": "string" }, "accessToken": { "description": "The access token used to authenticate with the external embedding API service.", "type": "string" }, "url": { "description": "The url to the external embedding API service.", "type": "string", "format": "uri" } } } ``` ## Repo embeddings search The repo embeddings search is implemented in its own service. When a user queries a repo using embeddings search, the following happens: - Download the repo embedding index from blob storage and cache it in memory - We cache up to 5 embedding indexes in memory - Embed the query and use the embedded query vector to find similar code and text file metadata in the embedding index - Query gitserver for the actual file contents - Return the results ## Interesting files - [Similarity search](https://github.com/sourcegraph/sourcegraph/pull/48017/files#diff-102cc83520004eb0e2795e49bc435c5142ca555189b1db3a52bbf1ffb82fa3c6) - [Repo embedding job handler](https://github.com/sourcegraph/sourcegraph/pull/48017/files#diff-c345f373f426398beb4b9cd5852ba862a2718687882db2a8b2d9c7fbb5f1dc52) - [External embedding api client](https://github.com/sourcegraph/sourcegraph/pull/48017/files#diff-ad1e7956f518e4bcaee17dd9e7ac04a5e090c00d970fcd273919e887e1d2cf8f) - [Embedding a repo](https://github.com/sourcegraph/sourcegraph/pull/48017/files#diff-1f35118727128095b7816791b6f0a2e0e060cddee43d25102859b8159465585c) - [Embeddings searcher service](https://github.com/sourcegraph/sourcegraph/pull/48017/files#diff-5b20f3e7ef87041daeeaef98b58ebf7388519cedcdfc359dc5e6d4e0b021472e) - [Embeddings search](https://github.com/sourcegraph/sourcegraph/pull/48017/files#diff-79f95b9cc3f1ef39c1a0b88015bd9cd6c19c30a8d4c147409f1b8e8cd9462ea1) - [Repo embedding index cache management](https://github.com/sourcegraph/sourcegraph/pull/48017/files#diff-8a41f7dec31054889dbf86e97c52223d5636b4d408c6b375bcfc09160a8b70f8) - [GraphQL resolvers](https://github.com/sourcegraph/sourcegraph/pull/48017/files#diff-9b30a0b5efcb63e2f4611b99ab137fbe09629a769a4f30d10a1b2da41a01d21f) ## Test plan - Start by filling out the `embeddings` object in the site config (let me know if you need an API key) - Start the embeddings service using `sg start embeddings` - Go to the `/api/console` page and schedule a repo embedding job and a context detection embedding job: ```graphql mutation { scheduleRepositoriesForEmbedding(repoNames: ["github.com/sourcegraph/handbook"]) { __typename } scheduleContextDetectionForEmbedding { __typename } } ``` - Once both are finished, you should be able to query the repo embedding index, and determine whether context is need for a given query: ```graphql query { isContextRequiredForQuery(query: "how do access tokens work") embeddingsSearch( repo: "UmVwb3NpdG9yeToy", # github.com/sourcegraph/handbook GQL ID query: "how do access tokens work", codeResultsCount: 5, textResultsCount: 5) { codeResults { fileName content } textResults { fileName content } } } ```
2023-03-01 09:50:12 +00:00
install: |
if [ -n "$DELVE" ]; then
2023-09-12 10:10:10 +00:00
export GCFLAGS='all=-N -l'
embeddings: searcher and indexer (#48017) # High-level architecture overview <img width="2231" alt="Screenshot 2023-02-24 at 15 13 59" src="https://user-images.githubusercontent.com/6417322/221200130-53c1ff25-4c47-4532-885f-5c4f9dadb05e.png"> # Embeddings Really quickly: embeddings are a semantic representation of text. Embeddings are usually floating-point vectors with 256+ elements. The neat thing about embeddings is that they allow us to search over textual information using a semantic correlation between the query and the text, not just syntactic (matching keywords). In this PR, we implemented an embedding service that will allow us to do semantic code search over repositories in Sourcegraph. So, for example, you'll be able to ask, "how do access tokens work in Sourcegraph", and it will give you a list of the closest matching code files. Additionally, we build a context detection service powered by embeddings. In chat applications, it is important to know whether the user's message requires additional context. We have to differentiate between two cases: the user asks a general question about the codebase, or the user references something in the existing conversation. In the latter case, including the context would ruin the flow of the conversation, and the chatbot would most likely return a confusing answer. We determine whether a query _does not_ require additional context using two approaches: 1. We check if the query contains well-known phrases that would indicate the user is referencing the existing conversation (e.g., translate previous, change that) 1. We have a static dataset of messages that require context and a dataset of messages that do not. We embed both datasets, and then, using embedding similarity, we can check which set is more similar to the query. ## GraphQL API We add four new resolvers to the GraphQL API: ```graphql extend type Query { embeddingsSearch(repo: ID!, query: String!, codeResultsCount: Int!, textResultsCount: Int!): EmbeddingsSearchResults! isContextRequiredForQuery(query: String!): Boolean! } extend type Mutation { scheduleRepositoriesForEmbedding(repoNames: [String!]!): EmptyResponse! scheduleContextDetectionForEmbedding: EmptyResponse! } ``` - `embeddingsSearch` performs embeddings search over the repo embeddings and returns the specified number of results - `isContextRequiredForQuery` determines whether the given query requires additional context - `scheduleRepositoriesForEmbedding` schedules a repo embedding background job - `scheduleContextDetectionForEmbedding` schedules a context detection embedding background job that embeds a static dataset of messages. ## Repo embedding background job Embedding a repository is implemented as a background job. The background job handler receives the repository and the revision, which should be embedded. Handler then gathers a list of files from the gitserver and excludes files >1MB in size. The list of files is split into code and text files (.md, .txt), and we build a separate embedding index for both. We split them because in a combined index, the text files always tended to feature as top results and didn't leave any room for code files. Once we have the list of files, the procedure is as follows: - For each file - Get file contents from gitserver - Check if the file is embeddable (is not autogenerated, is large enough, does not have long lines) - Split the file into embeddable chunks - Embed the file chunks using an external embedding service (defined in site config) - Add embedded file chunks and metadata to the index - Metadata contains the file name, the start line, and the end line of the chunk - Once all files are processed, the index is marshaled into JSON and stored in Cloud storage (GCS, S3) ### Site config changes As mentioned, we use a configurable external embedding API that does the actual text -> vector embedding part. Ideally, this allows us to swap embedding providers in the future. ```json "embeddings": { "description": "Configuration for embeddings service.", "type": "object", "required": ["enabled", "dimensions", "model", "accessToken", "url"], "properties": { "enabled": { "description": "Toggles whether embedding service is enabled.", "type": "boolean", "default": false }, "dimensions": { "description": "The dimensionality of the embedding vectors.", "type": "integer", "minimum": 0 }, "model": { "description": "The model used for embedding.", "type": "string" }, "accessToken": { "description": "The access token used to authenticate with the external embedding API service.", "type": "string" }, "url": { "description": "The url to the external embedding API service.", "type": "string", "format": "uri" } } } ``` ## Repo embeddings search The repo embeddings search is implemented in its own service. When a user queries a repo using embeddings search, the following happens: - Download the repo embedding index from blob storage and cache it in memory - We cache up to 5 embedding indexes in memory - Embed the query and use the embedded query vector to find similar code and text file metadata in the embedding index - Query gitserver for the actual file contents - Return the results ## Interesting files - [Similarity search](https://github.com/sourcegraph/sourcegraph/pull/48017/files#diff-102cc83520004eb0e2795e49bc435c5142ca555189b1db3a52bbf1ffb82fa3c6) - [Repo embedding job handler](https://github.com/sourcegraph/sourcegraph/pull/48017/files#diff-c345f373f426398beb4b9cd5852ba862a2718687882db2a8b2d9c7fbb5f1dc52) - [External embedding api client](https://github.com/sourcegraph/sourcegraph/pull/48017/files#diff-ad1e7956f518e4bcaee17dd9e7ac04a5e090c00d970fcd273919e887e1d2cf8f) - [Embedding a repo](https://github.com/sourcegraph/sourcegraph/pull/48017/files#diff-1f35118727128095b7816791b6f0a2e0e060cddee43d25102859b8159465585c) - [Embeddings searcher service](https://github.com/sourcegraph/sourcegraph/pull/48017/files#diff-5b20f3e7ef87041daeeaef98b58ebf7388519cedcdfc359dc5e6d4e0b021472e) - [Embeddings search](https://github.com/sourcegraph/sourcegraph/pull/48017/files#diff-79f95b9cc3f1ef39c1a0b88015bd9cd6c19c30a8d4c147409f1b8e8cd9462ea1) - [Repo embedding index cache management](https://github.com/sourcegraph/sourcegraph/pull/48017/files#diff-8a41f7dec31054889dbf86e97c52223d5636b4d408c6b375bcfc09160a8b70f8) - [GraphQL resolvers](https://github.com/sourcegraph/sourcegraph/pull/48017/files#diff-9b30a0b5efcb63e2f4611b99ab137fbe09629a769a4f30d10a1b2da41a01d21f) ## Test plan - Start by filling out the `embeddings` object in the site config (let me know if you need an API key) - Start the embeddings service using `sg start embeddings` - Go to the `/api/console` page and schedule a repo embedding job and a context detection embedding job: ```graphql mutation { scheduleRepositoriesForEmbedding(repoNames: ["github.com/sourcegraph/handbook"]) { __typename } scheduleContextDetectionForEmbedding { __typename } } ``` - Once both are finished, you should be able to query the repo embedding index, and determine whether context is need for a given query: ```graphql query { isContextRequiredForQuery(query: "how do access tokens work") embeddingsSearch( repo: "UmVwb3NpdG9yeToy", # github.com/sourcegraph/handbook GQL ID query: "how do access tokens work", codeResultsCount: 5, textResultsCount: 5) { codeResults { fileName content } textResults { fileName content } } } ```
2023-03-01 09:50:12 +00:00
fi
go build -gcflags="$GCFLAGS" -o .bin/embeddings github.com/sourcegraph/sourcegraph/enterprise/cmd/embeddings
checkBinary: .bin/embeddings
watch:
- lib
- internal
- enterprise/cmd/embeddings
- internal/embeddings
qdrant:
cmd: |
docker run -p 6333:6333 -p 6334:6334 \
-v $HOME/.sourcegraph-dev/data/qdrant_data:/data \
-e QDRANT__SERVICE__GRPC_PORT="6334" \
-e QDRANT__LOG_LEVEL=INFO \
-e QDRANT__STORAGE__STORAGE_PATH=/data \
-e QDRANT__STORAGE__SNAPSHOTS_PATH=/data \
-e QDRANT_INIT_FILE_PATH=/data/.qdrant-initialized \
--entrypoint /usr/local/bin/qdrant \
sourcegraph/qdrant:insiders
worker:
cmd: |
export SOURCEGRAPH_LICENSE_GENERATION_KEY=$(cat ../dev-private/enterprise/dev/test-license-generation-key.pem)
.bin/worker
install: |
if [ -n "$DELVE" ]; then
2023-09-12 10:10:10 +00:00
export GCFLAGS='all=-N -l'
fi
go build -gcflags="$GCFLAGS" -o .bin/worker github.com/sourcegraph/sourcegraph/cmd/worker
watch:
- lib
- internal
- cmd/worker
embeddings: searcher and indexer (#48017) # High-level architecture overview <img width="2231" alt="Screenshot 2023-02-24 at 15 13 59" src="https://user-images.githubusercontent.com/6417322/221200130-53c1ff25-4c47-4532-885f-5c4f9dadb05e.png"> # Embeddings Really quickly: embeddings are a semantic representation of text. Embeddings are usually floating-point vectors with 256+ elements. The neat thing about embeddings is that they allow us to search over textual information using a semantic correlation between the query and the text, not just syntactic (matching keywords). In this PR, we implemented an embedding service that will allow us to do semantic code search over repositories in Sourcegraph. So, for example, you'll be able to ask, "how do access tokens work in Sourcegraph", and it will give you a list of the closest matching code files. Additionally, we build a context detection service powered by embeddings. In chat applications, it is important to know whether the user's message requires additional context. We have to differentiate between two cases: the user asks a general question about the codebase, or the user references something in the existing conversation. In the latter case, including the context would ruin the flow of the conversation, and the chatbot would most likely return a confusing answer. We determine whether a query _does not_ require additional context using two approaches: 1. We check if the query contains well-known phrases that would indicate the user is referencing the existing conversation (e.g., translate previous, change that) 1. We have a static dataset of messages that require context and a dataset of messages that do not. We embed both datasets, and then, using embedding similarity, we can check which set is more similar to the query. ## GraphQL API We add four new resolvers to the GraphQL API: ```graphql extend type Query { embeddingsSearch(repo: ID!, query: String!, codeResultsCount: Int!, textResultsCount: Int!): EmbeddingsSearchResults! isContextRequiredForQuery(query: String!): Boolean! } extend type Mutation { scheduleRepositoriesForEmbedding(repoNames: [String!]!): EmptyResponse! scheduleContextDetectionForEmbedding: EmptyResponse! } ``` - `embeddingsSearch` performs embeddings search over the repo embeddings and returns the specified number of results - `isContextRequiredForQuery` determines whether the given query requires additional context - `scheduleRepositoriesForEmbedding` schedules a repo embedding background job - `scheduleContextDetectionForEmbedding` schedules a context detection embedding background job that embeds a static dataset of messages. ## Repo embedding background job Embedding a repository is implemented as a background job. The background job handler receives the repository and the revision, which should be embedded. Handler then gathers a list of files from the gitserver and excludes files >1MB in size. The list of files is split into code and text files (.md, .txt), and we build a separate embedding index for both. We split them because in a combined index, the text files always tended to feature as top results and didn't leave any room for code files. Once we have the list of files, the procedure is as follows: - For each file - Get file contents from gitserver - Check if the file is embeddable (is not autogenerated, is large enough, does not have long lines) - Split the file into embeddable chunks - Embed the file chunks using an external embedding service (defined in site config) - Add embedded file chunks and metadata to the index - Metadata contains the file name, the start line, and the end line of the chunk - Once all files are processed, the index is marshaled into JSON and stored in Cloud storage (GCS, S3) ### Site config changes As mentioned, we use a configurable external embedding API that does the actual text -> vector embedding part. Ideally, this allows us to swap embedding providers in the future. ```json "embeddings": { "description": "Configuration for embeddings service.", "type": "object", "required": ["enabled", "dimensions", "model", "accessToken", "url"], "properties": { "enabled": { "description": "Toggles whether embedding service is enabled.", "type": "boolean", "default": false }, "dimensions": { "description": "The dimensionality of the embedding vectors.", "type": "integer", "minimum": 0 }, "model": { "description": "The model used for embedding.", "type": "string" }, "accessToken": { "description": "The access token used to authenticate with the external embedding API service.", "type": "string" }, "url": { "description": "The url to the external embedding API service.", "type": "string", "format": "uri" } } } ``` ## Repo embeddings search The repo embeddings search is implemented in its own service. When a user queries a repo using embeddings search, the following happens: - Download the repo embedding index from blob storage and cache it in memory - We cache up to 5 embedding indexes in memory - Embed the query and use the embedded query vector to find similar code and text file metadata in the embedding index - Query gitserver for the actual file contents - Return the results ## Interesting files - [Similarity search](https://github.com/sourcegraph/sourcegraph/pull/48017/files#diff-102cc83520004eb0e2795e49bc435c5142ca555189b1db3a52bbf1ffb82fa3c6) - [Repo embedding job handler](https://github.com/sourcegraph/sourcegraph/pull/48017/files#diff-c345f373f426398beb4b9cd5852ba862a2718687882db2a8b2d9c7fbb5f1dc52) - [External embedding api client](https://github.com/sourcegraph/sourcegraph/pull/48017/files#diff-ad1e7956f518e4bcaee17dd9e7ac04a5e090c00d970fcd273919e887e1d2cf8f) - [Embedding a repo](https://github.com/sourcegraph/sourcegraph/pull/48017/files#diff-1f35118727128095b7816791b6f0a2e0e060cddee43d25102859b8159465585c) - [Embeddings searcher service](https://github.com/sourcegraph/sourcegraph/pull/48017/files#diff-5b20f3e7ef87041daeeaef98b58ebf7388519cedcdfc359dc5e6d4e0b021472e) - [Embeddings search](https://github.com/sourcegraph/sourcegraph/pull/48017/files#diff-79f95b9cc3f1ef39c1a0b88015bd9cd6c19c30a8d4c147409f1b8e8cd9462ea1) - [Repo embedding index cache management](https://github.com/sourcegraph/sourcegraph/pull/48017/files#diff-8a41f7dec31054889dbf86e97c52223d5636b4d408c6b375bcfc09160a8b70f8) - [GraphQL resolvers](https://github.com/sourcegraph/sourcegraph/pull/48017/files#diff-9b30a0b5efcb63e2f4611b99ab137fbe09629a769a4f30d10a1b2da41a01d21f) ## Test plan - Start by filling out the `embeddings` object in the site config (let me know if you need an API key) - Start the embeddings service using `sg start embeddings` - Go to the `/api/console` page and schedule a repo embedding job and a context detection embedding job: ```graphql mutation { scheduleRepositoriesForEmbedding(repoNames: ["github.com/sourcegraph/handbook"]) { __typename } scheduleContextDetectionForEmbedding { __typename } } ``` - Once both are finished, you should be able to query the repo embedding index, and determine whether context is need for a given query: ```graphql query { isContextRequiredForQuery(query: "how do access tokens work") embeddingsSearch( repo: "UmVwb3NpdG9yeToy", # github.com/sourcegraph/handbook GQL ID query: "how do access tokens work", codeResultsCount: 5, textResultsCount: 5) { codeResults { fileName content } textResults { fileName content } } } ```
2023-03-01 09:50:12 +00:00
cody-gateway:
cmd: |
.bin/cody-gateway
install: |
if [ -n "$DELVE" ]; then
2023-09-12 10:10:10 +00:00
export GCFLAGS='all=-N -l'
fi
go build -gcflags="$GCFLAGS" -o .bin/cody-gateway github.com/sourcegraph/sourcegraph/cmd/cody-gateway
checkBinary: .bin/cody-gateway
env:
CODY_GATEWAY_ANTHROPIC_ACCESS_TOKEN: foobar
# Set in override if you want to test local Cody Gateway: https://docs.sourcegraph.com/dev/how-to/cody_gateway
CODY_GATEWAY_DOTCOM_ACCESS_TOKEN: ""
CODY_GATEWAY_DOTCOM_API_URL: https://sourcegraph.test:3443/.api/graphql
CODY_GATEWAY_ALLOW_ANONYMOUS: true
CODY_GATEWAY_DIAGNOSTICS_SECRET: sekret
SRC_LOG_LEVEL: info
2023-07-20 20:35:16 +00:00
# Enables metrics in dev via debugserver
SRC_PROF_HTTP: "127.0.0.1:6098"
watch:
- lib
- internal
- cmd/cody-gateway
telemetrygateway: add exporter and service (#56699) This change adds: - telemetry export background jobs: flagged behind `TELEMETRY_GATEWAY_EXPORTER_EXPORT_ADDR`, default empty => disabled - telemetry redaction: configured in package `internal/telemetry/sensitivemetadataallowlist` - telemetry-gateway service receiving events and forwarding it to a pub/sub topic (or just logging it, as configured in local dev) - utilities for easily creating an event recorder: `internal/telemetry/telemetryrecorder` Notes: - all changes are feature-flagged to some degree, off by default, so the merge should be fairly low-risk. - we decided that transmitting the full license key continues to be the way to go. we transmit it once per stream and attach it on all events in the telemetry-gateway. there is no auth mechanism at the moment - GraphQL return type `EventLog.Source` is now a plain string instead of string enum. This should not be a breaking change in our clients, but must be made so that our generated V2 events do not break requesting of event logs Stacked on https://github.com/sourcegraph/sourcegraph/pull/56520 Closes https://github.com/sourcegraph/sourcegraph/issues/56289 Closes https://github.com/sourcegraph/sourcegraph/issues/56287 ## Test plan Add an override to make the export super frequent: ``` env: TELEMETRY_GATEWAY_EXPORTER_EXPORT_INTERVAL: "10s" TELEMETRY_GATEWAY_EXPORTER_EXPORTED_EVENTS_RETENTION: "5m" ``` Start sourcegraph: ``` sg start ``` Enable `telemetry-export` featureflag (from https://github.com/sourcegraph/sourcegraph/pull/56520) Emit some events in GraphQL: ```gql mutation { telemetry { recordEvents(events:[{ feature:"foobar" action:"view" source:{ client:"WEB" } parameters:{ version:0 } }]) { alwaysNil } } ``` See series of log events: ``` [ worker] INFO worker.telemetrygateway-exporter telemetrygatewayexporter/telemetrygatewayexporter.go:61 Telemetry Gateway export enabled - initializing background routines [ worker] INFO worker.telemetrygateway-exporter telemetrygatewayexporter/exporter.go:99 exporting events {"maxBatchSize": 10000, "count": 1} [telemetry-g...y] INFO telemetry-gateway.pubsub pubsub/topic.go:115 Publish {"TraceId": "7852903434f0d2f647d397ee83b4009b", "SpanId": "8d945234bccf319b", "message": "{\"event\":{\"id\":\"dc96ae84-4ac4-4760-968f-0a0307b8bb3d\",\"timestamp\":\"2023-09-19T01:57:13.590266Z\",\"feature\":\"foobar\", .... ``` Build: ``` export VERSION="insiders" bazel run //cmd/telemetry-gateway:candidate_push --config darwin-docker --stamp --workspace_status_command=./dev/bazel_stamp_vars.sh -- --tag $VERSION --repository us.gcr.io/sourcegraph-dev/telemetry-gateway ``` Deploy: https://github.com/sourcegraph/managed-services/pull/7 Add override: ```yaml env: # Port required. TODO: What's the best way to provide gRPC addresses, such that a # localhost address is also possible? TELEMETRY_GATEWAY_EXPORTER_EXPORT_ADDR: "https://telemetry-gateway.sgdev.org:443" ``` Repeat the above (`sg start` and emit some events): ``` [ worker] INFO worker.telemetrygateway-exporter telemetrygatewayexporter/exporter.go:94 exporting events {"maxBatchSize": 10000, "count": 6} [ worker] INFO worker.telemetrygateway-exporter telemetrygatewayexporter/exporter.go:113 events exported {"maxBatchSize": 10000, "succeeded": 6} [ worker] INFO worker.telemetrygateway-exporter telemetrygatewayexporter/exporter.go:94 exporting events {"maxBatchSize": 10000, "count": 1} [ worker] INFO worker.telemetrygateway-exporter telemetrygatewayexporter/exporter.go:113 events exported {"maxBatchSize": 10000, "succeeded": 1} ```
2023-09-20 05:20:15 +00:00
telemetry-gateway:
cmd: |
# Telemetry Gateway needs this to parse and validate incoming license keys.
export SOURCEGRAPH_LICENSE_GENERATION_KEY=$(cat ../dev-private/enterprise/dev/test-license-generation-key.pem)
.bin/telemetry-gateway
install: |
if [ -n "$DELVE" ]; then
export GCFLAGS='all=-N -l'
fi
go build -gcflags="$GCFLAGS" -o .bin/telemetry-gateway github.com/sourcegraph/sourcegraph/cmd/telemetry-gateway
checkBinary: .bin/telemetry-gateway
env:
PORT: "10080"
DIAGNOSTICS_SECRET: sekret
TELEMETRY_GATEWAY_EVENTS_PUBSUB_ENABLED: false
SRC_LOG_LEVEL: info
# Enables metrics in dev via debugserver
SRC_PROF_HTTP: "127.0.0.1:6080"
GRPC_WEB_UI_ENABLED: true
watch:
- lib
- internal
- cmd/telemetry-gateway
- internal/telemetrygateway
pings:
cmd: |
.bin/pings
install: |
if [ -n "$DELVE" ]; then
2023-09-12 10:10:10 +00:00
export GCFLAGS='all=-N -l'
fi
go build -gcflags="$GCFLAGS" -o .bin/pings github.com/sourcegraph/sourcegraph/cmd/pings
checkBinary: .bin/pings
env:
SRC_LOG_LEVEL: info
DIAGNOSTICS_SECRET: 'lifeisgood'
PINGS_PUBSUB_PROJECT_ID: 'telligentsourcegraph'
PINGS_PUBSUB_TOPIC_ID: 'server-update-checks-test'
HUBSPOT_ACCESS_TOKEN: ''
# Enables metrics in dev via debugserver
SRC_PROF_HTTP: "127.0.0.1:7011"
watch:
- lib
- internal
- cmd/pings
searcher:
cmd: .bin/searcher
2021-10-09 01:47:08 +00:00
install: |
if [ -n "$DELVE" ]; then
2023-09-12 10:10:10 +00:00
export GCFLAGS='all=-N -l'
2021-10-09 01:47:08 +00:00
fi
go build -gcflags="$GCFLAGS" -o .bin/searcher github.com/sourcegraph/sourcegraph/cmd/searcher
checkBinary: .bin/searcher
watch:
- lib
- internal
- cmd/searcher
caddy:
ignoreStdout: true
ignoreStderr: true
cmd: .bin/caddy_${CADDY_VERSION} run --watch --config=dev/Caddyfile
install_func: installCaddy
env:
CADDY_VERSION: 2.7.3
web:
description: Enterprise version of the web app
cmd: ./node_modules/.bin/gulp --color dev
install: pnpm install
env:
ENABLE_OPEN_TELEMETRY: true
web-standalone-http:
description: Standalone web frontend (dev) with API proxy to a configurable URL
cmd: pnpm --filter @sourcegraph/web serve:dev --color
install: |
pnpm install
pnpm generate
env:
WEBPACK_SERVE_INDEX: true
SOURCEGRAPH_API_URL: https://k8s.sgdev.org
web-standalone-http-prod:
description: Standalone web frontend (production) with API proxy to a configurable URL
cmd: pnpm --filter @sourcegraph/web serve:prod
install: pnpm --filter @sourcegraph/web run build
env:
NODE_ENV: production
WEBPACK_SERVE_INDEX: true
SOURCEGRAPH_API_URL: https://k8s.sgdev.org
web-integration-build:
description: Build development web application for integration tests
cmd: pnpm --filter @sourcegraph/web run build
env:
INTEGRATION_TESTS: true
web-integration-build-prod:
description: Build production web application for integration tests
cmd: pnpm --filter @sourcegraph/web run build
env:
INTEGRATION_TESTS: true
NODE_ENV: production
docsite:
description: Docsite instance serving the docs
cmd: bazel run --noshow_progress --noshow_loading_progress //doc:serve
syntax-highlighter:
ignoreStdout: true
ignoreStderr: true
cmd: |
docker run --name=syntax-highlighter --rm -p9238:9238 \
-e WORKERS=1 -e ROCKET_ADDRESS=0.0.0.0 \
sourcegraph/syntax-highlighter:insiders
install: |
# Remove containers by the old name, too.
docker inspect syntect_server >/dev/null 2>&1 && docker rm -f syntect_server || true
docker inspect syntax-highlighter >/dev/null 2>&1 && docker rm -f syntax-highlighter || true
# Pull syntax-highlighter latest insider image, only during install, but
# skip if OFFLINE=true is set.
if [[ "$OFFLINE" != "true" ]]; then
docker pull -q sourcegraph/syntax-highlighter:insiders
fi
zoekt-indexserver-template: &zoekt_indexserver_template
cmd: |
env PATH="${PWD}/.bin:$PATH" .bin/zoekt-sourcegraph-indexserver \
-sourcegraph_url 'http://localhost:3090' \
-index "$HOME/.sourcegraph/zoekt/index-$ZOEKT_NUM" \
-hostname "localhost:$ZOEKT_HOSTNAME_PORT" \
-interval 1m \
-listen "127.0.0.1:$ZOEKT_LISTEN_PORT" \
-cpu_fraction 0.25
install: |
mkdir -p .bin
export GOBIN="${PWD}/.bin"
go install github.com/sourcegraph/zoekt/cmd/zoekt-archive-index
go install github.com/sourcegraph/zoekt/cmd/zoekt-git-index
go install github.com/sourcegraph/zoekt/cmd/zoekt-sourcegraph-indexserver
checkBinary: .bin/zoekt-sourcegraph-indexserver
env: &zoektenv
CTAGS_COMMAND: dev/universal-ctags-dev
SCIP_CTAGS_COMMAND: dev/scip-ctags-dev
GRPC_ENABLED: true
zoekt-index-0:
<<: *zoekt_indexserver_template
env:
<<: *zoektenv
ZOEKT_NUM: 0
ZOEKT_HOSTNAME_PORT: 3070
ZOEKT_LISTEN_PORT: 6072
zoekt-index-1:
<<: *zoekt_indexserver_template
env:
<<: *zoektenv
ZOEKT_NUM: 1
ZOEKT_HOSTNAME_PORT: 3071
ZOEKT_LISTEN_PORT: 6073
zoekt-web-template: &zoekt_webserver_template
install: |
mkdir -p .bin
env GOBIN="${PWD}/.bin" go install github.com/sourcegraph/zoekt/cmd/zoekt-webserver
checkBinary: .bin/zoekt-webserver
env:
JAEGER_DISABLED: true
OPENTELEMETRY_DISABLED: false
GOGC: 25
zoekt-web-0:
<<: *zoekt_webserver_template
cmd: env PATH="${PWD}/.bin:$PATH" .bin/zoekt-webserver -index "$HOME/.sourcegraph/zoekt/index-0" -pprof -rpc -indexserver_proxy -listen "127.0.0.1:3070"
zoekt-web-1:
<<: *zoekt_webserver_template
cmd: env PATH="${PWD}/.bin:$PATH" .bin/zoekt-webserver -index "$HOME/.sourcegraph/zoekt/index-1" -pprof -rpc -indexserver_proxy -listen "127.0.0.1:3071"
codeintel-worker:
cmd: |
export SOURCEGRAPH_LICENSE_GENERATION_KEY=$(cat ../dev-private/enterprise/dev/test-license-generation-key.pem)
.bin/codeintel-worker
install: |
2021-10-09 01:47:08 +00:00
if [ -n "$DELVE" ]; then
2023-09-12 10:10:10 +00:00
export GCFLAGS='all=-N -l'
2021-10-09 01:47:08 +00:00
fi
go build -gcflags="$GCFLAGS" -o .bin/codeintel-worker github.com/sourcegraph/sourcegraph/cmd/precise-code-intel-worker
checkBinary: .bin/codeintel-worker
watch:
- lib
- internal
- cmd/precise-code-intel-worker
- lib/codeintel
executor-template:
&executor_template # TMPDIR is set here so it's not set in the `install` process, which would trip up `go build`.
cmd: |
env TMPDIR="$HOME/.sourcegraph/executor-temp" .bin/executor
install: |
2021-10-09 01:47:08 +00:00
if [ -n "$DELVE" ]; then
2023-09-12 10:10:10 +00:00
export GCFLAGS='all=-N -l'
2021-10-09 01:47:08 +00:00
fi
go build -gcflags="$GCFLAGS" -o .bin/executor github.com/sourcegraph/sourcegraph/cmd/executor
checkBinary: .bin/executor
env:
# Required for frontend and executor to communicate
EXECUTOR_FRONTEND_URL: http://localhost:3080
# Must match the secret defined in the site config.
EXECUTOR_FRONTEND_PASSWORD: hunter2hunter2hunter2
# Disable firecracker inside executor in dev
EXECUTOR_USE_FIRECRACKER: false
EXECUTOR_QUEUE_NAME: TEMPLATE
watch:
- lib
- internal
- cmd/executor
2023-06-04 22:30:05 +00:00
executor-kubernetes-template: &executor_kubernetes_template
cmd: |
cd $MANIFEST_PATH
cleanup() {
executor: Single K8s Job (#53311) Part of #50601 (not the actual full implementation as corners were cut). ## Changes - Created a specific package to handle workspace files - Makes it easier to have same functionality across "modes" - Moved `workspace.FileStore` to `files.Store` - Created env vars to configuring the single job - The volume can be a dynamically created PVC or an `emptyDir` with a size limit - Reading of container logs is now kicked off during the pod watching. - This makes it easier to have shared functionality between the different modes - Can kick off logs based on the container status vs waiting - For the single job, a the job token is put into a secret that is created and deleted for the job - Refactored the Command Logger into a separate package to avoid cyclical imports - Will also make it easier to use in a custom image ### Important Changes - `enterprise/cmd/executor/internal/worker/files/files.go` - `enterprise/cmd/executor/internal/worker/runtime/kubernetes.go` - `enterprise/cmd/executor/internal/worker/runner/kubernetes.go` - `enterprise/cmd/executor/internal/worker/command/kubernetes.go` ### The Good - Handles step skipping - Handle workspace files - Handles env outputs - Handles previous step outputs - Hidden behind an environment variables. So can be turned off and on - Refactoring has been done to make it easier to extract functionality into another image to do this instead of hardcoding into the init containers ### The Bad - Using `batcheshlper` to run the setup `initContainer` and the `main` container - This image comes with `git`, so it works out well for the cloning - This image is already (probably) already being pulled in, so one less thing for a user to bring in - There is some hardcoded nonsense (e.g. step skipping) - Cannot handle large workspace files (e.g. binaries) ## Test plan - Updated existing tests - Added new tests - Ran tests with `docker` runtime - Ran tests with `kubernetes` runtime - Tested existing functionality - Tested single job functionality
2023-06-15 17:47:57 +00:00
kubectl delete jobs --all
kubectl delete -f .
}
kubectl delete -f . --ignore-not-found
kubectl apply -f .
executor: Single K8s Job (#53311) Part of #50601 (not the actual full implementation as corners were cut). ## Changes - Created a specific package to handle workspace files - Makes it easier to have same functionality across "modes" - Moved `workspace.FileStore` to `files.Store` - Created env vars to configuring the single job - The volume can be a dynamically created PVC or an `emptyDir` with a size limit - Reading of container logs is now kicked off during the pod watching. - This makes it easier to have shared functionality between the different modes - Can kick off logs based on the container status vs waiting - For the single job, a the job token is put into a secret that is created and deleted for the job - Refactored the Command Logger into a separate package to avoid cyclical imports - Will also make it easier to use in a custom image ### Important Changes - `enterprise/cmd/executor/internal/worker/files/files.go` - `enterprise/cmd/executor/internal/worker/runtime/kubernetes.go` - `enterprise/cmd/executor/internal/worker/runner/kubernetes.go` - `enterprise/cmd/executor/internal/worker/command/kubernetes.go` ### The Good - Handles step skipping - Handle workspace files - Handles env outputs - Handles previous step outputs - Hidden behind an environment variables. So can be turned off and on - Refactoring has been done to make it easier to extract functionality into another image to do this instead of hardcoding into the init containers ### The Bad - Using `batcheshlper` to run the setup `initContainer` and the `main` container - This image comes with `git`, so it works out well for the cloning - This image is already (probably) already being pulled in, so one less thing for a user to bring in - There is some hardcoded nonsense (e.g. step skipping) - Cannot handle large workspace files (e.g. binaries) ## Test plan - Updated existing tests - Added new tests - Ran tests with `docker` runtime - Ran tests with `kubernetes` runtime - Tested existing functionality - Tested single job functionality
2023-06-15 17:47:57 +00:00
trap cleanup EXIT SIGINT
while true; do
sleep 1
done
install: |
if [[ $(uname) == "Linux" ]]; then
bazel build //cmd/executor-kubernetes:image_tarball
docker load --input $(bazel cquery //cmd/executor-kubernetes:image_tarball --output=files)
else
bazel build //cmd/executor-kubernetes:image_tarball --config darwin-docker
docker load --input $(bazel cquery //cmd/executor-kubernetes:image_tarball --config darwin-docker --output=files)
fi
env:
IMAGE: executor-kubernetes:candidate
# TODO: This is required but should only be set on M1 Macs.
PLATFORM: linux/arm64
watch:
- lib
- internal
- cmd/executor
codeintel-executor:
<<: *executor_template
cmd: |
env TMPDIR="$HOME/.sourcegraph/indexer-temp" .bin/executor
env:
EXECUTOR_QUEUE_NAME: codeintel
# If you want to use this, either start it with `sg run batches-executor-firecracker` or
# modify the `commandsets.batches` in your local `sg.config.overwrite.yaml`
codeintel-executor-firecracker:
<<: *executor_template
cmd: |
env TMPDIR="$HOME/.sourcegraph/codeintel-executor-temp" \
sudo --preserve-env=TMPDIR,EXECUTOR_QUEUE_NAME,EXECUTOR_FRONTEND_URL,EXECUTOR_FRONTEND_PASSWORD,EXECUTOR_USE_FIRECRACKER \
.bin/executor
env:
EXECUTOR_USE_FIRECRACKER: true
EXECUTOR_QUEUE_NAME: codeintel
codeintel-executor-kubernetes:
<<: *executor_kubernetes_template
env:
MANIFEST_PATH: ./cmd/executor/kubernetes/codeintel
batches-executor:
<<: *executor_template
cmd: |
env TMPDIR="$HOME/.sourcegraph/batches-executor-temp" .bin/executor
env:
2021-09-22 10:03:57 +00:00
EXECUTOR_QUEUE_NAME: batches
EXECUTOR_MAXIMUM_NUM_JOBS: 8
# If you want to use this, either start it with `sg run batches-executor-firecracker` or
# modify the `commandsets.batches` in your local `sg.config.overwrite.yaml`
batches-executor-firecracker:
<<: *executor_template
cmd: |
env TMPDIR="$HOME/.sourcegraph/batches-executor-temp" \
sudo --preserve-env=TMPDIR,EXECUTOR_QUEUE_NAME,EXECUTOR_FRONTEND_URL,EXECUTOR_FRONTEND_PASSWORD,EXECUTOR_USE_FIRECRACKER \
.bin/executor
env:
EXECUTOR_USE_FIRECRACKER: true
EXECUTOR_QUEUE_NAME: batches
batches-executor-kubernetes:
<<: *executor_kubernetes_template
env:
MANIFEST_PATH: ./cmd/executor/kubernetes/batches
Experiment: Natively run SSBC in docker (#44034) This adds an experimental code path that I will use to test a docker-only execution mode for server-side batch changes. This code path is never executed for customers until we make the switch when we deem it ready. This will allow me to dogfood this while it's not available to customer instances yet. Ultimately, the goal of this is to make executors simply be "the job runner platform through a generic interface". Today, this depends on src-cli to do a good bunch of the work. This is a blocker for going full docker-based with executors, which will ultimately be a requirement on the road to k8s-based executors. As this removes the dependency on src-cli, nothing but the job interface and API endpoints tie executor and Sourcegraph instance together. Ultimately, this will allow us to support larger version spans between the two (pending executors going GA and being feature-complete). Known issues/limitations: Steps skipped in between steps that run don't work yet Skipping steps dynamically is inefficient as we cannot tell the executor to skip a step IF X, so we replace the script by exit 0 It is unclear if all variants of file mounts still work. Basic cases do work. Files used to be read-only in src-cli, they aren't now, but content is still reset in between steps. The assumption that everything operates in /work is broken here, because we need to use what executors give us to persist out-of-repo state in between containers (like the step result from the previous step) It is unclear if workspace mounts work Cache keys are not correctly computed if using workspace mounts - the metadataretriever is nil We still use log outputs to transfer the AfterStepResults to the Sourcegraph instance, this should finally become an artifact instead. Then, we don't have to rely on the execution_log_entires anymore and can theoretically prune those after some time. This column is currently growing indefinitely. It depends on tee being available in the docker images to capture the cmd.stdout/cmd.stderr properly for template variable rendering Env-vars are not rendered in their evaluated form post-execution File permissions are unclear and might be similarly broken to how they are now - or even worse Disclaimer: It's not feature complete today! But it is also not hitting any default code paths either. As development on this goes on, we can eventually remove the feature flag and run the new job format on all instances. This PR handles fallback of rendering old records correctly in the UI already.
2022-11-09 23:20:43 +00:00
# This tool rebuilds the batcheshelper image every time the source of it is changed.
batcheshelper-builder:
# Nothing to run for this, we just want to re-run the install script every time.
cmd: exit 0
install: |
if [[ $(uname) == "Linux" ]]; then
bazel build //cmd/batcheshelper:image_tarball
docker load --input $(bazel cquery //cmd/batcheshelper:image_tarball --output=files)
else
bazel build //cmd/batcheshelper:image_tarball --config darwin-docker
docker load --input $(bazel cquery //cmd/batcheshelper:image_tarball --config darwin-docker --output=files)
fi
Experiment: Natively run SSBC in docker (#44034) This adds an experimental code path that I will use to test a docker-only execution mode for server-side batch changes. This code path is never executed for customers until we make the switch when we deem it ready. This will allow me to dogfood this while it's not available to customer instances yet. Ultimately, the goal of this is to make executors simply be "the job runner platform through a generic interface". Today, this depends on src-cli to do a good bunch of the work. This is a blocker for going full docker-based with executors, which will ultimately be a requirement on the road to k8s-based executors. As this removes the dependency on src-cli, nothing but the job interface and API endpoints tie executor and Sourcegraph instance together. Ultimately, this will allow us to support larger version spans between the two (pending executors going GA and being feature-complete). Known issues/limitations: Steps skipped in between steps that run don't work yet Skipping steps dynamically is inefficient as we cannot tell the executor to skip a step IF X, so we replace the script by exit 0 It is unclear if all variants of file mounts still work. Basic cases do work. Files used to be read-only in src-cli, they aren't now, but content is still reset in between steps. The assumption that everything operates in /work is broken here, because we need to use what executors give us to persist out-of-repo state in between containers (like the step result from the previous step) It is unclear if workspace mounts work Cache keys are not correctly computed if using workspace mounts - the metadataretriever is nil We still use log outputs to transfer the AfterStepResults to the Sourcegraph instance, this should finally become an artifact instead. Then, we don't have to rely on the execution_log_entires anymore and can theoretically prune those after some time. This column is currently growing indefinitely. It depends on tee being available in the docker images to capture the cmd.stdout/cmd.stderr properly for template variable rendering Env-vars are not rendered in their evaluated form post-execution File permissions are unclear and might be similarly broken to how they are now - or even worse Disclaimer: It's not feature complete today! But it is also not hitting any default code paths either. As development on this goes on, we can eventually remove the feature flag and run the new job format on all instances. This PR handles fallback of rendering old records correctly in the UI already.
2022-11-09 23:20:43 +00:00
env:
IMAGE: batcheshelper:candidate
Experiment: Natively run SSBC in docker (#44034) This adds an experimental code path that I will use to test a docker-only execution mode for server-side batch changes. This code path is never executed for customers until we make the switch when we deem it ready. This will allow me to dogfood this while it's not available to customer instances yet. Ultimately, the goal of this is to make executors simply be "the job runner platform through a generic interface". Today, this depends on src-cli to do a good bunch of the work. This is a blocker for going full docker-based with executors, which will ultimately be a requirement on the road to k8s-based executors. As this removes the dependency on src-cli, nothing but the job interface and API endpoints tie executor and Sourcegraph instance together. Ultimately, this will allow us to support larger version spans between the two (pending executors going GA and being feature-complete). Known issues/limitations: Steps skipped in between steps that run don't work yet Skipping steps dynamically is inefficient as we cannot tell the executor to skip a step IF X, so we replace the script by exit 0 It is unclear if all variants of file mounts still work. Basic cases do work. Files used to be read-only in src-cli, they aren't now, but content is still reset in between steps. The assumption that everything operates in /work is broken here, because we need to use what executors give us to persist out-of-repo state in between containers (like the step result from the previous step) It is unclear if workspace mounts work Cache keys are not correctly computed if using workspace mounts - the metadataretriever is nil We still use log outputs to transfer the AfterStepResults to the Sourcegraph instance, this should finally become an artifact instead. Then, we don't have to rely on the execution_log_entires anymore and can theoretically prune those after some time. This column is currently growing indefinitely. It depends on tee being available in the docker images to capture the cmd.stdout/cmd.stderr properly for template variable rendering Env-vars are not rendered in their evaluated form post-execution File permissions are unclear and might be similarly broken to how they are now - or even worse Disclaimer: It's not feature complete today! But it is also not hitting any default code paths either. As development on this goes on, we can eventually remove the feature flag and run the new job format on all instances. This PR handles fallback of rendering old records correctly in the UI already.
2022-11-09 23:20:43 +00:00
# TODO: This is required but should only be set on M1 Macs.
PLATFORM: linux/arm64
watch:
- cmd/batcheshelper
Experiment: Natively run SSBC in docker (#44034) This adds an experimental code path that I will use to test a docker-only execution mode for server-side batch changes. This code path is never executed for customers until we make the switch when we deem it ready. This will allow me to dogfood this while it's not available to customer instances yet. Ultimately, the goal of this is to make executors simply be "the job runner platform through a generic interface". Today, this depends on src-cli to do a good bunch of the work. This is a blocker for going full docker-based with executors, which will ultimately be a requirement on the road to k8s-based executors. As this removes the dependency on src-cli, nothing but the job interface and API endpoints tie executor and Sourcegraph instance together. Ultimately, this will allow us to support larger version spans between the two (pending executors going GA and being feature-complete). Known issues/limitations: Steps skipped in between steps that run don't work yet Skipping steps dynamically is inefficient as we cannot tell the executor to skip a step IF X, so we replace the script by exit 0 It is unclear if all variants of file mounts still work. Basic cases do work. Files used to be read-only in src-cli, they aren't now, but content is still reset in between steps. The assumption that everything operates in /work is broken here, because we need to use what executors give us to persist out-of-repo state in between containers (like the step result from the previous step) It is unclear if workspace mounts work Cache keys are not correctly computed if using workspace mounts - the metadataretriever is nil We still use log outputs to transfer the AfterStepResults to the Sourcegraph instance, this should finally become an artifact instead. Then, we don't have to rely on the execution_log_entires anymore and can theoretically prune those after some time. This column is currently growing indefinitely. It depends on tee being available in the docker images to capture the cmd.stdout/cmd.stderr properly for template variable rendering Env-vars are not rendered in their evaluated form post-execution File permissions are unclear and might be similarly broken to how they are now - or even worse Disclaimer: It's not feature complete today! But it is also not hitting any default code paths either. As development on this goes on, we can eventually remove the feature flag and run the new job format on all instances. This PR handles fallback of rendering old records correctly in the UI already.
2022-11-09 23:20:43 +00:00
- lib/batches
continueWatchOnExit: true
2023-06-04 22:30:05 +00:00
multiqueue-executor:
<<: *executor_template
cmd: |
env TMPDIR="$HOME/.sourcegraph/multiqueue-executor-temp" .bin/executor
env:
EXECUTOR_QUEUE_NAME: ""
EXECUTOR_QUEUE_NAMES: "codeintel,batches"
EXECUTOR_MAXIMUM_NUM_JOBS: 8
blobstore:
cmd: .bin/blobstore
install: |
# Ensure the old blobstore Docker container is not running
docker rm -f blobstore
if [ -n "$DELVE" ]; then
2023-09-12 10:10:10 +00:00
export GCFLAGS='all=-N -l'
2023-06-04 22:30:05 +00:00
fi
go build -gcflags="$GCFLAGS" -o .bin/blobstore github.com/sourcegraph/sourcegraph/cmd/blobstore
checkBinary: .bin/blobstore
watch:
- lib
- internal
- cmd/blobstore
env:
BLOBSTORE_DATA_DIR: $HOME/.sourcegraph-dev/data/blobstore-go
redis-postgres:
# Add the following overwrites to your sg.config.overwrite.yaml to use the docker-compose
# database:
#
# env:
# PGHOST: localhost
# PGPASSWORD: sourcegraph
# PGUSER: sourcegraph
#
# You could also add an overwrite to add `redis-postgres` to the relevant command set(s).
description: Dockerized version of redis and postgres
cmd: docker-compose -f dev/redis-postgres.yml up $COMPOSE_ARGS
env:
COMPOSE_ARGS: --force-recreate
jaeger:
cmd: |
echo "Jaeger will be available on http://localhost:16686/-/debug/jaeger/search"
.bin/jaeger-all-in-one-${JAEGER_VERSION} --log-level ${JAEGER_LOG_LEVEL}
install_func: installJaeger
env:
JAEGER_VERSION: 1.45.0
2023-06-04 22:30:05 +00:00
JAEGER_DISK: $HOME/.sourcegraph-dev/data/jaeger
JAEGER_LOG_LEVEL: error
QUERY_BASE_PATH: /-/debug/jaeger
grafana:
cmd: |
if [[ $(uname) == "Linux" ]]; then
# Linux needs an extra arg to support host.internal.docker, which is how grafana connects
# to the prometheus backend.
ADD_HOST_FLAG="--add-host=host.docker.internal:host-gateway"
# Docker users on Linux will generally be using direct user mapping, which
# means that they'll want the data in the volume mount to be owned by the
# same user as is running this script. Fortunately, the Grafana container
# doesn't really care what user it runs as, so long as it can write to
# /var/lib/grafana.
DOCKER_USER="--user=$UID"
fi
echo "Grafana: serving on http://localhost:${PORT}"
echo "Grafana: note that logs are piped to ${GRAFANA_LOG_FILE}"
docker run --rm ${DOCKER_USER} \
--name=${CONTAINER} \
--cpus=1 \
--memory=1g \
-p 0.0.0.0:3370:3370 ${ADD_HOST_FLAG} \
-v "${GRAFANA_DISK}":/var/lib/grafana \
-v "$(pwd)"/dev/grafana/all:/sg_config_grafana/provisioning/datasources \
grafana:candidate >"${GRAFANA_LOG_FILE}" 2>&1
2023-06-04 22:30:05 +00:00
install: |
mkdir -p "${GRAFANA_DISK}"
mkdir -p "$(dirname ${GRAFANA_LOG_FILE})"
docker inspect $CONTAINER >/dev/null 2>&1 && docker rm -f $CONTAINER
bazel build //docker-images/grafana:image_tarball
docker load --input $(bazel cquery //docker-images/grafana:image_tarball --output=files)
2023-06-04 22:30:05 +00:00
env:
GRAFANA_DISK: $HOME/.sourcegraph-dev/data/grafana
# Log file location: since we log outside of the Docker container, we should
# log somewhere that's _not_ ~/.sourcegraph-dev/data/grafana, since that gets
# volume mounted into the container and therefore has its own ownership
# semantics.
# Now for the actual logging. Grafana's output gets sent to stdout and stderr.
# We want to capture that output, but because it's fairly noisy, don't want to
# display it in the normal case.
GRAFANA_LOG_FILE: $HOME/.sourcegraph-dev/logs/grafana/grafana.log
IMAGE: grafana:candidate
2023-06-04 22:30:05 +00:00
CONTAINER: grafana
PORT: 3370
# docker containers must access things via docker host on non-linux platforms
DOCKER_USER: ""
ADD_HOST_FLAG: ""
CACHE: false
prometheus:
cmd: |
if [[ $(uname) == "Linux" ]]; then
DOCKER_USER="--user=$UID"
# Frontend generally runs outside of Docker, so to access it we need to be
# able to access ports on the host. --net=host is a very dirty way of
# enabling this.
DOCKER_NET="--net=host"
SRC_FRONTEND_INTERNAL="localhost:3090"
fi
echo "Prometheus: serving on http://localhost:${PORT}"
echo "Prometheus: note that logs are piped to ${PROMETHEUS_LOG_FILE}"
docker run --rm ${DOCKER_NET} ${DOCKER_USER} \
--name=${CONTAINER} \
--cpus=1 \
--memory=4g \
-p 0.0.0.0:9090:9090 \
-v "${PROMETHEUS_DISK}":/prometheus \
-v "$(pwd)/${CONFIG_DIR}":/sg_prometheus_add_ons \
-e SRC_FRONTEND_INTERNAL="${SRC_FRONTEND_INTERNAL}" \
-e DISABLE_SOURCEGRAPH_CONFIG="${DISABLE_SOURCEGRAPH_CONFIG:-""}" \
-e DISABLE_ALERTMANAGER="${DISABLE_ALERTMANAGER:-""}" \
-e PROMETHEUS_ADDITIONAL_FLAGS="--web.enable-lifecycle --web.enable-admin-api" \
${IMAGE} >"${PROMETHEUS_LOG_FILE}" 2>&1
install: |
mkdir -p "${PROMETHEUS_DISK}"
mkdir -p "$(dirname ${PROMETHEUS_LOG_FILE})"
docker inspect $CONTAINER >/dev/null 2>&1 && docker rm -f $CONTAINER
if [[ $(uname) == "Linux" ]]; then
PROM_TARGETS="dev/prometheus/linux/prometheus_targets.yml"
fi
cp ${PROM_TARGETS} "${CONFIG_DIR}"/prometheus_targets.yml
if [[ $(uname) == "Linux" ]]; then
bazel build //docker-images/prometheus:image_tarball
docker load --input $(bazel cquery //docker-images/prometheus:image_tarball --output=files)
else
bazel build //docker-images/prometheus:image_tarball --config darwin-docker
docker load --input $(bazel cquery //docker-images/prometheus:image_tarball --config darwin-docker --output=files)
fi
2023-06-04 22:30:05 +00:00
env:
PROMETHEUS_DISK: $HOME/.sourcegraph-dev/data/prometheus
# See comment above for `grafana`
PROMETHEUS_LOG_FILE: $HOME/.sourcegraph-dev/logs/prometheus/prometheus.log
IMAGE: prometheus:candidate
2023-06-04 22:30:05 +00:00
CONTAINER: prometheus
PORT: 9090
CONFIG_DIR: docker-images/prometheus/config
DOCKER_USER: ""
DOCKER_NET: ""
PROM_TARGETS: dev/prometheus/all/prometheus_targets.yml
SRC_FRONTEND_INTERNAL: host.docker.internal:3090
ADD_HOST_FLAG: ""
DISABLE_SOURCEGRAPH_CONFIG: false
postgres_exporter:
cmd: |
if [[ $(uname) == "Linux" ]]; then
# Linux needs an extra arg to support host.internal.docker, which is how grafana connects
# to the prometheus backend.
ADD_HOST_FLAG="--add-host=host.docker.internal:host-gateway"
fi
# Use psql to read the effective values for PG* env vars (instead of, e.g., hardcoding the default
# values).
get_pg_env() { psql -c '\set' | grep "$1" | cut -f 2 -d "'"; }
PGHOST=${PGHOST-$(get_pg_env HOST)}
PGUSER=${PGUSER-$(get_pg_env USER)}
PGPORT=${PGPORT-$(get_pg_env PORT)}
# we need to be able to query migration_logs table
PGDATABASE=${PGDATABASE-$(get_pg_env DBNAME)}
ADJUSTED_HOST=${PGHOST:-127.0.0.1}
if [[ ("$ADJUSTED_HOST" == "localhost" || "$ADJUSTED_HOST" == "127.0.0.1" || -f "$ADJUSTED_HOST") && "$OSTYPE" != "linux-gnu" ]]; then
ADJUSTED_HOST="host.docker.internal"
fi
NET_ARG=""
DATA_SOURCE_NAME="postgresql://${PGUSER}:${PGPASSWORD}@${ADJUSTED_HOST}:${PGPORT}/${PGDATABASE}?sslmode=${PGSSLMODE:-disable}"
if [[ "$OSTYPE" == "linux-gnu" ]]; then
NET_ARG="--net=host"
DATA_SOURCE_NAME="postgresql://${PGUSER}:${PGPASSWORD}@${ADJUSTED_HOST}:${PGPORT}/${PGDATABASE}?sslmode=${PGSSLMODE:-disable}"
fi
echo "postgres_exporter: serving on http://localhost:${PORT}"
docker run --rm ${DOCKER_USER} \
--name=${CONTAINER} \
-e DATA_SOURCE_NAME="${DATA_SOURCE_NAME}" \
--cpus=1 \
--memory=1g \
-p 0.0.0.0:9187:9187 ${ADD_HOST_FLAG} \
"${IMAGE}"
install: |
docker inspect $CONTAINER >/dev/null 2>&1 && docker rm -f $CONTAINER
bazel build //docker-images/postgres_exporter:image_tarball
docker load --input $(bazel cquery //docker-images/postgres_exporter:image_tarball --output=files)
env:
IMAGE: postgres-exporter:candidate
CONTAINER: postgres_exporter
# docker containers must access things via docker host on non-linux platforms
DOCKER_USER: ""
ADD_HOST_FLAG: ""
2023-06-04 22:30:05 +00:00
monitoring-generator:
cmd: echo "monitoring-generator is deprecated, please run 'sg generate go' or 'bazel run //dev:write_all_generated' instead"
2023-06-04 22:30:05 +00:00
env:
loki:
cmd: |
echo "Loki: serving on http://localhost:3100"
echo "Loki: note that logs are piped to ${LOKI_LOG_FILE}"
docker run --rm --name=loki \
-p 3100:3100 -v $LOKI_DISK:/loki \
index.docker.io/grafana/loki:$LOKI_VERSION >"${LOKI_LOG_FILE}" 2>&1
install: |
mkdir -p "${LOKI_DISK}"
mkdir -p "$(dirname ${LOKI_LOG_FILE})"
docker pull index.docker.io/grafana/loki:$LOKI_VERSION
env:
LOKI_DISK: $HOME/.sourcegraph-dev/data/loki
LOKI_VERSION: "2.3.0"
LOKI_LOG_FILE: $HOME/.sourcegraph-dev/logs/loki/loki.log
otel-collector:
install: |
if [[ $(uname) == "Linux" ]]; then
bazel build //docker-images/opentelemetry-collector:image_tarball
docker load --input $(bazel cquery //docker-images/opentelemetry-collector:image_tarball --output=files)
else
bazel build //docker-images/opentelemetry-collector:image_tarball --config darwin-docker
docker load --input $(bazel cquery //docker-images/opentelemetry-collector:image_tarball --config darwin-docker --output=files)
fi
2023-06-04 22:30:05 +00:00
description: OpenTelemetry collector
cmd: |
JAEGER_HOST='host.docker.internal'
if [[ $(uname) == "Linux" ]]; then
# Jaeger generally runs outside of Docker, so to access it we need to be
# able to access ports on the host, because the Docker host only exists on
# MacOS. --net=host is a very dirty way of enabling this.
DOCKER_NET="--net=host"
JAEGER_HOST="localhost"
fi
docker container rm -f otel-collector
2023-06-04 22:30:05 +00:00
docker run --rm --name=otel-collector $DOCKER_NET $DOCKER_ARGS \
-p 4317:4317 -p 4318:4318 -p 55679:55679 -p 55670:55670 \
-p 8888:8888 \
-e JAEGER_HOST=$JAEGER_HOST \
-e HONEYCOMB_API_KEY=$HONEYCOMB_API_KEY \
-e HONEYCOMB_DATASET=$HONEYCOMB_DATASET \
$IMAGE --config "/etc/otel-collector/$CONFIGURATION_FILE"
env:
IMAGE: opentelemetry-collector:candidate
2023-06-04 22:30:05 +00:00
# Overwrite the following in sg.config.overwrite.yaml, based on which collector
# config you are using - see docker-images/opentelemetry-collector for more details.
CONFIGURATION_FILE: "configs/jaeger.yaml"
# HONEYCOMB_API_KEY: ''
# HONEYCOMB_DATASET: ''
storybook:
cmd: pnpm storybook
install: pnpm install
# This will execute `env`, a utility to print the process environment. Can
# be used to debug which global vars `sg` uses.
debug-env:
description: Debug env vars
cmd: env
bext:
cmd: pnpm --filter @sourcegraph/browser dev
install: pnpm install
sourcegraph: &sourcegraph_command
description: Single-program distribution
2023-06-04 22:30:05 +00:00
cmd: |
unset SRC_GIT_SERVERS INDEXED_SEARCH_SERVERS REDIS_ENDPOINT
# TODO: This should be fixed
export SOURCEGRAPH_LICENSE_GENERATION_KEY=$(cat ../dev-private/enterprise/dev/test-license-generation-key.pem)
# If EXTSVC_CONFIG_FILE is *unset*, set a default.
export EXTSVC_CONFIG_FILE=${EXTSVC_CONFIG_FILE-'../dev-private/enterprise/dev/external-services-config.json'}
.bin/sourcegraph
install: |
if [ -n "$DELVE" ]; then
2023-09-12 10:10:10 +00:00
export GCFLAGS='all=-N -l'
2023-06-04 22:30:05 +00:00
fi
go build -gcflags="$GCFLAGS" -ldflags="-X github.com/sourcegraph/sourcegraph/internal/conf/deploy.forceType=single-program" -o .bin/sourcegraph github.com/sourcegraph/sourcegraph/cmd/sourcegraph
2023-06-04 22:30:05 +00:00
checkBinary: .bin/sourcegraph
env:
SITE_CONFIG_FILE: "../dev-private/enterprise/dev/site-config.json"
SITE_CONFIG_ESCAPE_HATCH_PATH: "$HOME/.sourcegraph/site-config.json"
WEBPACK_DEV_SERVER: 1
watch:
- cmd
- enterprise
- internal
- lib
- schema
cody-app:
<<: *sourcegraph_command
description: Cody App
install: |
if [ -n "$DELVE" ]; then
export GCFLAGS='all=-N -l'
fi
go build -gcflags="$GCFLAGS" -ldflags="-X github.com/sourcegraph/sourcegraph/internal/conf/deploy.forceType=app" -o .bin/sourcegraph github.com/sourcegraph/sourcegraph/cmd/sourcegraph
2023-06-04 22:30:05 +00:00
tauri:
description: App shell (Tauri)
cmd: pnpm tauri dev --config src-tauri/tauri.dev.conf.json
bazel: add command sets that uses bazel under the hood (#48460) This PRs introduces `bazelCommand`(s) in `sg.config.yaml` to handle starting those services, plus command sets for both oss and enterprise. `go run ./dev/sg start enterprise-bazel` will start most of the services with bazel and will keep using normal commands for the others (caddy, etc ...). The code powering those is built on top of `startedCmd` which I don't want to touch until we have a bit more understanding how all of this behaves. Once we have that, we can start refactoring this so it's more straightforward. Running the command set with bazel is bit verbose, you'll see two new "services" being ran: - `[ bazel]` which is ran at the very beginning and the equivalent of the old `install` part of the commands, except it does all of them at once. Once it's complete, it will stop exit nicely. - `[ iBazel]` which handles rebuilding the services in the background. The only thing it will do is to watch the FS based on the bazel rules and the depencency tree to understand what exactly need to be rebuilt. Once it detects some changes, it will overwrite the existing binary that was previously built by the first step (`[ bazel]`). The bazel commands will detect that the binary has been overwritten and will restart gracefully once it's done, meaning that the services will still be running during the new build. EDIT: added the codeintel command set for @efritz who's up to be our guinea pig on this ## Test plan <!-- All pull requests REQUIRE a test plan: https://docs.sourcegraph.com/dev/background-information/testing_principles --> Locally tested. --------- Co-authored-by: davejrt <davetry@gmail.com> Co-authored-by: William Bezuidenhout <william.bezuidenhout@sourcegraph.com>
2023-03-02 10:31:51 +00:00
bazelCommands:
blobstore:
target: //cmd/blobstore:blobstore
bazel: add command sets that uses bazel under the hood (#48460) This PRs introduces `bazelCommand`(s) in `sg.config.yaml` to handle starting those services, plus command sets for both oss and enterprise. `go run ./dev/sg start enterprise-bazel` will start most of the services with bazel and will keep using normal commands for the others (caddy, etc ...). The code powering those is built on top of `startedCmd` which I don't want to touch until we have a bit more understanding how all of this behaves. Once we have that, we can start refactoring this so it's more straightforward. Running the command set with bazel is bit verbose, you'll see two new "services" being ran: - `[ bazel]` which is ran at the very beginning and the equivalent of the old `install` part of the commands, except it does all of them at once. Once it's complete, it will stop exit nicely. - `[ iBazel]` which handles rebuilding the services in the background. The only thing it will do is to watch the FS based on the bazel rules and the depencency tree to understand what exactly need to be rebuilt. Once it detects some changes, it will overwrite the existing binary that was previously built by the first step (`[ bazel]`). The bazel commands will detect that the binary has been overwritten and will restart gracefully once it's done, meaning that the services will still be running during the new build. EDIT: added the codeintel command set for @efritz who's up to be our guinea pig on this ## Test plan <!-- All pull requests REQUIRE a test plan: https://docs.sourcegraph.com/dev/background-information/testing_principles --> Locally tested. --------- Co-authored-by: davejrt <davetry@gmail.com> Co-authored-by: William Bezuidenhout <william.bezuidenhout@sourcegraph.com>
2023-03-02 10:31:51 +00:00
searcher:
target: //cmd/searcher
syntax-highlighter:
target: //docker-images/syntax-highlighter:syntect_server
ignoreStdout: true
ignoreStderr: true
env:
# Environment copied from Dockerfile
2023-06-04 22:30:05 +00:00
WORKERS: "1"
ROCKET_ENV: "production"
ROCKET_LIMITS: "{json=10485760}"
ROCKET_SECRET_KEY: "SeerutKeyIsI7releuantAndknvsuZPluaseIgnorYA="
ROCKET_KEEP_ALIVE: "0"
ROCKET_PORT: "9238"
QUIET: "true"
bazel: add command sets that uses bazel under the hood (#48460) This PRs introduces `bazelCommand`(s) in `sg.config.yaml` to handle starting those services, plus command sets for both oss and enterprise. `go run ./dev/sg start enterprise-bazel` will start most of the services with bazel and will keep using normal commands for the others (caddy, etc ...). The code powering those is built on top of `startedCmd` which I don't want to touch until we have a bit more understanding how all of this behaves. Once we have that, we can start refactoring this so it's more straightforward. Running the command set with bazel is bit verbose, you'll see two new "services" being ran: - `[ bazel]` which is ran at the very beginning and the equivalent of the old `install` part of the commands, except it does all of them at once. Once it's complete, it will stop exit nicely. - `[ iBazel]` which handles rebuilding the services in the background. The only thing it will do is to watch the FS based on the bazel rules and the depencency tree to understand what exactly need to be rebuilt. Once it detects some changes, it will overwrite the existing binary that was previously built by the first step (`[ bazel]`). The bazel commands will detect that the binary has been overwritten and will restart gracefully once it's done, meaning that the services will still be running during the new build. EDIT: added the codeintel command set for @efritz who's up to be our guinea pig on this ## Test plan <!-- All pull requests REQUIRE a test plan: https://docs.sourcegraph.com/dev/background-information/testing_principles --> Locally tested. --------- Co-authored-by: davejrt <davetry@gmail.com> Co-authored-by: William Bezuidenhout <william.bezuidenhout@sourcegraph.com>
2023-03-02 10:31:51 +00:00
frontend:
description: Enterprise frontend
target: //cmd/frontend
bazel: add command sets that uses bazel under the hood (#48460) This PRs introduces `bazelCommand`(s) in `sg.config.yaml` to handle starting those services, plus command sets for both oss and enterprise. `go run ./dev/sg start enterprise-bazel` will start most of the services with bazel and will keep using normal commands for the others (caddy, etc ...). The code powering those is built on top of `startedCmd` which I don't want to touch until we have a bit more understanding how all of this behaves. Once we have that, we can start refactoring this so it's more straightforward. Running the command set with bazel is bit verbose, you'll see two new "services" being ran: - `[ bazel]` which is ran at the very beginning and the equivalent of the old `install` part of the commands, except it does all of them at once. Once it's complete, it will stop exit nicely. - `[ iBazel]` which handles rebuilding the services in the background. The only thing it will do is to watch the FS based on the bazel rules and the depencency tree to understand what exactly need to be rebuilt. Once it detects some changes, it will overwrite the existing binary that was previously built by the first step (`[ bazel]`). The bazel commands will detect that the binary has been overwritten and will restart gracefully once it's done, meaning that the services will still be running during the new build. EDIT: added the codeintel command set for @efritz who's up to be our guinea pig on this ## Test plan <!-- All pull requests REQUIRE a test plan: https://docs.sourcegraph.com/dev/background-information/testing_principles --> Locally tested. --------- Co-authored-by: davejrt <davetry@gmail.com> Co-authored-by: William Bezuidenhout <william.bezuidenhout@sourcegraph.com>
2023-03-02 10:31:51 +00:00
precmd: |
export SOURCEGRAPH_LICENSE_GENERATION_KEY=$(cat ../dev-private/enterprise/dev/test-license-generation-key.pem)
# If EXTSVC_CONFIG_FILE is *unset*, set a default.
export EXTSVC_CONFIG_FILE=${EXTSVC_CONFIG_FILE-'../dev-private/enterprise/dev/external-services-config.json'}
env:
CONFIGURATION_MODE: server
USE_ENHANCED_LANGUAGE_DETECTION: false
SITE_CONFIG_FILE: "../dev-private/enterprise/dev/site-config.json"
SITE_CONFIG_ESCAPE_HATCH_PATH: "$HOME/.sourcegraph/site-config.json"
bazel: add command sets that uses bazel under the hood (#48460) This PRs introduces `bazelCommand`(s) in `sg.config.yaml` to handle starting those services, plus command sets for both oss and enterprise. `go run ./dev/sg start enterprise-bazel` will start most of the services with bazel and will keep using normal commands for the others (caddy, etc ...). The code powering those is built on top of `startedCmd` which I don't want to touch until we have a bit more understanding how all of this behaves. Once we have that, we can start refactoring this so it's more straightforward. Running the command set with bazel is bit verbose, you'll see two new "services" being ran: - `[ bazel]` which is ran at the very beginning and the equivalent of the old `install` part of the commands, except it does all of them at once. Once it's complete, it will stop exit nicely. - `[ iBazel]` which handles rebuilding the services in the background. The only thing it will do is to watch the FS based on the bazel rules and the depencency tree to understand what exactly need to be rebuilt. Once it detects some changes, it will overwrite the existing binary that was previously built by the first step (`[ bazel]`). The bazel commands will detect that the binary has been overwritten and will restart gracefully once it's done, meaning that the services will still be running during the new build. EDIT: added the codeintel command set for @efritz who's up to be our guinea pig on this ## Test plan <!-- All pull requests REQUIRE a test plan: https://docs.sourcegraph.com/dev/background-information/testing_principles --> Locally tested. --------- Co-authored-by: davejrt <davetry@gmail.com> Co-authored-by: William Bezuidenhout <william.bezuidenhout@sourcegraph.com>
2023-03-02 10:31:51 +00:00
# frontend processes need this to be so that the paths to the assets are rendered correctly
WEBPACK_DEV_SERVER: 1
worker:
target: //cmd/worker
bazel: add command sets that uses bazel under the hood (#48460) This PRs introduces `bazelCommand`(s) in `sg.config.yaml` to handle starting those services, plus command sets for both oss and enterprise. `go run ./dev/sg start enterprise-bazel` will start most of the services with bazel and will keep using normal commands for the others (caddy, etc ...). The code powering those is built on top of `startedCmd` which I don't want to touch until we have a bit more understanding how all of this behaves. Once we have that, we can start refactoring this so it's more straightforward. Running the command set with bazel is bit verbose, you'll see two new "services" being ran: - `[ bazel]` which is ran at the very beginning and the equivalent of the old `install` part of the commands, except it does all of them at once. Once it's complete, it will stop exit nicely. - `[ iBazel]` which handles rebuilding the services in the background. The only thing it will do is to watch the FS based on the bazel rules and the depencency tree to understand what exactly need to be rebuilt. Once it detects some changes, it will overwrite the existing binary that was previously built by the first step (`[ bazel]`). The bazel commands will detect that the binary has been overwritten and will restart gracefully once it's done, meaning that the services will still be running during the new build. EDIT: added the codeintel command set for @efritz who's up to be our guinea pig on this ## Test plan <!-- All pull requests REQUIRE a test plan: https://docs.sourcegraph.com/dev/background-information/testing_principles --> Locally tested. --------- Co-authored-by: davejrt <davetry@gmail.com> Co-authored-by: William Bezuidenhout <william.bezuidenhout@sourcegraph.com>
2023-03-02 10:31:51 +00:00
precmd: |
export SOURCEGRAPH_LICENSE_GENERATION_KEY=$(cat ../dev-private/enterprise/dev/test-license-generation-key.pem)
repo-updater:
target: //cmd/repo-updater
bazel: add command sets that uses bazel under the hood (#48460) This PRs introduces `bazelCommand`(s) in `sg.config.yaml` to handle starting those services, plus command sets for both oss and enterprise. `go run ./dev/sg start enterprise-bazel` will start most of the services with bazel and will keep using normal commands for the others (caddy, etc ...). The code powering those is built on top of `startedCmd` which I don't want to touch until we have a bit more understanding how all of this behaves. Once we have that, we can start refactoring this so it's more straightforward. Running the command set with bazel is bit verbose, you'll see two new "services" being ran: - `[ bazel]` which is ran at the very beginning and the equivalent of the old `install` part of the commands, except it does all of them at once. Once it's complete, it will stop exit nicely. - `[ iBazel]` which handles rebuilding the services in the background. The only thing it will do is to watch the FS based on the bazel rules and the depencency tree to understand what exactly need to be rebuilt. Once it detects some changes, it will overwrite the existing binary that was previously built by the first step (`[ bazel]`). The bazel commands will detect that the binary has been overwritten and will restart gracefully once it's done, meaning that the services will still be running during the new build. EDIT: added the codeintel command set for @efritz who's up to be our guinea pig on this ## Test plan <!-- All pull requests REQUIRE a test plan: https://docs.sourcegraph.com/dev/background-information/testing_principles --> Locally tested. --------- Co-authored-by: davejrt <davetry@gmail.com> Co-authored-by: William Bezuidenhout <william.bezuidenhout@sourcegraph.com>
2023-03-02 10:31:51 +00:00
precmd: |
export SOURCEGRAPH_LICENSE_GENERATION_KEY=$(cat ../dev-private/enterprise/dev/test-license-generation-key.pem)
symbols:
target: //enterprise/cmd/symbols
checkBinary: .bin/symbols
env:
CTAGS_COMMAND: dev/universal-ctags-dev
SCIP_CTAGS_COMMAND: dev/scip-ctags-dev
bazel: add command sets that uses bazel under the hood (#48460) This PRs introduces `bazelCommand`(s) in `sg.config.yaml` to handle starting those services, plus command sets for both oss and enterprise. `go run ./dev/sg start enterprise-bazel` will start most of the services with bazel and will keep using normal commands for the others (caddy, etc ...). The code powering those is built on top of `startedCmd` which I don't want to touch until we have a bit more understanding how all of this behaves. Once we have that, we can start refactoring this so it's more straightforward. Running the command set with bazel is bit verbose, you'll see two new "services" being ran: - `[ bazel]` which is ran at the very beginning and the equivalent of the old `install` part of the commands, except it does all of them at once. Once it's complete, it will stop exit nicely. - `[ iBazel]` which handles rebuilding the services in the background. The only thing it will do is to watch the FS based on the bazel rules and the depencency tree to understand what exactly need to be rebuilt. Once it detects some changes, it will overwrite the existing binary that was previously built by the first step (`[ bazel]`). The bazel commands will detect that the binary has been overwritten and will restart gracefully once it's done, meaning that the services will still be running during the new build. EDIT: added the codeintel command set for @efritz who's up to be our guinea pig on this ## Test plan <!-- All pull requests REQUIRE a test plan: https://docs.sourcegraph.com/dev/background-information/testing_principles --> Locally tested. --------- Co-authored-by: davejrt <davetry@gmail.com> Co-authored-by: William Bezuidenhout <william.bezuidenhout@sourcegraph.com>
2023-03-02 10:31:51 +00:00
CTAGS_PROCESSES: 2
USE_ROCKSKIP: "false"
bazel: add command sets that uses bazel under the hood (#48460) This PRs introduces `bazelCommand`(s) in `sg.config.yaml` to handle starting those services, plus command sets for both oss and enterprise. `go run ./dev/sg start enterprise-bazel` will start most of the services with bazel and will keep using normal commands for the others (caddy, etc ...). The code powering those is built on top of `startedCmd` which I don't want to touch until we have a bit more understanding how all of this behaves. Once we have that, we can start refactoring this so it's more straightforward. Running the command set with bazel is bit verbose, you'll see two new "services" being ran: - `[ bazel]` which is ran at the very beginning and the equivalent of the old `install` part of the commands, except it does all of them at once. Once it's complete, it will stop exit nicely. - `[ iBazel]` which handles rebuilding the services in the background. The only thing it will do is to watch the FS based on the bazel rules and the depencency tree to understand what exactly need to be rebuilt. Once it detects some changes, it will overwrite the existing binary that was previously built by the first step (`[ bazel]`). The bazel commands will detect that the binary has been overwritten and will restart gracefully once it's done, meaning that the services will still be running during the new build. EDIT: added the codeintel command set for @efritz who's up to be our guinea pig on this ## Test plan <!-- All pull requests REQUIRE a test plan: https://docs.sourcegraph.com/dev/background-information/testing_principles --> Locally tested. --------- Co-authored-by: davejrt <davetry@gmail.com> Co-authored-by: William Bezuidenhout <william.bezuidenhout@sourcegraph.com>
2023-03-02 10:31:51 +00:00
gitserver-template: &gitserver_bazel_template
target: //cmd/gitserver
bazel: add command sets that uses bazel under the hood (#48460) This PRs introduces `bazelCommand`(s) in `sg.config.yaml` to handle starting those services, plus command sets for both oss and enterprise. `go run ./dev/sg start enterprise-bazel` will start most of the services with bazel and will keep using normal commands for the others (caddy, etc ...). The code powering those is built on top of `startedCmd` which I don't want to touch until we have a bit more understanding how all of this behaves. Once we have that, we can start refactoring this so it's more straightforward. Running the command set with bazel is bit verbose, you'll see two new "services" being ran: - `[ bazel]` which is ran at the very beginning and the equivalent of the old `install` part of the commands, except it does all of them at once. Once it's complete, it will stop exit nicely. - `[ iBazel]` which handles rebuilding the services in the background. The only thing it will do is to watch the FS based on the bazel rules and the depencency tree to understand what exactly need to be rebuilt. Once it detects some changes, it will overwrite the existing binary that was previously built by the first step (`[ bazel]`). The bazel commands will detect that the binary has been overwritten and will restart gracefully once it's done, meaning that the services will still be running during the new build. EDIT: added the codeintel command set for @efritz who's up to be our guinea pig on this ## Test plan <!-- All pull requests REQUIRE a test plan: https://docs.sourcegraph.com/dev/background-information/testing_principles --> Locally tested. --------- Co-authored-by: davejrt <davetry@gmail.com> Co-authored-by: William Bezuidenhout <william.bezuidenhout@sourcegraph.com>
2023-03-02 10:31:51 +00:00
env: &gitserverenv
HOSTNAME: 127.0.0.1:3178
# This is only here to stay backwards-compatible with people's custom
# `sg.config.overwrite.yaml` files
gitserver:
<<: *gitserver_bazel_template
gitserver-0:
<<: *gitserver_bazel_template
env:
<<: *gitserverenv
GITSERVER_EXTERNAL_ADDR: 127.0.0.1:3501
GITSERVER_ADDR: 127.0.0.1:3501
SRC_REPOS_DIR: $HOME/.sourcegraph/repos_1
SRC_PROF_HTTP: 127.0.0.1:3551
gitserver-1:
<<: *gitserver_bazel_template
env:
<<: *gitserverenv
GITSERVER_EXTERNAL_ADDR: 127.0.0.1:3502
GITSERVER_ADDR: 127.0.0.1:3502
SRC_REPOS_DIR: $HOME/.sourcegraph/repos_2
SRC_PROF_HTTP: 127.0.0.1:3552
codeintel-worker:
precmd: |
export SOURCEGRAPH_LICENSE_GENERATION_KEY=$(cat ../dev-private/enterprise/dev/test-license-generation-key.pem)
target: //cmd/precise-code-intel-worker
bazel: add command sets that uses bazel under the hood (#48460) This PRs introduces `bazelCommand`(s) in `sg.config.yaml` to handle starting those services, plus command sets for both oss and enterprise. `go run ./dev/sg start enterprise-bazel` will start most of the services with bazel and will keep using normal commands for the others (caddy, etc ...). The code powering those is built on top of `startedCmd` which I don't want to touch until we have a bit more understanding how all of this behaves. Once we have that, we can start refactoring this so it's more straightforward. Running the command set with bazel is bit verbose, you'll see two new "services" being ran: - `[ bazel]` which is ran at the very beginning and the equivalent of the old `install` part of the commands, except it does all of them at once. Once it's complete, it will stop exit nicely. - `[ iBazel]` which handles rebuilding the services in the background. The only thing it will do is to watch the FS based on the bazel rules and the depencency tree to understand what exactly need to be rebuilt. Once it detects some changes, it will overwrite the existing binary that was previously built by the first step (`[ bazel]`). The bazel commands will detect that the binary has been overwritten and will restart gracefully once it's done, meaning that the services will still be running during the new build. EDIT: added the codeintel command set for @efritz who's up to be our guinea pig on this ## Test plan <!-- All pull requests REQUIRE a test plan: https://docs.sourcegraph.com/dev/background-information/testing_principles --> Locally tested. --------- Co-authored-by: davejrt <davetry@gmail.com> Co-authored-by: William Bezuidenhout <william.bezuidenhout@sourcegraph.com>
2023-03-02 10:31:51 +00:00
executor-template: &executor_template_bazel
target: //cmd/executor
bazel: add command sets that uses bazel under the hood (#48460) This PRs introduces `bazelCommand`(s) in `sg.config.yaml` to handle starting those services, plus command sets for both oss and enterprise. `go run ./dev/sg start enterprise-bazel` will start most of the services with bazel and will keep using normal commands for the others (caddy, etc ...). The code powering those is built on top of `startedCmd` which I don't want to touch until we have a bit more understanding how all of this behaves. Once we have that, we can start refactoring this so it's more straightforward. Running the command set with bazel is bit verbose, you'll see two new "services" being ran: - `[ bazel]` which is ran at the very beginning and the equivalent of the old `install` part of the commands, except it does all of them at once. Once it's complete, it will stop exit nicely. - `[ iBazel]` which handles rebuilding the services in the background. The only thing it will do is to watch the FS based on the bazel rules and the depencency tree to understand what exactly need to be rebuilt. Once it detects some changes, it will overwrite the existing binary that was previously built by the first step (`[ bazel]`). The bazel commands will detect that the binary has been overwritten and will restart gracefully once it's done, meaning that the services will still be running during the new build. EDIT: added the codeintel command set for @efritz who's up to be our guinea pig on this ## Test plan <!-- All pull requests REQUIRE a test plan: https://docs.sourcegraph.com/dev/background-information/testing_principles --> Locally tested. --------- Co-authored-by: davejrt <davetry@gmail.com> Co-authored-by: William Bezuidenhout <william.bezuidenhout@sourcegraph.com>
2023-03-02 10:31:51 +00:00
env:
EXECUTOR_QUEUE_NAME: TEMPLATE
TMPDIR: $HOME/.sourcegraph/executor-temp
# Required for frontend and executor to communicate
EXECUTOR_FRONTEND_URL: http://localhost:3080
# Must match the secret defined in the site config.
EXECUTOR_FRONTEND_PASSWORD: hunter2hunter2hunter2
# Disable firecracker inside executor in dev
EXECUTOR_USE_FIRECRACKER: false
bazel: add command sets that uses bazel under the hood (#48460) This PRs introduces `bazelCommand`(s) in `sg.config.yaml` to handle starting those services, plus command sets for both oss and enterprise. `go run ./dev/sg start enterprise-bazel` will start most of the services with bazel and will keep using normal commands for the others (caddy, etc ...). The code powering those is built on top of `startedCmd` which I don't want to touch until we have a bit more understanding how all of this behaves. Once we have that, we can start refactoring this so it's more straightforward. Running the command set with bazel is bit verbose, you'll see two new "services" being ran: - `[ bazel]` which is ran at the very beginning and the equivalent of the old `install` part of the commands, except it does all of them at once. Once it's complete, it will stop exit nicely. - `[ iBazel]` which handles rebuilding the services in the background. The only thing it will do is to watch the FS based on the bazel rules and the depencency tree to understand what exactly need to be rebuilt. Once it detects some changes, it will overwrite the existing binary that was previously built by the first step (`[ bazel]`). The bazel commands will detect that the binary has been overwritten and will restart gracefully once it's done, meaning that the services will still be running during the new build. EDIT: added the codeintel command set for @efritz who's up to be our guinea pig on this ## Test plan <!-- All pull requests REQUIRE a test plan: https://docs.sourcegraph.com/dev/background-information/testing_principles --> Locally tested. --------- Co-authored-by: davejrt <davetry@gmail.com> Co-authored-by: William Bezuidenhout <william.bezuidenhout@sourcegraph.com>
2023-03-02 10:31:51 +00:00
codeintel-executor:
<<: *executor_template_bazel
env:
EXECUTOR_QUEUE_NAME: codeintel
TMPDIR: $HOME/.sourcegraph/indexer-temp
#
# CommandSets ################################################################
#
defaultCommandset: enterprise
commandsets:
bazel: add command sets that uses bazel under the hood (#48460) This PRs introduces `bazelCommand`(s) in `sg.config.yaml` to handle starting those services, plus command sets for both oss and enterprise. `go run ./dev/sg start enterprise-bazel` will start most of the services with bazel and will keep using normal commands for the others (caddy, etc ...). The code powering those is built on top of `startedCmd` which I don't want to touch until we have a bit more understanding how all of this behaves. Once we have that, we can start refactoring this so it's more straightforward. Running the command set with bazel is bit verbose, you'll see two new "services" being ran: - `[ bazel]` which is ran at the very beginning and the equivalent of the old `install` part of the commands, except it does all of them at once. Once it's complete, it will stop exit nicely. - `[ iBazel]` which handles rebuilding the services in the background. The only thing it will do is to watch the FS based on the bazel rules and the depencency tree to understand what exactly need to be rebuilt. Once it detects some changes, it will overwrite the existing binary that was previously built by the first step (`[ bazel]`). The bazel commands will detect that the binary has been overwritten and will restart gracefully once it's done, meaning that the services will still be running during the new build. EDIT: added the codeintel command set for @efritz who's up to be our guinea pig on this ## Test plan <!-- All pull requests REQUIRE a test plan: https://docs.sourcegraph.com/dev/background-information/testing_principles --> Locally tested. --------- Co-authored-by: davejrt <davetry@gmail.com> Co-authored-by: William Bezuidenhout <william.bezuidenhout@sourcegraph.com>
2023-03-02 10:31:51 +00:00
enterprise-bazel: &enterprise_bazel_set
requiresDevPrivate: true
checks:
- redis
- postgres
- git
- bazelisk
- ibazel
bazel: add command sets that uses bazel under the hood (#48460) This PRs introduces `bazelCommand`(s) in `sg.config.yaml` to handle starting those services, plus command sets for both oss and enterprise. `go run ./dev/sg start enterprise-bazel` will start most of the services with bazel and will keep using normal commands for the others (caddy, etc ...). The code powering those is built on top of `startedCmd` which I don't want to touch until we have a bit more understanding how all of this behaves. Once we have that, we can start refactoring this so it's more straightforward. Running the command set with bazel is bit verbose, you'll see two new "services" being ran: - `[ bazel]` which is ran at the very beginning and the equivalent of the old `install` part of the commands, except it does all of them at once. Once it's complete, it will stop exit nicely. - `[ iBazel]` which handles rebuilding the services in the background. The only thing it will do is to watch the FS based on the bazel rules and the depencency tree to understand what exactly need to be rebuilt. Once it detects some changes, it will overwrite the existing binary that was previously built by the first step (`[ bazel]`). The bazel commands will detect that the binary has been overwritten and will restart gracefully once it's done, meaning that the services will still be running during the new build. EDIT: added the codeintel command set for @efritz who's up to be our guinea pig on this ## Test plan <!-- All pull requests REQUIRE a test plan: https://docs.sourcegraph.com/dev/background-information/testing_principles --> Locally tested. --------- Co-authored-by: davejrt <davetry@gmail.com> Co-authored-by: William Bezuidenhout <william.bezuidenhout@sourcegraph.com>
2023-03-02 10:31:51 +00:00
bazelCommands:
- blobstore
bazel: add command sets that uses bazel under the hood (#48460) This PRs introduces `bazelCommand`(s) in `sg.config.yaml` to handle starting those services, plus command sets for both oss and enterprise. `go run ./dev/sg start enterprise-bazel` will start most of the services with bazel and will keep using normal commands for the others (caddy, etc ...). The code powering those is built on top of `startedCmd` which I don't want to touch until we have a bit more understanding how all of this behaves. Once we have that, we can start refactoring this so it's more straightforward. Running the command set with bazel is bit verbose, you'll see two new "services" being ran: - `[ bazel]` which is ran at the very beginning and the equivalent of the old `install` part of the commands, except it does all of them at once. Once it's complete, it will stop exit nicely. - `[ iBazel]` which handles rebuilding the services in the background. The only thing it will do is to watch the FS based on the bazel rules and the depencency tree to understand what exactly need to be rebuilt. Once it detects some changes, it will overwrite the existing binary that was previously built by the first step (`[ bazel]`). The bazel commands will detect that the binary has been overwritten and will restart gracefully once it's done, meaning that the services will still be running during the new build. EDIT: added the codeintel command set for @efritz who's up to be our guinea pig on this ## Test plan <!-- All pull requests REQUIRE a test plan: https://docs.sourcegraph.com/dev/background-information/testing_principles --> Locally tested. --------- Co-authored-by: davejrt <davetry@gmail.com> Co-authored-by: William Bezuidenhout <william.bezuidenhout@sourcegraph.com>
2023-03-02 10:31:51 +00:00
- frontend
- worker
- repo-updater
- gitserver-0
- gitserver-1
- searcher
- symbols
- syntax-highlighter
bazel: add command sets that uses bazel under the hood (#48460) This PRs introduces `bazelCommand`(s) in `sg.config.yaml` to handle starting those services, plus command sets for both oss and enterprise. `go run ./dev/sg start enterprise-bazel` will start most of the services with bazel and will keep using normal commands for the others (caddy, etc ...). The code powering those is built on top of `startedCmd` which I don't want to touch until we have a bit more understanding how all of this behaves. Once we have that, we can start refactoring this so it's more straightforward. Running the command set with bazel is bit verbose, you'll see two new "services" being ran: - `[ bazel]` which is ran at the very beginning and the equivalent of the old `install` part of the commands, except it does all of them at once. Once it's complete, it will stop exit nicely. - `[ iBazel]` which handles rebuilding the services in the background. The only thing it will do is to watch the FS based on the bazel rules and the depencency tree to understand what exactly need to be rebuilt. Once it detects some changes, it will overwrite the existing binary that was previously built by the first step (`[ bazel]`). The bazel commands will detect that the binary has been overwritten and will restart gracefully once it's done, meaning that the services will still be running during the new build. EDIT: added the codeintel command set for @efritz who's up to be our guinea pig on this ## Test plan <!-- All pull requests REQUIRE a test plan: https://docs.sourcegraph.com/dev/background-information/testing_principles --> Locally tested. --------- Co-authored-by: davejrt <davetry@gmail.com> Co-authored-by: William Bezuidenhout <william.bezuidenhout@sourcegraph.com>
2023-03-02 10:31:51 +00:00
commands:
- web
- docsite
- zoekt-index-0
- zoekt-index-1
- zoekt-web-0
- zoekt-web-1
- caddy
# If you modify this command set, please consider also updating the dotcom runset.
enterprise: &enterprise_set
requiresDevPrivate: true
checks:
- docker
- redis
- postgres
- git
commands:
- frontend
- worker
- repo-updater
- web
- gitserver-0
- gitserver-1
- searcher
- caddy
- symbols
- docsite
- syntax-highlighter
- zoekt-index-0
- zoekt-index-1
- zoekt-web-0
- zoekt-web-1
- blobstore
- embeddings
telemetrygateway: add exporter and service (#56699) This change adds: - telemetry export background jobs: flagged behind `TELEMETRY_GATEWAY_EXPORTER_EXPORT_ADDR`, default empty => disabled - telemetry redaction: configured in package `internal/telemetry/sensitivemetadataallowlist` - telemetry-gateway service receiving events and forwarding it to a pub/sub topic (or just logging it, as configured in local dev) - utilities for easily creating an event recorder: `internal/telemetry/telemetryrecorder` Notes: - all changes are feature-flagged to some degree, off by default, so the merge should be fairly low-risk. - we decided that transmitting the full license key continues to be the way to go. we transmit it once per stream and attach it on all events in the telemetry-gateway. there is no auth mechanism at the moment - GraphQL return type `EventLog.Source` is now a plain string instead of string enum. This should not be a breaking change in our clients, but must be made so that our generated V2 events do not break requesting of event logs Stacked on https://github.com/sourcegraph/sourcegraph/pull/56520 Closes https://github.com/sourcegraph/sourcegraph/issues/56289 Closes https://github.com/sourcegraph/sourcegraph/issues/56287 ## Test plan Add an override to make the export super frequent: ``` env: TELEMETRY_GATEWAY_EXPORTER_EXPORT_INTERVAL: "10s" TELEMETRY_GATEWAY_EXPORTER_EXPORTED_EVENTS_RETENTION: "5m" ``` Start sourcegraph: ``` sg start ``` Enable `telemetry-export` featureflag (from https://github.com/sourcegraph/sourcegraph/pull/56520) Emit some events in GraphQL: ```gql mutation { telemetry { recordEvents(events:[{ feature:"foobar" action:"view" source:{ client:"WEB" } parameters:{ version:0 } }]) { alwaysNil } } ``` See series of log events: ``` [ worker] INFO worker.telemetrygateway-exporter telemetrygatewayexporter/telemetrygatewayexporter.go:61 Telemetry Gateway export enabled - initializing background routines [ worker] INFO worker.telemetrygateway-exporter telemetrygatewayexporter/exporter.go:99 exporting events {"maxBatchSize": 10000, "count": 1} [telemetry-g...y] INFO telemetry-gateway.pubsub pubsub/topic.go:115 Publish {"TraceId": "7852903434f0d2f647d397ee83b4009b", "SpanId": "8d945234bccf319b", "message": "{\"event\":{\"id\":\"dc96ae84-4ac4-4760-968f-0a0307b8bb3d\",\"timestamp\":\"2023-09-19T01:57:13.590266Z\",\"feature\":\"foobar\", .... ``` Build: ``` export VERSION="insiders" bazel run //cmd/telemetry-gateway:candidate_push --config darwin-docker --stamp --workspace_status_command=./dev/bazel_stamp_vars.sh -- --tag $VERSION --repository us.gcr.io/sourcegraph-dev/telemetry-gateway ``` Deploy: https://github.com/sourcegraph/managed-services/pull/7 Add override: ```yaml env: # Port required. TODO: What's the best way to provide gRPC addresses, such that a # localhost address is also possible? TELEMETRY_GATEWAY_EXPORTER_EXPORT_ADDR: "https://telemetry-gateway.sgdev.org:443" ``` Repeat the above (`sg start` and emit some events): ``` [ worker] INFO worker.telemetrygateway-exporter telemetrygatewayexporter/exporter.go:94 exporting events {"maxBatchSize": 10000, "count": 6} [ worker] INFO worker.telemetrygateway-exporter telemetrygatewayexporter/exporter.go:113 events exported {"maxBatchSize": 10000, "succeeded": 6} [ worker] INFO worker.telemetrygateway-exporter telemetrygatewayexporter/exporter.go:94 exporting events {"maxBatchSize": 10000, "count": 1} [ worker] INFO worker.telemetrygateway-exporter telemetrygatewayexporter/exporter.go:113 events exported {"maxBatchSize": 10000, "succeeded": 1} ```
2023-09-20 05:20:15 +00:00
- telemetry-gateway
env:
DISABLE_CODE_INSIGHTS_HISTORICAL: false
DISABLE_CODE_INSIGHTS: false
enterprise-e2e:
<<: *enterprise_set
env:
# EXTSVC_CONFIG_FILE being set prevents the e2e test suite to add
# additional connections.
EXTSVC_CONFIG_FILE: ""
dotcom:
# This is 95% the enterprise runset, with the addition of Cody Gateway.
requiresDevPrivate: true
checks:
- docker
- redis
- postgres
- git
commands:
- frontend
- worker
- repo-updater
- web
- gitserver-0
- gitserver-1
- searcher
- symbols
- caddy
- docsite
- syntax-highlighter
- zoekt-index-0
- zoekt-index-1
- zoekt-web-0
- zoekt-web-1
- blobstore
- cody-gateway
- embeddings
telemetrygateway: add exporter and service (#56699) This change adds: - telemetry export background jobs: flagged behind `TELEMETRY_GATEWAY_EXPORTER_EXPORT_ADDR`, default empty => disabled - telemetry redaction: configured in package `internal/telemetry/sensitivemetadataallowlist` - telemetry-gateway service receiving events and forwarding it to a pub/sub topic (or just logging it, as configured in local dev) - utilities for easily creating an event recorder: `internal/telemetry/telemetryrecorder` Notes: - all changes are feature-flagged to some degree, off by default, so the merge should be fairly low-risk. - we decided that transmitting the full license key continues to be the way to go. we transmit it once per stream and attach it on all events in the telemetry-gateway. there is no auth mechanism at the moment - GraphQL return type `EventLog.Source` is now a plain string instead of string enum. This should not be a breaking change in our clients, but must be made so that our generated V2 events do not break requesting of event logs Stacked on https://github.com/sourcegraph/sourcegraph/pull/56520 Closes https://github.com/sourcegraph/sourcegraph/issues/56289 Closes https://github.com/sourcegraph/sourcegraph/issues/56287 ## Test plan Add an override to make the export super frequent: ``` env: TELEMETRY_GATEWAY_EXPORTER_EXPORT_INTERVAL: "10s" TELEMETRY_GATEWAY_EXPORTER_EXPORTED_EVENTS_RETENTION: "5m" ``` Start sourcegraph: ``` sg start ``` Enable `telemetry-export` featureflag (from https://github.com/sourcegraph/sourcegraph/pull/56520) Emit some events in GraphQL: ```gql mutation { telemetry { recordEvents(events:[{ feature:"foobar" action:"view" source:{ client:"WEB" } parameters:{ version:0 } }]) { alwaysNil } } ``` See series of log events: ``` [ worker] INFO worker.telemetrygateway-exporter telemetrygatewayexporter/telemetrygatewayexporter.go:61 Telemetry Gateway export enabled - initializing background routines [ worker] INFO worker.telemetrygateway-exporter telemetrygatewayexporter/exporter.go:99 exporting events {"maxBatchSize": 10000, "count": 1} [telemetry-g...y] INFO telemetry-gateway.pubsub pubsub/topic.go:115 Publish {"TraceId": "7852903434f0d2f647d397ee83b4009b", "SpanId": "8d945234bccf319b", "message": "{\"event\":{\"id\":\"dc96ae84-4ac4-4760-968f-0a0307b8bb3d\",\"timestamp\":\"2023-09-19T01:57:13.590266Z\",\"feature\":\"foobar\", .... ``` Build: ``` export VERSION="insiders" bazel run //cmd/telemetry-gateway:candidate_push --config darwin-docker --stamp --workspace_status_command=./dev/bazel_stamp_vars.sh -- --tag $VERSION --repository us.gcr.io/sourcegraph-dev/telemetry-gateway ``` Deploy: https://github.com/sourcegraph/managed-services/pull/7 Add override: ```yaml env: # Port required. TODO: What's the best way to provide gRPC addresses, such that a # localhost address is also possible? TELEMETRY_GATEWAY_EXPORTER_EXPORT_ADDR: "https://telemetry-gateway.sgdev.org:443" ``` Repeat the above (`sg start` and emit some events): ``` [ worker] INFO worker.telemetrygateway-exporter telemetrygatewayexporter/exporter.go:94 exporting events {"maxBatchSize": 10000, "count": 6} [ worker] INFO worker.telemetrygateway-exporter telemetrygatewayexporter/exporter.go:113 events exported {"maxBatchSize": 10000, "succeeded": 6} [ worker] INFO worker.telemetrygateway-exporter telemetrygatewayexporter/exporter.go:94 exporting events {"maxBatchSize": 10000, "count": 1} [ worker] INFO worker.telemetrygateway-exporter telemetrygatewayexporter/exporter.go:113 events exported {"maxBatchSize": 10000, "succeeded": 1} ```
2023-09-20 05:20:15 +00:00
- telemetry-gateway
env:
SOURCEGRAPHDOTCOM_MODE: true
bazel: add command sets that uses bazel under the hood (#48460) This PRs introduces `bazelCommand`(s) in `sg.config.yaml` to handle starting those services, plus command sets for both oss and enterprise. `go run ./dev/sg start enterprise-bazel` will start most of the services with bazel and will keep using normal commands for the others (caddy, etc ...). The code powering those is built on top of `startedCmd` which I don't want to touch until we have a bit more understanding how all of this behaves. Once we have that, we can start refactoring this so it's more straightforward. Running the command set with bazel is bit verbose, you'll see two new "services" being ran: - `[ bazel]` which is ran at the very beginning and the equivalent of the old `install` part of the commands, except it does all of them at once. Once it's complete, it will stop exit nicely. - `[ iBazel]` which handles rebuilding the services in the background. The only thing it will do is to watch the FS based on the bazel rules and the depencency tree to understand what exactly need to be rebuilt. Once it detects some changes, it will overwrite the existing binary that was previously built by the first step (`[ bazel]`). The bazel commands will detect that the binary has been overwritten and will restart gracefully once it's done, meaning that the services will still be running during the new build. EDIT: added the codeintel command set for @efritz who's up to be our guinea pig on this ## Test plan <!-- All pull requests REQUIRE a test plan: https://docs.sourcegraph.com/dev/background-information/testing_principles --> Locally tested. --------- Co-authored-by: davejrt <davetry@gmail.com> Co-authored-by: William Bezuidenhout <william.bezuidenhout@sourcegraph.com>
2023-03-02 10:31:51 +00:00
codeintel-bazel: &codeintel_bazel_set
requiresDevPrivate: true
checks:
- docker
- redis
- postgres
- git
- bazelisk
- ibazel
bazel: add command sets that uses bazel under the hood (#48460) This PRs introduces `bazelCommand`(s) in `sg.config.yaml` to handle starting those services, plus command sets for both oss and enterprise. `go run ./dev/sg start enterprise-bazel` will start most of the services with bazel and will keep using normal commands for the others (caddy, etc ...). The code powering those is built on top of `startedCmd` which I don't want to touch until we have a bit more understanding how all of this behaves. Once we have that, we can start refactoring this so it's more straightforward. Running the command set with bazel is bit verbose, you'll see two new "services" being ran: - `[ bazel]` which is ran at the very beginning and the equivalent of the old `install` part of the commands, except it does all of them at once. Once it's complete, it will stop exit nicely. - `[ iBazel]` which handles rebuilding the services in the background. The only thing it will do is to watch the FS based on the bazel rules and the depencency tree to understand what exactly need to be rebuilt. Once it detects some changes, it will overwrite the existing binary that was previously built by the first step (`[ bazel]`). The bazel commands will detect that the binary has been overwritten and will restart gracefully once it's done, meaning that the services will still be running during the new build. EDIT: added the codeintel command set for @efritz who's up to be our guinea pig on this ## Test plan <!-- All pull requests REQUIRE a test plan: https://docs.sourcegraph.com/dev/background-information/testing_principles --> Locally tested. --------- Co-authored-by: davejrt <davetry@gmail.com> Co-authored-by: William Bezuidenhout <william.bezuidenhout@sourcegraph.com>
2023-03-02 10:31:51 +00:00
bazelCommands:
- blobstore
bazel: add command sets that uses bazel under the hood (#48460) This PRs introduces `bazelCommand`(s) in `sg.config.yaml` to handle starting those services, plus command sets for both oss and enterprise. `go run ./dev/sg start enterprise-bazel` will start most of the services with bazel and will keep using normal commands for the others (caddy, etc ...). The code powering those is built on top of `startedCmd` which I don't want to touch until we have a bit more understanding how all of this behaves. Once we have that, we can start refactoring this so it's more straightforward. Running the command set with bazel is bit verbose, you'll see two new "services" being ran: - `[ bazel]` which is ran at the very beginning and the equivalent of the old `install` part of the commands, except it does all of them at once. Once it's complete, it will stop exit nicely. - `[ iBazel]` which handles rebuilding the services in the background. The only thing it will do is to watch the FS based on the bazel rules and the depencency tree to understand what exactly need to be rebuilt. Once it detects some changes, it will overwrite the existing binary that was previously built by the first step (`[ bazel]`). The bazel commands will detect that the binary has been overwritten and will restart gracefully once it's done, meaning that the services will still be running during the new build. EDIT: added the codeintel command set for @efritz who's up to be our guinea pig on this ## Test plan <!-- All pull requests REQUIRE a test plan: https://docs.sourcegraph.com/dev/background-information/testing_principles --> Locally tested. --------- Co-authored-by: davejrt <davetry@gmail.com> Co-authored-by: William Bezuidenhout <william.bezuidenhout@sourcegraph.com>
2023-03-02 10:31:51 +00:00
- frontend
- worker
- repo-updater
- gitserver-0
- gitserver-1
- searcher
- symbols
- syntax-highlighter
bazel: add command sets that uses bazel under the hood (#48460) This PRs introduces `bazelCommand`(s) in `sg.config.yaml` to handle starting those services, plus command sets for both oss and enterprise. `go run ./dev/sg start enterprise-bazel` will start most of the services with bazel and will keep using normal commands for the others (caddy, etc ...). The code powering those is built on top of `startedCmd` which I don't want to touch until we have a bit more understanding how all of this behaves. Once we have that, we can start refactoring this so it's more straightforward. Running the command set with bazel is bit verbose, you'll see two new "services" being ran: - `[ bazel]` which is ran at the very beginning and the equivalent of the old `install` part of the commands, except it does all of them at once. Once it's complete, it will stop exit nicely. - `[ iBazel]` which handles rebuilding the services in the background. The only thing it will do is to watch the FS based on the bazel rules and the depencency tree to understand what exactly need to be rebuilt. Once it detects some changes, it will overwrite the existing binary that was previously built by the first step (`[ bazel]`). The bazel commands will detect that the binary has been overwritten and will restart gracefully once it's done, meaning that the services will still be running during the new build. EDIT: added the codeintel command set for @efritz who's up to be our guinea pig on this ## Test plan <!-- All pull requests REQUIRE a test plan: https://docs.sourcegraph.com/dev/background-information/testing_principles --> Locally tested. --------- Co-authored-by: davejrt <davetry@gmail.com> Co-authored-by: William Bezuidenhout <william.bezuidenhout@sourcegraph.com>
2023-03-02 10:31:51 +00:00
- codeintel-worker
- codeintel-executor
commands:
- web
- docsite
- zoekt-index-0
- zoekt-index-1
- zoekt-web-0
- zoekt-web-1
- caddy
- jaeger
- grafana
- prometheus
codeintel:
requiresDevPrivate: true
checks:
- docker
- redis
- postgres
- git
commands:
- frontend
- worker
- repo-updater
- web
- gitserver-0
- gitserver-1
- searcher
- symbols
- caddy
- docsite
- syntax-highlighter
- zoekt-index-0
- zoekt-index-1
- zoekt-web-0
- zoekt-web-1
- blobstore
- codeintel-worker
- codeintel-executor
# - otel-collector
- jaeger
- grafana
- prometheus
Executors: enable dequeueing and heartbeating for multiple queues (#52016) - Closes https://github.com/sourcegraph/sourcegraph/issues/50614 Depends on the following issues: - https://github.com/sourcegraph/sourcegraph/issues/50616 - Merged in https://github.com/sourcegraph/sourcegraph/pull/52020 - https://github.com/sourcegraph/sourcegraph/issues/51656 - Merged in https://github.com/sourcegraph/sourcegraph/pull/52381 - https://github.com/sourcegraph/sourcegraph/issues/51658 - Merged in https://github.com/sourcegraph/sourcegraph/pull/52525 Due to the dependency on the work listed above, all PRs were ultimately merged into this branch, so the client side implementation got a little bit messy to review, for which I apologize. The initial intent of this PR, closing #50614, was reviewed before the merges occurred. ## Demos ### Scenario 1: old Sourcegraph version, new executor version #### batches, single queue https://github.com/sourcegraph/sourcegraph/assets/2979513/6d325326-d4f5-4e6f-a5e0-0918dbde07bf #### codeintel, single queue https://github.com/sourcegraph/sourcegraph/assets/2979513/ff3a87db-ef07-4221-a159-73b93a6ae55b ### Scenario 2: new Sourcegraph version, old executor version #### batches, single queue https://github.com/sourcegraph/sourcegraph/assets/2979513/c66a9b9f-b745-4abc-bf87-9077dd5f2959 #### codeintel, single queue https://github.com/sourcegraph/sourcegraph/assets/2979513/785d7f34-630f-4c2a-abb6-82ca3b896813 ### Scenario 3: new Sourcegraph version, new executor version #### batches, single queue https://github.com/sourcegraph/sourcegraph/assets/2979513/69c5c06b-ea1c-47f7-9196-e3f108794493 #### codeintel, single queue https://github.com/sourcegraph/sourcegraph/assets/2979513/4e791d98-06cd-4fa7-93dd-81be0a73c744 #### batches + codeintel in parallel, multi queue https://github.com/sourcegraph/sourcegraph/assets/2979513/004608dd-f98c-4305-aeb3-a01444c12224 ## Initial PR description (enable client to dequeue from multiple queues) This PR updates the client's `Dequeue` method to dequeue from multiple queues. It's backwards compatible with single-queue configurations. The field `Job.Queue` is only set when dequeuing from the general `/dequeue` endpoint, so the job `MarkXxx` methods default to single-queue behaviour if the job does not specify a queue name. Unrelated to the linked issue, the multi-handler will log and return no content early in the event of an empty queue list in a dequeue request (although this should never occur). ## Test plan - [x] Unit tests - [x] Local testing - [x] Dogfood testing - [x] Demo <!-- All pull requests REQUIRE a test plan: https://docs.sourcegraph.com/dev/background-information/testing_principles -->
2023-06-04 13:25:05 +00:00
codeintel-kubernetes:
requiresDevPrivate: true
checks:
- docker
- redis
- postgres
- git
commands:
- frontend
- worker
- repo-updater
- web
- gitserver-0
- gitserver-1
- searcher
- symbols
- caddy
- docsite
- syntax-highlighter
- zoekt-index-0
- zoekt-index-1
- zoekt-web-0
- zoekt-web-1
- blobstore
- codeintel-worker
Executors: enable dequeueing and heartbeating for multiple queues (#52016) - Closes https://github.com/sourcegraph/sourcegraph/issues/50614 Depends on the following issues: - https://github.com/sourcegraph/sourcegraph/issues/50616 - Merged in https://github.com/sourcegraph/sourcegraph/pull/52020 - https://github.com/sourcegraph/sourcegraph/issues/51656 - Merged in https://github.com/sourcegraph/sourcegraph/pull/52381 - https://github.com/sourcegraph/sourcegraph/issues/51658 - Merged in https://github.com/sourcegraph/sourcegraph/pull/52525 Due to the dependency on the work listed above, all PRs were ultimately merged into this branch, so the client side implementation got a little bit messy to review, for which I apologize. The initial intent of this PR, closing #50614, was reviewed before the merges occurred. ## Demos ### Scenario 1: old Sourcegraph version, new executor version #### batches, single queue https://github.com/sourcegraph/sourcegraph/assets/2979513/6d325326-d4f5-4e6f-a5e0-0918dbde07bf #### codeintel, single queue https://github.com/sourcegraph/sourcegraph/assets/2979513/ff3a87db-ef07-4221-a159-73b93a6ae55b ### Scenario 2: new Sourcegraph version, old executor version #### batches, single queue https://github.com/sourcegraph/sourcegraph/assets/2979513/c66a9b9f-b745-4abc-bf87-9077dd5f2959 #### codeintel, single queue https://github.com/sourcegraph/sourcegraph/assets/2979513/785d7f34-630f-4c2a-abb6-82ca3b896813 ### Scenario 3: new Sourcegraph version, new executor version #### batches, single queue https://github.com/sourcegraph/sourcegraph/assets/2979513/69c5c06b-ea1c-47f7-9196-e3f108794493 #### codeintel, single queue https://github.com/sourcegraph/sourcegraph/assets/2979513/4e791d98-06cd-4fa7-93dd-81be0a73c744 #### batches + codeintel in parallel, multi queue https://github.com/sourcegraph/sourcegraph/assets/2979513/004608dd-f98c-4305-aeb3-a01444c12224 ## Initial PR description (enable client to dequeue from multiple queues) This PR updates the client's `Dequeue` method to dequeue from multiple queues. It's backwards compatible with single-queue configurations. The field `Job.Queue` is only set when dequeuing from the general `/dequeue` endpoint, so the job `MarkXxx` methods default to single-queue behaviour if the job does not specify a queue name. Unrelated to the linked issue, the multi-handler will log and return no content early in the event of an empty queue list in a dequeue request (although this should never occur). ## Test plan - [x] Unit tests - [x] Local testing - [x] Dogfood testing - [x] Demo <!-- All pull requests REQUIRE a test plan: https://docs.sourcegraph.com/dev/background-information/testing_principles -->
2023-06-04 13:25:05 +00:00
- codeintel-executor-kubernetes
# - otel-collector
- jaeger
- grafana
- prometheus
Executors: enable dequeueing and heartbeating for multiple queues (#52016) - Closes https://github.com/sourcegraph/sourcegraph/issues/50614 Depends on the following issues: - https://github.com/sourcegraph/sourcegraph/issues/50616 - Merged in https://github.com/sourcegraph/sourcegraph/pull/52020 - https://github.com/sourcegraph/sourcegraph/issues/51656 - Merged in https://github.com/sourcegraph/sourcegraph/pull/52381 - https://github.com/sourcegraph/sourcegraph/issues/51658 - Merged in https://github.com/sourcegraph/sourcegraph/pull/52525 Due to the dependency on the work listed above, all PRs were ultimately merged into this branch, so the client side implementation got a little bit messy to review, for which I apologize. The initial intent of this PR, closing #50614, was reviewed before the merges occurred. ## Demos ### Scenario 1: old Sourcegraph version, new executor version #### batches, single queue https://github.com/sourcegraph/sourcegraph/assets/2979513/6d325326-d4f5-4e6f-a5e0-0918dbde07bf #### codeintel, single queue https://github.com/sourcegraph/sourcegraph/assets/2979513/ff3a87db-ef07-4221-a159-73b93a6ae55b ### Scenario 2: new Sourcegraph version, old executor version #### batches, single queue https://github.com/sourcegraph/sourcegraph/assets/2979513/c66a9b9f-b745-4abc-bf87-9077dd5f2959 #### codeintel, single queue https://github.com/sourcegraph/sourcegraph/assets/2979513/785d7f34-630f-4c2a-abb6-82ca3b896813 ### Scenario 3: new Sourcegraph version, new executor version #### batches, single queue https://github.com/sourcegraph/sourcegraph/assets/2979513/69c5c06b-ea1c-47f7-9196-e3f108794493 #### codeintel, single queue https://github.com/sourcegraph/sourcegraph/assets/2979513/4e791d98-06cd-4fa7-93dd-81be0a73c744 #### batches + codeintel in parallel, multi queue https://github.com/sourcegraph/sourcegraph/assets/2979513/004608dd-f98c-4305-aeb3-a01444c12224 ## Initial PR description (enable client to dequeue from multiple queues) This PR updates the client's `Dequeue` method to dequeue from multiple queues. It's backwards compatible with single-queue configurations. The field `Job.Queue` is only set when dequeuing from the general `/dequeue` endpoint, so the job `MarkXxx` methods default to single-queue behaviour if the job does not specify a queue name. Unrelated to the linked issue, the multi-handler will log and return no content early in the event of an empty queue list in a dequeue request (although this should never occur). ## Test plan - [x] Unit tests - [x] Local testing - [x] Dogfood testing - [x] Demo <!-- All pull requests REQUIRE a test plan: https://docs.sourcegraph.com/dev/background-information/testing_principles -->
2023-06-04 13:25:05 +00:00
enterprise-codeintel:
requiresDevPrivate: true
checks:
- docker
- redis
- postgres
- git
commands:
- frontend
- worker
- repo-updater
- web
- gitserver-0
- gitserver-1
- searcher
- symbols
- caddy
- docsite
- syntax-highlighter
- zoekt-index-0
- zoekt-index-1
- zoekt-web-0
- zoekt-web-1
- blobstore
- codeintel-worker
Executors: enable dequeueing and heartbeating for multiple queues (#52016) - Closes https://github.com/sourcegraph/sourcegraph/issues/50614 Depends on the following issues: - https://github.com/sourcegraph/sourcegraph/issues/50616 - Merged in https://github.com/sourcegraph/sourcegraph/pull/52020 - https://github.com/sourcegraph/sourcegraph/issues/51656 - Merged in https://github.com/sourcegraph/sourcegraph/pull/52381 - https://github.com/sourcegraph/sourcegraph/issues/51658 - Merged in https://github.com/sourcegraph/sourcegraph/pull/52525 Due to the dependency on the work listed above, all PRs were ultimately merged into this branch, so the client side implementation got a little bit messy to review, for which I apologize. The initial intent of this PR, closing #50614, was reviewed before the merges occurred. ## Demos ### Scenario 1: old Sourcegraph version, new executor version #### batches, single queue https://github.com/sourcegraph/sourcegraph/assets/2979513/6d325326-d4f5-4e6f-a5e0-0918dbde07bf #### codeintel, single queue https://github.com/sourcegraph/sourcegraph/assets/2979513/ff3a87db-ef07-4221-a159-73b93a6ae55b ### Scenario 2: new Sourcegraph version, old executor version #### batches, single queue https://github.com/sourcegraph/sourcegraph/assets/2979513/c66a9b9f-b745-4abc-bf87-9077dd5f2959 #### codeintel, single queue https://github.com/sourcegraph/sourcegraph/assets/2979513/785d7f34-630f-4c2a-abb6-82ca3b896813 ### Scenario 3: new Sourcegraph version, new executor version #### batches, single queue https://github.com/sourcegraph/sourcegraph/assets/2979513/69c5c06b-ea1c-47f7-9196-e3f108794493 #### codeintel, single queue https://github.com/sourcegraph/sourcegraph/assets/2979513/4e791d98-06cd-4fa7-93dd-81be0a73c744 #### batches + codeintel in parallel, multi queue https://github.com/sourcegraph/sourcegraph/assets/2979513/004608dd-f98c-4305-aeb3-a01444c12224 ## Initial PR description (enable client to dequeue from multiple queues) This PR updates the client's `Dequeue` method to dequeue from multiple queues. It's backwards compatible with single-queue configurations. The field `Job.Queue` is only set when dequeuing from the general `/dequeue` endpoint, so the job `MarkXxx` methods default to single-queue behaviour if the job does not specify a queue name. Unrelated to the linked issue, the multi-handler will log and return no content early in the event of an empty queue list in a dequeue request (although this should never occur). ## Test plan - [x] Unit tests - [x] Local testing - [x] Dogfood testing - [x] Demo <!-- All pull requests REQUIRE a test plan: https://docs.sourcegraph.com/dev/background-information/testing_principles -->
2023-06-04 13:25:05 +00:00
- codeintel-executor
# - otel-collector
- jaeger
- grafana
- prometheus
Executors: enable dequeueing and heartbeating for multiple queues (#52016) - Closes https://github.com/sourcegraph/sourcegraph/issues/50614 Depends on the following issues: - https://github.com/sourcegraph/sourcegraph/issues/50616 - Merged in https://github.com/sourcegraph/sourcegraph/pull/52020 - https://github.com/sourcegraph/sourcegraph/issues/51656 - Merged in https://github.com/sourcegraph/sourcegraph/pull/52381 - https://github.com/sourcegraph/sourcegraph/issues/51658 - Merged in https://github.com/sourcegraph/sourcegraph/pull/52525 Due to the dependency on the work listed above, all PRs were ultimately merged into this branch, so the client side implementation got a little bit messy to review, for which I apologize. The initial intent of this PR, closing #50614, was reviewed before the merges occurred. ## Demos ### Scenario 1: old Sourcegraph version, new executor version #### batches, single queue https://github.com/sourcegraph/sourcegraph/assets/2979513/6d325326-d4f5-4e6f-a5e0-0918dbde07bf #### codeintel, single queue https://github.com/sourcegraph/sourcegraph/assets/2979513/ff3a87db-ef07-4221-a159-73b93a6ae55b ### Scenario 2: new Sourcegraph version, old executor version #### batches, single queue https://github.com/sourcegraph/sourcegraph/assets/2979513/c66a9b9f-b745-4abc-bf87-9077dd5f2959 #### codeintel, single queue https://github.com/sourcegraph/sourcegraph/assets/2979513/785d7f34-630f-4c2a-abb6-82ca3b896813 ### Scenario 3: new Sourcegraph version, new executor version #### batches, single queue https://github.com/sourcegraph/sourcegraph/assets/2979513/69c5c06b-ea1c-47f7-9196-e3f108794493 #### codeintel, single queue https://github.com/sourcegraph/sourcegraph/assets/2979513/4e791d98-06cd-4fa7-93dd-81be0a73c744 #### batches + codeintel in parallel, multi queue https://github.com/sourcegraph/sourcegraph/assets/2979513/004608dd-f98c-4305-aeb3-a01444c12224 ## Initial PR description (enable client to dequeue from multiple queues) This PR updates the client's `Dequeue` method to dequeue from multiple queues. It's backwards compatible with single-queue configurations. The field `Job.Queue` is only set when dequeuing from the general `/dequeue` endpoint, so the job `MarkXxx` methods default to single-queue behaviour if the job does not specify a queue name. Unrelated to the linked issue, the multi-handler will log and return no content early in the event of an empty queue list in a dequeue request (although this should never occur). ## Test plan - [x] Unit tests - [x] Local testing - [x] Dogfood testing - [x] Demo <!-- All pull requests REQUIRE a test plan: https://docs.sourcegraph.com/dev/background-information/testing_principles -->
2023-06-04 13:25:05 +00:00
enterprise-codeintel-multi-queue-executor:
requiresDevPrivate: true
checks:
- docker
- redis
- postgres
- git
Executors: enable dequeueing and heartbeating for multiple queues (#52016) - Closes https://github.com/sourcegraph/sourcegraph/issues/50614 Depends on the following issues: - https://github.com/sourcegraph/sourcegraph/issues/50616 - Merged in https://github.com/sourcegraph/sourcegraph/pull/52020 - https://github.com/sourcegraph/sourcegraph/issues/51656 - Merged in https://github.com/sourcegraph/sourcegraph/pull/52381 - https://github.com/sourcegraph/sourcegraph/issues/51658 - Merged in https://github.com/sourcegraph/sourcegraph/pull/52525 Due to the dependency on the work listed above, all PRs were ultimately merged into this branch, so the client side implementation got a little bit messy to review, for which I apologize. The initial intent of this PR, closing #50614, was reviewed before the merges occurred. ## Demos ### Scenario 1: old Sourcegraph version, new executor version #### batches, single queue https://github.com/sourcegraph/sourcegraph/assets/2979513/6d325326-d4f5-4e6f-a5e0-0918dbde07bf #### codeintel, single queue https://github.com/sourcegraph/sourcegraph/assets/2979513/ff3a87db-ef07-4221-a159-73b93a6ae55b ### Scenario 2: new Sourcegraph version, old executor version #### batches, single queue https://github.com/sourcegraph/sourcegraph/assets/2979513/c66a9b9f-b745-4abc-bf87-9077dd5f2959 #### codeintel, single queue https://github.com/sourcegraph/sourcegraph/assets/2979513/785d7f34-630f-4c2a-abb6-82ca3b896813 ### Scenario 3: new Sourcegraph version, new executor version #### batches, single queue https://github.com/sourcegraph/sourcegraph/assets/2979513/69c5c06b-ea1c-47f7-9196-e3f108794493 #### codeintel, single queue https://github.com/sourcegraph/sourcegraph/assets/2979513/4e791d98-06cd-4fa7-93dd-81be0a73c744 #### batches + codeintel in parallel, multi queue https://github.com/sourcegraph/sourcegraph/assets/2979513/004608dd-f98c-4305-aeb3-a01444c12224 ## Initial PR description (enable client to dequeue from multiple queues) This PR updates the client's `Dequeue` method to dequeue from multiple queues. It's backwards compatible with single-queue configurations. The field `Job.Queue` is only set when dequeuing from the general `/dequeue` endpoint, so the job `MarkXxx` methods default to single-queue behaviour if the job does not specify a queue name. Unrelated to the linked issue, the multi-handler will log and return no content early in the event of an empty queue list in a dequeue request (although this should never occur). ## Test plan - [x] Unit tests - [x] Local testing - [x] Dogfood testing - [x] Demo <!-- All pull requests REQUIRE a test plan: https://docs.sourcegraph.com/dev/background-information/testing_principles -->
2023-06-04 13:25:05 +00:00
commands:
- frontend
- worker
- repo-updater
- web
- gitserver-0
- gitserver-1
- searcher
- symbols
- caddy
- docsite
- syntax-highlighter
- zoekt-index-0
- zoekt-index-1
- zoekt-web-0
- zoekt-web-1
- blobstore
- codeintel-worker
- multiqueue-executor
# - otel-collector
- jaeger
- grafana
- prometheus
bazel: add command sets that uses bazel under the hood (#48460) This PRs introduces `bazelCommand`(s) in `sg.config.yaml` to handle starting those services, plus command sets for both oss and enterprise. `go run ./dev/sg start enterprise-bazel` will start most of the services with bazel and will keep using normal commands for the others (caddy, etc ...). The code powering those is built on top of `startedCmd` which I don't want to touch until we have a bit more understanding how all of this behaves. Once we have that, we can start refactoring this so it's more straightforward. Running the command set with bazel is bit verbose, you'll see two new "services" being ran: - `[ bazel]` which is ran at the very beginning and the equivalent of the old `install` part of the commands, except it does all of them at once. Once it's complete, it will stop exit nicely. - `[ iBazel]` which handles rebuilding the services in the background. The only thing it will do is to watch the FS based on the bazel rules and the depencency tree to understand what exactly need to be rebuilt. Once it detects some changes, it will overwrite the existing binary that was previously built by the first step (`[ bazel]`). The bazel commands will detect that the binary has been overwritten and will restart gracefully once it's done, meaning that the services will still be running during the new build. EDIT: added the codeintel command set for @efritz who's up to be our guinea pig on this ## Test plan <!-- All pull requests REQUIRE a test plan: https://docs.sourcegraph.com/dev/background-information/testing_principles --> Locally tested. --------- Co-authored-by: davejrt <davetry@gmail.com> Co-authored-by: William Bezuidenhout <william.bezuidenhout@sourcegraph.com>
2023-03-02 10:31:51 +00:00
enterprise-codeintel-bazel:
<<: *codeintel_bazel_set
enterprise-codeinsights:
requiresDevPrivate: true
checks:
- docker
- redis
- postgres
- git
commands:
- frontend
- worker
- repo-updater
- web
- gitserver-0
- gitserver-1
- searcher
- symbols
- caddy
- docsite
- syntax-highlighter
- zoekt-index-0
- zoekt-index-1
- zoekt-web-0
- zoekt-web-1
- blobstore
env:
DISABLE_CODE_INSIGHTS_HISTORICAL: false
DISABLE_CODE_INSIGHTS: false
2021-05-25 08:33:48 +00:00
api-only:
requiresDevPrivate: true
checks:
- docker
- redis
- postgres
- git
commands:
- frontend
- worker
- repo-updater
- gitserver-0
- gitserver-1
- searcher
- symbols
- zoekt-index-0
- zoekt-index-1
- zoekt-web-0
- zoekt-web-1
- blobstore
2021-05-25 08:33:48 +00:00
batches:
requiresDevPrivate: true
checks:
- docker
- redis
- postgres
- git
commands:
- frontend
- worker
- repo-updater
- web
- gitserver-0
- gitserver-1
- searcher
- symbols
- caddy
- docsite
- syntax-highlighter
- zoekt-index-0
- zoekt-index-1
- zoekt-web-0
- zoekt-web-1
- blobstore
- batches-executor
Experiment: Natively run SSBC in docker (#44034) This adds an experimental code path that I will use to test a docker-only execution mode for server-side batch changes. This code path is never executed for customers until we make the switch when we deem it ready. This will allow me to dogfood this while it's not available to customer instances yet. Ultimately, the goal of this is to make executors simply be "the job runner platform through a generic interface". Today, this depends on src-cli to do a good bunch of the work. This is a blocker for going full docker-based with executors, which will ultimately be a requirement on the road to k8s-based executors. As this removes the dependency on src-cli, nothing but the job interface and API endpoints tie executor and Sourcegraph instance together. Ultimately, this will allow us to support larger version spans between the two (pending executors going GA and being feature-complete). Known issues/limitations: Steps skipped in between steps that run don't work yet Skipping steps dynamically is inefficient as we cannot tell the executor to skip a step IF X, so we replace the script by exit 0 It is unclear if all variants of file mounts still work. Basic cases do work. Files used to be read-only in src-cli, they aren't now, but content is still reset in between steps. The assumption that everything operates in /work is broken here, because we need to use what executors give us to persist out-of-repo state in between containers (like the step result from the previous step) It is unclear if workspace mounts work Cache keys are not correctly computed if using workspace mounts - the metadataretriever is nil We still use log outputs to transfer the AfterStepResults to the Sourcegraph instance, this should finally become an artifact instead. Then, we don't have to rely on the execution_log_entires anymore and can theoretically prune those after some time. This column is currently growing indefinitely. It depends on tee being available in the docker images to capture the cmd.stdout/cmd.stderr properly for template variable rendering Env-vars are not rendered in their evaluated form post-execution File permissions are unclear and might be similarly broken to how they are now - or even worse Disclaimer: It's not feature complete today! But it is also not hitting any default code paths either. As development on this goes on, we can eventually remove the feature flag and run the new job format on all instances. This PR handles fallback of rendering old records correctly in the UI already.
2022-11-09 23:20:43 +00:00
- batcheshelper-builder
batches-kubernetes:
requiresDevPrivate: true
checks:
- docker
- redis
- postgres
- git
commands:
- frontend
- worker
- repo-updater
- web
- gitserver-0
- gitserver-1
- searcher
- symbols
- caddy
- docsite
- syntax-highlighter
- zoekt-index-0
- zoekt-index-1
- zoekt-web-0
- zoekt-web-1
- blobstore
- batches-executor-kubernetes
- batcheshelper-builder
iam:
requiresDevPrivate: true
2021-07-23 07:59:43 +00:00
checks:
- docker
- redis
- postgres
- git
2021-07-23 07:59:43 +00:00
commands:
- frontend
- repo-updater
- web
- gitserver-0
- gitserver-1
2021-07-23 07:59:43 +00:00
- caddy
monitoring:
checks:
- docker
commands:
- jaeger
- otel-collector
- prometheus
- grafana
- postgres_exporter
monitoring-alerts:
checks:
- docker
- redis
- postgres
commands:
- prometheus
- grafana
# For generated alerts docs
- docsite
# For the alerting integration with frontend
- frontend
- web
- caddy
web-standalone:
commands:
- web-standalone-http
- caddy
web-standalone-prod:
commands:
- web-standalone-http-prod
- caddy
# For testing our OpenTelemetry stack
otel:
checks:
- docker
commands:
- otel-collector
- jaeger
single-program:
Sourcegraph App (single-binary branch) (#46547) * internal: add service and singleprogram packages * sg.config.yaml: add single-binary build targets * internal/env: add a function for clearing environ cache * internal/{workerutil,metrics}: add a hack to allow running 2 executors in the same process * internal/conf: add single-program deploy type * internal/singleprogram: clarify security * cmd/sourcegraph-oss: add initial single-binary main (will not build yet) * enterprise/cmd/sourcegraph: initial enterprise single-binary * Add multi-platform builds for single-program * single-binary: correctly build JS artifacts into binary * license_finder licenses add github.com/xi2/xz "Public domain" * internal/service/svcmain: correctly initialize logger for DeprecatedSingleServiceMain * worker: refactor to new service pattern * cmd/github-proxy: refactor to use new service pattern * symbols: refactor to use new service pattern * gitserver: refactor to user new service pattern * searcher: refactor to use new service pattern * gitserver: refactor to use new service pattern * repo-updater: refactor to use new service pattern * frontend: refactor to use new service pattern * executor: refactor to use new service pattern * internal/symbols: use new LoadConfig pattern * precise-code-intel-worker: refactor to use new service pattern * internal/symbols: load config for tests * cmd/repo-updater: remove LoadConfig approach * cmd/symbols: workaround env var conflict with searcher * executor: internal: add workaround to allow running 2 instances in same process * executors: add EXECUTOR_QUEUE_DISABLE_ACCESS_TOKEN for single-binary and dev deployments only * single-binary: use EXECUTOR_QUEUE_DISABLE_ACCESS_TOKEN * extsvc/github: fix default value for single-program deploy type * single-binary: stop relying on a local ctags image * single-binary: use unix sockets for postgres * release App snapshots in CI when pushed to app/release-snapshot branch * internal/service/svcmain: update TODO comment * executor: correct DEPLOY_TYPE check * dev/check: allow single-binary to import dbconn * executor: remove accidental reliance on dbconn package * executor: improve error logging when running commands (#46546) * executor: improve error logging when running commands * executor: do not attempt std config validation running e.g. install cmd * executor: do not pull in the conf package / frontend reliance * ci: executors: correct site config for passwordless auth * server: fix bug where github-proxy would try to be a conf server * CI: executors: fix integration test passwordless auth * executors: allow passwordless auth in sourcegraph/server for testing * repo-updater: fix enterprise init (caused regression in repository syncing) Signed-off-by: Stephen Gutekanst <stephen@sourcegraph.com> Co-authored-by: Peter Guy <peter.guy@sourcegraph.com> Co-authored-by: Quinn Slack <quinn@slack.org>
2023-01-20 00:35:39 +00:00
requiresDevPrivate: true
checks:
- git
commands:
- sourcegraph
- web
- caddy
env:
DISABLE_CODE_INSIGHTS: false
PRECISE_CODE_INTEL_UPLOAD_AWS_ENDPOINT: http://localhost:49000
EMBEDDINGS_UPLOAD_AWS_ENDPOINT: http://localhost:49000
USE_EMBEDDED_POSTGRESQL: false
app:
requiresDevPrivate: true
checks:
- git
commands:
- cody-app
Sourcegraph App (single-binary branch) (#46547) * internal: add service and singleprogram packages * sg.config.yaml: add single-binary build targets * internal/env: add a function for clearing environ cache * internal/{workerutil,metrics}: add a hack to allow running 2 executors in the same process * internal/conf: add single-program deploy type * internal/singleprogram: clarify security * cmd/sourcegraph-oss: add initial single-binary main (will not build yet) * enterprise/cmd/sourcegraph: initial enterprise single-binary * Add multi-platform builds for single-program * single-binary: correctly build JS artifacts into binary * license_finder licenses add github.com/xi2/xz "Public domain" * internal/service/svcmain: correctly initialize logger for DeprecatedSingleServiceMain * worker: refactor to new service pattern * cmd/github-proxy: refactor to use new service pattern * symbols: refactor to use new service pattern * gitserver: refactor to user new service pattern * searcher: refactor to use new service pattern * gitserver: refactor to use new service pattern * repo-updater: refactor to use new service pattern * frontend: refactor to use new service pattern * executor: refactor to use new service pattern * internal/symbols: use new LoadConfig pattern * precise-code-intel-worker: refactor to use new service pattern * internal/symbols: load config for tests * cmd/repo-updater: remove LoadConfig approach * cmd/symbols: workaround env var conflict with searcher * executor: internal: add workaround to allow running 2 instances in same process * executors: add EXECUTOR_QUEUE_DISABLE_ACCESS_TOKEN for single-binary and dev deployments only * single-binary: use EXECUTOR_QUEUE_DISABLE_ACCESS_TOKEN * extsvc/github: fix default value for single-program deploy type * single-binary: stop relying on a local ctags image * single-binary: use unix sockets for postgres * release App snapshots in CI when pushed to app/release-snapshot branch * internal/service/svcmain: update TODO comment * executor: correct DEPLOY_TYPE check * dev/check: allow single-binary to import dbconn * executor: remove accidental reliance on dbconn package * executor: improve error logging when running commands (#46546) * executor: improve error logging when running commands * executor: do not attempt std config validation running e.g. install cmd * executor: do not pull in the conf package / frontend reliance * ci: executors: correct site config for passwordless auth * server: fix bug where github-proxy would try to be a conf server * CI: executors: fix integration test passwordless auth * executors: allow passwordless auth in sourcegraph/server for testing * repo-updater: fix enterprise init (caused regression in repository syncing) Signed-off-by: Stephen Gutekanst <stephen@sourcegraph.com> Co-authored-by: Peter Guy <peter.guy@sourcegraph.com> Co-authored-by: Quinn Slack <quinn@slack.org>
2023-01-20 00:35:39 +00:00
- docsite
- web
- caddy
- tauri
env:
DISABLE_CODE_INSIGHTS: true
CODY_APP: 1
EXTSVC_CONFIG_ALLOW_EDITS: true
PRECISE_CODE_INTEL_UPLOAD_AWS_ENDPOINT: http://localhost:49000
EMBEDDINGS_UPLOAD_AWS_ENDPOINT: http://localhost:49000
Sourcegraph App (single-binary branch) (#46547) * internal: add service and singleprogram packages * sg.config.yaml: add single-binary build targets * internal/env: add a function for clearing environ cache * internal/{workerutil,metrics}: add a hack to allow running 2 executors in the same process * internal/conf: add single-program deploy type * internal/singleprogram: clarify security * cmd/sourcegraph-oss: add initial single-binary main (will not build yet) * enterprise/cmd/sourcegraph: initial enterprise single-binary * Add multi-platform builds for single-program * single-binary: correctly build JS artifacts into binary * license_finder licenses add github.com/xi2/xz "Public domain" * internal/service/svcmain: correctly initialize logger for DeprecatedSingleServiceMain * worker: refactor to new service pattern * cmd/github-proxy: refactor to use new service pattern * symbols: refactor to use new service pattern * gitserver: refactor to user new service pattern * searcher: refactor to use new service pattern * gitserver: refactor to use new service pattern * repo-updater: refactor to use new service pattern * frontend: refactor to use new service pattern * executor: refactor to use new service pattern * internal/symbols: use new LoadConfig pattern * precise-code-intel-worker: refactor to use new service pattern * internal/symbols: load config for tests * cmd/repo-updater: remove LoadConfig approach * cmd/symbols: workaround env var conflict with searcher * executor: internal: add workaround to allow running 2 instances in same process * executors: add EXECUTOR_QUEUE_DISABLE_ACCESS_TOKEN for single-binary and dev deployments only * single-binary: use EXECUTOR_QUEUE_DISABLE_ACCESS_TOKEN * extsvc/github: fix default value for single-program deploy type * single-binary: stop relying on a local ctags image * single-binary: use unix sockets for postgres * release App snapshots in CI when pushed to app/release-snapshot branch * internal/service/svcmain: update TODO comment * executor: correct DEPLOY_TYPE check * dev/check: allow single-binary to import dbconn * executor: remove accidental reliance on dbconn package * executor: improve error logging when running commands (#46546) * executor: improve error logging when running commands * executor: do not attempt std config validation running e.g. install cmd * executor: do not pull in the conf package / frontend reliance * ci: executors: correct site config for passwordless auth * server: fix bug where github-proxy would try to be a conf server * CI: executors: fix integration test passwordless auth * executors: allow passwordless auth in sourcegraph/server for testing * repo-updater: fix enterprise init (caused regression in repository syncing) Signed-off-by: Stephen Gutekanst <stephen@sourcegraph.com> Co-authored-by: Peter Guy <peter.guy@sourcegraph.com> Co-authored-by: Quinn Slack <quinn@slack.org>
2023-01-20 00:35:39 +00:00
cody-gateway:
checks:
- redis
commands:
- cody-gateway
qdrant:
commands:
- qdrant
- frontend
- worker
- repo-updater
- web
- gitserver-0
- gitserver-1
- searcher
- caddy
- symbols
- docsite
- syntax-highlighter
- zoekt-index-0
- zoekt-index-1
- zoekt-web-0
- zoekt-web-1
- blobstore
- embeddings
env:
QDRANT_ENDPOINT: 'localhost:6334'
tests:
# These can be run with `sg test [name]`
backend:
cmd: go test
defaultArgs: ./...
backend-integration:
cmd: cd dev/gqltest && go test -long -base-url $BASE_URL -email $EMAIL -username $USERNAME -password $PASSWORD ./gqltest
env:
# These are defaults. They can be overwritten by setting the env vars when
# running the command.
BASE_URL: "http://localhost:3080"
EMAIL: "joe@sourcegraph.com"
PASSWORD: "12345"
bext:
cmd: pnpm --filter @sourcegraph/browser test
bext-build:
cmd: EXTENSION_PERMISSIONS_ALL_URLS=true pnpm --filter @sourcegraph/browser build
bext-integration:
cmd: pnpm --filter @sourcegraph/browser test-integration
bext-e2e:
cmd: pnpm --filter @sourcegraph/browser mocha ./src/end-to-end/github.test.ts ./src/end-to-end/gitlab.test.ts
env:
SOURCEGRAPH_BASE_URL: https://sourcegraph.com
client:
cmd: pnpm jest --testPathIgnorePatterns end-to-end regression integration storybook
docsite:
cmd: .bin/docsite_${DOCSITE_VERSION} check ./doc
env:
2023-08-18 18:40:44 +00:00
DOCSITE_VERSION: v1.9.4 # Update DOCSITE_VERSION everywhere in all places (including outside this repo)
web-e2e:
preamble: |
A Sourcegraph isntance must be already running for these tests to work, most
commonly with: `sg start enterprise-e2e`
See more details: https://docs.sourcegraph.com/dev/how-to/testing#running-end-to-end-tests
cmd: pnpm test-e2e
env:
TEST_USER_EMAIL: test@sourcegraph.com
TEST_USER_PASSWORD: supersecurepassword
SOURCEGRAPH_BASE_URL: https://sourcegraph.test:3443
BROWSER: chrome
external_secrets:
GH_TOKEN:
project: "sourcegraph-ci"
name: "BUILDKITE_GITHUBDOTCOM_TOKEN"
web-regression:
preamble: |
A Sourcegraph instance must be already running for these tests to work, most
commonly with: `sg start enterprise-e2e`
See more details: https://docs.sourcegraph.com/dev/how-to/testing#running-regression-tests
cmd: pnpm test-regression
env:
SOURCEGRAPH_SUDO_USER: test
SOURCEGRAPH_BASE_URL: https://sourcegraph.test:3443
TEST_USER_PASSWORD: supersecurepassword
BROWSER: chrome
web-integration:
preamble: |
A web application should be built for these tests to work, most
commonly with: `sg run web-integration-build` or `sg run web-integration-build-prod` for production build.
See more details: https://docs.sourcegraph.com/dev/how-to/testing#running-integration-tests
cmd: pnpm test-integration
web-integration:debug:
preamble: |
A Sourcegraph instance must be already running for these tests to work, most
commonly with: `sg start web-standalone`
See more details: https://docs.sourcegraph.com/dev/how-to/testing#running-integration-tests
cmd: pnpm test-integration:debug