sourcegraph

mirror of https://github.com/sourcegraph/sourcegraph.git synced 2026-02-06 12:51:55 +00:00

Code AI platform with Code Search & Cody

code-intelligence code-search lsif-enabled open-source repo-type-main sourcegraph

Go to file

Stephen Gutekanst dca1b9694d self hosted models (#63899 ) This PR is stacked on top of all the prior work @chrsmith has done for shuffling configuration data around; it implements the new "Self hosted models" functionality. ## Configuration Configuring a Sourcegraph instance to use self-hosted models basically involves adding some configuration like this to the site config (if you set `modelConfiguration`, you are opting in to the new system which is in early access): ``` // Setting this field means we are opting into the new Cody model configuration system. "modelConfiguration": { // Disable use of Sourcegraph's servers for model discovery "sourcegraph": null, // Create two model providers "providerOverrides": [ { // Our first model provider "mistral" will be a Huggingface TGI deployment which hosts our // mistral model for chat functionality. "id": "mistral", "displayName": "Mistral", "serverSideConfig": { "type": "huggingface-tgi", "endpoints": [{"url": "https://mistral.example.com/v1"}] }, }, { // Our second model provider "bigcode" will be a Huggingface TGI deployment which hosts our // bigcode/starcoder model for code completion functionality. "id": "bigcode", "displayName": "Bigcode", "serverSideConfig": { "type": "huggingface-tgi", "endpoints": [{"url": "http://starcoder.example.com/v1"}] } } ], // Make these two models available to Cody users "modelOverridesRecommendedSettings": [ "mistral::v1::mixtral-8x7b-instruct", "bigcode::v1::starcoder2-7b" ], // Configure which models Cody will use by default "defaultModels": { "chat": "mistral::v1::mixtral-8x7b-instruct", "fastChat": "mistral::v1::mixtral-8x7b-instruct", "codeCompletion": "bigcode::v1::starcoder2-7b" } } ``` More advanced configurations are possible, the above is our blessed configuration for today. ## Hosting models Another major component of this work is starting to build up recommendations around how to self-host models, which ones to use, how to configure them, etc. For now, we've been testing with these two on a machine with dual A100s: * Huggingface TGI (this is a Docker container for model inference, which provides an OpenAI-compatible API - and is widely popular) * Two models: * Starcoder2 for code completion; specifically `bigcode/starcoder2-15b` with `eetq` 8-bit quantization. * Mixtral 8x7b instruct for chat; specifically `casperhansen/mixtral-instruct-awq` which uses `awq` 4-bit quantization. This is our 'starter' configuration. Other models - specifically other starcoder 2, and mixtral instruct models - certainly work too, and higher parameter versions may of course provide better results. Documentation for how to deploy Huggingface TGI, suggested configuration and debugging tips - coming soon. ## Advanced configuration As part of this effort, I have added a quite extensive set of configuration knobs to to the client side model configuration (see `type ClientSideModelConfigOpenAICompatible` in this PR) Some of these configuration options are needed for things to work at a basic level, while others (e.g. prompt customization) are not needed for basic functionality, but are very important for customers interested in self-hosting their own models. Today, Cody clients have a number of different _autocomplete provider implementations_ which tie model-specific logic to enable autocomplete, to a provider. For example, if you use a GPT model through Azure OpenAI, the autocomplete provider for that is entirely different from what you'd get if you used a GPT model through OpenAI officially. This can lead to some subtle issues for us, and so it is worth exploring ways to have a _generalized autocomplete provider_ - and since with self-hosted models we _must_ address this problem, these configuration knobs fed to the client from the server are a pathway to doing that - initially just for self-hosted models, but in the future possibly generalized to other providers. ## Debugging facilities Working with customers in the past to use OpenAI-compatible APIs, we've learned that debugging can be quite a pain. If you can't see what requests the Sourcegraph backend is making, and what it is getting back.. it can be quite painful to debug. This PR implements quite extensive logging, and a `debugConnections` flag which can be turned on to enable logging of the actual request payloads and responses. This is critical when a customer is trying to add support for a new model, their own custom OpenAI API service, etc. ## Robustness Working with customers in the past, we also learned that various parts of our backend `openai` provider were not super robust. For example, [if more than one message was present it was a fatal error](https://github.com/sourcegraph/sourcegraph/blob/main/internal/completions/client/openai/openai.go#L305), or if the SSE stream yielded `{"error"}` payloads, they would go ignored. Similarly, the SSE event stream parser we use is heavily tailored towards [the exact response structure](https://github.com/sourcegraph/sourcegraph/blob/main/internal/completions/client/openai/decoder.go#L15-L19) which OpenAI's official API returns, and is therefor quite brittle if connecting to a different SSE stream. For this work, I have _started by forking_ our `internal/completions/client/openai` - and made a number of major improvements to it to make it more robust, handle errors better, etc. I have also replaced the usage of a custom SSE event stream parser - which was not spec compliant and brittle - with a proper SSE event stream parser that recently popped up in the Go community: https://github.com/tmaxmax/go-sse My intention is that after more extensive testing, this new `internal/completions/client/openaicompatible` provider will be more robust, more correct, and all around better than `internal/completions/client/openai` (and possibly the azure one) so that we can just supersede those with this new `openaicompatible` one entirely. ## Client implementation Much of the work done in this PR is just "let the site admin configure things, and broadcast that config to the client through the new model config system." Actually getting the clients to respect the new configuration, is a task I am tackling in future `sourcegraph/cody` PRs. ## Test plan 1. This change currently lacks any unit/regression tests, that is a major noteworthy point. I will follow-up with those in a future PR. * However, these changes are incredibly isolated, clearly only affecting customers who opt-in to this new self-hosted models configuration. * Most of the heavy lifting (SSE streaming, shuffling data around) is done in other well-tested codebases. 2. Manual testing has played a big role here, specifically: * Running a dev instance with the new configuration, actually connected to Huggingface TGI deployed on a remote server. * Using the new `debugConnections` mechanism (which customers would use) to directly confirm requests are going to the right places, with the right data and payloads. * Confirming with a new client (changes not yet landed) that autocomplete and chat functionality work. Can we use more testing? Hell yeah, and I'm going to add it soon. Does it work quite well and have small room for error? Also yes. ## Changelog Cody Enterprise: added a new configuration for self-hosting models. Reach out to support if you would like to use this feature as it is in early access. --------- Signed-off-by: Stephen Gutekanst <stephen@sourcegraph.com>		2024-07-19 01:34:02 +00:00
.apko	Build images end-to-end using Bazel v2 (#61845 )	2024-04-12 16:18:43 +01:00
.aspect	fix(ci): only emit bazel execlog artifact for 'test' commands (#63916 )	2024-07-18 15:17:12 +01:00
.buildkite	chore(ci): remove Percy visual tests (#63515 )	2024-06-27 16:20:06 +02:00
.github	pr-auditor: use pr-auditor from devx-service (#63847 )	2024-07-16 11:10:36 +02:00
.vscode	feat(search): Make search aware of perforce changelist id mapping (#63563 )	2024-07-09 14:01:05 -04:00
client	Prompt Library (#63872 )	2024-07-18 16:04:55 -07:00
cmd	self hosted models (#63899 )	2024-07-19 01:34:02 +00:00
dev	Revert "fix(sg): resolve overwrite `env` ordering in sg (#63838 )" (#63924 )	2024-07-18 20:46:35 +00:00
doc	chore/sg: remove 'sg telemetry' and related docs (#63763 )	2024-07-10 17:25:04 -07:00
docker-images	chore/otel-collector: upgrade to v0.103.0, remove jaegerexporter (#63171 )	2024-07-10 09:01:41 -07:00
internal	self hosted models (#63899 )	2024-07-19 01:34:02 +00:00
lib	chore/lib/telemetrygateway: fixup Dial helper (#63862 )	2024-07-16 20:38:53 +00:00
migrations	Prompt Library (#63872 )	2024-07-18 16:04:55 -07:00
monitoring	feat/lib/telemetrygateway: expose simple Dial (#63810 )	2024-07-15 10:45:10 -07:00
schema	self hosted models (#63899 )	2024-07-19 01:34:02 +00:00
testing	feat/bazel: `//cmd/{frontend,server}` targets that don't include client bundle for backend integration tests (#62877 )	2024-05-28 14:32:48 +01:00
third_party	enterprise-portal: implement basic MSP IAM and RPCs (#63173 )	2024-06-19 21:46:48 -04:00
third-party-licenses	Chore: remove gorilla/schema (#63738 )	2024-07-10 15:36:37 +00:00
tools	Chore(release): Calendar Updates (#63583 )	2024-07-02 10:42:12 -04:00
ui/assets	feat/bazel: `//cmd/{frontend,server}` targets that don't include client bundle for backend integration tests (#62877 )	2024-05-28 14:32:48 +01:00
wolfi-images	fix(build): update wolfi image lock for otel (#63755 )	2024-07-10 10:23:11 -07:00
wolfi-packages	chore/otel-collector: upgrade to v0.103.0, remove jaegerexporter (#63171 )	2024-07-10 09:01:41 -07:00
.bazel_fix_commands.json	SG Start Bazel Improvements Take 2 (#60687 )	2024-03-05 01:44:21 -08:00
.bazelignore	Convert Appliance Maintenance UI to Bazel (#63661 )	2024-07-10 13:47:18 +02:00
.bazeliskrc	chore: upgrade to Aspect CLI 5.8.19 (#59203 )	2024-01-02 15:13:24 +01:00
.bazelrc	feat(ci): Adds playwright tests for sveltekit to bazel (#62560 )	2024-06-06 12:45:05 -06:00
.bazelversion	chore(bazel): bump to 7.2.0 (#63226 )	2024-06-12 13:25:18 +00:00
.dockerignore	use esbuild for client/web builds (#57365 )	2023-10-23 10:59:06 -07:00
.editorconfig	proto: Add editorconfig to ident using two spaces (#57281 )	2023-10-03 00:39:42 +00:00
.eslintrc.js	various improvements to saved searches (#63539 )	2024-07-15 20:12:34 +00:00
.gitattributes	dev/linearhooks: add POC (#62367 )	2024-05-07 00:14:05 -07:00
.gitignore	chore(ci): emit compact executon log in CI (#63420 )	2024-06-21 19:50:35 +01:00
.graphqlrc.yml
.hadolint.yaml	bump comby version to 1.7.1 (#35830 )	2022-05-20 20:12:01 -07:00
.mailmap	mailmap: add entries for Eric and Renovate (#50966 )	2023-04-25 09:42:22 +02:00
.mocharc.js	reapply "switch from jest to vitest for faster, simpler tests (#57886 )" (#58145 )	2023-11-07 12:00:18 +02:00
.npmrc	pnpm: remove update notifier message (#51630 )	2023-05-10 08:53:39 +02:00
.pre-commit-config.yaml	chore(local): add FORBIDCOMMIT pragma to prevent accidental commits (#63581 )	2024-07-01 18:27:26 +00:00
.prettierignore	feat/dotcom: use Enterprise Portal for Cody Gateway usage (#63653 )	2024-07-10 19:22:08 +00:00
.stylelintignore	rework plugin structure and implement frontside blogpost (#46883 )	2023-02-15 11:49:51 +02:00
.stylelintrc.json	web: drop `bootstrap` depenedency (#41401 )	2022-09-07 03:11:26 -07:00
.swcrc	use swc instead of babel for faster bazel typescript transpilation (#57912 )	2023-11-02 22:49:03 -07:00
.tool-versions	chore(tooling): bump Go version to 1.22.4 (#63124 )	2024-06-06 15:19:03 +00:00
.trivyignore
BUILD.bazel	symbols: Make symbols specific code internal (#63736 )	2024-07-10 01:26:22 +02:00
CHANGELOG.md	feat(code insights): language stats speed improvements by using archive loading (#62946 )	2024-07-18 08:40:48 +02:00
CODENOTIFY	nix: update pnpm hash (#51512 )	2023-05-05 12:51:59 +00:00
CONTRIBUTING.md	fix: update links for dev docs (#62758 )	2024-05-17 13:47:34 +02:00
deps.bzl	self hosted models (#63899 )	2024-07-19 01:34:02 +00:00
doc.go
eslint-relative-formatter.js	bazel: implement custom ESLint Bazel rule (#52062 )	2023-05-22 04:05:45 -07:00
flake.lock	nix: bump to bazel 7.1 (#61326 )	2024-03-22 16:57:50 +00:00
flake.nix	nix: use go1.22.4 (#63372 )	2024-06-20 11:12:17 +02:00
gen.go	chore: fixup go-mockgen run statement (#61028 )	2024-03-12 13:06:36 +00:00
go.mod	self hosted models (#63899 )	2024-07-19 01:34:02 +00:00
go.sum	self hosted models (#63899 )	2024-07-19 01:34:02 +00:00
graphql-schema-linter.config.js
LICENSE	relicense all paths other than MIT licensed code, client/cody*, jetbrains, VS code, and browser extension to enterprise (#53345 ) (#53345 )	2023-06-13 10:28:11 -07:00
LICENSE.enterprise	Update Enterprise license copyright notice (#62467 )	2024-05-06 17:35:32 +00:00
linter_deps.bzl	chore: Remove redundant loop captures (#62264 )	2024-04-30 07:57:21 -06:00
mockgen.temp.yaml	Prompt Library (#63872 )	2024-07-18 16:04:55 -07:00
mockgen.test.yaml	feat(appliance): self-update (#63780 )	2024-07-11 17:59:39 +01:00
mockgen.yaml	bazel: native go-mockgen in Bazel (#60386 )	2024-02-16 13:26:48 +00:00
nogo_config.json	chore: Remove redundant loop captures (#62264 )	2024-04-30 07:57:21 -06:00
package.json	Upgrade cody web experimental package to 0.2.7 (#63863 )	2024-07-16 18:43:53 -03:00
pnpm-lock.yaml	Upgrade cody web experimental package to 0.2.7 (#63863 )	2024-07-16 18:43:53 -03:00
pnpm-workspace.yaml	Convert Appliance Maintenance UI to Bazel (#63661 )	2024-07-10 13:47:18 +02:00
postcss.config.js
prettier.config.js	clean up Cody CSS to increase shareability and improve display in web app (#50279 )	2023-04-03 12:29:05 -07:00
README.md	chore: remove broken link in README (#63256 )	2024-06-13 22:22:56 +00:00
release.yaml	feat(ci): Trigger security scanner from release pipeline (#63280 )	2024-06-19 19:16:36 +00:00
renovate.json	chore(ci): disable renovate (#63313 )	2024-06-19 13:17:15 +02:00
SECURITY.md
service-catalog.yaml	lib/servicecatalog: init to distribute catalog (#46999 )	2023-01-26 17:22:27 -08:00
sg.config.yaml	chore/otel-collector: upgrade to v0.103.0, remove jaegerexporter (#63171 )	2024-07-10 09:01:41 -07:00
shell.nix	bazel: use pgutil binaries from GCS instead of from the host (#61741 )	2024-04-11 18:00:21 +01:00
stamp_tags.bzl	Switch to OCI/Wolfi based image (#52693 )	2023-06-02 12:12:52 +02:00
tsconfig.base.json	web: fix pnpm-lock issue (#47478 )	2023-02-09 22:04:31 -08:00
tsconfig.json	release: drop legacy release tooling (#61220 )	2024-04-09 14:29:35 -05:00
vitest.shared.ts	make pagination hooks store filter & query params in URL, not just pagination params (#63744 )	2024-07-15 19:17:59 +00:00
vitest.workspace.ts	vitest: Fix workspace config wrt client/web/ (#58397 )	2023-11-17 08:22:46 +00:00
WORKSPACE	release/bug: generate a new stitched migration graph (#63764 )	2024-07-10 14:49:18 -07:00

README.md

Docs • Contributing • Twitter • Discord

Sourcegraph makes it easy to read, write, and fix code—even in big, complex codebases.

Code search: Search all of your repositories across all branches and all code hosts.
Code intelligence: Navigate code, find references, see code owners, trace history, and more.
Fix and refactor: Roll out large-scale changes to many repositories at once and track big migrations.

Getting started

Development

Refer to the Developing Sourcegraph guide to get started.

Documentation

The doc directory has additional documentation for developing and understanding Sourcegraph:

Architecture: high-level architecture
Database setup: database best practices
Go style guide
Documentation style guide
GraphQL API: useful tips when modifying the GraphQL API
Contributing

License

This repository contains primarily non-OSS-licensed files. See LICENSE.