sourcegraph

mirror of https://github.com/sourcegraph/sourcegraph.git synced 2026-02-06 15:31:48 +00:00

Author	SHA1	Message	Date
Julie Tibshirani	c222523fa5	Redis: remove some direct pool usages (#64447 ) We want to discourage direct usage of the Redis pool in favor of routing all calls through the main `KeyValue` interface. This PR removes several usages of `KeyValue.Pool`. To do so, it adds "PING" and "MGET" to the `KeyValue` interface.	2024-08-14 13:39:10 +03:00
Julie Tibshirani	ca6e72fe18	Redis: remove RedisKeyValue constructor (#64442 ) This PR removes the `redispool.RedisKeyValue` constructor in favor of the `New...KeyValue` methods, which do not take a pool directly. This way callers won't create a `Pool` reference, allowing us to track all direct pool usage through `KeyValue.Pool()`. This also simplifies a few things: * Tests now use `NewTestKeyValue` instead of dialing up localhost directly * We can remove duplicated Redis connection logic in Cody Gateway	2024-08-14 11:24:32 +03:00
Robert Lin	cec288dc89	fix/enterpriseportal, fix/codygateway: zero-value durations and missing active licenses (#64378 ) This change ensure we correctly handle: 1. In Enterprise Portal, where no active license is available, we return ratelimit=0 intervalduration=0, from the source `PLAN` (as this is determined by the lack of a plan) 2. In Cody Gateway, where intervalduration=0, we do not grant access to that feature ## Test plan Unit tests	2024-08-12 11:49:54 -07:00
Noah S-C	b9c4e2aae9	Revert "Revert "refactor: upgrade to rules_oci 2.0 (2nd attempt)"" (#64354 ) Reverts sourcegraph/sourcegraph#64351 ## Test plan Need to test on main due to main-only CI steps (even with main dry-run)	2024-08-08 09:00:08 +00:00
Noah S-C	addba96f47	Revert "refactor: upgrade to rules_oci 2.0 (2nd attempt)" (#64351 ) Reverts sourcegraph/sourcegraph#63829 Not working with Aspect Delivery ## Test plan CI	2024-08-07 22:15:21 +00:00
Greg Magolan	be015c58c2	refactor: upgrade to rules_oci 2.0 (2nd attempt) (#63829 ) 2nd attempt of #63111, a follow up https://github.com/sourcegraph/sourcegraph/pull/63085 rules_oci 2.0 brings a lot of performance improvement around oci_image and oci_pull, which will benefit Sourcegraph. It will also make RBE faster and have less load on remote cache. However, 2.0 makes some breaking changes like - oci_tarball's default output is no longer a tarball - oci_image no longer compresses layers that are uncompressed, somebody has to make sure all `pkg_tar` targets have a `compression` attribute set to compress it beforehand. - there is no curl fallback, but this is fine for sourcegraph as it already uses bazel 7.1. I checked all targets that use oci_tarball as much as i could to make sure nothing depends on the default tarball output of oci_tarball. there was one target which used the default output which i put a TODO for somebody else (somebody who is more on top of the repo) to tackle later. ## Test plan CI. Also run delivery on this PR (don't land those changes) --------- Co-authored-by: Noah Santschi-Cooney <noah@santschi-cooney.ch>	2024-08-07 22:21:49 +01:00
Robert Lin	eedc12e789	feat/enterpriseportal: import data from dotcom (#63858 ) Adds a background job that can periodically import subscriptions, licenses, and Cody Gateway access from dotcom. Note that subscriptions and licenses cannot be deleted, so we don't need to worry about that. Additionally licenses cannot be updated, so we only need to worry about creation and revocation. The importer can be configured with `DOTCOM_IMPORT_INTERVAL` - if zero, the importer is disabled. Closes https://linear.app/sourcegraph/issue/CORE-216 ## Test plan ``` DOTCOM_IMPORT_INTERVAL=10s sg start dotcom ``` Look for `service.importer` logs. Play around in https://sourcegraph.test:3443/site-admin/dotcom/product/subscriptions/ to create and edit subscriptions, licenses, and Cody Gateway access. Watch them show up in the database: ``` psql -d sourcegraph sourcegraph# select * from enterprise_portal_susbscriptions; sourcegraph# select * from enterprise_portal_susbscription_licenses; sourcegraph# select * from enterprise_portal_cody_gateway_access; ``` --------- Co-authored-by: James Cotter <35706755+jac@users.noreply.github.com>	2024-08-07 11:44:18 -07:00
Taras Yemets	d19aa106f9	feat(cody): add circuit breaker to handle timed-out requests and rate limit hits (#64133 ) <!-- PR description tips: https://www.notion.so/sourcegraph/Write-a-good-pull-request-description-610a7fd3e613496eb76f450db5a49b6e --> Closes https://linear.app/sourcegraph/issue/CODY-2758/[autocomplete-latency]-add-circuit-breaker-in-cody-gateway-to-handle This PR introduces an in-memory model availability tracker to ensure we do not send consequent requests to the currently unavailable LLM providers. The tracker maintains a history of error records for each model. It evaluates these records to determine whether the model is available for new requests. The evaluation follows these steps: 1. For every request to an upstream provider, the tracker records any errors that occur. Specifically, it logs timeout errors (when a request exceeds its deadline) and responses with a 429 status code (Too Many Requests). 2. These error records are stored in a circular buffer for each model. This buffer holds a fixed number of records, ensuring efficient memory usage. 3. The tracker calculates the failure ratio by analyzing the stored records. It checks the percentage of errors within a specified evaluation window and compares this against the total number of recent requests. 4. Based on the calculated failure ratio, the tracker decides whether the model is available: - Model Unavailable: If the ratio of failures (timeouts or 429 status codes) exceeds a predefined threshold (X%), the model is marked as unavailable. In this state, the system does not send new requests to the upstream provider. When a model is unavailable, the system immediately returns an error status code, typically a 503 Service Unavailable, to the client. This informs the client that the service is temporarily unavailable due to upstream issues. - Model Available: If the failure ratio is within acceptable limits, the system proceeds with sending the request to the upstream provider. This PR suggests considering a model unavailable if 95% of the last 100 requests within the past minute either time out or return a 429 status code. I am not sure about these exact values and suggest them as a starting point for discussion ## Test plan - Added unit tests - CI <!-- REQUIRED; info at https://docs-legacy.sourcegraph.com/dev/background-information/testing_principles --> ## Changelog <!-- OPTIONAL; info at https://www.notion.so/sourcegraph/Writing-a-changelog-entry-dd997f411d524caabf0d8d38a24a878c -->	2024-08-05 11:38:58 +00:00
Beatrix	468a01a3ab	cody-gateway: handle missing Google response (#63895 ) - Logs a warning instead of returning an error when the Google response is missing - This prevents the API from returning an error when the Google response is empty, which can happen when Google is not happy with the question due to safety issues. We will only log a decoder error as error. <!-- PR description tips: https://www.notion.so/sourcegraph/Write-a-good-pull-request-description-610a7fd3e613496eb76f450db5a49b6e --> ## Test plan <!-- REQUIRED; info at https://docs-legacy.sourcegraph.com/dev/background-information/testing_principles --> only the log level was updated for empty responses. the function behavior was not changed. ## Changelog <!-- OPTIONAL; info at https://www.notion.so/sourcegraph/Writing-a-changelog-entry-dd997f411d524caabf0d8d38a24a878c --> cody-gateway: log missing Google response as warning	2024-07-24 17:58:19 +00:00
Rafał Gajdulewicz	8bd9c5d1a4	Add counter for traced requests to Fireworks (#63953 ) Adds a otel counter to measure how many traced requests we sent to Fireworks from Cody Gateway - [context](https://sourcegraph.slack.com/archives/C0729T2PBV2/p1721385268801079?thread_ts=1721029684.522409&cid=C0729T2PBV2) ## Test plan - tested locally by sending a request without the `X-Fireworks-Genie` header, with the header but with value != `true` and with the header and value `true` and observing `fireworks-traced-requests` Prometheus metric	2024-07-19 14:32:19 +00:00
Taras Yemets	26df35a69f	fix(cody): use client-provided timeout for completions requests (#63875 ) Closes [CODY-2775](https://linear.app/sourcegraph/issue/CODY-2775/%5Bautocomplete-latency%5D-apply-the-same-timeout-on-the-cody-gateway-side) Enables client control over the request processing timeout on the server (both Sourcegraph backend and Cody Gateway). The context timeout is set to the value provided in the `X-Timeout-Ms` header of the client request. If the header is not provided, the default context timeout is used (1 minute on both Sourcegraph backend and Cody Gateway). Previously, we only had a default timeout on the Sourcegraph backend side (8 minutes). Corresponding client change: - https://github.com/sourcegraph/cody/pull/4921 <!-- 💡 To write a useful PR description, make sure that your description covers: - WHAT this PR is changing: - How was it PREVIOUSLY. - How it will be from NOW on. - WHY this PR is needed. - CONTEXT, i.e. to which initiative, project or RFC it belongs. The structure of the description doesn't matter as much as covering these points, so use your best judgement based on your context. Learn how to write good pull request description: https://www.notion.so/sourcegraph/Write-a-good-pull-request-description-610a7fd3e613496eb76f450db5a49b6e?pvs=4 --> ## Test plan - Manually tested and confirmed that if the request contains the `X-Timeout-Ms` header, its value is used. If not, the default maximum request duration is applied. - CI - <!-- All pull requests REQUIRE a test plan: https://docs-legacy.sourcegraph.com/dev/background-information/testing_principles --> ## Changelog - Use the provided timeout from request parameters if available; otherwise use the default maximum request duration (8 minutes) <!-- 1. Ensure your pull request title is formatted as: $type($domain): $what 2. Add bullet list items for each additional detail you want to cover (see example below) 3. You can edit this after the pull request was merged, as long as release shipping it hasn't been promoted to the public. 4. For more information, please see this how-to https://www.notion.so/sourcegraph/Writing-a-changelog-entry-dd997f411d524caabf0d8d38a24a878c? Audience: TS/CSE > Customers > Teammates (in that order). Cheat sheet: $type = chore\|fix\|feat $domain: source\|search\|ci\|release\|plg\|cody\|local\|... --> <!-- Example: Title: fix(search): parse quotes with the appropriate context Changelog section: ## Changelog - When a quote is used with regexp pattern type, then ... - Refactored underlying code. -->	2024-07-19 14:17:02 +00:00
Kevin Chen	59ec1e034e	Update flagging.go	2024-07-16 07:15:40 -07:00
Kevin Chen	9d8140ee5f	updated error messaging for blocked requests	2024-07-16 07:15:40 -07:00
Hitesh Sagtani	660d6866b5	change model identifier for finetuned deepseek model (#63817 ) ## Context 1. Change model identifier for deepseek-coder-v2 model for fine-tuned models. ## Test plan ``` curl -vS -X POST http://localhost:9992/v1/completions/fireworks -H 'Authorization: bearer <SGD_TOKEN>' -d '{"stream":false,"max_tokens":50, "model": "fim-lang-specific-model-deepseek-stack-trained", "stop_sequences": ["\n\n"], "prompt": "const value = ", "stream":false, "languageId": "python"}' -H 'X-sourcegraph-feature: code_completions' ``` ``` curl -vS -X POST http://localhost:9992/v1/completions/fireworks -H 'Authorization: bearer <SGD_TOKEN>' -d '{"stream":false,"max_tokens":50, "model": "fim-lang-specific-model-deepseek-logs-trained", "stop_sequences": ["\n\n"], "prompt": "const value = ", "stream":false, "languageId": "python"}' -H 'X-sourcegraph-feature: code_completions' ```	2024-07-14 13:47:50 +00:00
Chris Smith	02c07df176	feat/cody: Refactor completions API to use new modelconfig (support more models) (#63797 ) This PR if what the past dozen or so [cleanup](https://github.com/sourcegraph/sourcegraph/pull/63359), [refactoring](https://github.com/sourcegraph/sourcegraph/pull/63731), and [test](https://github.com/sourcegraph/sourcegraph/pull/63761) PRs were all about: using the new `modelconfig` system for the completion APIs. This will enable users to: - Use the new site config schema for specifying LLM configuration, added in https://github.com/sourcegraph/sourcegraph/pull/63654. Sourcegraph admins who use these new site config options will be able to support many more LLM models and providers than is possible using the older "completions" site config. - For Cody Enterprise users, we no longer ignore the `CodyCompletionRequest.Model` field. And now support users specifying any LLM model (provided it is "supported" by the Sourcegraph instance). Beyond those two things, everything should continue to work like before. With any existing "completions" configuration data being converted into the `modelconfig` system (see https://github.com/sourcegraph/sourcegraph/pull/63533). ## Overview In order to understand how this all fits together, I'd suggest reviewing this PR commit-by-commit. ### [Update internal/completions to use modelconfig](`e6b7eb171e`) The first change was to update the code we use to serve LLM completions. (Various implementations of the `types.CompletionsProvider` interface.) The key changes here were as follows: 1. Update the `CompletionRequest` type to include the `ModelConfigInfo` field (to make the new Provider and Model-specific configuration data available.) 2. Rename the `CompletionRequest.Model` field to `CompletionRequest.RequestedModel`. (But with a JSON annotation to maintain compatibility with existing callers.) This is to catch any bugs related to using the field directly, since that is now almost guaranteed to be a mistake. (See below.) With these changes, all of the `CompletionProvider`s were updated to reflect these changes. - Any situation where we used the `CompletionRequest.Parameters.RequestedModel` should now refer to `CompletionRequest.ModelConfigInfo.Model.ModelName`. The "model name" being the thing that should be passed to the API provider, e.g. `gpt-3.5-turbo`. - In some situations (`azureopenai`) we needed to rely on the Model ID as a more human-friendly identifier. This isn't 100% accurate, but will match the behavior we have today. A long doc comment calls out the details of what is wrong with that. - In other situations (`awsbedrock`, `azureopenai`) we read the new `modelconfig` data to configure the API provider (e.g. `Azure.UseDeprecatedAPI`), or surface model-specific metadata (e.g. AWS Provisioned Throughput ARNs). While the code is a little clunky to avoid larger refactoring, this is the heart and soul of how we will be writing new completion providers in the future. That is, taking specific configuration bags with whatever data that is required. ### [Fix bugs in modelconfig](`75a51d8cb5`) While we had lots of tests for converting the existing "completions" site config data into the `modelconfig.ModelConfiguration` structure, there were a couple of subtle bugs that I found while testing the larger change. The updated unit tests and comments should make that clear. ### [Update frontend/internal/httpapi/completions to use modelconfig](`084793e08f`) The final step was to update the HTTP endpoints that serve the completion requests. There weren't any logic changes here, just refactoring how we lookup the required data. (e.g. converting the user's requested model into an actual model found in the site configuration.) We support Cody clients sending either "legacy mrefs" of the form `provider/model` like before, or the newer mref `provider::apiversion::model`. Although it will likely be a while before Cody clients are updated to only use the newer-style model references. The existing unit tests for the competitions APIs just worked, which was the plan. But for the few changes that were required I've added comments to explain the situation. ### [Fix: Support requesting models just by their ID](`99715feba6`) > ... We support Cody clients sending either "legacy mrefs" of the form `provider/model` like before ... Yeah, so apparently I lied 😅 . After doing more testing, the extension _also_ sends requests where the requested model is just `"model"`. (Without the provider prefix.) So that now works too. And we just blindly match "gtp-3.5-turbo" to the first mref with the matching model ID, such as "anthropic::unknown::gtp-3.5-turbo". ## Test plan Existing unit tests pass, added a few tests. And manually tested my Sg instance configured to act as both "dotcom" mode and a prototypical Cody Enterprise instance. ## Changelog Update the Cody APIs for chat or code completions to use the "new style" model configuration. This allows for great flexibility in configuring LLM providers and exposing new models, but also allows Cody Enterprise users to select different models for chats. This will warrant a longer, more detailed changelog entry for the patch release next week. As this unlocks many other exciting features.	2024-07-12 12:15:31 -07:00
Erik Seliger	a32b6131f3	codygateway: Use only one redis pool and make REDIS_ENDPOINT a clear requirement in config (#63625 ) Currently, nothing really tells that Cody Gateway needs redis, the env var for finding the address is hidden somewhere deep in the redispool package. In practice, we only use one redis instance, but at some point we started using both redispool.Cache and redispool.Store, which means we maintain two connection pools, leading to more than expected connections. Test plan: Code review and CI.	2024-07-10 01:54:24 +02:00
Erik Seliger	169db11ce6	rcache: Explicitly pass redis pool to use (#63644 ) Recently, this was refactored to also allow using the redispool.Store. However, that makes it very implicit to know where something is being written, so instead we pass down the pool instance at instantiation. This also gives a slightly better overview of where redispool is actually required. Test plan: CI passes.	2024-07-10 01:23:19 +02:00
Hitesh Sagtani	eb16d802a3	adding deepseek-v2 and deepseek fine-tuned model trained on symbol graph context (#63702 ) ## Context 1. Adds support for deepseek-coder-v2 model and added fine-tuned on deepseek coder. ## Test plan ``` curl -vS -X POST http://localhost:9992/v1/completions/fireworks -H 'Authorization: bearer <SGD_TOKEN>' -d '{"stream":false,"max_tokens":50, "model": "fim-lang-specific-model-deepseek-stack-trained", "stop_sequences": ["\n\n"], "prompt": "const value = ", "stream":false, "languageId": "python"}' -H 'X-sourcegraph-feature: code_completions' ``` ``` curl -vS -X POST http://localhost:9992/v1/completions/fireworks -H 'Authorization: bearer <SGD_TOKEN>' -d '{"stream":false,"max_tokens":50, "model": "fim-lang-specific-model-deepseek-logs-trained", "stop_sequences": ["\n\n"], "prompt": "const value = ", "stream":false, "languageId": "python"}' -H 'X-sourcegraph-feature: code_completions' ```	2024-07-09 17:57:16 +05:30
Robert Lin	fcfdba7e7b	fix/codygateway: tweak enterprise-portal dial options (#63692 ) As titled - we ran into 1 occurrence of a sync failure again: https://sourcegraph.slack.com/archives/C076472745A/p1720278447544509 ## Test plan n/a	2024-07-08 13:21:26 -07:00
Beatrix	806ff434a6	feat(cody-gateway): add support for Gemini models with context cache (#63413 ) PART OF https://linear.app/sourcegraph/issue/CODY-2451 CLOSE https://linear.app/sourcegraph/issue/CODY-2513 - Add Gemini 1.5 Flash 001 and Gemini 1.5 Pro 001 models to the config and allowed models lists - These fixed stable versions support context caching, as noted in the [Google Gemini API docs](https://ai.google.dev/gemini-api/docs/caching?lang=node) ![image](https://github.com/sourcegraph/sourcegraph/assets/68532117/1857c853-7c8d-4446-a991-4bc6a39e6065) NEXT: Implement context caching in the codebase. Right now using the newly added models alone do not work with context caching. <!-- 💡 To write a useful PR description, make sure that your description covers: - WHAT this PR is changing: - How was it PREVIOUSLY. - How it will be from NOW on. - WHY this PR is needed. - CONTEXT, i.e. to which initiative, project or RFC it belongs. The structure of the description doesn't matter as much as covering these points, so use your best judgement based on your context. Learn how to write good pull request description: https://www.notion.so/sourcegraph/Write-a-good-pull-request-description-610a7fd3e613496eb76f450db5a49b6e?pvs=4 --> ## Test plan <!-- All pull requests REQUIRE a test plan: https://docs-legacy.sourcegraph.com/dev/background-information/testing_principles --> No feature changes. Adding new model to allow list. ## Changelog <!-- 1. Ensure your pull request title is formatted as: $type($domain): $what 2. Add bullet list items for each additional detail you want to cover (see example below) 3. You can edit this after the pull request was merged, as long as release shipping it hasn't been promoted to the public. 4. For more information, please see this how-to https://www.notion.so/sourcegraph/Writing-a-changelog-entry-dd997f411d524caabf0d8d38a24a878c? Audience: TS/CSE > Customers > Teammates (in that order). Cheat sheet: $type = chore\|fix\|feat $domain: source\|search\|ci\|release\|plg\|cody\|local\|... --> <!-- Example: Title: fix(search): parse quotes with the appropriate context Changelog section: ## Changelog - When a quote is used with regexp pattern type, then ... - Refactored underlying code. -->	2024-07-03 09:28:13 -07:00
Robert Lin	ad03371193	fix/cody-gateway: use keepalive/idle timeout options for Enterprise Portal (#63605 ) See https://linear.app/sourcegraph/issue/CORE-203 - it seems the default keepalive and idle options are quite aggressive about keeping idle connections around without verifying them. This change tries to configure some options to ensure idle connections aren't retained for a long time. ## Test plan `sg start cody-gateway` with `CODY_GATEWAY_ENTERPRISE_PORTAL_URL: https://enterprise-portal.sgdev.org:443` in override ``` [ cody-gateway] INFO cody-gateway.sources.worker.handler actor/source.go:154 Completed sync {"TraceId": "b30602af4f6f3269ed438c86ca37edcc", "SpanId": "154e7f8114b27cab", "handle.timeout": "2m0s", "source": "dotcom-product-subscriptions", "sync_duration": "800.20575ms", "seen": 161} [ cody-gateway] INFO cody-gateway.sources.worker.handler actor/source.go:165 All sources synced {"TraceId": "b30602af4f6f3269ed438c86ca37edcc", "SpanId": "8f226cc9f0159b70", "handle.timeout": "2m0s"} ```	2024-07-03 08:40:50 -07:00
Robert Lin	5f37089303	chore/codygatewayevents: extract into standalone package for reuse, split up internal/codygateway (#63528 ) Allows us to directly reuse the Cody Gateway usage queries so that they can be served directly from Enterprise Portal (https://github.com/sourcegraph/sourcegraph/pull/63531). To enable this we also need to split up the monolithic `internal/codygateway` package so that not all roads lead back to `conf`: - `internal/codygateway`: Client mechanisms - `internal/codygateway/codygatewayevents`: Cody Gateway events service + related consts - `internal/codygateway/codygatewayactor`: Cody Gateway actor types Part of https://linear.app/sourcegraph/issue/CORE-201 ## Test plan n/a	2024-06-28 12:03:16 -07:00
Ólafur Páll Geirsson	a426134fd4	Gateway: forward X-Fireworks-Genie header from client (#63460 ) Previously, there was no way to enable the "tracing" feature from Fireworks https://readme.fireworks.ai/docs/enabling-tracing This PR solves the problem by forwarding the `X-Fireworks-Genie` HTTP header to Fireworks if this HTTP header is set by the Gateway client. Fixes CODY-2555 <!-- 💡 To write a useful PR description, make sure that your description covers: - WHAT this PR is changing: - How was it PREVIOUSLY. - How it will be from NOW on. - WHY this PR is needed. - CONTEXT, i.e. to which initiative, project or RFC it belongs. The structure of the description doesn't matter as much as covering these points, so use your best judgement based on your context. Learn how to write good pull request description: https://www.notion.so/sourcegraph/Write-a-good-pull-request-description-610a7fd3e613496eb76f450db5a49b6e?pvs=4 --> ## Test plan <!-- All pull requests REQUIRE a test plan: https://docs-legacy.sourcegraph.com/dev/background-information/testing_principles --> N/A ## Changelog <!-- 1. Ensure your pull request title is formatted as: $type($domain): $what 2. Add bullet list items for each additional detail you want to cover (see example below) 3. You can edit this after the pull request was merged, as long as release shipping it hasn't been promoted to the public. 4. For more information, please see this how-to https://www.notion.so/sourcegraph/Writing-a-changelog-entry-dd997f411d524caabf0d8d38a24a878c? Audience: TS/CSE > Customers > Teammates (in that order). Cheat sheet: $type = chore\|fix\|feat $domain: source\|search\|ci\|release\|plg\|cody\|local\|... --> <!-- Example: Title: fix(search): parse quotes with the appropriate context Changelog section: ## Changelog - When a quote is used with regexp pattern type, then ... - Refactored underlying code. -->	2024-06-25 02:49:26 +00:00
Quinn Slack	91bc23d8e1	support fast, simple `sg start single-program-experimental-blame-sqs` for local dev (#63435 ) This makes it easier to run Sourcegraph in local dev by compiling a few key services (frontend, searcher, repo-updater, gitserver, and worker) into a single Go binary and running that. Compared to `sg start` (which compiles and runs ~10 services), it's faster to start up (by ~10% or a few seconds), takes a lot less memory and CPU when running, has less log noise, and rebuilds faster. It is slower to recompile for changes just to `frontend` because it needs to link in more code on each recompile, but it's faster for most other Go changes that require recompilation of multiple services. This is only intended for local dev as a convenience. There may be different behavior in this mode that could result in problems when your code runs in the normal deployment. Usually our e2e tests should catch this, but to be safe, you should run in the usual mode if you are making sensitive cross-service changes. Partially reverts "svcmain: Simplify service setup (#61903)" (commit `9541032292`). ## Test plan Existing tests cover any regressions to existing behavior. This new behavior is for local dev only.	2024-06-24 21:12:47 +00:00
Beatrix	b3fe6dceb6	fix(cody-gateway): getAPIURL before transformBody (#63406 ) <!-- 💡 To write a useful PR description, make sure that your description covers: - WHAT this PR is changing: - How was it PREVIOUSLY. - How it will be from NOW on. - WHY this PR is needed. - CONTEXT, i.e. to which initiative, project or RFC it belongs. The structure of the description doesn't matter as much as covering these points, so use your best judgement based on your context. Learn how to write good pull request description: https://www.notion.so/sourcegraph/Write-a-good-pull-request-description-610a7fd3e613496eb76f450db5a49b6e?pvs=4 --> Fix an issue where the requestBody is used after transformBody has been executed. ## Test plan <!-- All pull requests REQUIRE a test plan: https://docs-legacy.sourcegraph.com/dev/background-information/testing_principles --> 1. Start Cody Gateway locally 2. Start SG local dev instance 3. Connect SG local dev instance to your local Cody Gateway instance 4. Set Gemini Flash as your chatModel 5. Connect Cody to your local dev instance 6. Ask Cody a question and verify you are getting a response ![image](https://github.com/sourcegraph/sourcegraph/assets/68532117/fbce22f9-8531-4f6e-8eb7-5c6b26e0a9fa) ## Changelog <!-- 1. Ensure your pull request title is formatted as: $type($domain): $what 8. Add bullet list items for each additional detail you want to cover (see example below) 9. You can edit this after the pull request was merged, as long as release shipping it hasn't been promoted to the public. 10. For more information, please see this how-to https://www.notion.so/sourcegraph/Writing-a-changelog-entry-dd997f411d524caabf0d8d38a24a878c? Audience: TS/CSE > Customers > Teammates (in that order). Cheat sheet: $type = chore\|fix\|feat $domain: source\|search\|ci\|release\|plg\|cody\|local\|... --> <!-- Example: Title: fix(search): parse quotes with the appropriate context Changelog section: ## Changelog - When a quote is used with regexp pattern type, then ... - Refactored underlying code. -->	2024-06-20 16:12:16 -07:00
Beatrix	18c7ba8dac	Cody Gateway: New Claude 3.5 Sonnet model (#63395 ) CONTEXT: https://sourcegraph.slack.com/archives/C05AGQYD528/p1718898110684289?thread_ts=1718896254.676939&cid=C05AGQYD528 CLOSE https://linear.app/sourcegraph/issue/CODY-2177 Adding new Claude 3.5 Sonnet (`claude-3.5-sonnet-20240620`) to the Cody Gateway allow list. Model ID based on Anthropic Console: ![image](https://github.com/sourcegraph/sourcegraph/assets/68532117/6f27b24f-a7f5-4b3f-85a9-c0eed1babe9b) Claude 3.5 Sonnet is Live on [s0.dev](http://s0.dev/) to confirm this is the correct model ID <!-- 💡 To write a useful PR description, make sure that your description covers: - WHAT this PR is changing: - How was it PREVIOUSLY. - How it will be from NOW on. - WHY this PR is needed. - CONTEXT, i.e. to which initiative, project or RFC it belongs. The structure of the description doesn't matter as much as covering these points, so use your best judgement based on your context. Learn how to write good pull request description: https://www.notion.so/sourcegraph/Write-a-good-pull-request-description-610a7fd3e613496eb76f450db5a49b6e?pvs=4 --> ## Test plan <!-- All pull requests REQUIRE a test plan: https://docs-legacy.sourcegraph.com/dev/background-information/testing_principles --> Verify you can use the new model through Cody Gateway ## Changelog <!-- 1. Ensure your pull request title is formatted as: $type($domain): $what 2. Add bullet list items for each additional detail you want to cover (see example below) 3. You can edit this after the pull request was merged, as long as release shipping it hasn't been promoted to the public. 4. For more information, please see this how-to https://www.notion.so/sourcegraph/Writing-a-changelog-entry-dd997f411d524caabf0d8d38a24a878c? Audience: TS/CSE > Customers > Teammates (in that order). Cheat sheet: $type = chore\|fix\|feat $domain: source\|search\|ci\|release\|plg\|cody\|local\|... --> <!-- Example: Title: fix(search): parse quotes with the appropriate context Changelog section: ## Changelog - When a quote is used with regexp pattern type, then ... - Refactored underlying code. --> feature(plg): new Claude 3.5 Sonnet model support for Cody Pro users	2024-06-20 09:46:33 -07:00
Chris Smith	46134524c7	refactor(cody): Reshape the `CompletionsClient` interface (#63358 ) This PR refactors the `CompletionsClient` interface, and all the corresponding call sites. There is no functional change, beyond bundling several function parameters into a new type. See `internal/completions/types/types.go`. But the gist is this putting 3x parameters into a single `CompletionRequest` type. ```diff Complete( context.Context, log.Logger - CompletionsFeature, - CompletionsVersion, - CompletionRequestParameters + CompletionRequest ) (CompletionResponse, error) ``` ## Why? As part of reworking the codepath between receiving a completion request, dispatching it to the right `CompletionsClient` implementation, and serving the request, I need some "hooks" to inject new information. In a future PR I plan on adding a `ServerSideModelConfig` as another field to the `CompletionRequest`, so that when the `CompletionClient`'s implementation is trying to serve that request it has any additional data it needs. (For example, the AWS Bedrock provisioned capacity ARN, etc.) ## Test plan Updated existing tests, relying on CI/CD for any other issues. ## Changelog NA, just some under the hood refactoring that shouldn't impact any functionality.	2024-06-19 19:17:32 -07:00
Rafał Gajdulewicz	b7dd61769c	Use math/rand/v2 (#63346 ) Switches code from https://github.com/sourcegraph/sourcegraph/pull/63315 to use [math/rand/v2](https://pkg.go.dev/math/rand/v2) as suggested by @keegancsmith. ## Test plan - tested locally with a backend throwing random 404	2024-06-19 15:31:07 +00:00
Robert Lin	557b4df0ed	chore/deps: upgrade grpc, prometheus/common (#63328 ) This change extracts the unrelated transitive upgrades of https://github.com/sourcegraph/sourcegraph/pull/63171 (CORE-177) into a separate PR. I'm making this because @unknwon ran into issues with the exact same dependencies in https://github.com/sourcegraph/sourcegraph/pull/63171#issuecomment-2157694545. The change consists of upgrades to: - `google.golang.org/grpc` - there's a deprecation of `grpc.DialContext` that we agreed in #63171 to keep for now. - removing our `replace` directive on `github.com/prometheus/common` and upgrading it. This is safe to do because our Alertmanager version is already way ahead, and the reason this has a `replace` is outdated now. ## Test plan CI, nothing blows up on `sg start` and I can click around and do a bit of searching	2024-06-19 09:55:44 -04:00
Rafał Gajdulewicz	c68cd521cf	Retry 404 errors from Triton (#63315 ) Currently, when SMEGA scales down a pod, it will return 404 for a period of time, and that 404 will get translated into a 500 response from Cody Gateway. This PR implements (exponential) retries (with jitter), attempting to send the same request to a different Triton pod. Closes AI-86. Related to AI-31, AI-87. ## Test plan - tested locally with a backend that throws random 404	2024-06-19 13:43:56 +01:00
Beatrix	0c777bac41	fix(cody-gateway): streaming google endpoint (#63306 ) <!-- 💡 To write a useful PR description, make sure that your description covers: - WHAT this PR is changing: - How was it PREVIOUSLY. - How it will be from NOW on. - WHY this PR is needed. - CONTEXT, i.e. to which initiative, project or RFC it belongs. The structure of the description doesn't matter as much as covering these points, so use your best judgement based on your context. Learn how to write good pull request description: https://www.notion.so/sourcegraph/Write-a-good-pull-request-description-610a7fd3e613496eb76f450db5a49b6e?pvs=4 --> Issue: Currently, the ShouldStream() method will always returns false because the Stream value is removed before it was passed into the Handler. To fix this, we will store the original googleRequest.Stream value if it's true so that ShouldStream() will return the correct Stream value. We will also use the transformBody method to remove the Stream value before we send it to Google API. Here is the expected behaviour after the stream is fixed: https://github.com/sourcegraph/sourcegraph/assets/68532117/8324fb8c-0625-4579-b0e9-0abfc9858961 Also confirmed it works with both Cody Gateway and BYOK: ![image](https://github.com/sourcegraph/sourcegraph/assets/68532117/9fe60423-a05b-412d-812a-f34cd812d9dc) ## Test plan <!-- All pull requests REQUIRE a test plan: https://docs-legacy.sourcegraph.com/dev/background-information/testing_principles --> Always stream Cody Gateway's requests for Google Gemini models as we haven't implemented Code Completion feature on the client side. ### Non-stream request ``` ❯ curl 'https://sourcegraph.test:3443/.api/completions/code' -i \ -X POST \ -H 'authorization: token LOCALTOKEN' \ --data-raw '{"messages":[{"speaker":"human","text":"Who are you?"}],"maxTokensToSample":30,"temperature":0,"stopSequences":[],"timeoutMs":5000}' HTTP/2 200 access-control-allow-credentials: true access-control-allow-origin: alt-svc: h3=":3443"; ma=2592000 cache-control: no-cache, max-age=0 content-type: text/plain; charset=utf-8 date: Tue, 18 Jun 2024 21:05:38 GMT server: Caddy server: Caddy set-cookie: sourcegraphDeviceId=d4fa7789-2442-472a-b425-a68372d27944; Expires=Wed, 18 Jun 2025 21:05:36 GMT; Secure vary: Cookie, Accept-Encoding, Authorization, Cookie, Authorization, X-Requested-With, Cookie x-content-type-options: nosniff x-frame-options: DENY x-powered-by: Express x-trace: 00f998a2a2e1b6895687ad7cc567b41c x-trace-span: da9c93d16415b94f x-trace-url: https://sourcegraph.test:3443/-/debug/jaeger/trace/00f998a2a2e1b6895687ad7cc567b41c x-xss-protection: 1; mode=block content-length: 147 {"completion":"I am a large language model, trained by Google. \n\nHere's what that means:\n\n* Large Language Model: I'm","stopReason":"STOP"}% ``` ### Streaming request: ``` ❯ curl 'https://sourcegraph.test:3443/.api/completions/stream' -i \ -X POST \ -H 'authorization: token $LOCALTOKEN' \ --data-raw '{"stream":true,"messages":[{"speaker":"human","text":"Who are you?"}],"maxTokensToSample":1000,"temperature":0,"stopSequences":[],"timeoutMs":5000}' HTTP/2 200 access-control-allow-credentials: true access-control-allow-origin: alt-svc: h3=":3443"; ma=2592000 cache-control: no-cache content-type: text/event-stream date: Tue, 18 Jun 2024 21:07:02 GMT server: Caddy server: Caddy set-cookie: sourcegraphDeviceId=38b45f36-d237-4f8d-8242-a63fcc801a32; Expires=Wed, 18 Jun 2025 21:06:59 GMT; Secure vary: Cookie, Accept-Encoding, Authorization, Cookie, Authorization, X-Requested-With, Cookie x-accel-buffering: no x-content-type-options: nosniff x-frame-options: DENY x-powered-by: Express x-trace: 984932973626e14f7cb0ce7e8e470717 x-trace-span: d285179cfb744e08 x-trace-url: https://sourcegraph.test:3443/-/debug/jaeger/trace/984932973626e14f7cb0ce7e8e470717 x-xss-protection: 1; mode=block event: completion data: {"completion":"I","stopReason":"STOP"} event: completion data: {"completion":"I am a large language model, trained by Google. \n\nHere's what","stopReason":"STOP"} event: completion data: {"completion":"I am a large language model, trained by Google. \n\nHere's what that means:\n\n* I am not a person. I am a computer","stopReason":"STOP"} event: completion data: {"completion":"I am a large language model, trained by Google. \n\nHere's what that means:\n\n* I am not a person. I am a computer program designed to process and generate human-like text. \n* I learn from data. I was trained on a massive dataset of text and code,","stopReason":"STOP"} event: completion data: {"completion":"I am a large language model, trained by Google. \n\nHere's what that means:\n\n* I am not a person. I am a computer program designed to process and generate human-like text. \n* I learn from data. I was trained on a massive dataset of text and code, which allows me to generate text, translate languages, write different kinds of creative content, and answer your questions in an informative way.\n* *I am still","stopReason":"STOP"} event: completion data: {"completion":"I am a large language model, trained by Google. \n\nHere's what that means:\n\n I am not a person. I am a computer program designed to process and generate human-like text. \n* I learn from data. I was trained on a massive dataset of text and code, which allows me to generate text, translate languages, write different kinds of creative content, and answer your questions in an informative way.\n* I am still under development. I am constantly learning and improving, but I am not perfect and can sometimes make mistakes.\n\nHow can I help you today? \n","stopReason":"STOP"} event: done data: {} ``` ## Changelog <!-- 1. Ensure your pull request title is formatted as: $type($domain): $what 2. Add bullet list items for each additional detail you want to cover (see example below) 3. You can edit this after the pull request was merged, as long as release shipping it hasn't been promoted to the public. 4. For more information, please see this how-to https://www.notion.so/sourcegraph/Writing-a-changelog-entry-dd997f411d524caabf0d8d38a24a878c? Audience: TS/CSE > Customers > Teammates (in that order). Cheat sheet: $type = chore\|fix\|feat $domain: source\|search\|ci\|release\|plg\|cody\|local\|... --> <!-- Example: Title: fix(search): parse quotes with the appropriate context Changelog section: ## Changelog - When a quote is used with regexp pattern type, then ... - Refactored underlying code. -->	2024-06-18 22:25:26 +00:00
Chris Smith	692cecad6e	fix(cody-gateway): Fix Google flagging configuration (#63305 ) We recently noticed that 100% of all LLM requests routed to Google-provided LLMs were getting flagged, and for the same reasons ["high_max_tokens_to_sample","blocked_phrase"]. After FAR, FAR more head scratching than I care to admit to. I realized the problem: that we were never actually initializing the `Google.FlaggingConfig` settings. So when we were inspecting Gemini requests for potential abuse, we were comparing them against the zero-state for `flaggingConfig`. i.e. does this prompt have a higher `MaxTokensToSample` than 0? 🤦 Super easy mistake to make. We now confirm that _something_ is in the `flaggingConfig` before assuming it is legitimate. Fixes https://github.com/sourcegraph/abuse-ban-bot/issues/32. ## Test plan CI/CD	2024-06-17 14:16:29 -07:00
David Veszelovszki	7452324ea5	fix(cody-gateway): Disable flagging Google requests (#63295 ) - Fixes https://github.com/sourcegraph/sourcegraph/issues/63294 This PR turns off flagging for all Google models entirely. ## Test plan Not tested yet.	2024-06-17 16:16:32 +00:00
Hitesh Sagtani	d01358ff12	adding deepseek and lang specific mixtral for completions ab experiment (#63283 ) ## Context 1. Adds support for deepseek-coder model and updated Mixtral finetuned-FIM models identifiers hosted on Fireworks. 2. Client side pull request: https://github.com/sourcegraph/cody/pull/4577 ## Test plan ``` curl -vS -X POST http://localhost:9992/v1/completions/fireworks -H 'Authorization: bearer <SGD_TOKEN>' -d '{"stream":false,"max_tokens":50, "model": "accounts/sourcegraph/models/deepseek-coder-7b-base", "stop_sequences": ["\n\n"], "prompt": "const value = ", "stream":false}' -H 'X-sourcegraph-feature: code_completions' ``` ``` curl -vS -X POST http://localhost:9992/v1/completions/fireworks -H 'Authorization: bearer <SGD_TOKEN>' -d '{"stream":false,"max_tokens":50, "model": "accounts/sourcegraph/models/custom-deepseek-1p3b-base-hf-version", "stop_sequences": ["\n\n"], "prompt": "const value = ", "stream":false}' -H 'X-sourcegraph-feature: code_completions' ``` ``` curl -vS -X POST http://localhost:9992/v1/completions/fireworks -H 'Authorization: bearer <SGD_TOKEN>' -d '{"stream":false,"max_tokens":50, "model": "fim-lang-specific-model-mixtral", "stop_sequences": ["\n\n"], "prompt": "const value = ", "stream":false}' -H 'X-sourcegraph-feature: code_completions' ```	2024-06-17 16:28:24 +05:30
Beatrix	8bf288e153	fix(cody-gateway): Improve prompt and request validation for gemini (#63258 ) This PR aims to fix the issue where Cody Gateway is returning error regarding the unsupported `Stream` field showing up in the request: ![image](https://github.com/sourcegraph/sourcegraph/assets/68532117/9653287f-c53b-4301-9419-a3212db1521f) Changes included: - Moved the google-specific types from google.go to a new google_types.go file - Reorganized the types to improve readability and maintainability - Removed unused fields and methods from the types - Aligned the types with the latest Google API documentation Also fix an issue with google provider for byok customers about last assistant message being empty, which is the default format sent from clients (e.g. VS Code): ![image](https://github.com/sourcegraph/sourcegraph/assets/68532117/a1941ead-518d-469b-8971-85f42f8b833e) This PR addresses this issue by removing the last assistant message if it's empty during the prompt building step to make it more robust: - Validate that the input messages are not empty - Ensure the first message is a non-empty assistant message - Skip empty assistant messages at the end of the prompt - Disallow consistent speaker role between consecutive messages - Add tests for the various validation cases <!-- 💡 To write a useful PR description, make sure that your description covers: - WHAT this PR is changing: - How was it PREVIOUSLY. - How it will be from NOW on. - WHY this PR is needed. - CONTEXT, i.e. to which initiative, project or RFC it belongs. The structure of the description doesn't matter as much as covering these points, so use your best judgement based on your context. Learn how to write good pull request description: https://www.notion.so/sourcegraph/Write-a-good-pull-request-description-610a7fd3e613496eb76f450db5a49b6e?pvs=4 --> ## Test plan <!-- All pull requests REQUIRE a test plan: https://docs-legacy.sourcegraph.com/dev/background-information/testing_principles --> Update Site Config to connect to google API directly in your local instance, and then log into your local instance in VS Code. Verify the responses are correct with no errors. ![Screenshot 2024-06-14 at 10 03 41 AM](https://github.com/sourcegraph/sourcegraph/assets/68532117/d687f9c0-9d25-41bb-9907-e16af53bb09e) ## Changelog <!-- 1. Ensure your pull request title is formatted as: $type($domain): $what 2. Add bullet list items for each additional detail you want to cover (see example below) 3. You can edit this after the pull request was merged, as long as release shipping it hasn't been promoted to the public. 4. For more information, please see this how-to https://www.notion.so/sourcegraph/Writing-a-changelog-entry-dd997f411d524caabf0d8d38a24a878c? Audience: TS/CSE > Customers > Teammates (in that order). Cheat sheet: $type = chore\|fix\|feat $domain: source\|search\|ci\|release\|plg\|cody\|local\|... --> <!-- Example: Title: fix(search): parse quotes with the appropriate context Changelog section: ## Changelog - When a quote is used with regexp pattern type, then ... - Refactored underlying code. --> --------- Co-authored-by: Chris Smith <chrsmith@users.noreply.github.com>	2024-06-14 11:58:42 -07:00
Jan Hartman	c59ece1fd2	Use 8B version of llama3 in metadata generation (#63212 ) In offline evals this seems as good as 70b while being a fair bit faster. ## Test plan Tested locally.	2024-06-13 13:40:56 +02:00
Beatrix	e1551657b1	Cody Gateway: Add support for Google non-streaming endpoint (#63166 ) Add support for non-stream request for Google Gemini provider - Added `Stream` field to `googleRequest` struct to enable streaming completions - Added `SymtemInstruction` field to `googleRequest` struct to allow setting system instructions - Updated `GoogleHandlerMethods.validateRequest` to allow `FeatureEmbeddings` instead of `FeatureCodeCompletions` - Updated `GoogleHandlerMethods.getRequestMetadata` to return the `Stream` field - Updated `GoogleGatewayFeatureClient.GetRequest` to handle streaming for both `FeatureCodeCompletions` and `FeatureChatCompletions` - Removed unsupported feature checks in `googleCompletionStreamClient` - Added Gemini 1.5 Flash and Gemini 1.0 Pro to autocomplete allowed list (but not supported by clients atm) <!-- 💡 To write a useful PR description, make sure that your description covers: - WHAT this PR is changing: - How was it PREVIOUSLY. - How it will be from NOW on. - WHY this PR is needed. - CONTEXT, i.e. to which initiative, project or RFC it belongs. The structure of the description doesn't matter as much as covering these points, so use your best judgement based on your context. Learn how to write good pull request description: https://www.notion.so/sourcegraph/Write-a-good-pull-request-description-610a7fd3e613496eb76f450db5a49b6e?pvs=4 --> ## Test plan <!-- All pull requests REQUIRE a test plan: https://docs-legacy.sourcegraph.com/dev/background-information/testing_principles --> Unit tests updated for non-stream request. To manually test this: 1. In your Soucegraph local instance's Site Config, add the following: ``` "completions": { "accessToken": "REDACTED", "chatModel": "gemini-1.5-pro-latest", "completionModel": "google/gemini-1.5-flash-latest", "provider": "google", ``` Note: You can get the accessToken for Gemini API in 1Password. 2. After saving the site config with the above change, run the following curl command that hits the code endpoint: ``` curl 'https://sourcegraph.test:3443/.api/completions/stream' -i \ -X POST \ -H 'authorization: token $YOUR_LOCAL_TOKEN' \ --data-raw '{"messages":[{"speaker":"human","text":"Who are you?"}],"maxTokensToSample":30,"temperature":0,"stopSequences":[],"timeoutMs":5000,"stream":false,"model":"gemini-1.5-pro-latest"}' ``` Output: ``` ❯ curl 'https://sourcegraph.test:3443/.api/completions/stream' -i \ -X POST \ -H 'authorization: token $YOUR_LOCAL_TOKEN' \ --data-raw '{"messages":[{"speaker":"human","text":"Who are you?"}],"maxTokensToSample":30,"temperature":0,"stopSequences":[],"timeoutMs":5000,"stream":false,"model":"gemini-1.5-pro-latest"}' HTTP/2 200 access-control-allow-credentials: true access-control-allow-origin: alt-svc: h3=":3443"; ma=2592000 cache-control: no-cache, max-age=0 content-type: text/plain; charset=utf-8 date: Tue, 11 Jun 2024 17:02:19 GMT server: Caddy server: Caddy vary: Accept-Encoding, Authorization, Cookie, Authorization, X-Requested-With, Cookie x-content-type-options: nosniff x-frame-options: DENY x-powered-by: Express x-trace: e11a2ce292639414dd2ccdfcbfa89611 x-trace-span: 9457aa0dd0e09b6c x-trace-url: https://sourcegraph.test:3443/-/debug/jaeger/trace/e11a2ce292639414dd2ccdfcbfa89611 x-xss-protection: 1; mode=block content-length: 154 {"completion":"I am a large language model, trained by Google. \n\nHere's what that means:\n\n* I am a computer program: I","stopReason":"MAX_TOKENS"}% ``` ## Changelog <!-- 1. Ensure your pull request title is formatted as: $type($domain): $what 2. Add bullet list items for each additional detail you want to cover (see example below) 3. You can edit this after the pull request was merged, as long as release shipping it hasn't been promoted to the public. 4. For more information, please see this how-to https://www.notion.so/sourcegraph/Writing-a-changelog-entry-dd997f411d524caabf0d8d38a24a878c? Audience: TS/CSE > Customers > Teammates (in that order). Cheat sheet: $type = chore\|fix\|feat $domain: source\|search\|ci\|release\|plg\|cody\|local\|... --> <!-- Example: Title: fix(search): parse quotes with the appropriate context Changelog section: ## Changelog - When a quote is used with regexp pattern type, then ... - Refactored underlying code. -->	2024-06-11 10:54:27 -07:00
Beatrix	d288874197	Cody Gateway: handle streams with trailing newline in Gemini response (#63172 ) CONTEXT: https://sourcegraph.slack.com/archives/C05ABRRGB0B/p1717790701356599 Fix an issue where a valid Gemini response ends with new lines, causing a false alert. ![image](https://github.com/sourcegraph/sourcegraph/assets/68532117/6b69ff2e-b88e-435b-a50e-27eaa5e31bf9) Issue: We are seeing `*errutil.leafError: no Google response found` in Sentry complaining when I run a command or chat in VS Code using google as provider. Cause: Currently the code (added my me) would skip to the last line of the stream response and determine if the response is valid or not, which could be an issue because the stream API could ends the response with an empty new line, where our current logic would fail. Changes included in this PR: - Modify `parseGoogleTokenUsage` function to find the last non-empty line in the stream, to handle cases where the stream ends with a newline - Add a test case to cover the scenario where the stream ends with a newline This change modifies the behavior of the `parseGoogleTokenUsage` function to handle streams with trailing newlines, which would fail even if the response is valid because Gemini adds a new line to the end of their stream. <!-- 💡 To write a useful PR description, make sure that your description covers: - WHAT this PR is changing: - How was it PREVIOUSLY. - How it will be from NOW on. - WHY this PR is needed. - CONTEXT, i.e. to which initiative, project or RFC it belongs. The structure of the description doesn't matter as much as covering these points, so use your best judgement based on your context. Learn how to write good pull request description: https://www.notion.so/sourcegraph/Write-a-good-pull-request-description-610a7fd3e613496eb76f450db5a49b6e?pvs=4 --> ## Test plan <!-- All pull requests REQUIRE a test plan: https://docs-legacy.sourcegraph.com/dev/background-information/testing_principles --> Make a curl command to the Gemini streaming API to confirm the response ends with a new line: ![image](https://github.com/sourcegraph/sourcegraph/assets/68532117/c0ff0c4a-40dd-4d49-91d3-1b9f01f5c2b9) ```sh curl https://generativelanguage.googleapis.com/v1beta/models/gemini-1.5-flash:streamGenerateContent\?alt=sse\&key\=$GOOGLE_API_KEY \ -H 'Content-Type: application/json' \ -X POST \ -d '{ "contents": [ {"role":"user", "parts":[{ "text": "Write the first line of a story about a magic backpack."}]}, {"role": "model", "parts":[{ "text": "In the bustling city of Meadow brook, lived a young girl named Sophie. She was a bright and curious soul with an imaginative mind."}]}, {"role": "user", "parts":[{ "text": "Can you set it in a quiet village in 1600s France?"}]}, ] }' 2> /dev/null ``` Copied the response to the test file and confirmed our current test would fail with the same error message: ![image](https://github.com/sourcegraph/sourcegraph/assets/68532117/0485d38f-0d3c-40de-b641-a679b16fd1f4) This is now handled by the newly added test with the actually stream response returned by calling the Gemini API. ## Changelog <!-- 1. Ensure your pull request title is formatted as: $type($domain): $what 2. Add bullet list items for each additional detail you want to cover (see example below) 3. You can edit this after the pull request was merged, as long as release shipping it hasn't been promoted to the public. 4. For more information, please see this how-to https://www.notion.so/sourcegraph/Writing-a-changelog-entry-dd997f411d524caabf0d8d38a24a878c? Audience: TS/CSE > Customers > Teammates (in that order). Cheat sheet: $type = chore\|fix\|feat $domain: source\|search\|ci\|release\|plg\|cody\|local\|... --> <!-- Example: Title: fix(search): parse quotes with the appropriate context Changelog section: ## Changelog - When a quote is used with regexp pattern type, then ... - Refactored underlying code. -->	2024-06-08 15:54:06 -07:00
Robert Lin	177bdae83b	chore/cody-gateway: instrument removeUnseenTokens (#63169 ) Sync spans seem to hang for quite a while after getting data from the source - this adds instrumentation on the potential cause, `removeUnseenTokens`, so that we can get a bit more detail from the traces: ![image](https://github.com/sourcegraph/sourcegraph/assets/23356519/76dcd0ca-e316-4f67-ae52-0efca25a8da7) ## Test plan n/a	2024-06-07 14:25:08 -07:00
Beatrix	d0add88218	feat(cody-gateway): add Google Gemini stable models to allowed models (#63163 ) This change adds the stable versions of the Google Gemini 1.5 Flash, Gemini 1.5 Pro, and Gemini Pro models to the list of allowed models in the Cody Gateway configuration and API endpoints. The changes are made in the following files: - `cmd/cody-gateway/shared/config/config.go` - `cmd/frontend/internal/dotcom/productsubscription/codygateway_dotcom_user.go` - `cmd/frontend/internal/httpapi/completions/chat.go` - `internal/conf/computed.go` The new models are added to the allowed model lists and also set as the default chat, fast chat, and completion models when the configuration is not explicitly set. This change ensures that users are defaulted to use the stable versions of the Google Gemini models with the latest versions are still being available for use in the Cody Gateway application. <!-- 💡 To write a useful PR description, make sure that your description covers: - WHAT this PR is changing: - How was it PREVIOUSLY. - How it will be from NOW on. - WHY this PR is needed. - CONTEXT, i.e. to which initiative, project or RFC it belongs. The structure of the description doesn't matter as much as covering these points, so use your best judgement based on your context. Learn how to write good pull request description: https://www.notion.so/sourcegraph/Write-a-good-pull-request-description-610a7fd3e613496eb76f450db5a49b6e?pvs=4 --> ## Test plan <!-- All pull requests REQUIRE a test plan: https://docs-legacy.sourcegraph.com/dev/background-information/testing_principles --> Changes are covered by current tests. ## Changelog <!-- 1. Ensure your pull request title is formatted as: $type($domain): $what 2. Add bullet list items for each additional detail you want to cover (see example below) 3. You can edit this after the pull request was merged, as long as release shipping it hasn't been promoted to the public. 4. For more information, please see this how-to https://www.notion.so/sourcegraph/Writing-a-changelog-entry-dd997f411d524caabf0d8d38a24a878c? Audience: TS/CSE > Customers > Teammates (in that order). Cheat sheet: $type = chore\|fix\|feat $domain: source\|search\|ci\|release\|plg\|cody\|local\|... --> <!-- Example: Title: fix(search): parse quotes with the appropriate context Changelog section: ## Changelog - When a quote is used with regexp pattern type, then ... - Refactored underlying code. -->	2024-06-07 14:20:22 -07:00
Robert Lin	7e9d8ec8dc	feat/cody-gateway: use Enterprise Portal for actor/productsubscriptions (#62934 ) Migrates Cody Gateway to use the new Enterprise Portal's "read-only" APIs. For the most part, this is an in-place replacement - a lot of the diff is in testing and minor changes. Some changes, such as the removal of model allowlists, were made down the PR stack in https://github.com/sourcegraph/sourcegraph/pull/62911. At a high level, we replace the data requested by `cmd/cody-gateway/internal/dotcom/operations.graphql` and replace it with Enterprise Portal RPCs: - `codyaccessv1.GetCodyGatewayAccess` - `codyaccessv1.ListCodyGatewayAccesses` Use cases that previously required retrieving the active license tags now: 1. Use the display name provided by the Cody Access API https://github.com/sourcegraph/sourcegraph/pull/62968 2. Depend on the connected Enterprise Portal dev instance to only return dev subscriptions https://github.com/sourcegraph/sourcegraph/pull/62966 Closes https://linear.app/sourcegraph/issue/CORE-98 Related to https://linear.app/sourcegraph/issue/CORE-135 (https://github.com/sourcegraph/sourcegraph/pull/62909, https://github.com/sourcegraph/sourcegraph/pull/62911) Related to https://linear.app/sourcegraph/issue/CORE-97 ## Local development This change also adds Enterprise Portal to `sg start dotcom`. For local development, we set up Cody Gateway to connect to Enterprise Portal such that zero configuration is needed - all the required secrets are sourced from the `sourcegrah-local-dev` GCP project automatically when you run `sg start dotcom`, and local Cody Gateway will talk to local Enterprise Portal to do the Enterprise subscriptions sync. This is actually an upgrade from the current experience where you need to provide Cody Gateway a Sourcegraph user access token to test Enterprise locally, though the Sourcegraph user access token is still required for the PLG actor source. The credential is configured in https://console.cloud.google.com/security/secret-manager/secret/SG_LOCAL_DEV_SAMS_CLIENT_SECRET/overview?project=sourcegraph-local-dev, and I've included documentation in the secret annotation about what it is for and what to do with it: ![image](https://github.com/sourcegraph/sourcegraph/assets/23356519/c61ad4e0-3b75-408d-a930-076a414336fb) ## Rollout plan I will open PRs to set up the necessary configuration for Cody Gateway dev and prod. Once reviews taper down I'll cut an image from this branch and deploy it to Cody Gateway dev, and monitor it closely + do some manual testing. Once verified, I'll land this change and monitor a rollout to production. Cody Gateway dev SAMS client: https://github.com/sourcegraph/infrastructure/pull/6108 Cody Gateway prod SAMS client update (this one already exists): ``` accounts=> UPDATE idp_clients SET scopes = scopes \|\| '["enterprise_portal::subscription::read", "enterprise_portal::codyaccess::read"]'::jsonb WHERE id = 'sams_cid_018ea062-479e-7342-9473-66645e616cbf'; UPDATE 1 accounts=> select name, scopes from idp_clients WHERE name = 'Cody Gateway (prod)'; name \| scopes ---------------------+---------------------------------------------------------------------------------------------------------------------------------- Cody Gateway (prod) \| ["openid", "profile", "email", "offline_access", "enterprise_portal::subscription::read", "enterprise_portal::codyaccess::read"] (1 row) ``` Configuring the target Enterprise Portal instances: https://github.com/sourcegraph/infrastructure/pull/6127 ## Test plan Start the new `dotcom` runset, now including Enterprise Portal, and observe logs from both `enterprise-portal` and `cody-gateway`: ``` sg start dotcom ``` I reused the test plan from https://github.com/sourcegraph/sourcegraph/pull/62911: set up Cody Gateway external dependency secrets, then set up an enterprise subscription + license with a high seat count (for a high quota), and force a Cody Gateway sync: ``` curl -v -H 'Authorization: bearer sekret' http://localhost:9992/-/actor/sync-all-sources ``` This should indicate the new sync against "local dotcom" fetches the correct number of actors and whatnot. Using the local enterprise subscription's access token, we run the QA test suite: ```sh $ bazel test --runs_per_test=2 --test_output=all //cmd/cody-gateway/qa:qa_test --test_env=E2E_GATEWAY_ENDPOINT=http://localhost:9992 --test_env=E2E_GATEWAY_TOKEN=$TOKEN INFO: Analyzed target //cmd/cody-gateway/qa:qa_test (0 packages loaded, 0 targets configured). INFO: From Testing //cmd/cody-gateway/qa:qa_test (run 1 of 2): ==================== Test output for //cmd/cody-gateway/qa:qa_test (run 1 of 2): PASS ================================================================================ INFO: From Testing //cmd/cody-gateway/qa:qa_test (run 2 of 2): ==================== Test output for //cmd/cody-gateway/qa:qa_test (run 2 of 2): PASS ================================================================================ INFO: Found 1 test target... Target //cmd/cody-gateway/qa:qa_test up-to-date: bazel-bin/cmd/cody-gateway/qa/qa_test_/qa_test Aspect @@rules_rust//rust/private:clippy.bzl%rust_clippy_aspect of //cmd/cody-gateway/qa:qa_test up-to-date (nothing to build) Aspect @@rules_rust//rust/private:rustfmt.bzl%rustfmt_aspect of //cmd/cody-gateway/qa:qa_test up-to-date (nothing to build) INFO: Elapsed time: 13.653s, Critical Path: 13.38s INFO: 7 processes: 1 internal, 6 darwin-sandbox. INFO: Build completed successfully, 7 total actions //cmd/cody-gateway/qa:qa_test PASSED in 11.7s Stats over 2 runs: max = 11.7s, min = 11.7s, avg = 11.7s, dev = 0.0s Executed 1 out of 1 test: 1 test passes. ```	2024-06-07 11:46:01 -07:00
Jan Hartman	18bdafac78	Cody Gateway embeddings: powering with generated metadata - take 2 (#63112 ) Reverts sourcegraph/sourcegraph#63098 and fixes the problems with conf loading and Fireworks API responses. ## Test plan Try running locally and then with feature flag.	2024-06-07 12:33:11 +02:00
James McNamara	4077b3ec22	feat(ci): Adds playwright tests for sveltekit to bazel (#62560 ) This runs playwright tests with bazel. This changes how the app is served in the tests, specifically playwright will intercept all network calls to the local server and serve the static assets directly or serve root index.html file if nothing is matched. --------- Co-authored-by: bahrmichael <michael.bahr@sourcegraph.com> Co-authored-by: Jean-Hadrien Chabran <jh@chabran.fr> Co-authored-by: Michael Bahr <1830132+bahrmichael@users.noreply.github.com> Co-authored-by: Jean-Hadrien Chabran <jean-hadrien.chabran@sourcegraph.com> Co-authored-by: Camden Cheek <camden@ccheek.com>	2024-06-06 12:45:05 -06:00
Varun Gandhi	2955bb6cfb	chore: Change errors.HasType to respect multi-errors (#63024 ) With this patch, the `errors.HasType` API behaves similar to `Is` and `As`, where it checks the full error tree instead of just checking a linearized version of it, as cockroachdb/errors's `HasType` implementation does not respect multi-errors. As a consequence, a bunch of relationships between HasType and Is/As that you'd intuitively expect to hold are now true; see changes to `invariants_test.go`.	2024-06-06 13:02:14 +00:00
Rafał Gajdulewicz	08a1c6a6f6	Revert "Cody Gateway embeddings: powering with generated metadata" (#63098 ) Reverts sourcegraph/sourcegraph#63000 - this makes Cody Gateway hang (and fail to listen on 9992 port).	2024-06-05 14:56:03 +02:00
Jan Hartman	4327bf8fc1	Cody Gateway embeddings: powering with generated metadata (#63000 ) Connected to https://github.com/sourcegraph/bfg-private/pull/189 and https://github.com/sourcegraph/cody/pull/4414. We're introducing a hacky MVP to enable embeddings being powered by metadata that's generated from code. This PR is the bare minimum to make this work on CG. We plan to trigger metadata generation only if we're using a new (fake) model (this comes in via a feature flag) and if the request isn't a real-time query, but is a background indexing request. The implementation is really hacky, but is also really minimal. ## Test plan Testing locally through a feature flag.	2024-06-05 13:33:10 +02:00
Beatrix	f2590cbb36	Cody Gateway: Add Gemini models to PLG and Enterprise users (#63053 ) CLOSE https://github.com/sourcegraph/cody-issues/issues/211 & https://github.com/sourcegraph/cody-issues/issues/412 & https://github.com/sourcegraph/cody-issues/issues/412 UNBLOCK https://github.com/sourcegraph/cody/pull/4360 * Add support for Google Gemini AI models as chat completions provider * Add new `google` package to handle Google Generative AI client * Update `client.go` and `codygateway.go` to handle the new Google provider * Set default models for chat, fast chat, and completions when Google is the configured provider * Add gemini-pro to the allowed list <!-- 💡 To write a useful PR description, make sure that your description covers: - WHAT this PR is changing: - How was it PREVIOUSLY. - How it will be from NOW on. - WHY this PR is needed. - CONTEXT, i.e. to which initiative, project or RFC it belongs. The structure of the description doesn't matter as much as covering these points, so use your best judgement based on your context. Learn how to write good pull request description: https://www.notion.so/sourcegraph/Write-a-good-pull-request-description-610a7fd3e613496eb76f450db5a49b6e?pvs=4 --> ## Test plan <!-- All pull requests REQUIRE a test plan: https://docs-legacy.sourcegraph.com/dev/background-information/testing_principles --> For Enterprise instances using google as provider: 1. In your Soucegraph local instance's Site Config, add the following: ``` "accessToken": "REDACTED", "chatModel": "gemini-1.5-pro-latest", "provider": "google", ``` Note: You can get the accessToken for Gemini API in 1Password. 2. After saving the site config with the above change, run the following curl command: ``` curl 'https://sourcegraph.test:3443/.api/completions/stream' -i \ -X POST \ -H 'authorization: token $LOCAL_INSTANCE_TOKEN' \ --data-raw '{"messages":[{"speaker":"human","text":"Who are you?"}],"maxTokensToSample":30,"temperature":0,"stopSequences":[],"timeoutMs":5000,"stream":true,"model":"gemini-1.5-pro-latest"}' ``` 3. Expected Output: ``` ❯ curl 'https://sourcegraph.test:3443/.api/completions/stream' -i \ -X POST \ -H 'authorization: token <REDACTED>' \ --data-raw '{"messages":[{"speaker":"human","text":"Who are you?"}],"maxTokensToSample":30,"temperature":0,"stopSequences":[],"timeoutMs":5000,"stream":true,"model":"gemini-1.5-pro-latest"}' HTTP/2 200 access-control-allow-credentials: true access-control-allow-origin: alt-svc: h3=":3443"; ma=2592000 cache-control: no-cache content-type: text/event-stream date: Tue, 04 Jun 2024 05:45:33 GMT server: Caddy server: Caddy vary: Accept-Encoding, Authorization, Cookie, Authorization, X-Requested-With, Cookie x-accel-buffering: no x-content-type-options: nosniff x-frame-options: DENY x-powered-by: Express x-trace: d4b1f02a3e2882a3d52331335d217b03 x-trace-span: 728ec33860d3b5e6 x-trace-url: https://sourcegraph.test:3443/-/debug/jaeger/trace/d4b1f02a3e2882a3d52331335d217b03 x-xss-protection: 1; mode=block event: completion data: {"completion":"I","stopReason":"STOP"} event: completion data: {"completion":"I am a large language model, trained by Google. \n\nThink of me as","stopReason":"STOP"} event: completion data: {"completion":"I am a large language model, trained by Google. \n\nThink of me as a computer program that can understand and generate human-like text.","stopReason":"MAX_TOKENS"} event: done data: {} ``` Verified locally: ![image](https://github.com/sourcegraph/sourcegraph/assets/68532117/2e6c914d-7a77-4484-b693-16bbc394518c) #### Before Cody Gateway returns `no client known for upstream provider google` ```sh curl -X 'POST' -d '{"messages":[{"speaker":"human","text":"Who are you?"}],"maxTokensToSample":30,"temperature":0,"stopSequences":[],"timeoutMs":5000,"stream":true,"model":"google/gemini-1.5-pro-latest"}' -H 'Accept: application/json' -H 'Authorization: token $YOUR_DOTCOM_TOKEN' -H 'Content-Type: application/json' 'https://sourcegraph.com/.api/completions/stream' event: error data: {"error":"no client known for upstream provider google"} event: done data: { ``` ## Changelog <!-- 1. Ensure your pull request title is formatted as: $type($domain): $what 2. Add bullet list items for each additional detail you want to cover (see example below) 5. You can edit this after the pull request was merged, as long as release shipping it hasn't been promoted to the public. 6. For more information, please see this how-to https://www.notion.so/sourcegraph/Writing-a-changelog-entry-dd997f411d524caabf0d8d38a24a878c? Audience: TS/CSE > Customers > Teammates (in that order). Cheat sheet: $type = chore\|fix\|feat $domain: source\|search\|ci\|release\|plg\|cody\|local\|... --> <!-- Example: Title: fix(search): parse quotes with the appropriate context Changelog section: ## Changelog - When a quote is used with regexp pattern type, then ... - Refactored underlying code. --> Added support for Google as an LLM provider for Cody, with the following models available through Cody Gateway: Gemini Pro (`gemini-pro-latest`), Gemini 1.5 Flash (`gemini-1.5-flash-latest`), and Gemini 1.5 Pro (`gemini-1.5-pro-latest`).	2024-06-04 23:46:36 +00:00
Robert Lin	f952ceb8da	feat/cody-gateway: use wildcard for enterprise allowlists (#62911 ) This change makes Cody Gateway always apply a wildcard model allowlist, irrespective of what the configured model allowlist is for an Enterprise subscription is in dotcom (see #62909). The next PR in the stack, https://github.com/sourcegraph/sourcegraph/pull/62912, makes the GraphQL queries return similar results, and removes model allowlists from the subscription management UI. Closes https://linear.app/sourcegraph/issue/CORE-135 ### Context In https://sourcegraph.slack.com/archives/C05SZB829D0/p1715638980052279 we shared a decision we landed on as part of #62263: > Ignoring (then removing) per-subscription model allowlists: As part of the API discussions, we've also surfaced some opportunities for improvements - to make it easier to roll out new models to Enterprise, we're not including per-subscription model allowlists in the new API, and as part of the Cody Gateway migration (by end-of-June), we will update Cody Gateway to stop enforcing per-subscription model allowlists. Cody Gateway will still retain a Cody-Gateway-wide model allowlist. [@chrsmith](https://sourcegraph.slack.com/team/U061QHKUBJ8) is working on a broader design here and will have more to share on this later. This means there is one less thing for us to migrate as part of https://github.com/sourcegraph/sourcegraph/pull/62934, and avoids the need to add an API field that will be removed shortly post-migration. As part of this, rolling out new models to Enterprise customers no longer require additional code/override changes. ## Test plan Set up Cody Gateway locally as documented, then `sg start dotcom`. Set up an enterprise subscription + license with a high seat count (for a high quota), and force a Cody Gateway sync: ``` curl -v -H 'Authorization: bearer sekret' http://localhost:9992/-/actor/sync-all-sources ``` Verify we are using wildcard allowlist: ```sh $ redis-cli -p 6379 get 'v2:product-subscriptions:v2:slk_...' "{\"key\":\"slk_...\",\"id\":\"6ad033f4-c6da-43a9-95ef-f653bf59aaac\",\"name\":\"bobheadxi\",\"accessEnabled\":true,\"endpointAccess\":{\"/v1/attribution\":true},\"rateLimits\":{\"chat_completions\":{\"allowedModels\":[\"\"],\"limit\":660,\"interval\":86400000000000,\"concurrentRequests\":330,\"concurrentRequestsInterval\":10000000000},\"code_completions\":{\"allowedModels\":[\"\"],\"limit\":66000,\"interval\":86400000000000,\"concurrentRequests\":33000,\"concurrentRequestsInterval\":10000000000},\"embeddings\":{\"allowedModels\":[\"*\"],\"limit\":220000000,\"interval\":86400000000000,\"concurrentRequests\":110000000,\"concurrentRequestsInterval\":10000000000}},\"lastUpdated\":\"2024-05-24T20:28:58.283296Z\"}" ``` Using the local enterprise subscription's access token, we run the QA test suite: ```sh $ bazel test --runs_per_test=2 --test_output=all //cmd/cody-gateway/qa:qa_test --test_env=E2E_GATEWAY_ENDPOINT=http://localhost:9992 --test_env=E2E_GATEWAY_TOKEN=$TOKEN INFO: Analyzed target //cmd/cody-gateway/qa:qa_test (0 packages loaded, 0 targets configured). INFO: From Testing //cmd/cody-gateway/qa:qa_test (run 1 of 2): ==================== Test output for //cmd/cody-gateway/qa:qa_test (run 1 of 2): PASS ================================================================================ INFO: From Testing //cmd/cody-gateway/qa:qa_test (run 2 of 2): ==================== Test output for //cmd/cody-gateway/qa:qa_test (run 2 of 2): PASS ================================================================================ INFO: Found 1 test target... Target //cmd/cody-gateway/qa:qa_test up-to-date: bazel-bin/cmd/cody-gateway/qa/qa_test_/qa_test Aspect @@rules_rust//rust/private:clippy.bzl%rust_clippy_aspect of //cmd/cody-gateway/qa:qa_test up-to-date (nothing to build) Aspect @@rules_rust//rust/private:rustfmt.bzl%rustfmt_aspect of //cmd/cody-gateway/qa:qa_test up-to-date (nothing to build) INFO: Elapsed time: 13.653s, Critical Path: 13.38s INFO: 7 processes: 1 internal, 6 darwin-sandbox. INFO: Build completed successfully, 7 total actions //cmd/cody-gateway/qa:qa_test PASSED in 11.7s Stats over 2 runs: max = 11.7s, min = 11.7s, avg = 11.7s, dev = 0.0s Executed 1 out of 1 test: 1 test passes. ```	2024-06-04 22:29:20 +00:00
Chris Smith	c4b5c73260	feat(cody-gateway): Add FLAGGED_MODEL_NAMES check (#63013 ) * Cody Gateway: Add FLAGGED_MODEL_NAMES check * Update cmd/cody-gateway/internal/httpapi/completions/flagging.go Co-authored-by: Quinn Slack <quinn@slack.org> --------- Co-authored-by: Quinn Slack <quinn@slack.org>	2024-05-31 20:12:27 +00:00
Robert Lin	5833a98185	feat/cody-gateway: support wildcard models (#62909 ) In https://sourcegraph.slack.com/archives/C05SZB829D0/p1715638980052279 we shared a decision we landed on as part of #62263: > Ignoring (then removing) per-subscription model allowlists: As part of the API discussions, we've also surfaced some opportunities for improvements - to make it easier to roll out new models to Enterprise, we're not including per-subscription model allowlists in the new API, and as part of the Cody Gateway migration (by end-of-June), we will update Cody Gateway to stop enforcing per-subscription model allowlists. Cody Gateway will still retain a Cody-Gateway-wide model allowlist. [@chrsmith](https://sourcegraph.slack.com/team/U061QHKUBJ8) is working on a broader design here and will have more to share on this later. To support this, we first need to extend Cody Gateway's model allowlist enforcement to respect a notion of "allow all models that are allowed in Cody Gateway". To ensure models are explicitly provided today, an empty `AllowedModels` is considered invalid, so we add a special single-element-slice-`` configuration that can be used to indicate an actor's rate limit allows all models (`prefixedMasterAllowlist`). This change also unifies somewhat the way we enforce allowed models in various places by introducing `(RateLimit).EvaluateAllowedModels(...)` as the unified way to construct the final allowlist for a given rate limit. I'm planning to roll this out before rolling out actual functionality changes (https://github.com/sourcegraph/sourcegraph/pull/62911) to ensure changes in cached rate limits don't end up confusing an older revision of Cody Gateway that doesn't yet support wildcard models. With #62911, rolling out new models to Enterprise customers no longer require additional code/override changes. Part of https://linear.app/sourcegraph/issue/CORE-135 ## Test plan Unit tests, and E2E test of this in https://github.com/sourcegraph/sourcegraph/pull/62911	2024-05-31 13:09:01 -07:00

1 2 3 4 5

214 Commits