Commit Graph

214 Commits

Author SHA1 Message Date
Julie Tibshirani
c222523fa5
Redis: remove some direct pool usages (#64447)
We want to discourage direct usage of the Redis pool in favor of routing
all calls through the main `KeyValue` interface. This PR removes several
usages of `KeyValue.Pool`. To do so, it adds "PING" and "MGET" to the
`KeyValue` interface.
2024-08-14 13:39:10 +03:00
Julie Tibshirani
ca6e72fe18
Redis: remove RedisKeyValue constructor (#64442)
This PR removes the `redispool.RedisKeyValue` constructor in favor of
the `New...KeyValue` methods, which do not take a pool directly. This
way callers won't create a `Pool` reference, allowing us to track all
direct pool usage through `KeyValue.Pool()`.

This also simplifies a few things:
* Tests now use `NewTestKeyValue` instead of dialing up localhost
directly
* We can remove duplicated Redis connection logic in Cody Gateway
2024-08-14 11:24:32 +03:00
Robert Lin
cec288dc89
fix/enterpriseportal, fix/codygateway: zero-value durations and missing active licenses (#64378)
This change ensure we correctly handle:

1. In Enterprise Portal, where no active license is available, we return
ratelimit=0 intervalduration=0, from the source `PLAN` (as this is
determined by the lack of a plan)
2. In Cody Gateway, where intervalduration=0, we do not grant access to
that feature

## Test plan

Unit tests
2024-08-12 11:49:54 -07:00
Noah S-C
b9c4e2aae9
Revert "Revert "refactor: upgrade to rules_oci 2.0 (2nd attempt)"" (#64354)
Reverts sourcegraph/sourcegraph#64351

## Test plan

Need to test on main due to main-only CI steps (even with main dry-run)
2024-08-08 09:00:08 +00:00
Noah S-C
addba96f47
Revert "refactor: upgrade to rules_oci 2.0 (2nd attempt)" (#64351)
Reverts sourcegraph/sourcegraph#63829

Not working with Aspect Delivery

## Test plan

CI
2024-08-07 22:15:21 +00:00
Greg Magolan
be015c58c2
refactor: upgrade to rules_oci 2.0 (2nd attempt) (#63829)
2nd attempt of #63111, a follow up
https://github.com/sourcegraph/sourcegraph/pull/63085

rules_oci 2.0 brings a lot of performance improvement around oci_image
and oci_pull, which will benefit Sourcegraph. It will also make RBE
faster and have less load on remote cache.

However, 2.0 makes some breaking changes like

- oci_tarball's default output is no longer a tarball
- oci_image no longer compresses layers that are uncompressed, somebody
has to make sure all `pkg_tar` targets have a `compression` attribute
set to compress it beforehand.
- there is no curl fallback, but this is fine for sourcegraph as it
already uses bazel 7.1.

I checked all targets that use oci_tarball as much as i could to make
sure nothing depends on the default tarball output of oci_tarball. there
was one target which used the default output which i put a TODO for
somebody else (somebody who is more on top of the repo) to tackle
**later**.

## Test plan

CI. Also run delivery on this PR (don't land those changes)

---------

Co-authored-by: Noah Santschi-Cooney <noah@santschi-cooney.ch>
2024-08-07 22:21:49 +01:00
Robert Lin
eedc12e789
feat/enterpriseportal: import data from dotcom (#63858)
Adds a background job that can periodically import subscriptions,
licenses, and Cody Gateway access from dotcom.

Note that subscriptions and licenses cannot be deleted, so we don't need
to worry about that. Additionally licenses cannot be updated, so we only
need to worry about creation and revocation.

The importer can be configured with `DOTCOM_IMPORT_INTERVAL` - if zero,
the importer is disabled.

Closes https://linear.app/sourcegraph/issue/CORE-216
## Test plan

```
DOTCOM_IMPORT_INTERVAL=10s sg start dotcom
```

Look for `service.importer` logs. Play around in
https://sourcegraph.test:3443/site-admin/dotcom/product/subscriptions/
to create and edit subscriptions, licenses, and Cody Gateway access.
Watch them show up in the database:

```
psql -d sourcegraph
sourcegraph# select * from enterprise_portal_susbscriptions;
sourcegraph# select * from enterprise_portal_susbscription_licenses;
sourcegraph# select * from enterprise_portal_cody_gateway_access;
```

---------

Co-authored-by: James Cotter <35706755+jac@users.noreply.github.com>
2024-08-07 11:44:18 -07:00
Taras Yemets
d19aa106f9
feat(cody): add circuit breaker to handle timed-out requests and rate limit hits (#64133)
<!-- PR description tips:
https://www.notion.so/sourcegraph/Write-a-good-pull-request-description-610a7fd3e613496eb76f450db5a49b6e
-->
Closes
https://linear.app/sourcegraph/issue/CODY-2758/[autocomplete-latency]-add-circuit-breaker-in-cody-gateway-to-handle

This PR introduces an in-memory model availability tracker to ensure we
do not send consequent requests to the currently unavailable LLM
providers.

The tracker maintains a history of error records for each model. It
evaluates these records to determine whether the model is available for
new requests. The evaluation follows these steps:
1. For every request to an upstream provider, the tracker records any
errors that occur. Specifically, it logs timeout errors (when a request
exceeds its deadline) and responses with a 429 status code (Too Many
Requests).
2. These error records are stored in a circular buffer for each model.
This buffer holds a fixed number of records, ensuring efficient memory
usage.
3. The tracker calculates the failure ratio by analyzing the stored
records. It checks the percentage of errors within a specified
evaluation window and compares this against the total number of recent
requests.
4. Based on the calculated failure ratio, the tracker decides whether
the model is available:
- Model Unavailable: If the ratio of failures (timeouts or 429 status
codes) exceeds a predefined threshold (X%), the model is marked as
unavailable. In this state, the system does not send new requests to the
upstream provider.
When a model is unavailable, the system immediately returns an error
status code, typically a 503 Service Unavailable, to the client. This
informs the client that the service is temporarily unavailable due to
upstream issues.
- Model Available: If the failure ratio is within acceptable limits, the
system proceeds with sending the request to the upstream provider.

This PR suggests considering a model unavailable if **95% of the last
100 requests within the past minute** either time out or return a 429
status code. I am not sure about these exact values and suggest them as
a starting point for discussion

## Test plan
- Added unit tests
- CI
<!-- REQUIRED; info at
https://docs-legacy.sourcegraph.com/dev/background-information/testing_principles
-->

## Changelog

<!-- OPTIONAL; info at
https://www.notion.so/sourcegraph/Writing-a-changelog-entry-dd997f411d524caabf0d8d38a24a878c
-->
2024-08-05 11:38:58 +00:00
Beatrix
468a01a3ab
cody-gateway: handle missing Google response (#63895)
- Logs a warning instead of returning an error when the Google response
is missing
- This prevents the API from returning an error when the Google response
is empty, which can happen when Google is not happy with the question
due to safety issues. We will only log a decoder error as error.

<!-- PR description tips:
https://www.notion.so/sourcegraph/Write-a-good-pull-request-description-610a7fd3e613496eb76f450db5a49b6e
-->

## Test plan

<!-- REQUIRED; info at
https://docs-legacy.sourcegraph.com/dev/background-information/testing_principles
-->

only the log level was updated for empty responses. the function
behavior was not changed.

## Changelog

<!-- OPTIONAL; info at
https://www.notion.so/sourcegraph/Writing-a-changelog-entry-dd997f411d524caabf0d8d38a24a878c
-->

cody-gateway: log missing Google response as warning
2024-07-24 17:58:19 +00:00
Rafał Gajdulewicz
8bd9c5d1a4
Add counter for traced requests to Fireworks (#63953)
Adds a otel counter to measure how many traced requests we sent to
Fireworks from Cody Gateway -
[context](https://sourcegraph.slack.com/archives/C0729T2PBV2/p1721385268801079?thread_ts=1721029684.522409&cid=C0729T2PBV2)

## Test plan

- tested locally by sending a request without the `X-Fireworks-Genie`
header, with the header but with value != `true` and with the header and
value `true` and observing `fireworks-traced-requests` Prometheus metric
2024-07-19 14:32:19 +00:00
Taras Yemets
26df35a69f
fix(cody): use client-provided timeout for completions requests (#63875)
Closes
[CODY-2775](https://linear.app/sourcegraph/issue/CODY-2775/%5Bautocomplete-latency%5D-apply-the-same-timeout-on-the-cody-gateway-side)



Enables client control over the request processing timeout on the server
(both Sourcegraph backend and Cody Gateway). The context timeout is set
to the value provided in the `X-Timeout-Ms` header of the client
request. If the header is not provided, the default context timeout is
used (1 minute on both Sourcegraph backend and Cody Gateway).

Previously, we only had a default timeout on the Sourcegraph backend
side (8 minutes).

Corresponding client change:
- https://github.com/sourcegraph/cody/pull/4921

<!-- 💡 To write a useful PR description, make sure that your description
covers:
- WHAT this PR is changing:
    - How was it PREVIOUSLY.
    - How it will be from NOW on.
- WHY this PR is needed.
- CONTEXT, i.e. to which initiative, project or RFC it belongs.

The structure of the description doesn't matter as much as covering
these points, so use
your best judgement based on your context.
Learn how to write good pull request description:
https://www.notion.so/sourcegraph/Write-a-good-pull-request-description-610a7fd3e613496eb76f450db5a49b6e?pvs=4
-->


## Test plan
- Manually tested and confirmed that if the request contains the
`X-Timeout-Ms` header, its value is used. If not, the default maximum
request duration is applied.
- CI
- 
<!-- All pull requests REQUIRE a test plan:
https://docs-legacy.sourcegraph.com/dev/background-information/testing_principles
-->


## Changelog
- Use the provided timeout from request parameters if available;
otherwise use the default maximum request duration (8 minutes)

<!--
1. Ensure your pull request title is formatted as: $type($domain): $what
2. Add bullet list items for each additional detail you want to cover
(see example below)
3. You can edit this after the pull request was merged, as long as
release shipping it hasn't been promoted to the public.
4. For more information, please see this how-to
https://www.notion.so/sourcegraph/Writing-a-changelog-entry-dd997f411d524caabf0d8d38a24a878c?

Audience: TS/CSE > Customers > Teammates (in that order).

Cheat sheet: $type = chore|fix|feat $domain:
source|search|ci|release|plg|cody|local|...
-->

<!--
Example:

Title: fix(search): parse quotes with the appropriate context
Changelog section:

## Changelog

- When a quote is used with regexp pattern type, then ...
- Refactored underlying code.
-->
2024-07-19 14:17:02 +00:00
Kevin Chen
59ec1e034e Update flagging.go 2024-07-16 07:15:40 -07:00
Kevin Chen
9d8140ee5f updated error messaging for blocked requests 2024-07-16 07:15:40 -07:00
Hitesh Sagtani
660d6866b5
change model identifier for finetuned deepseek model (#63817)
## Context
1. Change model identifier for deepseek-coder-v2 model for fine-tuned
models.

## Test plan
```
curl -vS -X POST http://localhost:9992/v1/completions/fireworks -H 'Authorization: bearer <SGD_TOKEN>' -d '{"stream":false,"max_tokens":50, "model": "fim-lang-specific-model-deepseek-stack-trained", "stop_sequences": ["\n\n"], "prompt": "const value = ", "stream":false, "languageId": "python"}' -H 'X-sourcegraph-feature: code_completions'
```

```
curl -vS -X POST http://localhost:9992/v1/completions/fireworks -H 'Authorization: bearer <SGD_TOKEN>' -d '{"stream":false,"max_tokens":50, "model": "fim-lang-specific-model-deepseek-logs-trained", "stop_sequences": ["\n\n"], "prompt": "const value = ", "stream":false, "languageId": "python"}' -H 'X-sourcegraph-feature: code_completions'
```
2024-07-14 13:47:50 +00:00
Chris Smith
02c07df176
feat/cody: Refactor completions API to use new modelconfig (support more models) (#63797)
This PR if what the past dozen or so
[cleanup](https://github.com/sourcegraph/sourcegraph/pull/63359),
[refactoring](https://github.com/sourcegraph/sourcegraph/pull/63731),
and [test](https://github.com/sourcegraph/sourcegraph/pull/63761) PRs
were all about: using the new `modelconfig` system for the completion
APIs.

This will enable users to:

- Use the new site config schema for specifying LLM configuration, added
in https://github.com/sourcegraph/sourcegraph/pull/63654. Sourcegraph
admins who use these new site config options will be able to support
many more LLM models and providers than is possible using the older
"completions" site config.
- For Cody Enterprise users, we no longer ignore the
`CodyCompletionRequest.Model` field. And now support users specifying
any LLM model (provided it is "supported" by the Sourcegraph instance).

Beyond those two things, everything should continue to work like before.
With any existing "completions" configuration data being converted into
the `modelconfig` system (see
https://github.com/sourcegraph/sourcegraph/pull/63533).

## Overview

In order to understand how this all fits together, I'd suggest reviewing
this PR commit-by-commit.

### [Update internal/completions to use
modelconfig](e6b7eb171e)

The first change was to update the code we use to serve LLM completions.
(Various implementations of the `types.CompletionsProvider` interface.)

The key changes here were as follows:

1. Update the `CompletionRequest` type to include the `ModelConfigInfo`
field (to make the new Provider and Model-specific configuration data
available.)
2. Rename the `CompletionRequest.Model` field to
`CompletionRequest.RequestedModel`. (But with a JSON annotation to
maintain compatibility with existing callers.) This is to catch any bugs
related to using the field directly, since that is now almost guaranteed
to be a mistake. (See below.)

With these changes, all of the `CompletionProvider`s were updated to
reflect these changes.

- Any situation where we used the
`CompletionRequest.Parameters.RequestedModel` should now refer to
`CompletionRequest.ModelConfigInfo.Model.ModelName`. The "model name"
being the thing that should be passed to the API provider, e.g.
`gpt-3.5-turbo`.
- In some situations (`azureopenai`) we needed to rely on the Model ID
as a more human-friendly identifier. This isn't 100% accurate, but will
match the behavior we have today. A long doc comment calls out the
details of what is wrong with that.
- In other situations (`awsbedrock`, `azureopenai`) we read the new
`modelconfig` data to configure the API provider (e.g.
`Azure.UseDeprecatedAPI`), or surface model-specific metadata (e.g. AWS
Provisioned Throughput ARNs). While the code is a little clunky to avoid
larger refactoring, this is the heart and soul of how we will be writing
new completion providers in the future. That is, taking specific
configuration bags with whatever data that is required.

### [Fix bugs in
modelconfig](75a51d8cb5)

While we had lots of tests for converting the existing "completions"
site config data into the `modelconfig.ModelConfiguration` structure,
there were a couple of subtle bugs that I found while testing the larger
change.

The updated unit tests and comments should make that clear.

### [Update frontend/internal/httpapi/completions to use
modelconfig](084793e08f)

The final step was to update the HTTP endpoints that serve the
completion requests. There weren't any logic changes here, just
refactoring how we lookup the required data. (e.g. converting the user's
requested model into an actual model found in the site configuration.)

We support Cody clients sending either "legacy mrefs" of the form
`provider/model` like before, or the newer mref
`provider::apiversion::model`. Although it will likely be a while before
Cody clients are updated to only use the newer-style model references.

The existing unit tests for the competitions APIs just worked, which was
the plan. But for the few changes that were required I've added comments
to explain the situation.

### [Fix: Support requesting models just by their
ID](99715feba6)

> ... We support Cody clients sending either "legacy mrefs" of the form
`provider/model` like before ...

Yeah, so apparently I lied 😅 . After doing more testing, the extension
_also_ sends requests where the requested model is just `"model"`.
(Without the provider prefix.)

So that now works too. And we just blindly match "gtp-3.5-turbo" to the
first mref with the matching model ID, such as
"anthropic::unknown::gtp-3.5-turbo".

## Test plan

Existing unit tests pass, added a few tests. And manually tested my Sg
instance configured to act as both "dotcom" mode and a prototypical Cody
Enterprise instance.

## Changelog

Update the Cody APIs for chat or code completions to use the "new style"
model configuration. This allows for great flexibility in configuring
LLM providers and exposing new models, but also allows Cody Enterprise
users to select different models for chats.

This will warrant a longer, more detailed changelog entry for the patch
release next week. As this unlocks many other exciting features.
2024-07-12 12:15:31 -07:00
Erik Seliger
a32b6131f3
codygateway: Use only one redis pool and make REDIS_ENDPOINT a clear requirement in config (#63625)
Currently, nothing really tells that Cody Gateway needs redis, the env
var for finding the address is hidden somewhere deep in the redispool
package.
In practice, we only use one redis instance, but at some point we
started using both redispool.Cache and redispool.Store, which means we
maintain two connection pools, leading to more than expected
connections.

Test plan:

Code review and CI.
2024-07-10 01:54:24 +02:00
Erik Seliger
169db11ce6
rcache: Explicitly pass redis pool to use (#63644)
Recently, this was refactored to also allow using the redispool.Store.
However, that makes it very implicit to know where something is being
written, so instead we pass down the pool instance at instantiation.

This also gives a slightly better overview of where redispool is
actually required.

Test plan: CI passes.
2024-07-10 01:23:19 +02:00
Hitesh Sagtani
eb16d802a3
adding deepseek-v2 and deepseek fine-tuned model trained on symbol graph context (#63702)
## Context
1. Adds support for deepseek-coder-v2 model and added fine-tuned on
deepseek coder.

## Test plan
```
curl -vS -X POST http://localhost:9992/v1/completions/fireworks -H 'Authorization: bearer <SGD_TOKEN>' -d '{"stream":false,"max_tokens":50, "model": "fim-lang-specific-model-deepseek-stack-trained", "stop_sequences": ["\n\n"], "prompt": "const value = ", "stream":false, "languageId": "python"}' -H 'X-sourcegraph-feature: code_completions'
```

```
curl -vS -X POST http://localhost:9992/v1/completions/fireworks -H 'Authorization: bearer <SGD_TOKEN>' -d '{"stream":false,"max_tokens":50, "model": "fim-lang-specific-model-deepseek-logs-trained", "stop_sequences": ["\n\n"], "prompt": "const value = ", "stream":false, "languageId": "python"}' -H 'X-sourcegraph-feature: code_completions'
```
2024-07-09 17:57:16 +05:30
Robert Lin
fcfdba7e7b
fix/codygateway: tweak enterprise-portal dial options (#63692)
As titled - we ran into 1 occurrence of a sync failure again:
https://sourcegraph.slack.com/archives/C076472745A/p1720278447544509

## Test plan

n/a
2024-07-08 13:21:26 -07:00
Beatrix
806ff434a6
feat(cody-gateway): add support for Gemini models with context cache (#63413)
PART OF https://linear.app/sourcegraph/issue/CODY-2451
CLOSE https://linear.app/sourcegraph/issue/CODY-2513

- Add Gemini 1.5 Flash 001 and Gemini 1.5 Pro 001 models to the config
and allowed models lists
- These fixed stable versions support context caching, as noted in the
[Google Gemini API
docs](https://ai.google.dev/gemini-api/docs/caching?lang=node)


![image](https://github.com/sourcegraph/sourcegraph/assets/68532117/1857c853-7c8d-4446-a991-4bc6a39e6065)


NEXT: Implement context caching in the codebase. Right now using the
newly added models alone do not work with context caching.

<!-- 💡 To write a useful PR description, make sure that your description
covers:
- WHAT this PR is changing:
    - How was it PREVIOUSLY.
    - How it will be from NOW on.
- WHY this PR is needed.
- CONTEXT, i.e. to which initiative, project or RFC it belongs.

The structure of the description doesn't matter as much as covering
these points, so use
your best judgement based on your context.
Learn how to write good pull request description:
https://www.notion.so/sourcegraph/Write-a-good-pull-request-description-610a7fd3e613496eb76f450db5a49b6e?pvs=4
-->


## Test plan

<!-- All pull requests REQUIRE a test plan:
https://docs-legacy.sourcegraph.com/dev/background-information/testing_principles
-->

No feature changes. Adding new model to allow list.

## Changelog

<!--
1. Ensure your pull request title is formatted as: $type($domain): $what
2. Add bullet list items for each additional detail you want to cover
(see example below)
3. You can edit this after the pull request was merged, as long as
release shipping it hasn't been promoted to the public.
4. For more information, please see this how-to
https://www.notion.so/sourcegraph/Writing-a-changelog-entry-dd997f411d524caabf0d8d38a24a878c?

Audience: TS/CSE > Customers > Teammates (in that order).

Cheat sheet: $type = chore|fix|feat $domain:
source|search|ci|release|plg|cody|local|...
-->

<!--
Example:

Title: fix(search): parse quotes with the appropriate context
Changelog section:

## Changelog

- When a quote is used with regexp pattern type, then ...
- Refactored underlying code.
-->
2024-07-03 09:28:13 -07:00
Robert Lin
ad03371193
fix/cody-gateway: use keepalive/idle timeout options for Enterprise Portal (#63605)
See https://linear.app/sourcegraph/issue/CORE-203 - it seems the default
keepalive and idle options are quite aggressive about keeping idle
connections around without verifying them. This change tries to
configure some options to ensure idle connections aren't retained for a
long time.

## Test plan

`sg start cody-gateway` with `CODY_GATEWAY_ENTERPRISE_PORTAL_URL:
https://enterprise-portal.sgdev.org:443` in override

```
[   cody-gateway] INFO cody-gateway.sources.worker.handler actor/source.go:154 Completed sync {"TraceId": "b30602af4f6f3269ed438c86ca37edcc", "SpanId": "154e7f8114b27cab", "handle.timeout": "2m0s", "source": "dotcom-product-subscriptions", "sync_duration": "800.20575ms", "seen": 161}
[   cody-gateway] INFO cody-gateway.sources.worker.handler actor/source.go:165 All sources synced {"TraceId": "b30602af4f6f3269ed438c86ca37edcc", "SpanId": "8f226cc9f0159b70", "handle.timeout": "2m0s"}
```
2024-07-03 08:40:50 -07:00
Robert Lin
5f37089303
chore/codygatewayevents: extract into standalone package for reuse, split up internal/codygateway (#63528)
Allows us to directly reuse the Cody Gateway usage queries so that they
can be served directly from Enterprise Portal
(https://github.com/sourcegraph/sourcegraph/pull/63531). To enable this
we also need to split up the monolithic `internal/codygateway` package
so that not all roads lead back to `conf`:

- `internal/codygateway`: Client mechanisms
- `internal/codygateway/codygatewayevents`: Cody Gateway events service
+ related consts
- `internal/codygateway/codygatewayactor`: Cody Gateway actor types

Part of https://linear.app/sourcegraph/issue/CORE-201

## Test plan

n/a
2024-06-28 12:03:16 -07:00
Ólafur Páll Geirsson
a426134fd4
Gateway: forward X-Fireworks-Genie header from client (#63460)
Previously, there was no way to enable the "tracing" feature from
Fireworks https://readme.fireworks.ai/docs/enabling-tracing This PR
solves the problem by forwarding the `X-Fireworks-Genie` HTTP header to
Fireworks if this HTTP header is set by the Gateway client.

Fixes CODY-2555

<!-- 💡 To write a useful PR description, make sure that your description
covers:
- WHAT this PR is changing:
    - How was it PREVIOUSLY.
    - How it will be from NOW on.
- WHY this PR is needed.
- CONTEXT, i.e. to which initiative, project or RFC it belongs.

The structure of the description doesn't matter as much as covering
these points, so use
your best judgement based on your context.
Learn how to write good pull request description:
https://www.notion.so/sourcegraph/Write-a-good-pull-request-description-610a7fd3e613496eb76f450db5a49b6e?pvs=4
-->


## Test plan

<!-- All pull requests REQUIRE a test plan:
https://docs-legacy.sourcegraph.com/dev/background-information/testing_principles
-->
N/A

## Changelog

<!--
1. Ensure your pull request title is formatted as: $type($domain): $what
2. Add bullet list items for each additional detail you want to cover
(see example below)
3. You can edit this after the pull request was merged, as long as
release shipping it hasn't been promoted to the public.
4. For more information, please see this how-to
https://www.notion.so/sourcegraph/Writing-a-changelog-entry-dd997f411d524caabf0d8d38a24a878c?

Audience: TS/CSE > Customers > Teammates (in that order).

Cheat sheet: $type = chore|fix|feat $domain:
source|search|ci|release|plg|cody|local|...
-->

<!--
Example:

Title: fix(search): parse quotes with the appropriate context
Changelog section:

## Changelog

- When a quote is used with regexp pattern type, then ...
- Refactored underlying code.
-->
2024-06-25 02:49:26 +00:00
Quinn Slack
91bc23d8e1
support fast, simple sg start single-program-experimental-blame-sqs for local dev (#63435)
This makes it easier to run Sourcegraph in local dev by compiling a few
key services (frontend, searcher, repo-updater, gitserver, and worker)
into a single Go binary and running that.

Compared to `sg start` (which compiles and runs ~10 services), it's
faster to start up (by ~10% or a few seconds), takes a lot less memory
and CPU when running, has less log noise, and rebuilds faster. It is
slower to recompile for changes just to `frontend` because it needs to
link in more code on each recompile, but it's faster for most other Go
changes that require recompilation of multiple services.

This is only intended for local dev as a convenience. There may be
different behavior in this mode that could result in problems when your
code runs in the normal deployment. Usually our e2e tests should catch
this, but to be safe, you should run in the usual mode if you are making
sensitive cross-service changes.

Partially reverts "svcmain: Simplify service setup (#61903)" (commit
9541032292).


## Test plan

Existing tests cover any regressions to existing behavior. This new
behavior is for local dev only.
2024-06-24 21:12:47 +00:00
Beatrix
b3fe6dceb6
fix(cody-gateway): getAPIURL before transformBody (#63406)
<!-- 💡 To write a useful PR description, make sure that your description
covers:
- WHAT this PR is changing:
    - How was it PREVIOUSLY.
    - How it will be from NOW on.
- WHY this PR is needed.
- CONTEXT, i.e. to which initiative, project or RFC it belongs.

The structure of the description doesn't matter as much as covering
these points, so use
your best judgement based on your context.
Learn how to write good pull request description:
https://www.notion.so/sourcegraph/Write-a-good-pull-request-description-610a7fd3e613496eb76f450db5a49b6e?pvs=4
-->

Fix an issue where the requestBody is used after transformBody has been
executed.


## Test plan

<!-- All pull requests REQUIRE a test plan:
https://docs-legacy.sourcegraph.com/dev/background-information/testing_principles
-->

1. Start Cody Gateway locally
2. Start SG local dev instance
3. Connect SG local dev instance to your local Cody Gateway instance
4. Set Gemini Flash as your chatModel
5. Connect Cody to your local dev instance
6. Ask Cody a question and verify you are getting a response


![image](https://github.com/sourcegraph/sourcegraph/assets/68532117/fbce22f9-8531-4f6e-8eb7-5c6b26e0a9fa)



## Changelog

<!--
1. Ensure your pull request title is formatted as: $type($domain): $what
8. Add bullet list items for each additional detail you want to cover
(see example below)
9. You can edit this after the pull request was merged, as long as
release shipping it hasn't been promoted to the public.
10. For more information, please see this how-to
https://www.notion.so/sourcegraph/Writing-a-changelog-entry-dd997f411d524caabf0d8d38a24a878c?

Audience: TS/CSE > Customers > Teammates (in that order).

Cheat sheet: $type = chore|fix|feat $domain:
source|search|ci|release|plg|cody|local|...
-->

<!--
Example:

Title: fix(search): parse quotes with the appropriate context
Changelog section:

## Changelog

- When a quote is used with regexp pattern type, then ...
- Refactored underlying code.
-->
2024-06-20 16:12:16 -07:00
Beatrix
18c7ba8dac
Cody Gateway: New Claude 3.5 Sonnet model (#63395)
CONTEXT:
https://sourcegraph.slack.com/archives/C05AGQYD528/p1718898110684289?thread_ts=1718896254.676939&cid=C05AGQYD528

CLOSE https://linear.app/sourcegraph/issue/CODY-2177


Adding new Claude 3.5 Sonnet (`claude-3.5-sonnet-20240620`) to the Cody
Gateway allow list.
 
Model ID based on Anthropic Console:


![image](https://github.com/sourcegraph/sourcegraph/assets/68532117/6f27b24f-a7f5-4b3f-85a9-c0eed1babe9b)

Claude 3.5 Sonnet is Live on [s0.dev](http://s0.dev/) to confirm this is
the correct model ID

<!-- 💡 To write a useful PR description, make sure that your description
covers:
- WHAT this PR is changing:
    - How was it PREVIOUSLY.
    - How it will be from NOW on.
- WHY this PR is needed.
- CONTEXT, i.e. to which initiative, project or RFC it belongs.

The structure of the description doesn't matter as much as covering
these points, so use
your best judgement based on your context.
Learn how to write good pull request description:
https://www.notion.so/sourcegraph/Write-a-good-pull-request-description-610a7fd3e613496eb76f450db5a49b6e?pvs=4
-->


## Test plan

<!-- All pull requests REQUIRE a test plan:
https://docs-legacy.sourcegraph.com/dev/background-information/testing_principles
-->

Verify you can use the new model through Cody Gateway


## Changelog

<!--
1. Ensure your pull request title is formatted as: $type($domain): $what
2. Add bullet list items for each additional detail you want to cover
(see example below)
3. You can edit this after the pull request was merged, as long as
release shipping it hasn't been promoted to the public.
4. For more information, please see this how-to
https://www.notion.so/sourcegraph/Writing-a-changelog-entry-dd997f411d524caabf0d8d38a24a878c?

Audience: TS/CSE > Customers > Teammates (in that order).

Cheat sheet: $type = chore|fix|feat $domain:
source|search|ci|release|plg|cody|local|...
-->

<!--
Example:

Title: fix(search): parse quotes with the appropriate context
Changelog section:

## Changelog

- When a quote is used with regexp pattern type, then ...
- Refactored underlying code.
-->

feature(plg): new Claude 3.5 Sonnet model support for Cody Pro users
2024-06-20 09:46:33 -07:00
Chris Smith
46134524c7
refactor(cody): Reshape the CompletionsClient interface (#63358)
This PR refactors the `CompletionsClient` interface, and all the
corresponding call sites. There is no functional change, beyond bundling
several function parameters into a new type.

See `internal/completions/types/types.go`. But the gist is this putting
3x parameters into a single `CompletionRequest` type.

```diff
	Complete(
            context.Context,
            log.Logger
-           CompletionsFeature,
-           CompletionsVersion,
-           CompletionRequestParameters
+           CompletionRequest
        ) (*CompletionResponse, error)
```

## Why?

As part of reworking the codepath between receiving a completion
request, dispatching it to the right `CompletionsClient` implementation,
and serving the request, I need some "hooks" to inject new information.

In a future PR I plan on adding a `*ServerSideModelConfig` as another
field to the `CompletionRequest`, so that when the `CompletionClient`'s
implementation is trying to serve that request it has any additional
data it needs. (For example, the AWS Bedrock provisioned capacity ARN,
etc.)

## Test plan

Updated existing tests, relying on CI/CD for any other issues.

## Changelog

NA, just some under the hood refactoring that shouldn't impact any
functionality.
2024-06-19 19:17:32 -07:00
Rafał Gajdulewicz
b7dd61769c
Use math/rand/v2 (#63346)
Switches code from https://github.com/sourcegraph/sourcegraph/pull/63315
to use [math/rand/v2](https://pkg.go.dev/math/rand/v2) as suggested by
@keegancsmith.


## Test plan

- tested locally with a backend throwing random 404
2024-06-19 15:31:07 +00:00
Robert Lin
557b4df0ed
chore/deps: upgrade grpc, prometheus/common (#63328)
This change extracts the unrelated transitive upgrades of
https://github.com/sourcegraph/sourcegraph/pull/63171 (CORE-177) into a
separate PR. I'm making this because @unknwon ran into issues with the
exact same dependencies in
https://github.com/sourcegraph/sourcegraph/pull/63171#issuecomment-2157694545.

The change consists of upgrades to:

- `google.golang.org/grpc` - there's a deprecation of `grpc.DialContext`
that we agreed in #63171 to keep for now.
- removing our `replace` directive on `github.com/prometheus/common` and
upgrading it. This is safe to do because our Alertmanager version is
already way ahead, and the reason this has a `replace` is outdated now.

## Test plan

CI, nothing blows up on `sg start` and I can click around and do a bit
of searching
2024-06-19 09:55:44 -04:00
Rafał Gajdulewicz
c68cd521cf
Retry 404 errors from Triton (#63315)
Currently, when SMEGA scales down a pod, it will return 404 for a period
of time, and that 404 will get translated into a 500 response from Cody
Gateway. This PR implements (exponential) retries (with jitter),
attempting to send the same request to a different Triton pod.

Closes AI-86.

Related to AI-31, AI-87.

## Test plan

- tested locally with a backend that throws random 404
2024-06-19 13:43:56 +01:00
Beatrix
0c777bac41
fix(cody-gateway): streaming google endpoint (#63306)
<!-- 💡 To write a useful PR description, make sure that your description
covers:
- WHAT this PR is changing:
    - How was it PREVIOUSLY.
    - How it will be from NOW on.
- WHY this PR is needed.
- CONTEXT, i.e. to which initiative, project or RFC it belongs.

The structure of the description doesn't matter as much as covering
these points, so use
your best judgement based on your context.
Learn how to write good pull request description:
https://www.notion.so/sourcegraph/Write-a-good-pull-request-description-610a7fd3e613496eb76f450db5a49b6e?pvs=4
-->

Issue: Currently, the ShouldStream() method will always returns false
because the Stream value is removed before it was passed into the
Handler.

To fix this, we will store the original googleRequest.Stream value if
it's true so that ShouldStream() will return the correct Stream value.
We will also use the transformBody method to remove the Stream value
before we send it to Google API.

Here is the expected behaviour after the stream is fixed:


https://github.com/sourcegraph/sourcegraph/assets/68532117/8324fb8c-0625-4579-b0e9-0abfc9858961


Also confirmed it works with both Cody Gateway and BYOK:


![image](https://github.com/sourcegraph/sourcegraph/assets/68532117/9fe60423-a05b-412d-812a-f34cd812d9dc)


## Test plan

<!-- All pull requests REQUIRE a test plan:
https://docs-legacy.sourcegraph.com/dev/background-information/testing_principles
-->

Always stream Cody Gateway's requests for Google Gemini models as we
haven't implemented Code Completion feature on the client side.

### Non-stream request

```
❯ curl 'https://sourcegraph.test:3443/.api/completions/code' -i \
-X POST \
-H 'authorization: token LOCALTOKEN' \
--data-raw '{"messages":[{"speaker":"human","text":"Who are you?"}],"maxTokensToSample":30,"temperature":0,"stopSequences":[],"timeoutMs":5000}'
HTTP/2 200
access-control-allow-credentials: true
access-control-allow-origin:
alt-svc: h3=":3443"; ma=2592000
cache-control: no-cache, max-age=0
content-type: text/plain; charset=utf-8
date: Tue, 18 Jun 2024 21:05:38 GMT
server: Caddy
server: Caddy
set-cookie: sourcegraphDeviceId=d4fa7789-2442-472a-b425-a68372d27944; Expires=Wed, 18 Jun 2025 21:05:36 GMT; Secure
vary: Cookie, Accept-Encoding, Authorization, Cookie, Authorization, X-Requested-With, Cookie
x-content-type-options: nosniff
x-frame-options: DENY
x-powered-by: Express
x-trace: 00f998a2a2e1b6895687ad7cc567b41c
x-trace-span: da9c93d16415b94f
x-trace-url: https://sourcegraph.test:3443/-/debug/jaeger/trace/00f998a2a2e1b6895687ad7cc567b41c
x-xss-protection: 1; mode=block
content-length: 147

{"completion":"I am a large language model, trained by Google. \n\nHere's what that means:\n\n* **Large Language Model:** I'm","stopReason":"STOP"}%
```

### Streaming request: 

```
❯ curl 'https://sourcegraph.test:3443/.api/completions/stream' -i \
-X POST \
-H 'authorization: token $LOCALTOKEN' \
--data-raw '{"stream":true,"messages":[{"speaker":"human","text":"Who are you?"}],"maxTokensToSample":1000,"temperature":0,"stopSequences":[],"timeoutMs":5000}'

HTTP/2 200
access-control-allow-credentials: true
access-control-allow-origin:
alt-svc: h3=":3443"; ma=2592000
cache-control: no-cache
content-type: text/event-stream
date: Tue, 18 Jun 2024 21:07:02 GMT
server: Caddy
server: Caddy
set-cookie: sourcegraphDeviceId=38b45f36-d237-4f8d-8242-a63fcc801a32; Expires=Wed, 18 Jun 2025 21:06:59 GMT; Secure
vary: Cookie, Accept-Encoding, Authorization, Cookie, Authorization, X-Requested-With, Cookie
x-accel-buffering: no
x-content-type-options: nosniff
x-frame-options: DENY
x-powered-by: Express
x-trace: 984932973626e14f7cb0ce7e8e470717
x-trace-span: d285179cfb744e08
x-trace-url: https://sourcegraph.test:3443/-/debug/jaeger/trace/984932973626e14f7cb0ce7e8e470717
x-xss-protection: 1; mode=block

event: completion
data: {"completion":"I","stopReason":"STOP"}

event: completion
data: {"completion":"I am a large language model, trained by Google. \n\nHere's what","stopReason":"STOP"}

event: completion
data: {"completion":"I am a large language model, trained by Google. \n\nHere's what that means:\n\n* **I am not a person.** I am a computer","stopReason":"STOP"}

event: completion
data: {"completion":"I am a large language model, trained by Google. \n\nHere's what that means:\n\n* **I am not a person.** I am a computer program designed to process and generate human-like text. \n* **I learn from data.** I was trained on a massive dataset of text and code,","stopReason":"STOP"}

event: completion
data: {"completion":"I am a large language model, trained by Google. \n\nHere's what that means:\n\n* **I am not a person.** I am a computer program designed to process and generate human-like text. \n* **I learn from data.** I was trained on a massive dataset of text and code, which allows me to generate text, translate languages, write different kinds of creative content, and answer your questions in an informative way.\n* **I am still","stopReason":"STOP"}

event: completion
data: {"completion":"I am a large language model, trained by Google. \n\nHere's what that means:\n\n* **I am not a person.** I am a computer program designed to process and generate human-like text. \n* **I learn from data.** I was trained on a massive dataset of text and code, which allows me to generate text, translate languages, write different kinds of creative content, and answer your questions in an informative way.\n* **I am still under development.** I am constantly learning and improving, but I am not perfect and can sometimes make mistakes.\n\nHow can I help you today? \n","stopReason":"STOP"}

event: done
data: {}
```

## Changelog

<!--
1. Ensure your pull request title is formatted as: $type($domain): $what
2. Add bullet list items for each additional detail you want to cover
(see example below)
3. You can edit this after the pull request was merged, as long as
release shipping it hasn't been promoted to the public.
4. For more information, please see this how-to
https://www.notion.so/sourcegraph/Writing-a-changelog-entry-dd997f411d524caabf0d8d38a24a878c?

Audience: TS/CSE > Customers > Teammates (in that order).

Cheat sheet: $type = chore|fix|feat $domain:
source|search|ci|release|plg|cody|local|...
-->

<!--
Example:

Title: fix(search): parse quotes with the appropriate context
Changelog section:

## Changelog

- When a quote is used with regexp pattern type, then ...
- Refactored underlying code.
-->
2024-06-18 22:25:26 +00:00
Chris Smith
692cecad6e
fix(cody-gateway): Fix Google flagging configuration (#63305)
We recently noticed that 100% of all LLM requests routed to
Google-provided LLMs were getting flagged, and for the same reasons
["high_max_tokens_to_sample","blocked_phrase"].

After FAR, FAR more head scratching than I care to admit to. I realized
the problem: that we were never actually initializing the
`Google.FlaggingConfig` settings. So when we were inspecting Gemini
requests for potential abuse, we were comparing them against the
zero-state for `flaggingConfig`. i.e. does this prompt have a higher
`MaxTokensToSample` than 0?

🤦 Super easy mistake to make. We now confirm that _something_ is in the
`flaggingConfig` before assuming it is legitimate.

Fixes https://github.com/sourcegraph/abuse-ban-bot/issues/32.

## Test plan

CI/CD
2024-06-17 14:16:29 -07:00
David Veszelovszki
7452324ea5
fix(cody-gateway): Disable flagging Google requests (#63295)
- Fixes https://github.com/sourcegraph/sourcegraph/issues/63294

This PR turns off flagging for all Google models entirely.

## Test plan

Not tested yet.
2024-06-17 16:16:32 +00:00
Hitesh Sagtani
d01358ff12
adding deepseek and lang specific mixtral for completions ab experiment (#63283)
## Context
1. Adds support for deepseek-coder model and updated Mixtral
finetuned-FIM models identifiers hosted on Fireworks.
2. Client side pull request:
https://github.com/sourcegraph/cody/pull/4577

## Test plan
```
curl -vS -X POST http://localhost:9992/v1/completions/fireworks -H 'Authorization: bearer <SGD_TOKEN>' -d '{"stream":false,"max_tokens":50, "model": "accounts/sourcegraph/models/deepseek-coder-7b-base", "stop_sequences": ["\n\n"], "prompt": "const value = ", "stream":false}' -H 'X-sourcegraph-feature: code_completions'
```

```
curl -vS -X POST http://localhost:9992/v1/completions/fireworks -H 'Authorization: bearer <SGD_TOKEN>' -d '{"stream":false,"max_tokens":50, "model": "accounts/sourcegraph/models/custom-deepseek-1p3b-base-hf-version", "stop_sequences": ["\n\n"], "prompt": "const value = ", "stream":false}' -H 'X-sourcegraph-feature: code_completions'
```

```
curl -vS -X POST http://localhost:9992/v1/completions/fireworks -H 'Authorization: bearer <SGD_TOKEN>' -d '{"stream":false,"max_tokens":50, "model": "fim-lang-specific-model-mixtral", "stop_sequences": ["\n\n"], "prompt": "const value = ", "stream":false}' -H 'X-sourcegraph-feature: code_completions'
```
2024-06-17 16:28:24 +05:30
Beatrix
8bf288e153
fix(cody-gateway): Improve prompt and request validation for gemini (#63258)
This PR aims to fix the issue where Cody Gateway is returning error
regarding the unsupported `Stream` field showing up in the request:


![image](https://github.com/sourcegraph/sourcegraph/assets/68532117/9653287f-c53b-4301-9419-a3212db1521f)


Changes included: 

- Moved the google-specific types from google.go to a new
google_types.go file
- Reorganized the types to improve readability and maintainability
- Removed unused fields and methods from the types
- Aligned the types with the latest Google API documentation


Also fix an issue with google provider for byok customers about last
assistant message being empty, which is the default format sent from
clients (e.g. VS Code):


![image](https://github.com/sourcegraph/sourcegraph/assets/68532117/a1941ead-518d-469b-8971-85f42f8b833e)

This PR addresses this issue by removing the last assistant message if
it's empty during the prompt building step to make it more robust:

- Validate that the input messages are not empty
- Ensure the first message is a non-empty assistant message
- Skip empty assistant messages at the end of the prompt
- Disallow consistent speaker role between consecutive messages
- Add tests for the various validation cases

<!-- 💡 To write a useful PR description, make sure that your description
covers:
- WHAT this PR is changing:
    - How was it PREVIOUSLY.
    - How it will be from NOW on.
- WHY this PR is needed.
- CONTEXT, i.e. to which initiative, project or RFC it belongs.

The structure of the description doesn't matter as much as covering
these points, so use
your best judgement based on your context.
Learn how to write good pull request description:
https://www.notion.so/sourcegraph/Write-a-good-pull-request-description-610a7fd3e613496eb76f450db5a49b6e?pvs=4
-->


## Test plan

<!-- All pull requests REQUIRE a test plan:
https://docs-legacy.sourcegraph.com/dev/background-information/testing_principles
-->

Update Site Config to connect to google API directly in your local
instance, and then log into your local instance in VS Code. Verify the
responses are correct with no errors.

![Screenshot 2024-06-14 at 10 03
41 AM](https://github.com/sourcegraph/sourcegraph/assets/68532117/d687f9c0-9d25-41bb-9907-e16af53bb09e)


## Changelog

<!--
1. Ensure your pull request title is formatted as: $type($domain): $what
2. Add bullet list items for each additional detail you want to cover
(see example below)
3. You can edit this after the pull request was merged, as long as
release shipping it hasn't been promoted to the public.
4. For more information, please see this how-to
https://www.notion.so/sourcegraph/Writing-a-changelog-entry-dd997f411d524caabf0d8d38a24a878c?

Audience: TS/CSE > Customers > Teammates (in that order).

Cheat sheet: $type = chore|fix|feat $domain:
source|search|ci|release|plg|cody|local|...
-->

<!--
Example:

Title: fix(search): parse quotes with the appropriate context
Changelog section:

## Changelog

- When a quote is used with regexp pattern type, then ...
- Refactored underlying code.
-->

---------

Co-authored-by: Chris Smith <chrsmith@users.noreply.github.com>
2024-06-14 11:58:42 -07:00
Jan Hartman
c59ece1fd2
Use 8B version of llama3 in metadata generation (#63212)
In offline evals this seems as good as 70b while being a fair bit
faster.

## Test plan
Tested locally.
2024-06-13 13:40:56 +02:00
Beatrix
e1551657b1
Cody Gateway: Add support for Google non-streaming endpoint (#63166)
Add support for non-stream request for Google Gemini provider

- Added `Stream` field to `googleRequest` struct to enable streaming
completions
- Added `SymtemInstruction` field to `googleRequest` struct to allow
setting system instructions
- Updated `GoogleHandlerMethods.validateRequest` to allow
`FeatureEmbeddings` instead of `FeatureCodeCompletions`
- Updated `GoogleHandlerMethods.getRequestMetadata` to return the
`Stream` field
- Updated `GoogleGatewayFeatureClient.GetRequest` to handle streaming
for both `FeatureCodeCompletions` and `FeatureChatCompletions`
- Removed unsupported feature checks in `googleCompletionStreamClient`
- Added Gemini 1.5 Flash and Gemini 1.0 Pro to autocomplete allowed list
(but not supported by clients atm)

<!-- 💡 To write a useful PR description, make sure that your description
covers:
- WHAT this PR is changing:
    - How was it PREVIOUSLY.
    - How it will be from NOW on.
- WHY this PR is needed.
- CONTEXT, i.e. to which initiative, project or RFC it belongs.

The structure of the description doesn't matter as much as covering
these points, so use
your best judgement based on your context.
Learn how to write good pull request description:
https://www.notion.so/sourcegraph/Write-a-good-pull-request-description-610a7fd3e613496eb76f450db5a49b6e?pvs=4
-->


## Test plan

<!-- All pull requests REQUIRE a test plan:
https://docs-legacy.sourcegraph.com/dev/background-information/testing_principles
-->

Unit tests updated for non-stream request.

To manually test this:

1. In your Soucegraph local instance's Site Config, add the following:

```
  "completions": {
    "accessToken": "REDACTED",
    "chatModel": "gemini-1.5-pro-latest",
    "completionModel": "google/gemini-1.5-flash-latest",
    "provider": "google",
```

Note: You can get the accessToken for Gemini API in 1Password.

2. After saving the site config with the above change, run the following
curl command that hits the code endpoint:

```
curl 'https://sourcegraph.test:3443/.api/completions/stream' -i \
-X POST \
-H 'authorization: token $YOUR_LOCAL_TOKEN' \
--data-raw '{"messages":[{"speaker":"human","text":"Who are you?"}],"maxTokensToSample":30,"temperature":0,"stopSequences":[],"timeoutMs":5000,"stream":false,"model":"gemini-1.5-pro-latest"}'
```

Output:
```
❯ curl 'https://sourcegraph.test:3443/.api/completions/stream' -i \
-X POST \
-H 'authorization: token $YOUR_LOCAL_TOKEN' \
--data-raw '{"messages":[{"speaker":"human","text":"Who are you?"}],"maxTokensToSample":30,"temperature":0,"stopSequences":[],"timeoutMs":5000,"stream":false,"model":"gemini-1.5-pro-latest"}'
HTTP/2 200
access-control-allow-credentials: true
access-control-allow-origin:
alt-svc: h3=":3443"; ma=2592000
cache-control: no-cache, max-age=0
content-type: text/plain; charset=utf-8
date: Tue, 11 Jun 2024 17:02:19 GMT
server: Caddy
server: Caddy
vary: Accept-Encoding, Authorization, Cookie, Authorization, X-Requested-With, Cookie
x-content-type-options: nosniff
x-frame-options: DENY
x-powered-by: Express
x-trace: e11a2ce292639414dd2ccdfcbfa89611
x-trace-span: 9457aa0dd0e09b6c
x-trace-url: https://sourcegraph.test:3443/-/debug/jaeger/trace/e11a2ce292639414dd2ccdfcbfa89611
x-xss-protection: 1; mode=block
content-length: 154

{"completion":"I am a large language model, trained by Google. \n\nHere's what that means:\n\n* **I am a computer program:** I","stopReason":"MAX_TOKENS"}%
```

## Changelog

<!--
1. Ensure your pull request title is formatted as: $type($domain): $what
2. Add bullet list items for each additional detail you want to cover
(see example below)
3. You can edit this after the pull request was merged, as long as
release shipping it hasn't been promoted to the public.
4. For more information, please see this how-to
https://www.notion.so/sourcegraph/Writing-a-changelog-entry-dd997f411d524caabf0d8d38a24a878c?

Audience: TS/CSE > Customers > Teammates (in that order).

Cheat sheet: $type = chore|fix|feat $domain:
source|search|ci|release|plg|cody|local|...
-->

<!--
Example:

Title: fix(search): parse quotes with the appropriate context
Changelog section:

## Changelog

- When a quote is used with regexp pattern type, then ...
- Refactored underlying code.
-->
2024-06-11 10:54:27 -07:00
Beatrix
d288874197
Cody Gateway: handle streams with trailing newline in Gemini response (#63172)
CONTEXT:
https://sourcegraph.slack.com/archives/C05ABRRGB0B/p1717790701356599

Fix an issue where a valid Gemini response ends with new lines, causing
a false alert.


![image](https://github.com/sourcegraph/sourcegraph/assets/68532117/6b69ff2e-b88e-435b-a50e-27eaa5e31bf9)

Issue: We are seeing `*errutil.leafError: no Google response found` in
Sentry complaining when I run a command or chat in VS Code using google
as provider.

Cause: Currently the code (added my me) would skip to the last line of
the stream response and determine if the response is valid or not, which
could be an issue because the stream API could ends the response with an
empty new line, where our current logic would fail.

Changes included in this PR:

- Modify `parseGoogleTokenUsage` function to find the last non-empty
line in the stream, to handle cases where the stream ends with a newline
- Add a test case to cover the scenario where the stream ends with a
newline

This change modifies the behavior of the `parseGoogleTokenUsage`
function to handle streams with trailing newlines, which would fail even
if the response is valid because Gemini adds a new line to the end of
their stream.

<!-- 💡 To write a useful PR description, make sure that your description
covers:
- WHAT this PR is changing:
    - How was it PREVIOUSLY.
    - How it will be from NOW on.
- WHY this PR is needed.
- CONTEXT, i.e. to which initiative, project or RFC it belongs.

The structure of the description doesn't matter as much as covering
these points, so use
your best judgement based on your context.
Learn how to write good pull request description:
https://www.notion.so/sourcegraph/Write-a-good-pull-request-description-610a7fd3e613496eb76f450db5a49b6e?pvs=4
-->


## Test plan

<!-- All pull requests REQUIRE a test plan:
https://docs-legacy.sourcegraph.com/dev/background-information/testing_principles
-->

Make a curl command to the Gemini streaming API to confirm the response
ends with a new line:


![image](https://github.com/sourcegraph/sourcegraph/assets/68532117/c0ff0c4a-40dd-4d49-91d3-1b9f01f5c2b9)

```sh
curl https://generativelanguage.googleapis.com/v1beta/models/gemini-1.5-flash:streamGenerateContent\?alt=sse\&key\=$GOOGLE_API_KEY \
    -H 'Content-Type: application/json' \
    -X POST \
    -d '{
      "contents": [
        {"role":"user",
         "parts":[{
           "text": "Write the first line of a story about a magic backpack."}]},
        {"role": "model",
         "parts":[{
           "text": "In the bustling city of Meadow brook, lived a young girl named Sophie. She was a bright and curious soul with an imaginative mind."}]},
        {"role": "user",
         "parts":[{
           "text": "Can you set it in a quiet village in 1600s France?"}]},
      ]
    }' 2> /dev/null
```

Copied the response to the test file and confirmed our current test
would fail with the same error message:


![image](https://github.com/sourcegraph/sourcegraph/assets/68532117/0485d38f-0d3c-40de-b641-a679b16fd1f4)

This is now handled by the newly added test with the actually stream
response returned by calling the Gemini API.


## Changelog

<!--
1. Ensure your pull request title is formatted as: $type($domain): $what
2. Add bullet list items for each additional detail you want to cover
(see example below)
3. You can edit this after the pull request was merged, as long as
release shipping it hasn't been promoted to the public.
4. For more information, please see this how-to
https://www.notion.so/sourcegraph/Writing-a-changelog-entry-dd997f411d524caabf0d8d38a24a878c?

Audience: TS/CSE > Customers > Teammates (in that order).

Cheat sheet: $type = chore|fix|feat $domain:
source|search|ci|release|plg|cody|local|...
-->

<!--
Example:

Title: fix(search): parse quotes with the appropriate context
Changelog section:

## Changelog

- When a quote is used with regexp pattern type, then ...
- Refactored underlying code.
-->
2024-06-08 15:54:06 -07:00
Robert Lin
177bdae83b
chore/cody-gateway: instrument removeUnseenTokens (#63169)
Sync spans seem to hang for quite a while after getting data from the
source - this adds instrumentation on the potential cause,
`removeUnseenTokens`, so that we can get a bit more detail from the
traces:


![image](https://github.com/sourcegraph/sourcegraph/assets/23356519/76dcd0ca-e316-4f67-ae52-0efca25a8da7)


## Test plan

n/a
2024-06-07 14:25:08 -07:00
Beatrix
d0add88218
feat(cody-gateway): add Google Gemini stable models to allowed models (#63163)
This change adds the stable versions of the Google Gemini 1.5 Flash,
Gemini 1.5 Pro, and Gemini Pro models to the list of allowed models in
the Cody Gateway configuration and API endpoints.

The changes are made in the following files:
- `cmd/cody-gateway/shared/config/config.go`
-
`cmd/frontend/internal/dotcom/productsubscription/codygateway_dotcom_user.go`
- `cmd/frontend/internal/httpapi/completions/chat.go`
- `internal/conf/computed.go`

The new models are added to the allowed model lists and also set as the
default chat, fast chat, and completion models when the configuration is
not explicitly set.

This change ensures that users are defaulted to use the stable versions
of the Google Gemini models with the latest versions are still being
available for use in the Cody Gateway application.

<!-- 💡 To write a useful PR description, make sure that your description
covers:
- WHAT this PR is changing:
    - How was it PREVIOUSLY.
    - How it will be from NOW on.
- WHY this PR is needed.
- CONTEXT, i.e. to which initiative, project or RFC it belongs.

The structure of the description doesn't matter as much as covering
these points, so use
your best judgement based on your context.
Learn how to write good pull request description:
https://www.notion.so/sourcegraph/Write-a-good-pull-request-description-610a7fd3e613496eb76f450db5a49b6e?pvs=4
-->


## Test plan

<!-- All pull requests REQUIRE a test plan:
https://docs-legacy.sourcegraph.com/dev/background-information/testing_principles
-->

Changes are covered by current tests.

## Changelog

<!--
1. Ensure your pull request title is formatted as: $type($domain): $what
2. Add bullet list items for each additional detail you want to cover
(see example below)
3. You can edit this after the pull request was merged, as long as
release shipping it hasn't been promoted to the public.
4. For more information, please see this how-to
https://www.notion.so/sourcegraph/Writing-a-changelog-entry-dd997f411d524caabf0d8d38a24a878c?

Audience: TS/CSE > Customers > Teammates (in that order).

Cheat sheet: $type = chore|fix|feat $domain:
source|search|ci|release|plg|cody|local|...
-->

<!--
Example:

Title: fix(search): parse quotes with the appropriate context
Changelog section:

## Changelog

- When a quote is used with regexp pattern type, then ...
- Refactored underlying code.
-->
2024-06-07 14:20:22 -07:00
Robert Lin
7e9d8ec8dc
feat/cody-gateway: use Enterprise Portal for actor/productsubscriptions (#62934)
Migrates Cody Gateway to use the new Enterprise Portal's "read-only"
APIs. For the most part, this is an in-place replacement - a lot of the
diff is in testing and minor changes. Some changes, such as the removal
of model allowlists, were made down the PR stack in
https://github.com/sourcegraph/sourcegraph/pull/62911.

At a high level, we replace the data requested by
`cmd/cody-gateway/internal/dotcom/operations.graphql` and replace it
with Enterprise Portal RPCs:

- `codyaccessv1.GetCodyGatewayAccess`
- `codyaccessv1.ListCodyGatewayAccesses`

Use cases that previously required retrieving the active license tags
now:

1. Use the display name provided by the Cody Access API
https://github.com/sourcegraph/sourcegraph/pull/62968
2. Depend on the connected Enterprise Portal dev instance to only return
dev subscriptions https://github.com/sourcegraph/sourcegraph/pull/62966

Closes https://linear.app/sourcegraph/issue/CORE-98
Related to https://linear.app/sourcegraph/issue/CORE-135
(https://github.com/sourcegraph/sourcegraph/pull/62909,
https://github.com/sourcegraph/sourcegraph/pull/62911)
Related to https://linear.app/sourcegraph/issue/CORE-97

## Local development

This change also adds Enterprise Portal to `sg start dotcom`. For local
development, we set up Cody Gateway to connect to Enterprise Portal such
that zero configuration is needed - all the required secrets are sourced
from the `sourcegrah-local-dev` GCP project automatically when you run
`sg start dotcom`, and local Cody Gateway will talk to local Enterprise
Portal to do the Enterprise subscriptions sync.

This is actually an upgrade from the current experience where you need
to provide Cody Gateway a Sourcegraph user access token to test
Enterprise locally, though the Sourcegraph user access token is still
required for the PLG actor source.

The credential is configured in
https://console.cloud.google.com/security/secret-manager/secret/SG_LOCAL_DEV_SAMS_CLIENT_SECRET/overview?project=sourcegraph-local-dev,
and I've included documentation in the secret annotation about what it
is for and what to do with it:


![image](https://github.com/sourcegraph/sourcegraph/assets/23356519/c61ad4e0-3b75-408d-a930-076a414336fb)

## Rollout plan

I will open PRs to set up the necessary configuration for Cody Gateway
dev and prod. Once reviews taper down I'll cut an image from this branch
and deploy it to Cody Gateway dev, and monitor it closely + do some
manual testing. Once verified, I'll land this change and monitor a
rollout to production.

Cody Gateway dev SAMS client:
https://github.com/sourcegraph/infrastructure/pull/6108
Cody Gateway prod SAMS client update (this one already exists):

```
accounts=> UPDATE idp_clients
SET scopes = scopes || '["enterprise_portal::subscription::read", "enterprise_portal::codyaccess::read"]'::jsonb
WHERE id = 'sams_cid_018ea062-479e-7342-9473-66645e616cbf';
UPDATE 1
accounts=> select name, scopes from idp_clients WHERE name = 'Cody Gateway (prod)';
        name         |                                                              scopes                                                              
---------------------+----------------------------------------------------------------------------------------------------------------------------------
 Cody Gateway (prod) | ["openid", "profile", "email", "offline_access", "enterprise_portal::subscription::read", "enterprise_portal::codyaccess::read"]
(1 row)
```

Configuring the target Enterprise Portal instances:
https://github.com/sourcegraph/infrastructure/pull/6127

## Test plan

Start the new `dotcom` runset, now including Enterprise Portal, and
observe logs from both `enterprise-portal` and `cody-gateway`:

```
sg start dotcom
```

I reused the test plan from
https://github.com/sourcegraph/sourcegraph/pull/62911: set up Cody
Gateway external dependency secrets, then set up an enterprise
subscription + license with a high seat count (for a high quota), and
force a Cody Gateway sync:

```
curl -v -H 'Authorization: bearer sekret' http://localhost:9992/-/actor/sync-all-sources
```

This should indicate the new sync against "local dotcom" fetches the
correct number of actors and whatnot.

Using the local enterprise subscription's access token, we run the QA
test suite:

```sh
$ bazel test --runs_per_test=2 --test_output=all //cmd/cody-gateway/qa:qa_test --test_env=E2E_GATEWAY_ENDPOINT=http://localhost:9992 --test_env=E2E_GATEWAY_TOKEN=$TOKEN
INFO: Analyzed target //cmd/cody-gateway/qa:qa_test (0 packages loaded, 0 targets configured).
INFO: From Testing //cmd/cody-gateway/qa:qa_test (run 1 of 2):
==================== Test output for //cmd/cody-gateway/qa:qa_test (run 1 of 2):
PASS
================================================================================
INFO: From Testing //cmd/cody-gateway/qa:qa_test (run 2 of 2):
==================== Test output for //cmd/cody-gateway/qa:qa_test (run 2 of 2):
PASS
================================================================================
INFO: Found 1 test target...
Target //cmd/cody-gateway/qa:qa_test up-to-date:
  bazel-bin/cmd/cody-gateway/qa/qa_test_/qa_test
Aspect @@rules_rust//rust/private:clippy.bzl%rust_clippy_aspect of //cmd/cody-gateway/qa:qa_test up-to-date (nothing to build)
Aspect @@rules_rust//rust/private:rustfmt.bzl%rustfmt_aspect of //cmd/cody-gateway/qa:qa_test up-to-date (nothing to build)
INFO: Elapsed time: 13.653s, Critical Path: 13.38s
INFO: 7 processes: 1 internal, 6 darwin-sandbox.
INFO: Build completed successfully, 7 total actions
//cmd/cody-gateway/qa:qa_test                                            PASSED in 11.7s
  Stats over 2 runs: max = 11.7s, min = 11.7s, avg = 11.7s, dev = 0.0s

Executed 1 out of 1 test: 1 test passes.
```
2024-06-07 11:46:01 -07:00
Jan Hartman
18bdafac78
Cody Gateway embeddings: powering with generated metadata - take 2 (#63112)
Reverts sourcegraph/sourcegraph#63098 and fixes the problems with conf
loading and Fireworks API responses.

## Test plan
Try running locally and then with feature flag.
2024-06-07 12:33:11 +02:00
James McNamara
4077b3ec22
feat(ci): Adds playwright tests for sveltekit to bazel (#62560)
This runs playwright tests with bazel. This changes how the
app is served in the tests, specifically playwright will intercept all
network calls to the local server and serve the static assets directly
or serve root index.html file if nothing is matched.

---------

Co-authored-by: bahrmichael <michael.bahr@sourcegraph.com>
Co-authored-by: Jean-Hadrien Chabran <jh@chabran.fr>
Co-authored-by: Michael Bahr <1830132+bahrmichael@users.noreply.github.com>
Co-authored-by: Jean-Hadrien Chabran <jean-hadrien.chabran@sourcegraph.com>
Co-authored-by: Camden Cheek <camden@ccheek.com>
2024-06-06 12:45:05 -06:00
Varun Gandhi
2955bb6cfb
chore: Change errors.HasType to respect multi-errors (#63024)
With this patch, the `errors.HasType` API behaves similar to `Is` and `As`,
where it checks the full error tree instead of just checking a linearized version
of it, as cockroachdb/errors's `HasType` implementation does not respect
multi-errors.

As a consequence, a bunch of relationships between HasType and Is/As that
you'd intuitively expect to hold are now true; see changes to `invariants_test.go`.
2024-06-06 13:02:14 +00:00
Rafał Gajdulewicz
08a1c6a6f6
Revert "Cody Gateway embeddings: powering with generated metadata" (#63098)
Reverts sourcegraph/sourcegraph#63000 - this makes Cody Gateway hang
(and fail to listen on 9992 port).
2024-06-05 14:56:03 +02:00
Jan Hartman
4327bf8fc1
Cody Gateway embeddings: powering with generated metadata (#63000)
Connected to https://github.com/sourcegraph/bfg-private/pull/189 and
https://github.com/sourcegraph/cody/pull/4414.

We're introducing a hacky MVP to enable embeddings being powered by
metadata that's generated from code. This PR is the bare minimum to make
this work on CG. We plan to trigger metadata generation only if we're
using a new (fake) model (this comes in via a feature flag) and if the
request isn't a real-time query, but is a background indexing request.
The implementation is really hacky, but is also really minimal.

## Test plan
Testing locally through a feature flag.
2024-06-05 13:33:10 +02:00
Beatrix
f2590cbb36
Cody Gateway: Add Gemini models to PLG and Enterprise users (#63053)
CLOSE https://github.com/sourcegraph/cody-issues/issues/211 &
https://github.com/sourcegraph/cody-issues/issues/412 &
https://github.com/sourcegraph/cody-issues/issues/412
UNBLOCK https://github.com/sourcegraph/cody/pull/4360

* Add support for Google Gemini AI models as chat completions provider
* Add new `google` package to handle Google Generative AI client
* Update `client.go` and `codygateway.go` to handle the new Google
provider
* Set default models for chat, fast chat, and completions when Google is
the configured provider
* Add gemini-pro to the allowed list

<!-- 💡 To write a useful PR description, make sure that your description
covers:
- WHAT this PR is changing:
    - How was it PREVIOUSLY.
    - How it will be from NOW on.
- WHY this PR is needed.
- CONTEXT, i.e. to which initiative, project or RFC it belongs.

The structure of the description doesn't matter as much as covering
these points, so use
your best judgement based on your context.
Learn how to write good pull request description:
https://www.notion.so/sourcegraph/Write-a-good-pull-request-description-610a7fd3e613496eb76f450db5a49b6e?pvs=4
-->


## Test plan

<!-- All pull requests REQUIRE a test plan:
https://docs-legacy.sourcegraph.com/dev/background-information/testing_principles
-->

For Enterprise instances using google as provider:

1. In your Soucegraph local instance's Site Config, add the following:

```
    "accessToken": "REDACTED",
    "chatModel": "gemini-1.5-pro-latest",
    "provider": "google",
```

Note: You can get the accessToken for Gemini API in 1Password.

2. After saving the site config with the above change, run the following
curl command:

```
curl 'https://sourcegraph.test:3443/.api/completions/stream' -i \
-X POST \
-H 'authorization: token $LOCAL_INSTANCE_TOKEN' \
--data-raw '{"messages":[{"speaker":"human","text":"Who are you?"}],"maxTokensToSample":30,"temperature":0,"stopSequences":[],"timeoutMs":5000,"stream":true,"model":"gemini-1.5-pro-latest"}'
```

3. Expected Output:

```
❯ curl 'https://sourcegraph.test:3443/.api/completions/stream' -i \
-X POST \
-H 'authorization: token <REDACTED>' \
--data-raw '{"messages":[{"speaker":"human","text":"Who are you?"}],"maxTokensToSample":30,"temperature":0,"stopSequences":[],"timeoutMs":5000,"stream":true,"model":"gemini-1.5-pro-latest"}'

HTTP/2 200
access-control-allow-credentials: true
access-control-allow-origin:
alt-svc: h3=":3443"; ma=2592000
cache-control: no-cache
content-type: text/event-stream
date: Tue, 04 Jun 2024 05:45:33 GMT
server: Caddy
server: Caddy
vary: Accept-Encoding, Authorization, Cookie, Authorization, X-Requested-With, Cookie
x-accel-buffering: no
x-content-type-options: nosniff
x-frame-options: DENY
x-powered-by: Express
x-trace: d4b1f02a3e2882a3d52331335d217b03
x-trace-span: 728ec33860d3b5e6
x-trace-url: https://sourcegraph.test:3443/-/debug/jaeger/trace/d4b1f02a3e2882a3d52331335d217b03
x-xss-protection: 1; mode=block

event: completion
data: {"completion":"I","stopReason":"STOP"}

event: completion
data: {"completion":"I am a large language model, trained by Google. \n\nThink of me as","stopReason":"STOP"}

event: completion
data: {"completion":"I am a large language model, trained by Google. \n\nThink of me as a computer program that can understand and generate human-like text.","stopReason":"MAX_TOKENS"}

event: done
data: {}
```

Verified locally:


![image](https://github.com/sourcegraph/sourcegraph/assets/68532117/2e6c914d-7a77-4484-b693-16bbc394518c)

#### Before

Cody Gateway returns `no client known for upstream provider google`

```sh
curl -X 'POST' -d '{"messages":[{"speaker":"human","text":"Who are you?"}],"maxTokensToSample":30,"temperature":0,"stopSequences":[],"timeoutMs":5000,"stream":true,"model":"google/gemini-1.5-pro-latest"}' -H 'Accept: application/json' -H 'Authorization: token $YOUR_DOTCOM_TOKEN' -H 'Content-Type: application/json' 'https://sourcegraph.com/.api/completions/stream'

event: error
data: {"error":"no client known for upstream provider google"}

event: done
data: {
```

## Changelog

<!--
1. Ensure your pull request title is formatted as: $type($domain): $what
2. Add bullet list items for each additional detail you want to cover
(see example below)
5. You can edit this after the pull request was merged, as long as
release shipping it hasn't been promoted to the public.
6. For more information, please see this how-to
https://www.notion.so/sourcegraph/Writing-a-changelog-entry-dd997f411d524caabf0d8d38a24a878c?

Audience: TS/CSE > Customers > Teammates (in that order).

Cheat sheet: $type = chore|fix|feat $domain:
source|search|ci|release|plg|cody|local|...
-->

<!--
Example:

Title: fix(search): parse quotes with the appropriate context
Changelog section:

## Changelog

- When a quote is used with regexp pattern type, then ...
- Refactored underlying code.
-->

Added support for Google as an LLM provider for Cody, with the following
models available through Cody Gateway: Gemini Pro (`gemini-pro-latest`),
Gemini 1.5 Flash (`gemini-1.5-flash-latest`), and Gemini 1.5 Pro
(`gemini-1.5-pro-latest`).
2024-06-04 23:46:36 +00:00
Robert Lin
f952ceb8da
feat/cody-gateway: use wildcard for enterprise allowlists (#62911)
This change makes Cody Gateway always apply a wildcard model allowlist,
irrespective of what the configured model allowlist is for an Enterprise
subscription is in dotcom (see #62909).

The next PR in the stack,
https://github.com/sourcegraph/sourcegraph/pull/62912, makes the GraphQL
queries return similar results, and removes model allowlists from the
subscription management UI.

Closes https://linear.app/sourcegraph/issue/CORE-135

### Context

In https://sourcegraph.slack.com/archives/C05SZB829D0/p1715638980052279
we shared a decision we landed on as part of #62263:

> Ignoring (then removing) per-subscription model allowlists: As part of
the API discussions, we've also surfaced some opportunities for
improvements - to make it easier to roll out new models to Enterprise,
we're not including per-subscription model allowlists in the new API,
and as part of the Cody Gateway migration (by end-of-June), we will
update Cody Gateway to stop enforcing per-subscription model allowlists.
Cody Gateway will still retain a Cody-Gateway-wide model allowlist.
[@chrsmith](https://sourcegraph.slack.com/team/U061QHKUBJ8) is working
on a broader design here and will have more to share on this later.

This means there is one less thing for us to migrate as part of
https://github.com/sourcegraph/sourcegraph/pull/62934, and avoids the
need to add an API field that will be removed shortly post-migration.

As part of this, rolling out new models to Enterprise customers no
longer require additional code/override changes.

## Test plan

Set up Cody Gateway locally as documented, then `sg start dotcom`. Set
up an enterprise subscription + license with a high seat count (for a
high quota), and force a Cody Gateway sync:

```
curl -v -H 'Authorization: bearer sekret' http://localhost:9992/-/actor/sync-all-sources
```

Verify we are using wildcard allowlist:

```sh
$ redis-cli -p 6379 get 'v2:product-subscriptions:v2:slk_...'
"{\"key\":\"slk_...\",\"id\":\"6ad033f4-c6da-43a9-95ef-f653bf59aaac\",\"name\":\"bobheadxi\",\"accessEnabled\":true,\"endpointAccess\":{\"/v1/attribution\":true},\"rateLimits\":{\"chat_completions\":{\"allowedModels\":[\"*\"],\"limit\":660,\"interval\":86400000000000,\"concurrentRequests\":330,\"concurrentRequestsInterval\":10000000000},\"code_completions\":{\"allowedModels\":[\"*\"],\"limit\":66000,\"interval\":86400000000000,\"concurrentRequests\":33000,\"concurrentRequestsInterval\":10000000000},\"embeddings\":{\"allowedModels\":[\"*\"],\"limit\":220000000,\"interval\":86400000000000,\"concurrentRequests\":110000000,\"concurrentRequestsInterval\":10000000000}},\"lastUpdated\":\"2024-05-24T20:28:58.283296Z\"}"
```

Using the local enterprise subscription's access token, we run the QA
test suite:

```sh
$ bazel test --runs_per_test=2 --test_output=all //cmd/cody-gateway/qa:qa_test --test_env=E2E_GATEWAY_ENDPOINT=http://localhost:9992 --test_env=E2E_GATEWAY_TOKEN=$TOKEN
INFO: Analyzed target //cmd/cody-gateway/qa:qa_test (0 packages loaded, 0 targets configured).
INFO: From Testing //cmd/cody-gateway/qa:qa_test (run 1 of 2):
==================== Test output for //cmd/cody-gateway/qa:qa_test (run 1 of 2):
PASS
================================================================================
INFO: From Testing //cmd/cody-gateway/qa:qa_test (run 2 of 2):
==================== Test output for //cmd/cody-gateway/qa:qa_test (run 2 of 2):
PASS
================================================================================
INFO: Found 1 test target...
Target //cmd/cody-gateway/qa:qa_test up-to-date:
  bazel-bin/cmd/cody-gateway/qa/qa_test_/qa_test
Aspect @@rules_rust//rust/private:clippy.bzl%rust_clippy_aspect of //cmd/cody-gateway/qa:qa_test up-to-date (nothing to build)
Aspect @@rules_rust//rust/private:rustfmt.bzl%rustfmt_aspect of //cmd/cody-gateway/qa:qa_test up-to-date (nothing to build)
INFO: Elapsed time: 13.653s, Critical Path: 13.38s
INFO: 7 processes: 1 internal, 6 darwin-sandbox.
INFO: Build completed successfully, 7 total actions
//cmd/cody-gateway/qa:qa_test                                            PASSED in 11.7s
  Stats over 2 runs: max = 11.7s, min = 11.7s, avg = 11.7s, dev = 0.0s

Executed 1 out of 1 test: 1 test passes.
```
2024-06-04 22:29:20 +00:00
Chris Smith
c4b5c73260
feat(cody-gateway): Add FLAGGED_MODEL_NAMES check (#63013)
* Cody Gateway: Add FLAGGED_MODEL_NAMES check

* Update cmd/cody-gateway/internal/httpapi/completions/flagging.go

Co-authored-by: Quinn Slack <quinn@slack.org>

---------

Co-authored-by: Quinn Slack <quinn@slack.org>
2024-05-31 20:12:27 +00:00
Robert Lin
5833a98185
feat/cody-gateway: support wildcard models (#62909)
In https://sourcegraph.slack.com/archives/C05SZB829D0/p1715638980052279 we shared a decision we landed on as part of #62263:

> Ignoring (then removing) per-subscription model allowlists: As part of the API discussions, we've also surfaced some opportunities for improvements - to make it easier to roll out new models to Enterprise, we're not including per-subscription model allowlists in the new API, and as part of the Cody Gateway migration (by end-of-June), we will update Cody Gateway to stop enforcing per-subscription model allowlists. Cody Gateway will still retain a Cody-Gateway-wide model allowlist. [@chrsmith](https://sourcegraph.slack.com/team/U061QHKUBJ8) is working on a broader design here and will have more to share on this later.

To support this, we first need to extend Cody Gateway's model allowlist enforcement to respect a notion of "allow all models that are allowed in Cody Gateway". To ensure models are explicitly provided today, an empty `AllowedModels` is considered invalid, so we add a special single-element-slice-`*` configuration that can be used to indicate an actor's rate limit allows all models (`prefixedMasterAllowlist`).

This change also unifies somewhat the way we enforce allowed models in various places by introducing `(*RateLimit).EvaluateAllowedModels(...)` as the unified way to construct the final allowlist for a given rate limit.

I'm planning to roll this out before rolling out actual functionality changes (https://github.com/sourcegraph/sourcegraph/pull/62911) to ensure changes in cached rate limits don't end up confusing an older revision of Cody Gateway that doesn't yet support wildcard models. With #62911, rolling out new models to Enterprise customers no longer require additional code/override changes.

Part of https://linear.app/sourcegraph/issue/CORE-135

## Test plan

Unit tests, and E2E test of this in https://github.com/sourcegraph/sourcegraph/pull/62911
2024-05-31 13:09:01 -07:00