sourcegraph

mirror of https://github.com/sourcegraph/sourcegraph.git synced 2026-02-06 18:31:54 +00:00

Author	SHA1	Message	Date
Keegan Carruthers-Smith	9981c14e40	migrations: avoid duplicate foreign key constraints when tenant_id re-run (#64465 ) ALTER TABLE with a foreign key constraint will _always_ add the constraint, which means we always require a table lock even if this migration has run. So we check if the column exists first. The motivation for this change was me noticing slow migrations against S2, even though it had been migrated already. This speeds it up since it avoids a bunch of locks. In testing I kept running into issues with the lsif_* tables. We agreed we will only migrate them later. Test Plan: ran against my local instance and S2 multiple times and CI. Also manually ran the down and up migration against dotcom	2024-08-14 16:57:32 +02:00
Keegan Carruthers-Smith	3268ade126	migrations: commit between table locks and use when condition for triggers (#64462 ) The improvement to the triggers I tried to ship yesterday failed on dotcom. This is because it has the same issue as the tenant_id column where we can't acquire the ShareRowExclusiveLock for multiple important tables in a transaction without deadlocking. We use the same technique where we commit and restart the transaction. This is safe to do since even if it fails, the migration is idempotent and partial migration still works. Additionally after reading the documentation on triggers more deeply [1], I noticed we can add in a WHEN clause to triggers. We already have this in the functions, but this should be more efficient. I didn't want to remove the condition from the functions just yet, since that makes the migration more risky under partial failure. ## Test Plan CI. Also added BEGIN and COMMIT to the migration than ran locally with ``` shell psql -f migrations/frontend/1723557836_faster_soft_delete_trigger_for_permissions/up.sql ``` Then inspected the DB state for repo and had a good looking trigger come back ``` \d+ repo ...snip Triggers: trig_delete_user_repo_permissions_on_repo_soft_delete AFTER UPDATE OF deleted_at ON repo FOR EACH ROW WHEN (new.deleted_at IS NOT NULL AND old.deleted_at IS NULL) EXECUTE FUNCTION delete_user_repo_permissions_on_repo_soft_delete() ``` Additionally before landing I am going to manually run this migration on S2 first, then on dotcom to see how it behaves when there is lots of traffic and scale.	2024-08-14 15:38:35 +02:00
Keegan Carruthers-Smith	9d2220ea33	migration: specify column for soft delete triggers (#64446 ) We believe the migration to add tenant_id is failing on dotcom is due to this trigger which runs for any possible change to a row. Instead it only needs to run if the deleted_at column changes. So this migration just adjusts each trigger to only run if deleted_at is changed. We don't need to change the trigger function, since the trigger functions only read deleted at. If this doesn't resolve our issue, we will need to look into a statement level trigger for all our triggers. Test Plan: CI and then will test safely on larger instances.	2024-08-13 15:03:49 +00:00
Keegan Carruthers-Smith	e93b69bef2	migrations: use COMMIT AND CHAIN for tenant_id (#64431 ) We dived into our go postgres driver and when executing a migration it is executed as a "simple query". Postgres in this case automatically wraps the collection of statements in a transaction, unless it contains transaction statements. So our last attempt at removing the transaction failed. In this attempt we use COMMIT AND CHAIN after each table alter. What this does is commit the current transaction and then starts it up again. From the perspective of the go driver, it is as if there was only one transaction. We then switch the migration to using a transaction to ensure the go drivers clean up the postgres connection in case of failure. IE if a query manually starts a transaction and does not clean up, the connection will be marked as broken for the next person who gets the connection from the pool. By wrapping in go's transaction code the connection will be properly cleaned up. Test Plan: All continuous environments have already succeeded or failed on this migration number. So we will manually run this again against them with the migrator code to ensure the same code paths. If they succeed we will keep code as is, otherwise we will rollback. Additionally we did lots of adhoc testing to understand the characteristics of go and transaction handling. Co-authored-by: Erik Seliger <erikseliger@me.com>	2024-08-13 11:59:59 +00:00
Keegan Carruthers-Smith	cbd12608b5	database: run tenant_id migration outside of a transaction (#64410 ) We hit a deadlock when deploying this migration to s2. This is because within our transaction of the migration we likely didn't obtain table locks in the same order as a transaction in our application code. So this commit introduces a new migration metadata field "noTransaction". The documentation for migrator says you should create a migration per needed transactions. However, this would require us to create 100s of migrations. We believe the better approach is introducing this field and barely advertising it. When reading the code which actually runs migrations, there is no extra logic done outside of BEGIN; run_migration; COMMIT; so this change is safe. We update the migrations to avoid duplicating the function name we introduce in case something goes wrong (now that the transaction could leak out the function name). Test Plan: The actual migrations are tested by go test. I added a test assertion that we don't call Transact, but to be honest that is super sketchy. However, we couldn't actually find any test fixtures which actually run against the DB. So that would require a much deeper investment for how simple the code change is. Co-authored-by: Erik Seliger <erikseliger@me.com>	2024-08-13 07:08:19 +02:00
Erik Seliger	ba9fe24e0f	Revert "tenant: Introduce tenant_id to every table (#64323 )" (#64386 ) This reverts commit `43e06cef0c`. This migration deadlocked on S2, we didn't expect that we have processes taking table locks but apparently so.. Reverting for now and will manually fix up S2 once the build went through. We probably need to make this one migration per table :rip: ## Test plan Revert.	2024-08-09 15:02:12 +02:00
Erik Seliger	43e06cef0c	tenant: Introduce tenant_id to every table (#64323 ) We introduce the tenant_id column to every table without any magic enforcement or magic default value. Instead we default to NULL and plan to have an out of band migration. From out testing this change is instant since it is only a table metadata change. According to the documentation this sort of ALTER TABLE is instant since Postgres 11. Test Plan: ran this migration against a clone of the s2 database. --------- Co-authored-by: Keegan Carruthers-Smith <keegan.csmith@gmail.com>	2024-08-09 10:17:21 +02:00
Erik Seliger	221b7f0fb6	tenant: Ensure tenant table also exists in codeintel and codeinsights DBs (#64359 ) In some deployments, these are running in separate instances. This migration makes sure that we can still reference the tenants from the tenant_id column reliably everywhere. It's critical that this migration is a noop on instances that already have this table, so please double check me here during code review. Test plan: Integration and E2E tests still pass, as this should be a noop.	2024-08-08 15:33:49 +02:00
Erik Seliger	1819daa7a7	tenant: Introduce tenant table to frontend database (#64322 ) This PR adds a basic table for storing tenants in the database, and automatically inserts a tenant 1 for existing instances. Test plan: Adds a new, not yet used table. The migration is very simple, so shouldn't cause any issues to existing instances.	2024-08-08 12:03:52 +02:00
Stefan Hengl	cd38adb4a7	chore(search_jobs): add janitor job (#64186 ) Fixes SPLF-119 This adds a background job to Search Jobs that periodically scans for finished jobs to aggregate the status, upload logs, and clean up the tables. This drastically reduces the size of the tables and improves the performance of the Search Jobs GQL API. For example, with this change, a finished search job on .com only has 1 entry in the database, whereas before it could have several millions if we searched each repository. Notes: - the diff seems larger than it actually is. I left a couple of comments to help the reviewers. ## Test plan: - new unit tests - manual testing: I ran a couple of search jobs locally (with the janitor job interval set to 1 min) and checked that - logs are uploaded to `blobstore-go/buckets/search-jobs` - repo jobs are deleted from `exhaustive_repo_jobs` - logs are served from the blobstore after the janitor ran - downloading logs while the job is running still works ## Changelog The new background job drastically reduces the size of the `exhaustive_*` tables and improves performance of the Search Jobs GQL API.	2024-08-01 15:29:10 +02:00
Michael Bahr	3ff0ddf1cb	fix(batches): switch github app installation handling from redirect flow to webhooks (#64036 ) Closes SRCH-741 Closes SRCH-716 This PR removes the GitHub installation code from the redirect flow to a webhook-based appraoch. We expect that the GitHub server calls the webhook when the installation is ready, and therefore shouldn't see the errors explained in the issues above. To handle the potential delay until the webhook is called and the credential is set up, I added a scrappy info notice that the user should refresh their page: <img width="928" alt="Screenshot 2024-07-24 at 13 48 24" src="https://github.com/user-attachments/assets/4d298f2a-d7b8-423b-9e2f-2ae53fbce1ac"> Below is what you see after you refreshed (or if the webhook was called faster than the user being redirected back to the settings): <img width="929" alt="Screenshot 2024-07-24 at 13 50 14" src="https://github.com/user-attachments/assets/b6826158-8561-476d-b20e-e36f8cfb86fd"> I'm able to create PRs for sourcegraph-testing with an app that was created this way. <img width="1171" alt="Screenshot 2024-07-24 at 16 16 06" src="https://github.com/user-attachments/assets/86e20acb-136f-4a46-a33b-bdfdd0d51d71"> I'm seeing an error when getting an access token with a personal github app to run a batch change, but that will be handled with another PR. <img width="1053" alt="Screenshot 2024-07-24 at 16 38 38" src="https://github.com/user-attachments/assets/5655ba91-1ae4-453a-8d5c-1bcdbe34bc17"> ## Test plan Manual testing locally, and more testing to be done on S2 where we have a more production like environment ## Changelog - When installing a GitHub app for batch changes, the instance now waits for a callback from GitHub to complete the installation to avoid issues from eventual consistency. --------- Co-authored-by: Peter Guy <peter.guy@sourcegraph.com>	2024-07-26 11:53:34 +00:00
Camden Cheek	033cb9d901	Search: add regex support to `repo:has.meta()` (#63891 ) This adds support to searching for repo metadata with a regex pattern. Background: repo metadata is a useful feature for shoehorning business-specific information into the search query language. It allows tagging repos with arbitrary metadata (think ownership info, quality info, 3rd-party system IDs, etc.). This ends up being a useful escape hatch to shim in functionality that is not natively supported in Sourcegraph. However it's currently limited to searching with an exact key/value pair. We've had a few requests to extend this to allow searching by pattern because it enables ingesting semi-structured data and making it searchable. This adds the ability to use a `/.../`-delimited regex pattern to match against both keys and values. For example, `repo:has.meta(team:/^my\/org/)`	2024-07-19 11:53:24 -06:00
Quinn Slack	b4e03f45b0	Prompt Library (#63872 ) The Prompt Library lets you create, share, and browse chat prompts for use with Cody. Prompts are owned by users or organizations, and site admins can make prompts public so that all users on the instance can see and use them. A prompt is just plain text for now, and you can see a list of prompts in your Prompt Library from within Cody chat (https://github.com/sourcegraph/cody/pull/4903). See https://www.loom.com/share/f3124269300c481ebfcbd0a1e300be1b. Depends on https://github.com/sourcegraph/cody/pull/4903. ![image](https://github.com/user-attachments/assets/d1098809-f7ff-4233-8ecb-9bc53ad4dbb2) ## Test plan Add a prompt on the web. Ensure you can access it from Cody. ## Changelog - The Prompt Library lets you create, share, and browse chat prompts for use with Cody. Prompts are owned by users or organizations, and site admins can make prompts public so that all users on the instance can see and use them. To use a prompt from your Prompt Library in Cody, select it in the Prompts dropdown in the Cody chat message field.	2024-07-18 16:04:55 -07:00
Quinn Slack	fcdcfef706	add saved search visibility, draft, and timestamps (#63909 ) Public saved searches will let us make global saved searches for dotcom and for customers to help them discover and share awesome search queries! Saved searches now have: - Visibility (public vs. secret). Only site admins may make a saved search public. Secret saved searches are visible only to their owners (either a user, or all members of the owning org). A public saved search can be viewed by everyone on the instance. - Draft status: If a saved search's "draft" checkbox is checked, that means that other people shouldn't use that saved search yet. You're still working on it. - Timestamps: The last user to update a saved search and the creator of the saved search are now recorded. Also adds a lot more tests for saved search UI and backend code. ![image](https://github.com/user-attachments/assets/a6fdfa54-61c5-4a0f-9f04-1bdf34ae3ad4) ![image](https://github.com/user-attachments/assets/a29567a0-9cfa-4535-99b6-7ffb4aea4102) ![image](https://github.com/user-attachments/assets/e287e18c-0b88-451d-b1c8-a26987e7ec47) ## Test plan Create a saved search. Ensure it's in secret visibility to begin with. As a site admin, make it public. Ensure other users can view it, and no edit buttons are shown. Try changing visibility back and forth. ## Changelog - Saved searches can now be made public (by site admins), which means all users can view them. This is a great way to share useful search queries with all users of a Sourcegraph instance. - Saved searches can be marked as a "draft", which is a gentle indicator that other people shouldn't use it yet.	2024-07-18 13:37:41 -07:00
Geoffrey Gilmore	14137ed9df	feature/db/sub_repo_perms: add IPs column to sub repo perms (#63771 ) This PR modifies the sub_repo_permissions database to store the ip addresses associated with each Perforce path rule, as part of the IP based sub repo permissions work. The new IP column is implemetned as a []text similar to the path column. The IP address associated with `paths[0]` is stored in the ips column in `ips[0]`. For example, the follownig proections table ``` Protections: read user emily * //depot/elm_proj/... write group devgrp * //... write user * 192.168.41.0/24 -//... write user * [2001:db8:1:2::]/64 -//... write user joe * -//... write user lisag * -//depot/... write user lisag * //depot/doc/... super user edk * //... ``` turns into the following rows in the sub_repo_permissions table \| repo_id \| user_id \| version \| updated_at \| paths \| ips \| \|---------\|---------\|---------\|------------\|-------\|-----\| \| 1 \| 1 \| 1 \| 2023-07-01 10:00:00 \| {`"//depot/elm_proj/..."`} \| {`""`} \| \| 1 \| 2 \| 1 \| 2023-07-01 10:00:00 \| {`"//..."`} \| {`""`} \| \| 1 \| 3 \| 1 \| 2023-07-01 10:00:00 \| {`"-//..."`} \| {`"192.168.41.0/24"`} \| \| 1 \| 4 \| 1 \| 2023-07-01 10:00:00 \| {`"-//..."`} \| {`"2001:db8:1:2::]/64"`} \| \| 1 \| 5 \| 1 \| 2023-07-01 10:00:00 \| {`"-//..."`} \| {`""`} \| \| 1 \| 6 \| 1 \| 2023-07-01 10:00:00 \| {`"-//depot/..."`, `"//depot/doc/..."`} \| {`""`, `""`} \| \| 1 \| 7 \| 1 \| 2023-07-01 10:00:00 \| {`"//..."`} \| {`""`} \| ## Test plan The unit test for the sub_repository_permissions store PR that is built on this PR: https://app.graphite.dev/github/pr/sourcegraph/sourcegraph/63811/internal-database-sub_repo_permissions-modify-store-to-be-able-to-insert-ip-based-permissions ## Changelog - The sub_repo_permissions table now has an ips column to store the associated IP address associated with each path rule.	2024-07-18 12:54:54 -07:00
Quinn Slack	7ba706e65d	various improvements to saved searches (#63539 ) - Remove long-deprecated and long-ineffective notifications for saved searches (removed in `de8ae5ee28` 2.5 years ago). Note that code monitors were the replacement for saved searches and work great. - Clean up UI. - Make the UI global instead of in the user/org area. - Convert React class components to function components. - Add default `patterntype:` because it's required. - Use `useQuery` and `useMutation` instead of `requestGraphQL`. - Use a single namespace `owner` GraphQL arg instead of separating out `userID` and `orgID`. - Clean up GraphQL resolver code and factor out common auth checking. - Support transferring ownership of saved searches among owners (the user's own user account and the orgs they're a member of). (I know this is not in Svelte.) SECURITY: There is one substantive change. Site admins may now view any user's and any org's saved searches. This is so that they can audit and delete them if needed. ![image](https://github.com/sourcegraph/sourcegraph/assets/1976/7ba22c1c-b92e-4089-836b-135a503c96a0) ![image](https://github.com/sourcegraph/sourcegraph/assets/1976/a1f2f43d-f681-4ec9-b2a1-8273707b34ee) ![image](https://github.com/sourcegraph/sourcegraph/assets/1976/5ee1164f-ed2e-4144-9aca-db61fa7c20f4) ![image](https://github.com/sourcegraph/sourcegraph/assets/1976/d631529e-6c0d-49c6-9be1-33a7ff53ed97) ## Test plan Try creating, updating, and deleting saved searches, and transferring ownership of them. ## Changelog - Improved the saved searches feature, which lets you save search queries to easily reuse them later and share them with other people in an organization. - Added the ability to transfer ownership of a saved search to a user's organizations or from an organization to a user's account. - Removed a long-deprecated and ineffective settings `search.savedQueries` field. You can manage saved searches in a user's or organization's profile area (e.g., at `/user/searches`).	2024-07-15 20:12:34 +00:00
Stefan Hengl	e3d112b01c	notebooks: set default pattern type to keyword (#63662 ) Relates to https://github.com/sourcegraph/sourcegraph/pull/63472 This changes the default patternType of Notebooks from "standard" to "keyword". As a consequence, all new notebooks will default to keyword search. Existing notebooks will keep using standard search. Test plan: - I ran the migration locally and verified that new notebooks use "keyword" as default.	2024-07-09 09:36:20 +02:00
Stefan Hengl	9b4e70face	insights: persist patternType in db (#63579 ) This PR refactors insights to support the upcoming Keyword Search GA. New insights are always persisted with `patternType:`. Queries of existing insights are updated with `patternType:standard <query>` if they don't already specify a pattern type. This reflects the current default of the Stream API and thus, merely makes the pattern type explicit. Notes: - The Insights UI has a lot of query input fields, but there are only 2 fields which allow the user to freely specify a pattern type: the interactive insight on the landing page and the query input field in the "Track Changes" creation form. Test plan: - new unit test - existing tests pass without changes - manual testing: - I created a couple of insights and checked that the pattern type is persisted in the db. - I experimented with different default pattern types. - I tested the workflow to create an insight from the search results page. - I created a "language" insights and ran the migration to make sure we don't set pattern type in those cases.	2024-07-03 09:51:48 +02:00
Michael Bahr	fcff7a218b	feat: introduce database fields for github apps <-> batch changes integration, and update database layer (#63577 )	2024-07-02 06:12:56 -05:00
Stefan Hengl	4f79980e0e	notebooks: store default pattern type per notebook (#63472 ) This PR refactors notebooks to support the upcoming Keyword Search GA. The main goal is to make it easier to switch to a new default pattern type without breaking existing notebooks. Before - pattern type and version were hardcoded in several places After - Each notebook has a read-only pattern type as determined by the new column `notebooks.pattern_type` (defaults to "standard"). Notes - Notebooks call the Stream API via various helper functions. `patternType` and `version` are both required parameters, which is redundant, because version acts as a default pattern type already. I left a TODO in the code for that. I don't want to change this as part of this PR because the change would get very big and affect too much code outside of Notebooks. - We support rendering notebooks with `.snb.md` extension. Unlike notebooks stored in our db, we cannot migrate those files. Q&A Q: How does this help for Keyword Search GA? A: Once we default to keyword search, we can change the default of `notebooks.pattern_type` from "standard" to "keyword". Existing notebooks will still work with "standard". New Notebooks will use "keyword". Q: How can customers migrate existing notebooks to a new version? A: Use the existing "Copy to My Notebooks" function of Notebooks. The copied notebook will have the current default pattern type. Test plan: - existing tests pass - manual testing - I created a couple of notebooks with all the different block types and verified via the network tab that all requests to the Stream API have the proper pattern type. I played around with different values in `notebooks.pattern_type` to make sure that the request parameters change.	2024-07-01 10:45:46 +02:00
Erik Seliger	7ed7c480dd	scim: Fix user updates when SCIM was previously enabled (#63135 ) When the user external account entry for SCIM has been deleted in the past, updating the user again fails as the deleted record is found and then the insertion fails. This PR fixes that by adjusting the unique index to only consider non-soft-deleted external accounts. Closes SRC-383 Test plan: Added a test suite for the DB method - it was untested before :(	2024-06-06 22:24:00 +02:00
Michael Bahr	e85028b8bd	fix: update links for dev docs (#62758 ) * fix: license checker info is in docs-legacy * fix: update remaining dev links	2024-05-17 13:47:34 +02:00
Stefan Hengl	0e7537f534	search jobs: add index on state column (#62145 ) The search jobs tables are missing an index on the state column that most worker tables have - the dequeue query on dotcom takes 12.07s at the moment. Test plan: `sg migration up --db frontend; sg migration undo`	2024-04-25 09:17:09 +02:00
Anton Sviridov	612e26da48	Extract reposcheduler and add support for syntactic indexes (#60958 )	2024-04-02 11:37:30 +01:00
Stefan Hengl	0f2c5bd834	code monitors: don't fail job if repo empty (#61367 ) Previously we failed a code monitor search if any of the revisions could not be resolved. For example, this happens if an empty repo is added to Sourcegraph. Now we log the error and continue with the job. The log message is shown in the Code Monitors log view.	2024-04-02 09:30:59 +02:00
Camden Cheek	0ccc402ad3	Feature flags: relax some constraints (#61343 ) This fixes two small pain points with automated treatment of feature flags: 1) A user cannot add an override unless the feature flag has already been created and a default has been set. However, in all our code, we require the caller to specify a default if the feature flag doesn't exist, so this is not a necessary limitation. It's also particularly annoying because only site admins can create a feature flag. 2) In order to update an override, a user has to first fetch the override to see if it exists, then update it if it exists or create it if it doesn't. This modifies create to just overwrite if the override already exists.	2024-03-25 10:39:01 -06:00
Rafał Gajdulewicz	383b30c2d5	Add double hash index for CodyGatewayDotcomUserByToken (#60695 ) * Add double hash index * Write generated * Add comment	2024-02-22 13:05:13 +01:00
Erik Seliger	e969d098c7	Remove unused user_id and org_id fields on external services and repos (#57271 ) These fields were used in the cloud v1 implementation, but are no longer used. All the records have been removed from Sourcegraph.com. As a precaution, I've added ANOTHER round of deletes here. Closes #43115	2024-02-20 16:35:59 +01:00
Erik Seliger	8e250b408b	Drop unused users.searchable DB column (#60429 ) We stopped using this over a release ago, so we can now safely drop it from the schema. ## Test plan Backcompat tests.	2024-02-14 15:09:55 +01:00
Camden Cheek	1ead945267	Docs: update links to point to new site (#60381 ) We have a number of docs links in the product that point to the old doc site. Method: - Search the repo for `docs.sourcegraph.com` - Exclude the `doc/` dir, all test fixtures, and `CHANGELOG.md` - For each, replace `docs.sourcegraph.com` with `sourcegraph.com/docs` - Navigate to the resulting URL ensuring it's not a dead link, updating the URL if necessary Many of the URLs updated are just comments, but since I'm doing a manual audit of each URL anyways, I felt it was worth it to update these while I was at it.	2024-02-13 00:23:47 +00:00
Anton Sviridov	d5f53378b9	Syntactic indexing: add database schema and dbworker (#60055 ) Adds a basic schema for syntactic code intel indexes (up and down migration) Sets up everything required to use dbworker interface Adds a stub indexing worker implementation Adds tests to ensure conformance with dbworker interface (run against real postgres)	2024-02-08 18:00:21 +00:00
Vova Kulikov	f9b1c3a0fb	Search Results: Add telemetry for the new search result page UI (preview and filters panels) (#59985 ) * Add telemetry events for file preview and filters panels * Generate allow list event migrations * Fix prettier and eslint problems * Fix eslint and rebase fixes * Fix failing search jobs site_test.go * Fix telemetry service missing prop problem	2024-01-31 12:15:16 -03:00
Chris Warwick	b9099792b9	Implement expiry for access tokens (#59565 ) Implement expiry for access tokens, by default all newly created access tokens will expire. There is site configuration available at auth.accessToken to enable tokens without expiration and to configure the values in days are presented to users and selected by default. Co-authored-by: Erik Seliger <erikseliger@me.com>	2024-01-18 18:54:07 -05:00
Gabe Torres	52669a7c7f	Add creator_id and last_updater_id to external_services (#57603 ) * Add creator_id and last_updater_id to external_services * Empty commit message * add foreign key constraints * run sg gen * change types to int32 pointers * allow lastupdater to be null * update tests * update Update pointer * run bazel configure * fix migration * fix squashed migration * remove println --------- Co-authored-by: Joe Chen <joe@sourcegraph.com>	2023-12-01 11:16:13 -08:00
Greg Magolan	afc8577b6f	build: add buildifier check to Aspect Workflows (#58566 ) * build: add buildifier check to Aspect Workflows * exclude squirrel/test_repos/starlark * buildifier --------- Co-authored-by: William Bezuidenhout <william.bezuidenhout@sourcegraph.com>	2023-11-27 14:58:01 +02:00
Julie Tibshirani	7ea0c8bd07	Embeddings: fix low-hanging issues with scheduling job (#58510 ) As part of the embeddings policy framework, a worker periodically checks what repos can be embedded. For every candidate repo, it queries the DB to see if there's a new revision to embed. This runs every minute and becomes increasingly expensive as the jobs table fills up with more entries over time. This change makes small optimizations to improve this: * Add an index to make selecting on `repo_id` and `revision` much faster * Check the repos every 5 minutes instead of 1 minute. This shouldn't make a huge difference in user experience, since by default embeddings jobs aren't allowed to be scheduled within 24h of the last run	2023-11-27 12:03:56 +01:00
Thorsten Ball	1c1691ec9b	database: make gitserver_repos_statistics append-only (#58496 ) This removes contention around the single table. If you have a small number of shards and lots of repositories that are cloned/recloned, they all try to write to the same table concurrently. What this does here is it changes the `gitserver_repos_statistics` table to work like `repo_statistics`. Co-authored-by: Erik Seliger <erikseliger@me.com>	2023-11-23 15:36:58 +01:00
Eric Fritz	30e0533137	codeintel: Improve GetIndexers method (#56801 ) Fix the slow "list indexers" GraphQL endpoint, which is especially problematic on the upload/index list view when not scoped to a particular repository.	2023-11-22 10:47:08 +00:00
Naman Kumar	b5893bb3a6	[Cody PLG] add mutation & gql field to upgrade users to Cody Pro (#58074 ) * [Cody PLG] add mutation & gql field to upgrade users to Cody Pro * fixes	2023-11-03 17:04:13 +05:30
Aditya Kalia	b16918e118	add events to allowlist (#58019 ) add `OpenInEditorClicked` and `GitBlameEnabled`	2023-10-31 15:10:56 -04:00
Petri-Johan Last	a3b4b6f42b	[UX] Move code host connection warnings and warn about unauthorized connections (#57766 )	2023-10-30 11:04:12 +00:00
Camden Cheek	14c01fb7ea	Clean up querying of repository topics (#57755 ) Previously, querying repos by topic was pretty messy, especially after we added support for gitlab topics. We had to dig down into the metadata, which not particularly nicely structured JSON for github, and we had to do this both for the index and for queries. This adds a generated column topics which isolates the complexity of extracting the topics from the metadata while maintaining normalization. It replaces the codehost-specific indexes with a single GIN index on that column so that querying for repos with a specific topic is both fast and easy.	2023-10-25 11:58:10 -06:00
Jason Hawk Harris	43bf93c68a	Search: enable `repo:has.topic()` for GitLab (#57649 ) * add has.topic support for gitlab * doc and changelog update * fix hover predicate to include gitlab * schema updates	2023-10-19 20:46:56 +00:00
Petri-Johan Last	8eb1433614	Remove tags column from users table (#57660 )	2023-10-19 06:27:25 +00:00
Jean-Hadrien Chabran	1744507bef	bzl: rework migration schemas generation (#57511 )	2023-10-10 17:19:47 +02:00
Felix Kling	74aa3b3bff	[telemetry] Add onboarding tour related events to allow list (#57199 )	2023-09-30 18:10:29 +02:00
Eric Fritz	96432a2776	chore: Remove indexes that existed for a short time (#57054 )	2023-09-26 17:24:40 +00:00
Robert Lin	fb2a4a670e	telemetry: init internal APIs and events cache store layer (#56520 ) This PR introduces: 1. a new backend telemetry API, with `internal/telemetry.EventRecorder`, for backend services to generate their own events, based on [the proposal](https://docs.google.com/document/d/14WBt80sbmVm73B-R1Srs5cSunZo2IhDMtKiyAgkYujU/edit#bookmark=id.wy4lr5f3lxyk) - this is similar to the clientside SDK: https://github.com/sourcegraph/telemetry 3. a caching layer in the database, `internal/database.TelemetryEventsExportQueueStore`, that stores raw Protobuf messages so that a worker can be implemented on top to pull events out and export them in batches - planned strategy is to mark exported entries as exported, and then periodically prune exported events from the table after N day(s) 4. an `internal/telemetry/teestore.Store` which tees events into the existing `event_logs` table (which we are no longer considering revamping due to extensive existing integrations with it) as well as the new `TelemetryEventsExportQueueStore` - adding things to this store is behind an off-by-default feature flag `telemetry-export` - when the flag is disabled, adding to the store is a no-op 5. adapters to send events from `telemetry.EventRecorder` and the GraphQL mutation added in https://github.com/sourcegraph/sourcegraph/pull/56297 to the `teestore.Store` Actually exporting things, and the destination service itself, will be built in https://github.com/sourcegraph/sourcegraph/pull/56699 Stacked on https://github.com/sourcegraph/sourcegraph/pull/56519 Closes https://github.com/sourcegraph/sourcegraph/issues/56283 Closes https://github.com/sourcegraph/sourcegraph/issues/56285 Closes https://github.com/sourcegraph/sourcegraph/issues/56286 ## Test plan Unit and integration tests, and manual test plan in https://github.com/sourcegraph/sourcegraph/pull/56699	2023-09-19 21:53:42 -07:00
Jean-Hadrien Chabran	8b5e2362fe	bzl: suppress an irrelevant warning from gazelle (#56547 ) Gazelle wrongly assumes that all files embededed with embedsrcs are present on disk, while the go_library rule totally support using targets that generates those inputs. This has been confusing for teammates. Alas, the only option we have is to add a gazelle:ignore pragma, so we don't get the confusing warning. I added a comment in that BUILD file to reflect that and inform teammates that if they are to add a new Go file there, they have to update the rule manually.	2023-09-12 14:20:08 +00:00
Jean-Hadrien Chabran	fe9695be7f	bzl: wrap migrations schemas/describe scripts (#56389 ) Co-authored-by: William Bezuidenhout <william.bezuidenhout@sourcegraph.com>	2023-09-11 16:19:44 +02:00

1 2 3 4 5 ...

1001 Commits