sourcegraph

mirror of https://github.com/sourcegraph/sourcegraph.git synced 2026-02-06 18:31:54 +00:00

History

Michael Bahr f61e637062 feat(code insights): language stats speed improvements by using archive loading (#62946 ) We previously improved the performance of Language Stats Insights by introducing parallel requests to gitserver: https://github.com/sourcegraph/sourcegraph/pull/62011 This PR replaces the previous approach where we would iterate through and request each file from gitserver with an approach where we request just one archive. This eliminates a lot of network traffic, and gives us an additional(!) performance improvement of 70-90%. Even repositories like chromium (42GB) can now be processed (on my machine in just one minute). --- Caching: We dropped most of the caching, and kept only the top-level caching (repo@commit). This means that we only need to compute the language stats once per commit, and subsequent users/requests can see the cached data. We dropped the file/directory level caching, because (1) the code to do that got very complex and (2) we can assume that most repositories are able to compute within the 5 minutes timeout (which can be increase via the environment variable `GET_INVENTORY_TIMEOUT`). The timeout is not bound to the user's request anymore. Before, the frontend would request the stats up to three times to let the computation move forward and pick up where the last request aborted. While we still have this frontend retry mechanism, we don't have to worry about an abort-and-continue mechanism in the backend. --- Credits for the code to @eseliger: https://github.com/sourcegraph/sourcegraph/issues/62019#issuecomment-2119278481 I've taken the diff, and updated the caching methods to allow for more advanced use cases should we decide to introduce more caching. We can take that out again if the current caching is sufficient. Todos: - [x] Check if CI passes, manual testing seems to be fine - [x] Verify that insights are cached at the top level --- Test data: - sourcegraph/sourcegraph: 9.07s (main) -> 1.44s (current): 74% better - facebook/react: 17.52s (main) -> 0.87s (current): 95% better - godotengine/godot: 28.92s (main) -> 1.98s (current): 93% better - chromium/chromium: ~1 minute: 100% better, because it didn't compute before ## Changelog - Language stats queries now request one archive from gitserver instead of individual file requests. This leads to a huge performance improvement. Even extra large repositories like chromium are now able to compute within one minute. Previously they timed out. ## Test plan - New unit tests - Plenty of manual testing		2024-07-18 08:40:48 +02:00
..
appliance	chore(appliance): Stub out react UI expected URIs and JSON API (#63741 )	2024-07-15 14:48:38 -04:00
batcheshelper	bazel: transcribe test ownership to bazel tags (#62664 )	2024-05-16 15:51:16 +01:00
blobstore	bazel: transcribe test ownership to bazel tags (#62664 )	2024-05-16 15:51:16 +01:00
bundled-executor	bazel: transcribe test ownership to bazel tags (#62664 )	2024-05-16 15:51:16 +01:00
cody-gateway	Update flagging.go	2024-07-16 07:15:40 -07:00
cody-gateway-config	Several fixes around merging modelconfig, and the current Cody Gateway data (#63814 )	2024-07-15 17:14:28 +00:00
embeddings	lib/background: upgrade `Routine` interface with context and errors (#62136 )	2024-05-24 10:04:55 -04:00
enterprise-portal	fix/enterpriseportal: drop old gorm fk constraints (#63864 )	2024-07-17 14:16:02 -07:00
executor	chore/deps: upgrade grpc, prometheus/common (#63328 )	2024-06-19 09:55:44 -04:00
executor-kubernetes	bazel: transcribe test ownership to bazel tags (#62664 )	2024-05-16 15:51:16 +01:00
frontend	feat(code insights): language stats speed improvements by using archive loading (#62946 )	2024-07-18 08:40:48 +02:00
gitserver	Update comment and decode bytes instead (#63754 )	2024-07-11 09:40:51 +02:00
loadtest	chore(bazel): update ownership tags to increase coverage (#63001 )	2024-05-31 14:10:29 +00:00
migrator	chore(ci): conditionally stamp genrules (#63204 )	2024-06-12 15:04:43 +02:00
msp-example	msp/runtime: split contract into JobContract and ServiceContract (#63494 )	2024-06-26 19:46:10 +00:00
pings	msp/runtime: split contract into JobContract and ServiceContract (#63494 )	2024-06-26 19:46:10 +00:00
precise-code-intel-worker	chore: Change errors.HasType to respect multi-errors (#63024 )	2024-06-06 13:02:14 +00:00
repo-updater	scheduler: Simplify query for uncloned repos (#63681 )	2024-07-10 02:24:32 +02:00
searcher	Structural search: fix precise lang filtering (#63791 )	2024-07-15 09:20:21 +02:00
server	feat/bazel: `//cmd/{frontend,server}` targets that don't include client bundle for backend integration tests (#62877 )	2024-05-28 14:32:48 +01:00
sourcegraph	support fast, simple `sg start single-program-experimental-blame-sqs` for local dev (#63435 )	2024-06-24 21:12:47 +00:00
symbols	symbols: Make symbols specific code internal (#63736 )	2024-07-10 01:26:22 +02:00
syntactic-code-intel-worker	Syntactic indexing produce scip files (#63580 )	2024-07-09 13:49:55 +02:00
telemetry-gateway	chore/telemetrygateway: gracefully handle sams introspectToken cancelation (#63809 )	2024-07-15 10:45:00 -07:00
worker	chore(worker): disable jobs based on ENVs (#63853 )	2024-07-16 18:07:22 +02:00
README.md	Reminder to keep architecture diagram in-sync (#36869 )	2022-06-08 19:40:36 -07:00

README.md

This directory contains Sourcegraph services and binaries.

When a services is added, removed, or when a service's dependencies change, update our architecture diagram.