mirror of
https://github.com/sourcegraph/sourcegraph.git
synced 2026-02-06 18:31:54 +00:00
We previously improved the performance of Language Stats Insights by introducing parallel requests to gitserver: https://github.com/sourcegraph/sourcegraph/pull/62011 This PR replaces the previous approach where we would iterate through and request each file from gitserver with an approach where we request just one archive. This eliminates a lot of network traffic, and gives us an additional(!) performance improvement of 70-90%. Even repositories like chromium (42GB) can now be processed (on my machine in just one minute). --- Caching: We dropped most of the caching, and kept only the top-level caching (repo@commit). This means that we only need to compute the language stats once per commit, and subsequent users/requests can see the cached data. We dropped the file/directory level caching, because (1) the code to do that got very complex and (2) we can assume that most repositories are able to compute within the 5 minutes timeout (which can be increase via the environment variable `GET_INVENTORY_TIMEOUT`). The timeout is not bound to the user's request anymore. Before, the frontend would request the stats up to three times to let the computation move forward and pick up where the last request aborted. While we still have this frontend retry mechanism, we don't have to worry about an abort-and-continue mechanism in the backend. --- Credits for the code to @eseliger: https://github.com/sourcegraph/sourcegraph/issues/62019#issuecomment-2119278481 I've taken the diff, and updated the caching methods to allow for more advanced use cases should we decide to introduce more caching. We can take that out again if the current caching is sufficient. Todos: - [x] Check if CI passes, manual testing seems to be fine - [x] Verify that insights are cached at the top level --- Test data: - sourcegraph/sourcegraph: 9.07s (main) -> 1.44s (current): 74% better - facebook/react: 17.52s (main) -> 0.87s (current): 95% better - godotengine/godot: 28.92s (main) -> 1.98s (current): 93% better - chromium/chromium: ~1 minute: 100% better, because it didn't compute before ## Changelog - Language stats queries now request one archive from gitserver instead of individual file requests. This leads to a huge performance improvement. Even extra large repositories like chromium are now able to compute within one minute. Previously they timed out. ## Test plan - New unit tests - Plenty of manual testing |
||
|---|---|---|
| .. | ||
| appliance | ||
| batcheshelper | ||
| blobstore | ||
| bundled-executor | ||
| cody-gateway | ||
| cody-gateway-config | ||
| embeddings | ||
| enterprise-portal | ||
| executor | ||
| executor-kubernetes | ||
| frontend | ||
| gitserver | ||
| loadtest | ||
| migrator | ||
| msp-example | ||
| pings | ||
| precise-code-intel-worker | ||
| repo-updater | ||
| searcher | ||
| server | ||
| sourcegraph | ||
| symbols | ||
| syntactic-code-intel-worker | ||
| telemetry-gateway | ||
| worker | ||
| README.md | ||
This directory contains Sourcegraph services and binaries.
When a services is added, removed, or when a service's dependencies change, update our architecture diagram.