schema: Document perf scaling for occurrences(...) (#62941)

Fixes GRAPH-634
This commit is contained in:
Varun Gandhi 2024-06-04 18:07:44 +08:00 committed by GitHub
parent a23d241c6d
commit 20c1c15608
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194

View File

@ -118,7 +118,26 @@ type CodeGraphData {
"""
Occurrences are guaranteed to be sorted by range. It is possible
for there to be multiple occurrences for the same exact source range.
At the moment, using higher values of 'first:' should not cause
significantly worst performance.
As an example, for a large C++ codebase, here are the percentiles
for number of occurrences per document.
| Percentile | Occurrences count |
| 50 | 241 |
| 90 | 2443 |
| 95 | 4554 |
| 99 | 15869 |
| 99.9 | 94465 |
| 100 | 707850 |
"""
# Tip: One can get the percentiles above using the scip CLI's 'stats'
# subcommand. (https://github.com/sourcegraph/scip/blob/main/docs/CLI.md#scip-stats)
#
# TODO(issue: GRAPH-635): Allow passing a filter here to only get
# occurrences for specific lines.
occurrences(first: Int, after: String): SCIPOccurrenceConnection
}