sourcegraph/cmd
Ryan Slade 9214e6656f
repo-updater: Add critical alerts (#14530)
* repo-updater: Add critical alerts

The number of user added repos is greater than 90% of our hard limit of
200k.
Number of external services added is great than 20k
No external services have synced in more than 8 hours. By default we
backoff for up to 8 hours, so anything higher than this indicates a
problem.

* syncer_sync_start should be a rate

Alert when we've performed <= 1 syncs for 8 hours.

Ideally we want this to be <= 0 but our monitoring package doesn't allow
this. Therefore we need to make this a warning as it'll fire for
instances that only have 1 external service defined.

* Fill in possible solutions and remove unnecessary panel

* Adjust rate

* Update monitoring/repo_updater.go

Co-authored-by: Tomás Senart <tomas@sourcegraph.com>

* Update monitoring/repo_updater.go

Co-authored-by: Robert Lin <robert@bobheadxi.dev>

* Update possible error descriptions

* Fix alert

* Switch alert to critical

Co-authored-by: Tomás Senart <tomas@sourcegraph.com>
Co-authored-by: Robert Lin <robert@bobheadxi.dev>
2020-10-09 15:58:35 +02:00
..
frontend search: indexed config API supports multiple repo query arguments (#14558) 2020-10-09 15:08:00 +02:00
github-proxy trace: Move init body into a function called explicitly (#13610) 2020-09-02 20:44:36 -05:00
gitserver all: add keegancsmith to CODENOTIFY for many pkgs (#14241) 2020-09-29 11:03:53 +02:00
loadtest all: update alpine 3.10 -> 3.12 (#13248) 2020-08-21 16:29:48 -07:00
query-runner trace: Move init body into a function called explicitly (#13610) 2020-09-02 20:44:36 -05:00
repo-updater repo-updater: Add critical alerts (#14530) 2020-10-09 15:58:35 +02:00
searcher CODENOTIFY: beyang subscriptions (#14396) 2020-10-03 15:46:42 -07:00
server Remove non-OSS GraphQL, INI, TOML, Perforce syntax highlighting. (#14456) - take two (#14465) 2020-10-07 12:46:40 -07:00
symbols all: add keegancsmith to CODENOTIFY for many pkgs (#14241) 2020-09-29 11:03:53 +02:00