Robert Lin
294d4b47f8
telemetry-gateway: wrap publish errors in error details for Sentry ( #61895 )
...
We aggregate all errors on a single log entry to get accurate representations of issues in Sentry, while not generating thousands of log entries at the same time. Because this means we only get higher-level logger context, we must annotate the errors directly with some hidden details to preserve Sentry grouping while adding context for diagnostics.
We do this by using CockroachDB error's `WithSafeDetails` helper, which annotates an error with details that are _not_ rendered by `err.Error()` but _are_ included in Sentry reports (via the `Message` field). Currently, we annotate the event ID, feature, action, and source (see example output below)
Context on this request: https://sourcegraph.slack.com/archives/C06CCJR4K9R/p1713054354443579
## Test plan
_Super_ jank unit tests that try to emulate how we build Sentry reports, relying on knowledge of how our Sentry report building works internally. I opened https://github.com/sourcegraph/log/issues/65 in case there are new use cases for this in the future. Running the test with `-v` demonstrates the output:
```
$ go test -timeout 30s -run ^TestSummarizeFailedEvents$ github.com/sourcegraph/sourcegraph/cmd/telemetry-gateway/internal/server -v
=== RUN TestSummarizeFailedEvents
=== RUN TestSummarizeFailedEvents/all_failed
=== RUN TestSummarizeFailedEvents/all_failed/Sentry_report
publish_events_test.go:83: Sentry Error message for field "error.0":
publish_events_test.go:38: event publish failed
(1) feature:"feature_0" action:"action_0" id:"id_0" server:{version:"TestSummarizeFailedEvents/all_failed"} client:{name:"test_client"}
Wraps: (2) attached stack trace
-- stack trace:
| github.com/sourcegraph/sourcegraph/cmd/telemetry-gateway/internal/server.TestSummarizeFailedEvents.func1
| /Users/robert@sourcegraph.com/Projects/sourcegraph/sourcegraph/cmd/telemetry-gateway/internal/server/publish_events_test.go:38
| testing.tRunner
| /Users/robert@sourcegraph.com/.asdf/installs/golang/1.22.1/go/src/testing/testing.go:1689
| runtime.goexit
| /Users/robert@sourcegraph.com/.asdf/installs/golang/1.22.1/go/src/runtime/asm_arm64.s:1222
Wraps: (3) event publish failed
Error types: (1) *safedetails.withSafeDetails (2) *withstack.withStack (3) *errutil.leafError
-- report composition:
*errutil.leafError: event publish failed
publish_events_test.go:38: *withstack.withStack (top exception)
*safedetails.withSafeDetails: feature:"feature_0" action:"action_0" id:"id_0" server:{version:"TestSummarizeFailedEvents/all_failed"} client:{name:"test_client"}
publish_events_test.go:83: Sentry Error message for field "error.1":
publish_events_test.go:38: event publish failed
(1) feature:"feature_1" action:"action_1" id:"id_1" server:{version:"TestSummarizeFailedEvents/all_failed"}
Wraps: (2) attached stack trace
-- stack trace:
| github.com/sourcegraph/sourcegraph/cmd/telemetry-gateway/internal/server.TestSummarizeFailedEvents.func1
| /Users/robert@sourcegraph.com/Projects/sourcegraph/sourcegraph/cmd/telemetry-gateway/internal/server/publish_events_test.go:38
| testing.tRunner
| /Users/robert@sourcegraph.com/.asdf/installs/golang/1.22.1/go/src/testing/testing.go:1689
| runtime.goexit
| /Users/robert@sourcegraph.com/.asdf/installs/golang/1.22.1/go/src/runtime/asm_arm64.s:1222
Wraps: (3) event publish failed
Error types: (1) *safedetails.withSafeDetails (2) *withstack.withStack (3) *errutil.leafError
-- report composition:
*errutil.leafError: event publish failed
publish_events_test.go:38: *withstack.withStack (top exception)
*safedetails.withSafeDetails: feature:"feature_1" action:"action_1" id:"id_1" server:{version:"TestSummarizeFailedEvents/all_failed"}
publish_events_test.go:83: Sentry Error message for field "error.2":
publish_events_test.go:38: event publish failed
(1) feature:"feature_2" action:"action_2" id:"id_2" server:{version:"TestSummarizeFailedEvents/all_failed"} client:{name:"test_client"}
Wraps: (2) attached stack trace
-- stack trace:
| github.com/sourcegraph/sourcegraph/cmd/telemetry-gateway/internal/server.TestSummarizeFailedEvents.func1
| /Users/robert@sourcegraph.com/Projects/sourcegraph/sourcegraph/cmd/telemetry-gateway/internal/server/publish_events_test.go:38
| testing.tRunner
| /Users/robert@sourcegraph.com/.asdf/installs/golang/1.22.1/go/src/testing/testing.go:1689
| runtime.goexit
| /Users/robert@sourcegraph.com/.asdf/installs/golang/1.22.1/go/src/runtime/asm_arm64.s:1222
Wraps: (3) event publish failed
Error types: (1) *safedetails.withSafeDetails (2) *withstack.withStack (3) *errutil.leafError
-- report composition:
*errutil.leafError: event publish failed
publish_events_test.go:38: *withstack.withStack (top exception)
*safedetails.withSafeDetails: feature:"feature_2" action:"action_2" id:"id_2" server:{version:"TestSummarizeFailedEvents/all_failed"} client:{name:"test_client"}
publish_events_test.go:83: Sentry Error message for field "error.3":
publish_events_test.go:38: event publish failed
(1) feature:"feature_3" action:"action_3" id:"id_3" server:{version:"TestSummarizeFailedEvents/all_failed"}
Wraps: (2) attached stack trace
-- stack trace:
| github.com/sourcegraph/sourcegraph/cmd/telemetry-gateway/internal/server.TestSummarizeFailedEvents.func1
| /Users/robert@sourcegraph.com/Projects/sourcegraph/sourcegraph/cmd/telemetry-gateway/internal/server/publish_events_test.go:38
| testing.tRunner
| /Users/robert@sourcegraph.com/.asdf/installs/golang/1.22.1/go/src/testing/testing.go:1689
| runtime.goexit
| /Users/robert@sourcegraph.com/.asdf/installs/golang/1.22.1/go/src/runtime/asm_arm64.s:1222
Wraps: (3) event publish failed
Error types: (1) *safedetails.withSafeDetails (2) *withstack.withStack (3) *errutil.leafError
-- report composition:
*errutil.leafError: event publish failed
publish_events_test.go:38: *withstack.withStack (top exception)
*safedetails.withSafeDetails: feature:"feature_3" action:"action_3" id:"id_3" server:{version:"TestSummarizeFailedEvents/all_failed"}
publish_events_test.go:83: Sentry Error message for field "error.4":
publish_events_test.go:38: event publish failed
(1) feature:"feature_4" action:"action_4" id:"id_4" server:{version:"TestSummarizeFailedEvents/all_failed"} client:{name:"test_client"}
Wraps: (2) attached stack trace
-- stack trace:
| github.com/sourcegraph/sourcegraph/cmd/telemetry-gateway/internal/server.TestSummarizeFailedEvents.func1
| /Users/robert@sourcegraph.com/Projects/sourcegraph/sourcegraph/cmd/telemetry-gateway/internal/server/publish_events_test.go:38
| testing.tRunner
| /Users/robert@sourcegraph.com/.asdf/installs/golang/1.22.1/go/src/testing/testing.go:1689
| runtime.goexit
| /Users/robert@sourcegraph.com/.asdf/installs/golang/1.22.1/go/src/runtime/asm_arm64.s:1222
Wraps: (3) event publish failed
Error types: (1) *safedetails.withSafeDetails (2) *withstack.withStack (3) *errutil.leafError
-- report composition:
*errutil.leafError: event publish failed
publish_events_test.go:38: *withstack.withStack (top exception)
*safedetails.withSafeDetails: feature:"feature_4" action:"action_4" id:"id_4" server:{version:"TestSummarizeFailedEvents/all_failed"} client:{name:"test_client"}
=== RUN TestSummarizeFailedEvents/some_failed
=== RUN TestSummarizeFailedEvents/all_succeeded
=== RUN TestSummarizeFailedEvents/all_succeeded_(large_set)
--- PASS: TestSummarizeFailedEvents (0.23s)
--- PASS: TestSummarizeFailedEvents/all_failed (0.22s)
--- PASS: TestSummarizeFailedEvents/all_failed/Sentry_report (0.00s)
--- PASS: TestSummarizeFailedEvents/some_failed (0.00s)
--- PASS: TestSummarizeFailedEvents/all_succeeded (0.00s)
--- PASS: TestSummarizeFailedEvents/all_succeeded_(large_set) (0.00s)
PASS
ok github.com/sourcegraph/sourcegraph/cmd/telemetry-gateway/internal/server (cached)
```
2024-04-15 23:19:29 +00:00
Jean-Hadrien Chabran
a9c3f8ce9a
chore: links/ownership devx->dev-infra ( #58999 )
2023-12-14 15:07:20 +00:00
Noah S-C
e13606a86f
migrator: exit 0 if autoupgrade set on up ( #53175 )
...
When auto-upgrade intent is signaled, we want the frontend to do the
migration instead of migrator. As by default, migrator runs up (unless
manually changed to something else or invoked), we check before that
command runs whether we should auto-upgrade, and exiting with 0 if yes
## Test plan
Tested locally in docker-compose setup
2023-06-13 20:51:09 +00:00
Robert Lin
ba29f2da26
lib/errors: document redaction, upgrade error deps ( #53109 )
...
Did you know: all args to `errors.Newf`, `errors.Wrapf`, etc are
considered sensitive, and redacted before errors are reported to Sentry?
That's where all the error reports that look useless come from:
```
x
x
x
x
x
```
This change documents how this works based on cockroachdb error docs,
and also registers a set of arg types that can automatically be
considered safe based on what is configured in cockroachdb itself
(basically most primitive types except string, such as ints for status
codes and whatnot).
Aside: PII is important to consider as we build out multi-tenant
services like Cody Gateway, and important if we want to consider using
nice third-party tools like Honeycomb/Sentry for Cloud instances in the
future.
Related: https://github.com/sourcegraph/sourcegraph/issues/51998
## Test plan
CI
2023-06-09 08:39:40 -07:00
Jean-Hadrien Chabran
3d36d34b3d
ci: re-enable race detection ( #52776 )
...
The previous approach to enable race detection was too radical and
accidently led to build our binaries with the race flage enabled, which
caused issues when building images down the line.
This happened because putting a `test --something` in bazelrc also sets
it on `build` which is absolutely not what we wanted. Usually folks get
this one working by having a `--stamp` config setting that fixes this
when releasing binaries, which we don't at this stage, as we're still
learning Bazel.
Luckily, this was caught swiftly. The current approach insteads takes a
more granular approach, which makes the `go_test` rule uses our own
variant, which injects the `race = "on"` attribute, but only on
`go_test`.
## Test plan
<!-- All pull requests REQUIRE a test plan:
https://docs.sourcegraph.com/dev/background-information/testing_principles
-->
CI, being a main-dry-run, this will cover the container building jobs,
which were the ones failing.
---------
Co-authored-by: Alex Ostrikov <alex.ostrikov@sourcegraph.com>
2023-06-05 20:41:47 +02:00
William Bezuidenhout
31e9d31220
bazel: add depguard as a nogo linter ( #50585 )
...
Add depguard as a nogo linter
## Test plan
* tested locally and made some fixes on the code it found
* green ci
<!-- All pull requests REQUIRE a test plan:
https://docs.sourcegraph.com/dev/background-information/testing_principles
-->
2023-04-13 14:19:45 +02:00
Dave Try
2b8fa079f0
bazel: fix buf files ( #49444 )
...
fix protoc-gen-go version
2023-03-15 20:21:38 +00:00
Dave Try
293385d5dd
bazel: update timeouts to suppress warnings ( #49399 )
...
Updates all of the BUILD fields with timeouts to suppress warnings and
reduce log spam.
## Test plan
Green CI
2023-03-15 15:04:16 +02:00
Jean-Hadrien Chabran
bc5490c4bb
bazel: introduce build files for Go ( #46770 )
2023-01-23 14:00:01 +01:00
Camden Cheek
1d2ae644a7
Backend: replace uses of errors.Group with lib/group ( #42787 )
2022-10-11 10:31:22 -06:00
Keegan Carruthers-Smith
27569d1fc7
all: run gofmt -s -w from 1.19 ( #41629 )
...
gofmt in go1.19 does a lot of reformating of godoc strings, mostly to
make them more consistent around lists.
Test Plan: CI
2022-09-13 07:44:06 +00:00
Idan Varsano
c407226c05
Repo Sync: If all errors are warnings, delete repos with bad permissions ( #40690 )
...
* Prevent stopping sync if a querying for repos in Bitbucket project key returns a 'fatal' error
2022-08-24 19:26:12 +00:00
Indradhanush Gupta
e2ae15ea28
lib/errors: Assert warning against nil error in tests ( #40631 )
2022-08-22 21:31:43 +05:30
Ryan Slade
e49cda4784
errors: Add package level IsWarning function ( #40656 )
...
This makes it easier to check if an error is a warning without
resorting to errors.As
2022-08-22 10:56:54 +02:00
Indradhanush Gupta
432313cfee
lib/errors: Simplify ClassifiedError type to Warning ( #38885 )
...
If we are trying to expose two "levels" of errors, one being warning
and another error, it is simpler to just expose a custom type Warning
to replicate the warning level of errors, and all other errors are
just errors.
Co-authored-by: Robert Lin <robert@bobheadxi.dev>
Co-authored-by: Alex Ostrikov <alex.ostrikov@sourcegraph.com>
2022-08-03 22:40:28 +05:30
Robert Lin
d75a3733d4
lib/errors: add interfaces for typed errors ( #39256 )
...
Introducing custom error types can be tricky - I realized there's no code-level assertions or docs on Is and As implementations, so this PR introduces some interfaces to guide implementers. We write them ourselves because the standard library does not provide them, and it allows us to include docstrings better.
2022-07-25 19:59:14 -04:00
Robert Lin
2d98b0901f
trace, tracer, errors, observation: add devx to CODENOTIFY ( #38897 )
2022-07-18 18:19:13 +02:00
Indradhanush Gupta
089f53c5d8
lib/errors: Add new classifiedError type ( #38671 )
2022-07-13 16:22:28 +00:00
Keegan Carruthers-Smith
11a534cc78
lib: use any instead of interface{} ( #35121 )
...
Now that lib is on go1.18 we can use the type alias "any" instead of "interface{}".
Test Plan: cd lib && go test ./...
2022-05-09 14:55:38 +00:00
Camden Cheek
f2e2a244cd
ignore any context errors in code monitors ( #35064 )
2022-05-07 12:13:24 -04:00
Robert Lin
fe428b20e2
lib/errors: new MultiError error type and utilities ( #31466 )
...
Wholesale migration away from go-multierror into a custom multierror implementation that is fully compatible with cockroachdb/errors, prints all errors, can be introspected with Is, As, and friends, and more. The new MultiError type is only available as an interface.
Co-authored-by: Camden Cheek <camden@ccheek.com>
2022-02-18 11:07:02 -08:00
Camden Cheek
233a93abc6
Add helper predicates for HasType and Is ( #30975 )
...
This adds a couple of small helpers to make working with errors.Ignore a
little easier. Now, if you just want to ignore a certain error type,
instead of creating a function for it, you can just do something like
```
err = errors.Ignore(err, errors.IsPred(context.Canceled))
```
2022-02-10 16:47:46 +00:00
Camden Cheek
f5a2e023cb
Add errors.Ignore() ( #30930 )
...
This adds the new helper function `Ignore` to our `errors` package. This
allows us to ignore errors based on a given predicate function in a way
that takes into account aggregated errors in MultiError. It recursively
unwraps to multierrors an filters out all child errors that match the
predicate.
2022-02-09 16:15:45 -07:00
Eric Fritz
7148009913
errors: Introduce internal package ( #30558 )
2022-02-07 15:03:45 +00:00