sourcegraph/lib/errors
Robert Lin 294d4b47f8
telemetry-gateway: wrap publish errors in error details for Sentry (#61895)
We aggregate all errors on a single log entry to get accurate representations of issues in Sentry, while not generating thousands of log entries at the same time. Because this means we only get higher-level logger context, we must annotate the errors directly with some hidden details to preserve Sentry grouping while adding context for diagnostics.

We do this by using CockroachDB error's `WithSafeDetails` helper, which annotates an error with details that are _not_ rendered by `err.Error()` but _are_ included in Sentry reports (via the `Message` field). Currently, we annotate the event ID, feature, action, and source (see example output below)

Context on this request: https://sourcegraph.slack.com/archives/C06CCJR4K9R/p1713054354443579

## Test plan

_Super_ jank unit tests that try to emulate how we build Sentry reports, relying on knowledge of how our Sentry report building works internally. I opened https://github.com/sourcegraph/log/issues/65 in case there are new use cases for this in the future. Running the test with `-v` demonstrates the output:

```
$ go test -timeout 30s -run ^TestSummarizeFailedEvents$ github.com/sourcegraph/sourcegraph/cmd/telemetry-gateway/internal/server -v
=== RUN   TestSummarizeFailedEvents
=== RUN   TestSummarizeFailedEvents/all_failed
=== RUN   TestSummarizeFailedEvents/all_failed/Sentry_report
    publish_events_test.go:83: Sentry Error message for field "error.0":
        
        publish_events_test.go:38: event publish failed
        (1) feature:"feature_0" action:"action_0" id:"id_0" server:{version:"TestSummarizeFailedEvents/all_failed"}  client:{name:"test_client"}
        Wraps: (2) attached stack trace
          -- stack trace:
          | github.com/sourcegraph/sourcegraph/cmd/telemetry-gateway/internal/server.TestSummarizeFailedEvents.func1
          |     /Users/robert@sourcegraph.com/Projects/sourcegraph/sourcegraph/cmd/telemetry-gateway/internal/server/publish_events_test.go:38
          | testing.tRunner
          |     /Users/robert@sourcegraph.com/.asdf/installs/golang/1.22.1/go/src/testing/testing.go:1689
          | runtime.goexit
          |     /Users/robert@sourcegraph.com/.asdf/installs/golang/1.22.1/go/src/runtime/asm_arm64.s:1222
        Wraps: (3) event publish failed
        Error types: (1) *safedetails.withSafeDetails (2) *withstack.withStack (3) *errutil.leafError
        -- report composition:
        *errutil.leafError: event publish failed
        publish_events_test.go:38: *withstack.withStack (top exception)
        *safedetails.withSafeDetails: feature:"feature_0" action:"action_0" id:"id_0" server:{version:"TestSummarizeFailedEvents/all_failed"}  client:{name:"test_client"}
        
    publish_events_test.go:83: Sentry Error message for field "error.1":
        
        publish_events_test.go:38: event publish failed
        (1) feature:"feature_1" action:"action_1" id:"id_1" server:{version:"TestSummarizeFailedEvents/all_failed"}
        Wraps: (2) attached stack trace
          -- stack trace:
          | github.com/sourcegraph/sourcegraph/cmd/telemetry-gateway/internal/server.TestSummarizeFailedEvents.func1
          |     /Users/robert@sourcegraph.com/Projects/sourcegraph/sourcegraph/cmd/telemetry-gateway/internal/server/publish_events_test.go:38
          | testing.tRunner
          |     /Users/robert@sourcegraph.com/.asdf/installs/golang/1.22.1/go/src/testing/testing.go:1689
          | runtime.goexit
          |     /Users/robert@sourcegraph.com/.asdf/installs/golang/1.22.1/go/src/runtime/asm_arm64.s:1222
        Wraps: (3) event publish failed
        Error types: (1) *safedetails.withSafeDetails (2) *withstack.withStack (3) *errutil.leafError
        -- report composition:
        *errutil.leafError: event publish failed
        publish_events_test.go:38: *withstack.withStack (top exception)
        *safedetails.withSafeDetails: feature:"feature_1" action:"action_1" id:"id_1" server:{version:"TestSummarizeFailedEvents/all_failed"}
        
    publish_events_test.go:83: Sentry Error message for field "error.2":
        
        publish_events_test.go:38: event publish failed
        (1) feature:"feature_2" action:"action_2" id:"id_2" server:{version:"TestSummarizeFailedEvents/all_failed"}  client:{name:"test_client"}
        Wraps: (2) attached stack trace
          -- stack trace:
          | github.com/sourcegraph/sourcegraph/cmd/telemetry-gateway/internal/server.TestSummarizeFailedEvents.func1
          |     /Users/robert@sourcegraph.com/Projects/sourcegraph/sourcegraph/cmd/telemetry-gateway/internal/server/publish_events_test.go:38
          | testing.tRunner
          |     /Users/robert@sourcegraph.com/.asdf/installs/golang/1.22.1/go/src/testing/testing.go:1689
          | runtime.goexit
          |     /Users/robert@sourcegraph.com/.asdf/installs/golang/1.22.1/go/src/runtime/asm_arm64.s:1222
        Wraps: (3) event publish failed
        Error types: (1) *safedetails.withSafeDetails (2) *withstack.withStack (3) *errutil.leafError
        -- report composition:
        *errutil.leafError: event publish failed
        publish_events_test.go:38: *withstack.withStack (top exception)
        *safedetails.withSafeDetails: feature:"feature_2" action:"action_2" id:"id_2" server:{version:"TestSummarizeFailedEvents/all_failed"}  client:{name:"test_client"}
        
    publish_events_test.go:83: Sentry Error message for field "error.3":
        
        publish_events_test.go:38: event publish failed
        (1) feature:"feature_3" action:"action_3" id:"id_3" server:{version:"TestSummarizeFailedEvents/all_failed"}
        Wraps: (2) attached stack trace
          -- stack trace:
          | github.com/sourcegraph/sourcegraph/cmd/telemetry-gateway/internal/server.TestSummarizeFailedEvents.func1
          |     /Users/robert@sourcegraph.com/Projects/sourcegraph/sourcegraph/cmd/telemetry-gateway/internal/server/publish_events_test.go:38
          | testing.tRunner
          |     /Users/robert@sourcegraph.com/.asdf/installs/golang/1.22.1/go/src/testing/testing.go:1689
          | runtime.goexit
          |     /Users/robert@sourcegraph.com/.asdf/installs/golang/1.22.1/go/src/runtime/asm_arm64.s:1222
        Wraps: (3) event publish failed
        Error types: (1) *safedetails.withSafeDetails (2) *withstack.withStack (3) *errutil.leafError
        -- report composition:
        *errutil.leafError: event publish failed
        publish_events_test.go:38: *withstack.withStack (top exception)
        *safedetails.withSafeDetails: feature:"feature_3" action:"action_3" id:"id_3" server:{version:"TestSummarizeFailedEvents/all_failed"}
        
    publish_events_test.go:83: Sentry Error message for field "error.4":
        
        publish_events_test.go:38: event publish failed
        (1) feature:"feature_4" action:"action_4" id:"id_4" server:{version:"TestSummarizeFailedEvents/all_failed"}  client:{name:"test_client"}
        Wraps: (2) attached stack trace
          -- stack trace:
          | github.com/sourcegraph/sourcegraph/cmd/telemetry-gateway/internal/server.TestSummarizeFailedEvents.func1
          |     /Users/robert@sourcegraph.com/Projects/sourcegraph/sourcegraph/cmd/telemetry-gateway/internal/server/publish_events_test.go:38
          | testing.tRunner
          |     /Users/robert@sourcegraph.com/.asdf/installs/golang/1.22.1/go/src/testing/testing.go:1689
          | runtime.goexit
          |     /Users/robert@sourcegraph.com/.asdf/installs/golang/1.22.1/go/src/runtime/asm_arm64.s:1222
        Wraps: (3) event publish failed
        Error types: (1) *safedetails.withSafeDetails (2) *withstack.withStack (3) *errutil.leafError
        -- report composition:
        *errutil.leafError: event publish failed
        publish_events_test.go:38: *withstack.withStack (top exception)
        *safedetails.withSafeDetails: feature:"feature_4" action:"action_4" id:"id_4" server:{version:"TestSummarizeFailedEvents/all_failed"}  client:{name:"test_client"}
        
=== RUN   TestSummarizeFailedEvents/some_failed
=== RUN   TestSummarizeFailedEvents/all_succeeded
=== RUN   TestSummarizeFailedEvents/all_succeeded_(large_set)
--- PASS: TestSummarizeFailedEvents (0.23s)
    --- PASS: TestSummarizeFailedEvents/all_failed (0.22s)
        --- PASS: TestSummarizeFailedEvents/all_failed/Sentry_report (0.00s)
    --- PASS: TestSummarizeFailedEvents/some_failed (0.00s)
    --- PASS: TestSummarizeFailedEvents/all_succeeded (0.00s)
    --- PASS: TestSummarizeFailedEvents/all_succeeded_(large_set) (0.00s)
PASS
ok      github.com/sourcegraph/sourcegraph/cmd/telemetry-gateway/internal/server        (cached)
```
2024-04-15 23:19:29 +00:00
..
BUILD.bazel migrator: exit 0 if autoupgrade set on up (#53175) 2023-06-13 20:51:09 +00:00
cockroach.go telemetry-gateway: wrap publish errors in error details for Sentry (#61895) 2024-04-15 23:19:29 +00:00
CODENOTIFY chore: links/ownership devx->dev-infra (#58999) 2023-12-14 15:07:20 +00:00
errors_test.go lib/errors: document redaction, upgrade error deps (#53109) 2023-06-09 08:39:40 -07:00
errors.go lib/errors: add interfaces for typed errors (#39256) 2022-07-25 19:59:14 -04:00
filter_test.go Add helper predicates for HasType and Is (#30975) 2022-02-10 16:47:46 +00:00
filter.go ignore any context errors in code monitors (#35064) 2022-05-07 12:13:24 -04:00
multi_error.go lib/errors: add interfaces for typed errors (#39256) 2022-07-25 19:59:14 -04:00
postgres.go migrator: exit 0 if autoupgrade set on up (#53175) 2023-06-13 20:51:09 +00:00
warning_test.go lib/errors: Assert warning against nil error in tests (#40631) 2022-08-22 21:31:43 +05:30
warning.go all: run gofmt -s -w from 1.19 (#41629) 2022-09-13 07:44:06 +00:00