mirror of
https://github.com/sourcegraph/sourcegraph.git
synced 2026-02-06 18:11:48 +00:00
This change updates our internal tracer to enforce policy only via a `Sampler` implementation. This has the following benefits: 1. Even when a trace should not be sampled, contexts are still populated with valid spans, rather than no-op ones. This is important to make use of trace IDs for non-tracing purposes, e.g. https://github.com/sourcegraph/sourcegraph/pull/57774 and https://github.com/sourcegraph/sourcegraph/pull/58060 2. We enforce trace policies the way they were meant to be enforced in OpenTelemetry: by simply indicating that the span should not be exported. This was not possible before because OpenTracing did not use context propagation, hence we did not have a way to use trace policy flags set in context - but thanks to @camdencheek's work removing OpenTracing entirely, we can now do this in a more idiomatic fashion. Thanks to this, I've removed a few places that prevented trace context from being populated based on trace policy (HTTP and GraphQL middleware, and `internal/trace`). This delegates sampling decisions to the sampler, and ensures we accept valid trace context everywhere. ## Test plan Unit tests on a TracerProvider configured with the new sampler. Manual testing: ``` sg run jaeger otel-collector sg start ``` Setting `observability.tracing.debug` to `true` we can see logs indicating the desired traits for non-`ShouldTrace` traces: ``` [ worker] INFO tracer tracer/logged_otel.go:63 Start {"spanName": "workerutil.dbworker.store.insights_query_runner_jobs_store.dequeue", "isRecording": false, "isSampled": false, "isValid": true} ``` With `observability.tracing.sampling` set to `none`, running a search with `&trace=1` only gives us spans from zoekt, which seems to have always been outside our rules here. With `observability.tracing.sampling` set to `selective`, running a search with `&trace=1` gives us a full trace. With `observability.tracing.sampling` set to `all`, Jaeger instantly gets loads of traces, and in logs we see: ``` [ worker] INFO tracer tracer/logged_otel.go:63 Start {"spanName": "workerutil.dbworker.store.exhaustive_search_worker_store.dequeue", "isRecording": true, "isSampled": true, "isValid": true} ``` --------- Co-authored-by: William Bezuidenhout <william.bezuidenhout@sourcegraph.com>
47 lines
1.4 KiB
Go
47 lines
1.4 KiB
Go
package tracer
|
|
|
|
import (
|
|
oteltracesdk "go.opentelemetry.io/otel/sdk/trace"
|
|
|
|
"github.com/sourcegraph/sourcegraph/internal/trace/policy"
|
|
)
|
|
|
|
var (
|
|
// Use upstream samplers to ensure we return the right thing in our
|
|
// custom Sampler implementation.
|
|
alwaysSampleSampler = oteltracesdk.AlwaysSample()
|
|
neverSampleSampler = oteltracesdk.NeverSample()
|
|
)
|
|
|
|
// tracePolicySampler implements the oteltrace.Sampler interface and indicates
|
|
// whether a trace should be sampled or not based on the global trace policy
|
|
// and comparing it against the policy indicated in the parent context where
|
|
// relevant.
|
|
type tracePolicySampler struct{}
|
|
|
|
var _ oteltracesdk.Sampler = tracePolicySampler{}
|
|
|
|
func (tracePolicySampler) ShouldSample(p oteltracesdk.SamplingParameters) oteltracesdk.SamplingResult {
|
|
switch policy.GetTracePolicy() {
|
|
case policy.TraceAll:
|
|
// Retain and export all events.
|
|
return alwaysSampleSampler.ShouldSample(p)
|
|
|
|
case policy.TraceNone:
|
|
// Drop all events.
|
|
return neverSampleSampler.ShouldSample(p)
|
|
|
|
default:
|
|
// By default, enforce policy.TraceSelective, which means that we only
|
|
// sample if the parent context is marked for tracing.
|
|
if policy.ShouldTrace(p.ParentContext) {
|
|
return alwaysSampleSampler.ShouldSample(p)
|
|
}
|
|
}
|
|
|
|
// Otherwise, indicate this span should be dropped and not exported.
|
|
return neverSampleSampler.ShouldSample(p)
|
|
}
|
|
|
|
func (tracePolicySampler) Description() string { return "internal/tracer.tracePolicySampler" }
|