From 382be7fbf9f728cc1426bd74b44859c4eb2e0623 Mon Sep 17 00:00:00 2001 From: Robert Lin Date: Wed, 7 Sep 2022 10:29:52 -0700 Subject: [PATCH] tracing: enable opentelemetry by default, remove legacy values (#41242) Now that all our core deployment methods bundle OpenTelemetry Collector by default (#40456, #40457, #40455), we can now switch over to enabling OpenTelemetry by default. This change also: - removes the default `traceURL` template, since the Jaeger instance it points to by default no longer exists on most deployments. - fixes the behaviour of `observability.tracing` config, and adds testing - refactors some things in `internal/trace` to be OpenTelemetry-first Closes #39399 --- CHANGELOG.md | 3 + doc/admin/observability/opentelemetry.md | 2 +- doc/admin/observability/tracing.md | 48 +++++----- internal/trace/url.go | 3 +- internal/tracer/switchable_ot.go | 42 ++++----- internal/tracer/switchable_otel.go | 32 ++++--- internal/tracer/tracer.go | 106 +++++------------------ internal/tracer/watch.go | 93 ++++++++++++++++++++ internal/tracer/watch_test.go | 90 +++++++++++++++++++ schema/schema.go | 8 +- schema/site.schema.json | 26 +++--- 11 files changed, 297 insertions(+), 156 deletions(-) create mode 100644 internal/tracer/watch.go create mode 100644 internal/tracer/watch_test.go diff --git a/CHANGELOG.md b/CHANGELOG.md index 9bc8b8de436..fa29561404a 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -26,6 +26,8 @@ All notable changes to Sourcegraph are documented in this file. - [Sourcegraph with Kubernetes (without Helm)](https://docs.sourcegraph.com/admin/deploy/kubernetes): The `jaeger-agent` sidecar has been replaced by an [OpenTelemetry Collector](https://docs.sourcegraph.com/admin/observability/opentelemetry) DaemonSet + Deployment configuration. The bundled Jaeger instance is now disabled by default, instead of enabled. [#40456](https://github.com/sourcegraph/sourcegraph/issues/40456) - [Sourcegraph with Docker Compose](https://docs.sourcegraph.com/admin/deploy/docker-compose): The `jaeger` service has been replaced by an [OpenTelemetry Collector](https://docs.sourcegraph.com/admin/observability/opentelemetry) service. The bundled Jaeger instance is now disabled by default, instead of enabled. [#40455](https://github.com/sourcegraph/sourcegraph/issues/40455) +- `"observability.tracing": { "type": "opentelemetry" }` is now the default tracer type. To revert to existing behaviour, set `"type": "jaeger"` instead. The legacy values `"type": "opentracing"` and `"type": "datadog"` have been removed. [#41242](https://github.com/sourcegraph/sourcegraph/pull/41242) +- `"observability.tracing": { "urlTemplate": "" }` is now the default, and if `"urlTemplate"` is left empty, no trace URLs are generated. To revert to existing behaviour, set `"urlTemplate": "{{ .ExternalURL }}/-/debug/jaeger/trace/{{ .TraceID }}"` instead. [#41242](https://github.com/sourcegraph/sourcegraph/pull/41242) ### Fixed @@ -33,6 +35,7 @@ All notable changes to Sourcegraph are documented in this file. - The recommended [src-cli](https://github.com/sourcegraph/src-cli) version is now reported consistently. [#39468](https://github.com/sourcegraph/sourcegraph/issues/39468) - A performance issue affecting structural search causing results to not stream. It is much faster now. [#40872](https://github.com/sourcegraph/sourcegraph/pull/40872) - An issue where the saved search input box reports an invalid pattern type for `standard`, which is now valid. [#41068](https://github.com/sourcegraph/sourcegraph/pull/41068) +- Fixed a bug where setting `"observability.tracing": {}` would disable tracing, when the intended behaviour is to default to tracing with `"sampling": "selective"` enabled by default. [#41242](https://github.com/sourcegraph/sourcegraph/pull/41242) - The performance, stability, and latency of search predicates like `repo:has.file()`, `repo:has.content()`, and `file:has.content()` have been dramatically improved. [#418](https://github.com/sourcegraph/zoekt/pull/418), [#40239](https://github.com/sourcegraph/sourcegraph/pull/40239), [#38988](https://github.com/sourcegraph/sourcegraph/pull/38988), [#39501](https://github.com/sourcegraph/sourcegraph/pull/39501) ### Removed diff --git a/doc/admin/observability/opentelemetry.md b/doc/admin/observability/opentelemetry.md index d7691616da7..f4bd56e5f07 100644 --- a/doc/admin/observability/opentelemetry.md +++ b/doc/admin/observability/opentelemetry.md @@ -1,6 +1,6 @@ # OpenTelemetry -Experimental Sourcegraph 4.0+ +Sourcegraph 4.0+ > WARNING: Sourcegraph is actively working on implementing [OpenTelemetry](https://opentelemetry.io/) for all observability data. The first - and currently only - [signal](https://opentelemetry.io/docs/concepts/signals/) to be fully integrated is [tracing](./tracing.md). diff --git a/doc/admin/observability/tracing.md b/doc/admin/observability/tracing.md index 2e74811f896..797c8f60933 100644 --- a/doc/admin/observability/tracing.md +++ b/doc/admin/observability/tracing.md @@ -13,8 +13,8 @@ Note that the policies above are implemented at an application level - to sample We support the following tracing backend types: -* [`"type": "jaeger"`](#jaeger) (default) -* [`"type": "opentelemetry"`](#opentelemetry) Experimental +* [`"type": "opentelemetry"`](#opentelemetry) (default) +* [`"type": "jaeger"`](#jaeger) In addition, we also export some tracing [via net/trace](#nettrace). @@ -33,19 +33,32 @@ The response headers of the response will now include an `x-trace` entry, which ## Tracing backends Tracing backends can be configured for Sourcegraph to export traces to. +We support exporting traces via [OpenTelemetry](#opentelemetry) (recommended), or directly to [Jaeger](#jaeger). + +### OpenTelemetry + +To learn about exporting traces to various backends using OpenTelemetry, review our [OpenTelemetry documentation](./opentelemetry.md). +Once configured, you can set up a `urlTemplate` that points to your traces backend. +For example, if you [export your traces to Honeycomb](./opentelemetry.md#otlp-compatible-backends), your configuration might look like: + +```json +{ + "observability.tracing": { + "type": "opentelemetry", + "urlTemplate": "https://ui.honeycomb.io/$ORG/environments/$DATASET/trace?trace_id={{ .TraceID }}" + } +} +``` + +You can test the exporter by [tracing a search query](#trace-a-search-query). ### Jaeger To configure Jaeger, first ensure Jeager is running: -* **Single Docker container:** Jaeger will be integrated into the Sourcegraph single Docker container starting in 3.16. -* **Docker Compose:** Jaeger is deployed if you use the provided `docker-compose.yaml`. Access it at - port 16686 on the Sourcegraph node. One way to do this is to add an Ingress rule exposing port - 16686 to public Internet traffic from your IP, then navigate to `http://${NODE_IP}:16686` in your - browser. You must also [enable tracing](../deploy/docker-compose/index.md#enable-tracing). -* **Kubernetes:** Jaeger is already deployed, unless you explicitly removed it from the Sourcegraph - manifest. Jaeger can be accessed from the admin UI under Maintenance/Tracing. Or by running `kubectl port-forward svc/jaeger-query 16686` and going to - `http://localhost:16686` in your browser. +* **Single Docker container:** Deploy a separate Jaeger instance and configure it with [Jaeger client environment variables](https://github.com/jaegertracing/jaeger-client-go#environment-variables). +* **Docker Compose:** See the relevant [enable the bundled Jaeger deployment guide](../deploy/docker-compose/operations.md#enable-the-bundled-jaeger-deployment) +* **Kubernetes:** See the relevant [enable the bundled Jaeger deployment guide](../deploy/kubernetes/operations.md#enable-the-bundled-jaeger-deployment) The Jaeger UI should look something like this: @@ -56,7 +69,8 @@ Then, configure Jaeger as your tracing backend in site configuration: ```json { "observability.tracing": { - "type": "jaeger" + "type": "jaeger", + "urlTemplate": "{{ .ExternalURL }}/-/debug/jaeger/trace/{{ .TraceID }}" } } ``` @@ -83,18 +97,12 @@ algorithm to root-cause issues with Jaeger: 1. Report this information to Sourcegraph by screenshotting the relevant trace or by downloading the trace JSON. - -### OpenTelemetry - -Experimental - -To learn about configuring Sourcegraph to make use of OpenTelemetry tracing, review our [OpenTelemetry documentation](./opentelemetry.md). - ### net/trace Sourcegraph uses the [`net/trace`](https://pkg.go.dev/golang.org/x/net/trace) package in its backend -services. This provides simple tracing information within a single process. It can be used as an -alternative when Jaeger is not available or as a supplement to Jaeger. +services, in addition to the other tracing mechanisms listed above. +This provides simple tracing information within a single process. +It can be used as an alternative when Jaeger is not available or as a supplement to Jaeger. Site admins can access `net/trace` information at https://sourcegraph.example.com/-/debug/. From there, click **Requests** to view the traces for that service. diff --git a/internal/trace/url.go b/internal/trace/url.go index 99bd52cd078..c89e202afcd 100644 --- a/internal/trace/url.go +++ b/internal/trace/url.go @@ -15,8 +15,7 @@ func URL(traceID string, querier conftypes.SiteConfigQuerier) string { c := querier.SiteConfig() tracing := c.ObservabilityTracing if tracing == nil || tracing.UrlTemplate == "" { - // Default template - return strings.TrimSuffix(c.ExternalURL, "/") + "/-/debug/jaeger/trace/" + traceID + return "" } tpl, err := template.New("traceURL").Parse(tracing.UrlTemplate) diff --git a/internal/tracer/switchable_ot.go b/internal/tracer/switchable_ot.go index 8e81d502761..c4ebc6d436f 100644 --- a/internal/tracer/switchable_ot.go +++ b/internal/tracer/switchable_ot.go @@ -2,82 +2,74 @@ package tracer import ( "fmt" - "io" "sync" "github.com/opentracing/opentracing-go" "github.com/sourcegraph/log" ) -// switchableTracer implements opentracing.Tracer, and is used to configure the global +// switchableOTTracer implements opentracing.Tracer, and is used to configure the global // tracer implementations. It is set as a global tracer so that all opentracing usages // will end up using this tracer. // // The underlying opentracer used is switchable (set via the `set` method), so as to // support live configuration. -type switchableTracer struct { - mu sync.RWMutex - tracer opentracing.Tracer - tracerCloser io.Closer +type switchableOTTracer struct { + mu sync.RWMutex + tracer opentracing.Tracer - log bool + debug bool logger log.Logger } -var _ opentracing.Tracer = &switchableTracer{} +var _ opentracing.Tracer = &switchableOTTracer{} -func newSwitchableOTTracer(logger log.Logger) *switchableTracer { +func newSwitchableOTTracer(logger log.Logger) *switchableOTTracer { var t opentracing.NoopTracer - return &switchableTracer{ + return &switchableOTTracer{ tracer: t, logger: logger.With(log.String("tracer", fmt.Sprintf("%T", t))).AddCallerSkip(1), } } -func (t *switchableTracer) StartSpan(operationName string, opts ...opentracing.StartSpanOption) opentracing.Span { +func (t *switchableOTTracer) StartSpan(operationName string, opts ...opentracing.StartSpanOption) opentracing.Span { t.mu.RLock() defer t.mu.RUnlock() - if t.log { + if t.debug { t.logger.Info("StartSpan", log.String("operationName", operationName)) } return t.tracer.StartSpan(operationName, opts...) } -func (t *switchableTracer) Inject(sm opentracing.SpanContext, format any, carrier any) error { +func (t *switchableOTTracer) Inject(sm opentracing.SpanContext, format any, carrier any) error { t.mu.RLock() defer t.mu.RUnlock() - if t.log { + if t.debug { t.logger.Info("Inject") } return t.tracer.Inject(sm, format, carrier) } -func (t *switchableTracer) Extract(format any, carrier any) (opentracing.SpanContext, error) { +func (t *switchableOTTracer) Extract(format any, carrier any) (opentracing.SpanContext, error) { t.mu.RLock() defer t.mu.RUnlock() - if t.log { + if t.debug { t.logger.Info("Extract") } return t.tracer.Extract(format, carrier) } -func (t *switchableTracer) set( +func (t *switchableOTTracer) set( logger log.Logger, tracer opentracing.Tracer, - tracerCloser io.Closer, - shouldLog bool, + debug bool, ) { t.mu.Lock() defer t.mu.Unlock() - if tc := t.tracerCloser; tc != nil { - // Close the old tracerCloser outside the critical zone - go tc.Close() - } - t.tracerCloser = tracerCloser t.tracer = tracer - t.log = shouldLog + t.debug = debug t.logger = logger.With(log.String("tracer", fmt.Sprintf("%T", tracer))).AddCallerSkip(1) t.logger.Info("tracer set") diff --git a/internal/tracer/switchable_otel.go b/internal/tracer/switchable_otel.go index 9a72bd6fbf9..2abf1fc1ec9 100644 --- a/internal/tracer/switchable_otel.go +++ b/internal/tracer/switchable_otel.go @@ -2,6 +2,7 @@ package tracer import ( "fmt" + "io" "sync/atomic" "github.com/sourcegraph/log" @@ -13,17 +14,18 @@ import ( // tracer implementations. It is set as a global tracer so that all opentracing usages // will end up using this tracer. // -// The underlying opentracer used is switchable (set via the `set` method), so as to +// The underlying tracer provider used is switchable (set via the `set` method), so as to // support live configuration. type switchableOtelTracerProvider struct { - logger log.Logger - noopProvider oteltrace.TracerProvider + logger log.Logger - v *atomic.Value + // current caries the *otelTracerProvider currently associated with this provider. + current *atomic.Value } -type otelTracerProvider struct { +type otelTracerProviderCarrier struct { provider oteltrace.TracerProvider + closer io.Closer debug bool } @@ -31,15 +33,15 @@ var _ oteltrace.TracerProvider = &switchableOtelTracerProvider{} func newSwitchableOtelTracerProvider(logger log.Logger) *switchableOtelTracerProvider { var v atomic.Value - v.Store(&otelTracerProvider{ + v.Store(&otelTracerProviderCarrier{ provider: oteltrace.NewNoopTracerProvider(), debug: false, }) - return &switchableOtelTracerProvider{logger: logger, v: &v} + return &switchableOtelTracerProvider{logger: logger, current: &v} } func (s *switchableOtelTracerProvider) Tracer(instrumentationName string, opts ...oteltrace.TracerOption) oteltrace.Tracer { - val := s.v.Load().(*otelTracerProvider) // must be initialized + val := s.current.Load().(*otelTracerProviderCarrier) // must be initialized if val.debug { s.logger.Info("Tracer", log.String("provider", fmt.Sprintf("%T", val.provider))) @@ -47,16 +49,26 @@ func (s *switchableOtelTracerProvider) Tracer(instrumentationName string, opts . return val.provider.Tracer(instrumentationName, opts...) } -func (s *switchableOtelTracerProvider) set(provider oteltrace.TracerProvider, debug bool) { +func (s *switchableOtelTracerProvider) set(provider oteltrace.TracerProvider, closer io.Closer, debug bool) { if debug { s.logger.Info("set", log.String("provider", fmt.Sprintf("%T", provider))) } + + // Shut down previous provider + if previous := s.current.Load().(*otelTracerProviderCarrier); previous.closer != nil { + go previous.closer.Close() // non-blocking + } + + // Ensure we default to a valid tracer if provider == nil { provider = oteltrace.NewNoopTracerProvider() } - s.v.Store(&otelTracerProvider{ + + // Update the value + s.current.Store(&otelTracerProviderCarrier{ provider: provider, + closer: closer, debug: debug, }) } diff --git a/internal/tracer/tracer.go b/internal/tracer/tracer.go index 72ecbf8c178..175f3e58687 100644 --- a/internal/tracer/tracer.go +++ b/internal/tracer/tracer.go @@ -14,11 +14,12 @@ import ( oteltrace "go.opentelemetry.io/otel/trace" "go.uber.org/automaxprocs/maxprocs" + "github.com/sourcegraph/sourcegraph/lib/errors" + "github.com/sourcegraph/sourcegraph/internal/conf" "github.com/sourcegraph/sourcegraph/internal/conf/conftypes" "github.com/sourcegraph/sourcegraph/internal/env" "github.com/sourcegraph/sourcegraph/internal/hostname" - "github.com/sourcegraph/sourcegraph/internal/trace/policy" "github.com/sourcegraph/sourcegraph/internal/version" "github.com/sourcegraph/sourcegraph/internal/tracer/internal/exporters" @@ -38,22 +39,22 @@ type TracerType string const ( None TracerType = "none" - // Jaeger and openTracing should be treated as analagous - the 'opentracing' moniker - // is for backwards compatibility only, 'jaeger' is more correct because we export - // Jaeger traces in 'opentracing' mode because 'opentracing' itself is an implementation - // detail, it does not have a wire protocol. - Jaeger TracerType = "jaeger" - openTracing TracerType = "opentracing" + // Jaeger exports traces over the Jaeger thrift protocol. + Jaeger TracerType = "jaeger" // OpenTelemetry exports traces over OTLP. OpenTelemetry TracerType = "opentelemetry" ) +// DefaultTracerType is the default tracer type if not explicitly set by the user and +// some trace policy is enabled. +const DefaultTracerType = OpenTelemetry + // isSetByUser returns true if the TracerType is one supported by the schema // should be kept in sync with ObservabilityTracing.Type in schema/site.schema.json func (t TracerType) isSetByUser() bool { switch t { - case openTracing, Jaeger, OpenTelemetry: + case Jaeger, OpenTelemetry: return true } return false @@ -88,87 +89,21 @@ func Init(logger log.Logger, c conftypes.WatchableSiteConfig) { // initTracer is a helper that should be called exactly once (from Init). func initTracer(logger log.Logger, opts *options, c conftypes.WatchableSiteConfig) { - // Initialize global, hot-swappable implementations of OpenTracing and OpenTelemetry + // Initialize global, hot-swappable implementations of OpenTelemetry and OpenTracing // tracing. - globalOTTracer := newSwitchableOTTracer(logger.Scoped("ot.global", "the global OpenTracing tracer")) - opentracing.SetGlobalTracer(globalOTTracer) globalOTelTracerProvider := newSwitchableOtelTracerProvider(logger.Scoped("otel.global", "the global OpenTelemetry tracer")) otel.SetTracerProvider(globalOTelTracerProvider) + globalOTTracer := newSwitchableOTTracer(logger.Scoped("ot.global", "the global OpenTracing tracer")) + opentracing.SetGlobalTracer(globalOTTracer) // Initially everything is disabled since we haven't read conf yet. This variable is // also updated to compare against new version of configuration. - oldOpts := options{ - resource: opts.resource, - // the values below may change + go c.Watch(newConfWatcher(logger, c, globalOTelTracerProvider, globalOTTracer, options{ + resource: opts.resource, TracerType: None, debug: false, externalURL: "", - } - - // Watch loop - go c.Watch(func() { - var ( - siteConfig = c.SiteConfig() - debug = false - setTracer = None - ) - - if tracingConfig := siteConfig.ObservabilityTracing; tracingConfig != nil { - debug = tracingConfig.Debug - - // If sampling policy is set, update the strategy and set our tracer to be - // Jaeger by default. - previousPolicy := policy.GetTracePolicy() - switch p := policy.TracePolicy(tracingConfig.Sampling); p { - case policy.TraceAll, policy.TraceSelective: - policy.SetTracePolicy(p) - // enable the defualt tracer type. TODO in 4.0, this should be OpenTelemetry - setTracer = Jaeger - default: - policy.SetTracePolicy(policy.TraceNone) - } - if newPolicy := policy.GetTracePolicy(); newPolicy != previousPolicy { - logger.Info("updating TracePolicy", - log.String("oldValue", string(previousPolicy)), - log.String("newValue", string(newPolicy))) - } - - // If the tracer type is configured, also set the tracer type - if t := TracerType(tracingConfig.Type); t.isSetByUser() { - setTracer = t - } - } - - opts := options{ - TracerType: setTracer, - externalURL: siteConfig.ExternalURL, - debug: debug, - // Stays the same - resource: oldOpts.resource, - } - if opts == oldOpts { - // Nothing changed - return - } - - // update old opts for comparison - oldOpts = opts - - // create new tracer providers - tracerLogger := logger.With( - log.String("tracerType", string(opts.TracerType)), - log.Bool("debug", opts.debug)) - otImpl, otelImpl, closer, err := newTracer(tracerLogger, &opts) - if err != nil { - tracerLogger.Warn("failed to initialize tracer", log.Error(err)) - return - } - - // update global tracers. for now, we let the OT tracer handle shutdown when - // switching (we always switch in tandem, so this is fine) - globalOTTracer.set(tracerLogger, otImpl, closer, opts.debug) - globalOTelTracerProvider.set(otelImpl, opts.debug) - }) + })) // Contribute validation for tracing package conf.ContributeWarning(func(c conftypes.SiteConfigQuerier) conf.Problems { @@ -183,17 +118,22 @@ func initTracer(logger log.Logger, opts *options, c conftypes.WatchableSiteConfi }) } -// newTracer creates OpenTelemetry and OpenTracing tracers based on opts +// newTracer creates OpenTelemetry and OpenTracing tracers based on opts. It always returns +// valid tracers. func newTracer(logger log.Logger, opts *options) (opentracing.Tracer, oteltrace.TracerProvider, io.Closer, error) { logger.Debug("configuring tracer") var exporter oteltracesdk.SpanExporter var err error switch opts.TracerType { - case Jaeger, openTracing: - exporter, err = exporters.NewJaegerExporter() case OpenTelemetry: exporter, err = exporters.NewOTLPExporter(context.Background(), logger) + + case Jaeger: + exporter, err = exporters.NewJaegerExporter() + + default: + err = errors.Newf("unknown tracer type %q", opts.TracerType) } if err != nil || exporter == nil { diff --git a/internal/tracer/watch.go b/internal/tracer/watch.go new file mode 100644 index 00000000000..732f223256f --- /dev/null +++ b/internal/tracer/watch.go @@ -0,0 +1,93 @@ +package tracer + +import ( + "github.com/sourcegraph/log" + + "github.com/sourcegraph/sourcegraph/internal/conf/conftypes" + "github.com/sourcegraph/sourcegraph/internal/trace/policy" +) + +func newConfWatcher( + logger log.Logger, + c conftypes.SiteConfigQuerier, + otelProvider *switchableOtelTracerProvider, + otTracer *switchableOTTracer, + initialOpts options, +) func() { + // always keep a reference to our existing options to determine if an update is needed + oldOpts := initialOpts + + // return function to be called on every conf update + return func() { + var ( + siteConfig = c.SiteConfig() + tracingConfig = siteConfig.ObservabilityTracing + previousPolicy = policy.GetTracePolicy() + setTracerType = None + debug = false + ) + + // If 'observability.tracing: {}', try to enable tracing by default + if tracingConfig != nil { + // If sampling policy is set, update the strategy and set a default TracerType + var newPolicy policy.TracePolicy + switch p := policy.TracePolicy(tracingConfig.Sampling); p { + case policy.TraceNone, policy.TraceAll, policy.TraceSelective: + // Set supported policy types + newPolicy = p + default: + // Default to selective + newPolicy = policy.TraceSelective + } + + // Set and log our new trace policy + if newPolicy != previousPolicy { + policy.SetTracePolicy(newPolicy) + logger.Info("updated TracePolicy", + log.String("previous", string(previousPolicy)), + log.String("new", string(newPolicy))) + } + + // If the tracer type is configured, also set the tracer type. Otherwise, set + // a default tracer type, unless the desired policy is none. + if t := TracerType(tracingConfig.Type); t.isSetByUser() { + setTracerType = t + } else if newPolicy != policy.TraceNone { + setTracerType = DefaultTracerType + } + + // Configure debug mode + debug = tracingConfig.Debug + } + + // collect options + opts := options{ + TracerType: setTracerType, + externalURL: siteConfig.ExternalURL, + debug: debug, + // Stays the same + resource: oldOpts.resource, + } + if opts == oldOpts { + // Nothing changed + return + } + + // update old opts for comparison + oldOpts = opts + + // create new tracer providers + tracerLogger := logger.With( + log.String("tracerType", string(opts.TracerType)), + log.Bool("debug", opts.debug)) + otImpl, otelImpl, closer, err := newTracer(tracerLogger, &opts) + if err != nil { + tracerLogger.Warn("failed to initialize tracer", log.Error(err)) + // do not return - we still want to update tracers + } + + // update global tracers + otelProvider.set(otelImpl, closer, opts.debug) + otTracer.set(tracerLogger, otImpl, opts.debug) + } +} diff --git a/internal/tracer/watch_test.go b/internal/tracer/watch_test.go new file mode 100644 index 00000000000..16e9238e9d3 --- /dev/null +++ b/internal/tracer/watch_test.go @@ -0,0 +1,90 @@ +package tracer + +import ( + "testing" + + "github.com/opentracing/opentracing-go" + "github.com/sourcegraph/log/logtest" + "github.com/stretchr/testify/assert" + oteltrace "go.opentelemetry.io/otel/trace" + + "github.com/sourcegraph/sourcegraph/internal/conf/conftypes" + "github.com/sourcegraph/sourcegraph/schema" +) + +type mockConfig struct { + get func() schema.SiteConfiguration +} + +var _ conftypes.SiteConfigQuerier = &mockConfig{} + +func (m *mockConfig) SiteConfig() schema.SiteConfiguration { return m.get() } + +func TestConfigWatcher(t *testing.T) { + var ( + logger = logtest.Scoped(t) + switchableOTel = newSwitchableOtelTracerProvider(logger.Scoped("otel", "")) + switchableOT = newSwitchableOTTracer(logger.Scoped("ot", "")) + confQuerier = &mockConfig{} + ) + + update := newConfWatcher( + logger, + confQuerier, + switchableOTel, + switchableOT, + options{}, + ) + + t.Run("tracing disabled", func(t *testing.T) { + confQuerier.get = func() schema.SiteConfiguration { + return schema.SiteConfiguration{} + } + + update() + + // should all be no-op + assert.Equal(t, oteltrace.NewNoopTracerProvider().Tracer(""), switchableOTel.Tracer("")) + assert.Equal(t, opentracing.NoopTracer{}, switchableOT.tracer) + }) + + t.Run("enable tracing with 'observability.tracing: {}'", func(t *testing.T) { + confQuerier.get = func() schema.SiteConfiguration { + return schema.SiteConfiguration{ + ObservabilityTracing: &schema.ObservabilityTracing{}, + } + } + + // fetch updated conf + update() + + // should not be no-op + assert.NotEqual(t, oteltrace.NewNoopTracerProvider().Tracer(""), switchableOTel.Tracer("")) + assert.NotEqual(t, opentracing.NoopTracer{}, switchableOT.tracer) + + // should have debug set to false + assert.False(t, switchableOTel.current.Load().(*otelTracerProviderCarrier).debug) + assert.False(t, switchableOT.debug) + }) + + t.Run("update tracing with debug", func(t *testing.T) { + confQuerier.get = func() schema.SiteConfiguration { + return schema.SiteConfiguration{ + ObservabilityTracing: &schema.ObservabilityTracing{ + Debug: true, + }, + } + } + + // fetch updated conf + update() + + // should not be no-op + assert.NotEqual(t, oteltrace.NewNoopTracerProvider().Tracer(""), switchableOTel.Tracer("")) + assert.NotEqual(t, opentracing.NoopTracer{}, switchableOT.tracer) + + // should have debug set + assert.True(t, switchableOTel.current.Load().(*otelTracerProviderCarrier).debug) + assert.True(t, switchableOT.debug) + }) +} diff --git a/schema/schema.go b/schema/schema.go index ecb7070d703..07261a8dc10 100644 --- a/schema/schema.go +++ b/schema/schema.go @@ -1306,11 +1306,11 @@ type ObservabilityClient struct { type ObservabilityTracing struct { // Debug description: Turns on debug logging of tracing client requests. This can be useful for debugging connectivity issues between the tracing client and tracing backend, the performance overhead of tracing, and other issues related to the use of distributed tracing. May have performance implications in production. Debug bool `json:"debug,omitempty"` - // Sampling description: Determines the requests for which distributed traces are recorded. "none" (default) turns off tracing entirely. "selective" sends traces whenever `?trace=1` is present in the URL. "all" sends traces on every request. Note that this only affects the behavior of the distributed tracing client. An appropriate tracing backend must be running for traces to be collected (for "opentracing", a Jaeger instance must be running as described in the Sourcegraph installation instructions). Additional downsampling can be configured in tracing backend (for Jaeger, see https://www.jaegertracing.io/docs/1.17/sampling). + // Sampling description: Determines the conditions under distributed traces are recorded. "none" turns off tracing entirely. "selective" (default) sends traces whenever `?trace=1` is present in the URL. "all" sends traces on every request. Note that this only affects the behavior of the distributed tracing client. An appropriate tracing backend must be running for traces to be collected (for "opentracing", a Jaeger instance must be running as described in the Sourcegraph installation instructions). Additional downsampling can be configured in tracing backend (for Jaeger, see https://www.jaegertracing.io/docs/1.17/sampling). Sampling string `json:"sampling,omitempty"` - // Type description: Determines what tracing provider to enable. For "jaeger", a Jaeger instance is required. For "opentelemetry" (EXPERIMENTAL), the required backend is a OpenTelemetry collector instance. "datadog" and "opentracing" options are deprecated, and the configuration option will be removed in a future release. + // Type description: Determines what tracing provider to enable. For "opentelemetry", the required backend is an OpenTelemetry collector instance (deployed by default with Sourcegraph). For "jaeger", a Jaeger instance is required to be configured via Jaeger client environment variables: https://github.com/jaegertracing/jaeger-client-go#environment-variables Type string `json:"type,omitempty"` - // UrlTemplate description: Template for linking to trace URLs - '{{ .TraceID }}' is replaced with the trace ID, and {{ .ExternalURL }} is replaced with the value of 'externalURL'. + // UrlTemplate description: Template for linking to trace URLs - '{{ .TraceID }}' is replaced with the trace ID, and {{ .ExternalURL }} is replaced with the value of 'externalURL'. If none is set, no links are generated. UrlTemplate string `json:"urlTemplate,omitempty"` } @@ -1356,7 +1356,7 @@ type OpenIDConnectAuthProvider struct { // OpenTelemetry description: Configuration for the client OpenTelemetry exporter type OpenTelemetry struct { - // Endpoint description: OpenTelemetry tracing collector endpoint + // Endpoint description: OpenTelemetry tracing collector endpoint. By default, Sourcegraph's "/-/debug/otlp" endpoint forwards data to the configured collector backend. Endpoint string `json:"endpoint,omitempty"` } diff --git a/schema/site.schema.json b/schema/site.schema.json index 221debf1669..7cdf5c9f15d 100644 --- a/schema/site.schema.json +++ b/schema/site.schema.json @@ -1197,7 +1197,7 @@ "type": "object", "properties": { "endpoint": { - "description": "OpenTelemetry tracing collector endpoint", + "description": "OpenTelemetry tracing collector endpoint. By default, Sourcegraph's \"/-/debug/otlp\" endpoint forwards data to the configured collector backend.", "type": "string", "examples": ["/-/debug/otlp", "https://COLLECTOR_ENDPOINT"], "default": "/-/debug/otlp" @@ -1210,28 +1210,32 @@ "description": "Controls the settings for distributed tracing.", "type": "object", "properties": { - "type": { - "description": "Determines what tracing provider to enable. For \"jaeger\", a Jaeger instance is required. For \"opentelemetry\" (EXPERIMENTAL), the required backend is a OpenTelemetry collector instance. \"datadog\" and \"opentracing\" options are deprecated, and the configuration option will be removed in a future release.", - "type": "string", - "enum": ["jaeger", "opentelemetry", "opentracing", "datadog"], - "default": "jaeger" - }, "sampling": { - "description": "Determines the requests for which distributed traces are recorded. \"none\" (default) turns off tracing entirely. \"selective\" sends traces whenever `?trace=1` is present in the URL. \"all\" sends traces on every request. Note that this only affects the behavior of the distributed tracing client. An appropriate tracing backend must be running for traces to be collected (for \"opentracing\", a Jaeger instance must be running as described in the Sourcegraph installation instructions). Additional downsampling can be configured in tracing backend (for Jaeger, see https://www.jaegertracing.io/docs/1.17/sampling).", + "description": "Determines the conditions under distributed traces are recorded. \"none\" turns off tracing entirely. \"selective\" (default) sends traces whenever `?trace=1` is present in the URL. \"all\" sends traces on every request. Note that this only affects the behavior of the distributed tracing client. An appropriate tracing backend must be running for traces to be collected (for \"opentracing\", a Jaeger instance must be running as described in the Sourcegraph installation instructions). Additional downsampling can be configured in tracing backend (for Jaeger, see https://www.jaegertracing.io/docs/1.17/sampling).", "type": "string", "enum": ["selective", "all", "none"], "default": "selective" }, + "type": { + "description": "Determines what tracing provider to enable. For \"opentelemetry\", the required backend is an OpenTelemetry collector instance (deployed by default with Sourcegraph). For \"jaeger\", a Jaeger instance is required to be configured via Jaeger client environment variables: https://github.com/jaegertracing/jaeger-client-go#environment-variables", + "type": "string", + "enum": ["opentelemetry", "jaeger"], + "default": "opentelemetry" + }, "debug": { "description": "Turns on debug logging of tracing client requests. This can be useful for debugging connectivity issues between the tracing client and tracing backend, the performance overhead of tracing, and other issues related to the use of distributed tracing. May have performance implications in production.", "type": "boolean", "default": false }, "urlTemplate": { - "description": "Template for linking to trace URLs - '{{ .TraceID }}' is replaced with the trace ID, and {{ .ExternalURL }} is replaced with the value of 'externalURL'.", + "description": "Template for linking to trace URLs - '{{ .TraceID }}' is replaced with the trace ID, and {{ .ExternalURL }} is replaced with the value of 'externalURL'. If none is set, no links are generated.", "type": "string", - "default": "{{ .ExternalURL }}/-/debug/jaeger/trace/{{ .TraceID }}", - "examples": ["https://ui.honeycomb.io/$ORG/environments/$DATASET/trace?trace_id={{ .TraceID }}"] + "examples": [ + "https://ui.honeycomb.io/$ORG/environments/$DATASET/trace?trace_id={{ .TraceID }}", + "https://console.cloud.google.com/traces/list?tid={{ .TraceID }}&project=$PROJECT", + "https://sourcegraph.grafana.net/explore?orgId=1&left=[\"now-1h\",\"now\",\"grafanacloud-sourcegraph-traces\",{\"query\":\"{{ .TraceID }}\",\"queryType\":\"traceId\"}]", + "{{ .ExternalURL }}/-/debug/jaeger/trace/{{ .TraceID }}" + ] } } },