02/20/26

Best Tracing Tools for Go Backends in 2026

Compare instrumentation approaches and tracing backends for Go

14 Min Read

Setting up tracing in a Go backend means making two decisions that are easy to conflate: how you instrument your code to generate spans, and where you send those spans for storage and querying. Most comparison guides focus on the backend (Jaeger vs Datadog vs Honeycomb), but in practice it's the instrumentation side that determines whether your team ends up with traces that are useful for debugging or traces that show "something called something" with no context.

This guide covers both halves. We'll look at how each instrumentation approach works in Go, compare the major tracing backends, and help you pick the combination that fits your project. We'll start with the manual approach using OpenTelemetry, then look at how modern frameworks can eliminate most of this work entirely.

The two halves of tracing

A trace records a request's path through your system. Each unit of work along the way becomes a span: an API call, a database query, a Pub/Sub message, a service-to-service call. Spans nest to show causality: this HTTP handler ran that SQL query, which was followed by that cache write.

To get traces, you need two things:

  1. Instrumentation: code that creates spans at the right points in your application. This is where the data comes from.
  2. A tracing backend: a system that collects, stores, and lets you query those spans. This is where you look at the data.

Most Go teams spend more time wrangling instrumentation than evaluating backends. The backend choice matters, but it's a more straightforward comparison. Instrumentation is where the real tradeoffs live.

Instrumentation approaches

OpenTelemetry Go SDK

OpenTelemetry (OTel) is the CNCF standard for vendor-neutral telemetry. It provides APIs and SDKs for traces, metrics, and logs across languages. In Go, the SDK integrates with context.Context for propagation, which makes it more natural than in languages without a first-class context mechanism. (For a full walkthrough, see our OpenTelemetry Go setup guide.)

A typical setup involves initializing a tracer provider with an OTLP exporter, a service name resource, and a batch span processor, pulling in packages from go.opentelemetry.io/otel, otel/sdk/trace, otel/exporters/otlp, otel/sdk/resource, and otel/semconv. Then you wrap your HTTP handlers with otelhttp:

import "go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp" mux := http.NewServeMux() mux.HandleFunc("/orders", handleCreateOrder) handler := otelhttp.NewHandler(mux, "server") http.ListenAndServe(":8080", handler)

Database calls need their own wrapping. The otelsql package wraps database/sql drivers:

import "github.com/XSAM/otelsql" db, err := otelsql.Open("postgres", connStr, otelsql.WithAttributes(semconv.DBSystemPostgreSQL), )

For your own business logic and service-to-service calls, you create manual spans:

tracer := otel.Tracer("order-service") func processOrder(ctx context.Context, orderID string) error { ctx, span := tracer.Start(ctx, "processOrder") defer span.End() span.SetAttributes(attribute.String("order.id", orderID)) items, err := fetchItems(ctx, orderID) if err != nil { span.RecordError(err) span.SetStatus(codes.Error, err.Error()) return err } return chargePayment(ctx, orderID, calculateTotal(items)) }

Go's context.Context carrying the span makes propagation cleaner than in Node.js, where async context can be fragile. But you still need to thread ctx through every function call, and any function that doesn't accept a context breaks the chain.

OTel is vendor-neutral with mature Go support. It works with any backend that accepts OTLP, and Go's context.Context makes propagation more reliable than in many other languages. The tradeoff is that it requires wrapping every HTTP handler, every database driver, and every outgoing HTTP client, and manual spans for business logic are your responsibility. A typical setup pulls in 5-8 go.opentelemetry.io packages, and new endpoints and service calls need explicit instrumentation or they're invisible.

Coverage in practice: otelhttp and otelsql cover the edges, meaning incoming requests and database queries. The interior of your application (service calls, business logic, conditional branches) needs manual spans. Teams that don't invest in those get traces that show a request came in, hit the database, and returned, but not what happened in between.

Vendor SDKs (Datadog, New Relic)

Datadog's dd-trace-go and New Relic's Go agent take a different approach. You import a single package, and it provides middleware, database integrations, and a tracer that sends data directly to the vendor's backend.

import ( "gopkg.in/DataDog/dd-trace-go.v1/ddtrace/tracer" httptrace "gopkg.in/DataDog/dd-trace-go.v1/contrib/net/http" sqltrace "gopkg.in/DataDog/dd-trace-go.v1/contrib/database/sql" ) func main() { tracer.Start(tracer.WithService("order-service")) defer tracer.Stop() // Wrap the database driver sqltrace.Register("postgres", &pq.Driver{}) db, _ := sqltrace.Open("postgres", connStr) // Wrap the HTTP mux mux := httptrace.NewServeMux() mux.HandleFunc("/orders", handleCreateOrder) http.ListenAndServe(":8080", mux) }

Setup is simpler than assembling OTel packages yourself. Datadog maintains contrib packages for common Go libraries (gorilla/mux, gRPC, Redis, etc.), and they update those integrations on their own release cycle.

The initial setup is simpler than OTel and the library integrations are well-maintained, with tight correlation between Datadog's metrics, logs, and profiling. The tradeoff is vendor lock-in: your instrumentation code imports Datadog-specific packages, so switching to a different backend means rewriting instrumentation. You still need manual spans for business logic, and costs scale with data volume, which adds up in Go microservice architectures where each service generates spans independently.

Coverage in practice: Similar to OTel auto-instrumentation. HTTP handlers and database calls are covered, but the logic between them isn't. You get the same coverage gap for any code that doesn't go through a wrapped library.

Framework-level instrumentation (Encore)

Encore handles instrumentation at the framework level. In Go, you annotate functions with //encore:api and use typed infrastructure primitives for databases, Pub/Sub, caching, and cron jobs. Encore's compiler analyzes your code at build time and generates instrumentation for every API endpoint, every service-to-service call, and every infrastructure operation.

There's no tracing setup code:

package orders import ( "context" "encore.dev/storage/sqldb" ) var db = sqldb.NewDatabase("orders", sqldb.DatabaseConfig{ Migrations: "./migrations", }) type Order struct { ID string `json:"id"` UserID string `json:"userId"` Status string `json:"status"` } //encore:api public method=GET path=/orders/:id func GetOrder(ctx context.Context, id string) (*Order, error) { order := &Order{} err := db.QueryRow(ctx, "SELECT id, user_id, status FROM orders WHERE id = $1", id, ).Scan(&order.ID, &order.UserID, &order.Status) if err != nil { return nil, err } return order, nil }

That endpoint is fully traced. The API call, the database query, and any service-to-service calls made inside the handler are all captured without a single line of instrumentation code. Traces show up in the local development dashboard at localhost:9400 the moment you make a request.

Service-to-service calls are traced automatically too, because Encore owns the RPC layer:

package orders import ( "context" "encore.app/payments" ) //encore:api public method=POST path=/orders func CreateOrder(ctx context.Context, req *CreateOrderRequest) (*Order, error) { // This call is traced automatically, context propagation is built in _, err := payments.Charge(ctx, &payments.ChargeRequest{ UserID: req.UserID, Amount: req.Amount, }) if err != nil { return nil, err } return createOrderRecord(ctx, req) }

You get zero instrumentation code with 100% coverage of all framework primitives (APIs, databases, Pub/Sub, cron, caches, service calls). Traces work locally during development with nothing to install or configure, and in production through Encore Cloud's built-in Trace Explorer.

Coverage in practice: For anything built with Encore's primitives, coverage is 100% by definition. The compiler knows what your code does because you declared it using typed primitives. New endpoints and infrastructure calls are traced automatically the moment you write them.

This eliminates a common failure mode: instrumentation rot. In OTel-based Go projects, developers add spans when code is first written. As the codebase evolves, new endpoints and service calls appear without corresponding instrumentation. Nobody notices until a production incident hits an uninstrumented path. With compiler-driven instrumentation, that category of problem doesn't exist.

Why local dev tracing matters

Most tracing setups only work in deployed environments. You instrument your code, push to staging, and then check whether traces look correct. If a span is missing or context didn't propagate, you iterate through another deploy cycle.

This means most debugging during development still happens with fmt.Println. Traces become something you look at after the fact rather than a tool you use while building.

Encore changes this by making traces available locally. When you run encore run, the local development dashboard at localhost:9400 shows every trace in real time. You can see the exact SQL queries an endpoint executed, how long each service call took, and where errors occurred. It turns tracing from a production-only observability tool into part of the development workflow. For a practical example of using local traces to debug slow API requests, see our companion guide.

For OTel-based setups, you can approximate this by running Jaeger locally via Docker (docker run -p 16686:16686 jaegertracing/all-in-one), configuring your exporter to point at it, and making sure your instrumentation is running in your dev environment. It works, but it's another thing to set up and maintain, and most teams don't bother.

Tracing backends

Once you have instrumentation generating spans, you need somewhere to send them. Here are the main options.

Jaeger

Jaeger is an open-source distributed tracing system originally built at Uber, now a CNCF graduated project. It accepts data via OTLP (and its older native format), stores it in Cassandra, Elasticsearch, or an in-memory store, and provides a web UI for searching and visualizing traces.

Good for: Teams that want full control over their tracing infrastructure. Self-hosted environments where data can't leave the network.

Limitations: You run the infrastructure yourself, including provisioning storage, managing upgrades, and handling retention. The UI is functional but basic compared to commercial alternatives. No built-in alerting or SLO tracking.

Cost: Free (open source). You pay for compute and storage to run it.

Grafana Tempo

Grafana Tempo is a high-scale tracing backend designed for the Grafana ecosystem. It stores trace data in object storage (S3, GCS), keeping costs low compared to indexed storage. Tempo integrates with Grafana for visualization, Loki for log correlation, and Prometheus/Mimir for metrics.

Good for: Teams already invested in Prometheus + Grafana. High-volume environments where storage cost matters.

Limitations: Requires the broader Grafana stack to get full value. Trace discovery depends on good metadata because Tempo doesn't index all span attributes by default. Grafana Cloud offers a managed version, but pricing scales with data volume.

Cost: Free (open source, self-hosted) or Grafana Cloud starting at $0.50/GB for traces.

Datadog APM

Datadog APM is a full-featured SaaS observability platform. Traces, metrics, logs, profiling, and error tracking all live in one place. The correlation between these signals is Datadog's strongest feature, since clicking from a slow trace to the relevant logs to the host metrics takes one click.

Good for: Enterprise teams that want a single platform for all observability. Teams that value polished UX and don't want to run infrastructure.

Limitations: Expensive at scale. Datadog charges per host and per ingested span, which can lead to surprise bills. The dd-trace-go agent works best with Datadog's own backend, creating a soft lock-in.

Cost: Starting at $31/host/month for APM, plus $0.10 per million analyzed spans beyond the included allocation.

Honeycomb

Honeycomb takes a different approach. Instead of pre-aggregated dashboards, it stores high-cardinality event data and lets you slice it in arbitrary ways: group by any attribute, filter by any combination, drill down without predefined queries. This makes it strong for debugging novel problems where you don't know what you're looking for until you start exploring. The Go community has been an early adopter, and Honeycomb's Go SDK integrates well with OpenTelemetry.

Good for: Teams that debug complex distributed systems with high-cardinality data. SRE teams doing incident response.

Limitations: The query-first approach has a learning curve. Pricing is based on event volume, which can be expensive for high-throughput Go services. Narrower than Datadog, so you'll need other tools for infrastructure monitoring and log management.

Cost: Free tier with 20 million events/month. Pro starts at $130/month for 100 million events.

Encore Cloud

Encore Cloud includes a built-in tracing backend as part of its development platform. Traces from your Encore application are collected automatically and stored in a ClickHouse-backed system optimized for trace queries. The Trace Explorer provides filtering by service, endpoint, status code, and time range, with percentile distributions (p50, p95, p99) and deploy correlation.

Because Encore handles both instrumentation and storage, there's no exporter to configure, no collector to run, no sampling decisions to make. Traces from local development and production use the same viewer.

Good for: Teams using Encore that want tracing without additional setup. Local development debugging. Teams that want deploy-correlated observability out of the box.

Cost: Usage-based pricing. Free tier includes trace events. See pricing for details.

Comparison

OTel + JaegerOTel + TempoDatadog APMHoneycombEncore
Setup complexityHigh (OTel SDK + Jaeger infra)High (OTel SDK + Grafana stack)Medium (dd-trace-go + agent)Medium (OTel SDK + SaaS)None
Instrumentation effortManual spans for business logicManual spans for business logicManual spans for business logicManual spans for business logicZero
Local dev tracingRequires running Jaeger locallyRequires running Tempo locallyRequires Datadog agentNot available locallyBuilt-in
Coverage guaranteeDepends on disciplineDepends on disciplineDepends on disciplineDepends on discipline100% of framework primitives
Context propagationManual (thread ctx through all calls)Manual (thread ctx through all calls)Manual (thread ctx through all calls)Manual (thread ctx through all calls)Automatic
Cost modelInfrastructure costsInfrastructure or Grafana CloudPer-host + per-spanPer-eventUsage-based
Vendor lock-inNone (OTLP standard)Low (Grafana ecosystem)High (dd-trace-go + Datadog)Low (accepts OTLP)Medium (framework-specific)
Best forFull control, self-hostedGrafana stack usersEnterprise, single-paneHigh-cardinality debuggingEncore users, zero-config

Sampling considerations

At scale, collecting every trace is expensive. Most tracing systems implement sampling, keeping a percentage of traces and dropping the rest.

Head-based sampling decides at the start of a request whether to trace it. Simple to implement, but you might miss the traces you care about most (errors, slow requests). In Go with OTel, you configure this on the tracer provider:

tp := sdktrace.NewTracerProvider( sdktrace.WithSampler(sdktrace.TraceIDRatioBased(0.1)), // sample 10% sdktrace.WithBatcher(exporter), )

Tail-based sampling decides after the trace is complete, so you can keep all error traces and all slow traces. This requires buffering span data before making the decision, which adds complexity. The OpenTelemetry Collector supports tail-based sampling via its tail_sampling processor, but running the collector is additional infrastructure.

With Encore Cloud, sampling is handled automatically. The platform retains all traces up to your plan's event limit, with priority given to errors and slow requests. You don't configure anything.

For most Go backends that aren't handling millions of requests per second, sampling isn't an immediate concern. But it's worth understanding that your backend choice affects how sampling works and what data you retain when you do hit scale.

What to consider when choosing

A few questions that narrow down the decision:

How many services are in the system? A single Go service can get away with simpler instrumentation. Once you have three or more services communicating over the network, distributed tracing becomes critical, and the instrumentation burden scales with every service boundary.

Do you have a platform team? Running Jaeger or Tempo requires ongoing maintenance. If nobody owns that infrastructure, a SaaS backend (or Encore's built-in tracing) removes the burden.

How important is local debugging? If your team debugs request flows during development, having traces locally changes the workflow. With OTel, running Jaeger in Docker is feasible but adds friction. With Encore, traces are there from encore run.

What's your budget? Datadog is capable but expensive. Honeycomb is strong for exploratory debugging but pricing scales with volume. Open-source backends are free but cost engineering time. Encore offers usage-based tracing pricing.

Are you starting fresh or instrumenting existing code? Retrofitting OTel into an existing Go codebase means wrapping every handler, every database call, and every outgoing HTTP client. Starting with Encore means tracing is there from the first endpoint.

The bottom line

For most Go backend teams, the instrumentation problem is harder than the backend choice. Tracing backends are well-understood infrastructure that you can evaluate on cost, query capabilities, and ecosystem fit. Instrumentation is where projects stall. Teams integrate OpenTelemetry, wrap their HTTP handlers and database drivers, and then never add the manual spans that would make traces useful for debugging the interesting problems.

Go's context.Context makes propagation more natural than in many languages, but someone still has to write and maintain the instrumentation code, and that someone is usually the same team building features.

Encore eliminates instrumentation entirely. The compiler understands your application's structure (every API, every database query, every service call) and generates complete traces without any code from you. Traces work locally from day one through the development dashboard, and in production through Encore Cloud's Trace Explorer with zero configuration.

For existing Go projects on Gin, Chi, or standard net/http, OpenTelemetry is the standard choice. Pair it with whichever backend fits your operational appetite and budget. If you're starting something new, it's worth considering whether maintaining instrumentation code is where you want to spend your engineering time.

Getting started

Install the Encore CLI and create a new Go project:

# macOS brew install encoredev/tap/encore # Linux curl -L https://encore.dev/install.sh | bash # Windows iwr https://encore.dev/install.ps1 | iex
encore app create my-app --example=go/hello-world cd my-app encore run

Open the local development dashboard at http://localhost:9400, make a request to your API, and click into the trace view. Every span is already there.

See the tracing documentation for more on Encore's built-in observability, or explore the Encore Cloud platform for production tracing with the Trace Explorer.


Building a Go backend? Join our Discord community to discuss tracing and observability with other developers.

Ready to escape the maze of complexity?

Encore Cloud is the development platform for building robust type-safe distributed systems with declarative infrastructure.