02/20/26

Best Tracing and Observability Tools for TypeScript in 2026

Comparing instrumentation approaches, tracing backends, and observability platforms

14 Min Read

Getting tracing to work in a TypeScript backend involves two separate decisions that often get conflated: how you generate traces (instrumentation) and where you store and view them (the backend). Most comparisons focus on the backend, but in practice it's the instrumentation side that determines whether your team actually gets useful traces in production.

This guide covers both. We'll look at how each instrumentation approach works in practice, compare the major tracing backends, and help you figure out which combination makes sense for your project. We'll start with the manual approach using OpenTelemetry, then look at how modern frameworks can eliminate most of this work entirely.

The two halves of tracing

A trace is a record of a request as it moves through your system. Each step in the request becomes a span: an API call, a database query, a Pub/Sub message, a service-to-service call. Spans are nested to show causality: this HTTP handler called that database query, which triggered that cache lookup.

To get traces, you need two things:

  1. Instrumentation: code that creates spans at the right points in your application. This is where the data comes from.
  2. A tracing backend: a system that collects, stores, and lets you query those spans. This is where you look at the data.

Most teams spend more time fighting with instrumentation than choosing a backend. The backend choice matters, but it's a more straightforward evaluation. Instrumentation is where the real tradeoffs live.

Instrumentation approaches

OpenTelemetry SDK

OpenTelemetry (OTel) is the CNCF standard for telemetry data. It provides vendor-neutral APIs and SDKs for generating traces, metrics, and logs. If you want maximum flexibility and don't want to be locked into a specific vendor, OTel is the default choice.

Setting it up in a TypeScript project looks something like this:

// tracing.ts (must be loaded before anything else) import { NodeSDK } from '@opentelemetry/sdk-node'; import { getNodeAutoInstrumentations } from '@opentelemetry/auto-instrumentations-node'; import { OTLPTraceExporter } from '@opentelemetry/exporter-trace-otlp-http'; import { Resource } from '@opentelemetry/resources'; import { ATTR_SERVICE_NAME } from '@opentelemetry/semantic-conventions'; const sdk = new NodeSDK({ resource: new Resource({ [ATTR_SERVICE_NAME]: 'my-service', }), traceExporter: new OTLPTraceExporter({ url: 'http://localhost:4318/v1/traces', }), instrumentations: [getNodeAutoInstrumentations()], }); sdk.start();

That handles automatic instrumentation for common libraries (HTTP, Express, database drivers). But for your own business logic and service-to-service calls, you need manual spans:

import { trace } from '@opentelemetry/api'; const tracer = trace.getTracer('order-service'); async function processOrder(orderId: string) { return tracer.startActiveSpan('processOrder', async (span) => { try { span.setAttribute('order.id', orderId); const items = await tracer.startActiveSpan('fetchItems', async (childSpan) => { const result = await db.query('SELECT * FROM items WHERE order_id = $1', [orderId]); childSpan.setAttribute('item.count', result.rows.length); childSpan.end(); return result.rows; }); await tracer.startActiveSpan('chargePayment', async (childSpan) => { await payments.charge({ orderId, amount: calculateTotal(items) }); childSpan.end(); }); span.setStatus({ code: SpanStatusCode.OK }); } catch (error) { span.setStatus({ code: SpanStatusCode.ERROR, message: error.message }); throw error; } finally { span.end(); } }); }

OTel is vendor-neutral, comprehensive, and backed by a large community. It works with any backend that supports OTLP. The tradeoff is that setup involves 6-8 packages minimum (@opentelemetry/sdk-node, @opentelemetry/api, @opentelemetry/auto-instrumentations-node, an exporter, resource definitions, semantic conventions, plus instrumentations for specific libraries). Auto-instrumentation covers HTTP and database calls but misses your application logic, and manual spans add boilerplate that requires discipline to maintain. Getting consistent trace context across async boundaries in Node.js can also be tricky.

Coverage in practice: Auto-instrumentation gets you maybe 60-70% of the picture. The remaining 30-40% (the business logic, the conditional branches, the service-to-service calls) requires manual work. Teams that don't invest in that manual work end up with traces that show "something called something" without enough context to debug actual issues.

One common pain point: the OTel SDK must be initialized before any other imports. In TypeScript projects using ES modules, this ordering is fragile. A misplaced import can silently break context propagation, and you won't know until you notice traces are missing spans in production. The --require flag helps with CommonJS, but ESM projects need a --loader or a separate entry point.

Vendor SDKs (Datadog, New Relic)

Datadog's dd-trace and New Relic's agent take a different approach. Instead of assembling OTel packages yourself, you install a single agent that hooks into Node.js at runtime.

// Datadog (must be imported first) import 'dd-trace/init'; // That's it for basic setup. The agent patches // supported libraries automatically.

The agent monkey-patches common libraries (Express, HTTP, pg, Redis, etc.) to generate spans without code changes. Datadog and New Relic have dedicated teams maintaining these patches, so coverage of popular libraries is good.

The initial setup is simpler than OTel, with well-maintained library patches and tight integration with the vendor's full observability stack (logs, metrics, and traces correlated automatically). The tradeoff is vendor lock-in: monkey-patching can cause subtle issues with newer libraries or TypeScript-specific patterns, you still need manual instrumentation for business logic and custom spans, and migrating away means rewriting your instrumentation. Costs scale with data volume, which can get expensive.

Coverage in practice: Similar to OTel auto-instrumentation, in that library-level calls are covered but your application's internal logic isn't. Datadog's runtime metrics add some visibility, but traces through your own code still require dd-trace's manual span API.

One thing to watch out for: vendor agents add runtime overhead. They patch modules at startup, which increases cold start times and memory usage. In serverless or containerized environments where startup speed matters, this can be a meaningful tradeoff.

Framework-level instrumentation (Encore)

Encore takes a fundamentally different approach. Instead of patching libraries at runtime, Encore's Rust-based compiler analyzes your code at build time and understands every API endpoint, service call, database query, Pub/Sub message, and cron job. It generates instrumentation automatically as part of compilation.

There's no setup code:

import { api } from "encore.dev/api"; import { SQLDatabase } from "encore.dev/storage/sqldb"; const db = new SQLDatabase("orders", { migrations: "./migrations" }); export const getOrder = api( { method: "GET", path: "/orders/:id", expose: true }, async ({ id }: { id: string }): Promise<Order> => { const order = await db.queryRow<Order>` SELECT * FROM orders WHERE id = ${id} `; return order; } );

This endpoint is fully traced. The API call, the database query, any service-to-service calls made inside the handler, error states, all captured without a single line of instrumentation code. The trace appears in your local development dashboard at localhost:9400 the moment you make a request.

You get zero instrumentation code with 100% coverage of all framework primitives (APIs, databases, Pub/Sub, cron, caches, service calls). Traces work locally during development, not just in production, with nothing to install or configure. All traces are viewable in Encore's built-in tracing dashboard, both locally during development and in production via the Encore Cloud Trace Explorer.

Coverage in practice: For anything built with Encore's primitives, coverage is 100% by definition. The compiler knows exactly what your code does because you declared it using typed infrastructure primitives. There's nothing to forget, nothing to maintain, and no instrumentation code to drift out of sync with your application.

This approach also eliminates a common failure mode: instrumentation rot. In OTel-based projects, spans get added when code is first written, but as the codebase evolves, new code paths appear without corresponding spans. Nobody notices until a production incident hits an uninstrumented path. With compiler-driven instrumentation, new endpoints and infrastructure calls are traced automatically the moment you write them.

Why local dev tracing matters

Most tracing setups only work in deployed environments. You instrument your code, push to staging, and then check whether traces look right. If a span is missing or a context didn't propagate, you iterate through another deploy cycle.

This is slow and it means most debugging during development still happens with console.log. Traces become something you look at after the fact, not a tool you use while building.

Encore changes this by making traces available locally. When you run encore run, the local dev dashboard at localhost:9400 shows every trace in real time. You can see the exact database queries an endpoint executed, how long each service call took, and where errors occurred. This turns tracing from a production-only observability tool into an active development tool.

For OTel-based setups, you can approximate this by running Jaeger locally via Docker (docker run -p 16686:16686 jaegertracing/all-in-one), but you still need to configure your exporter to point at it and ensure your auto-instrumentation is running in your dev environment. Most teams don't bother.

Tracing backends

Once you have instrumentation generating spans, you need somewhere to send them. Here are the main options.

Jaeger

Jaeger is an open-source, end-to-end distributed tracing system originally built by Uber. It's now a CNCF graduated project, which means it's mature and well-maintained.

Jaeger accepts data via OTLP (and its older Jaeger-native format), stores it in Cassandra, Elasticsearch, or an in-memory store, and provides a web UI for searching and visualizing traces.

Good for: Teams that want full control over their tracing infrastructure. Self-hosted environments where data can't leave the network. Budgets that can't accommodate SaaS pricing.

Limitations: You run the infrastructure. That means provisioning storage, handling upgrades, managing retention policies, and keeping the collectors healthy. The UI is functional but basic compared to commercial alternatives. No built-in alerting or SLO tracking.

Cost: Free (open source). You pay for the compute and storage to run it.

A typical Jaeger setup for a small team involves running the Jaeger collector, a storage backend (Elasticsearch or Cassandra), and the Jaeger query service. For evaluation or small deployments, the all-in-one binary works, but production use requires dedicated infrastructure.

Grafana Tempo

Grafana Tempo is a high-scale, cost-efficient tracing backend designed to work with the Grafana ecosystem. It uses object storage (S3, GCS) for trace data, which keeps costs low compared to indexed storage.

Tempo integrates tightly with Grafana for visualization, Loki for log correlation, and Prometheus/Mimir for metrics. If you're already running the Grafana stack, Tempo is the natural tracing addition.

Good for: Teams already invested in Prometheus + Grafana. Organizations that want trace-to-log and trace-to-metric correlation. High-volume environments where storage cost matters.

Limitations: Requires the broader Grafana stack to get full value. Trace discovery depends on having good metadata (service names, error tags) because Tempo doesn't index all span attributes by default. Self-hosted Tempo still means you're running infrastructure. Grafana Cloud offers a managed version, but pricing scales with data volume.

Cost: Free (open source, self-hosted) or Grafana Cloud starting at $0.50/GB for traces.

Datadog APM

Datadog APM is a full-featured SaaS observability platform. Traces, metrics, logs, profiling, error tracking, and security monitoring all live in one place. The correlation between these signals is Datadog's strongest feature, letting you click from a slow trace to the relevant logs to the host metrics seamlessly.

Good for: Enterprise teams that want a single platform for all observability. Organizations with budget for SaaS tooling. Teams that value polished UX and don't want to run infrastructure.

Limitations: Expensive at scale. Datadog's pricing model charges per host and per ingested span, which can lead to surprise bills as your system grows. The dd-trace agent works best with Datadog's own backend, creating a soft lock-in. Some teams report spending more on observability than on their application infrastructure.

Cost: Starting at $31/host/month for APM, plus $0.10 per million analyzed spans beyond the included allocation. Costs add up quickly in microservice architectures.

Honeycomb

Honeycomb approaches observability differently. Instead of pre-aggregated dashboards, Honeycomb stores high-cardinality event data and lets you slice it in arbitrary ways. You can group by any attribute, filter by any combination of fields, and drill down without predefined queries.

This makes Honeycomb particularly strong for debugging novel problems, the ones where you don't know what you're looking for until you start exploring.

Good for: Teams that debug complex, distributed systems with high-cardinality data. Organizations that value exploratory analysis over predefined dashboards. SRE teams doing incident response.

Limitations: The query-first approach has a learning curve. Teams used to dashboards may find it disorienting at first. Pricing is based on event volume, which can be expensive for high-throughput systems. The platform is narrower than Datadog. It does tracing and events well, but you'll need other tools for infrastructure monitoring and log management.

Cost: Free tier with 20 million events/month. Pro starts at $130/month for 100 million events.

Encore Cloud

Encore Cloud includes a built-in tracing backend as part of its development platform. Traces from your Encore application are collected automatically and stored in a ClickHouse-backed system optimized for trace queries.

The Trace Explorer provides filtering by service, endpoint, status code, and time range. You can view percentile distributions (p50, p95, p99), correlate traces with deploys, and drill into individual traces to see the full span tree, including database queries, service-to-service calls, and Pub/Sub messages.

Because Encore handles both instrumentation and storage, the integration is seamless. There's no exporter to configure, no collector to run, no sampling decisions to make. Traces from local development and production use the same viewer.

Good for: Teams using Encore that want tracing without any additional setup. Local development debugging (traces appear instantly in the dev dashboard). Teams that want deploy-correlated observability out of the box.

Cost: Usage-based pricing. Free tier includes trace events. See pricing for details.

Comparison

OpenTelemetry + JaegerOpenTelemetry + TempoDatadog APMHoneycombEncore
Setup complexityHigh (OTel SDK + Jaeger infra)High (OTel SDK + Grafana stack)Medium (agent install)Medium (OTel SDK + SaaS)None
Instrumentation effortManual spans for business logicManual spans for business logicManual spans for business logicManual spans for business logicZero
Local dev tracingRequires running Jaeger locallyRequires running Tempo locallyRequires Datadog agentNot available locallyBuilt-in
Coverage guaranteeDepends on disciplineDepends on disciplineDepends on disciplineDepends on discipline100% of framework primitives
Cost modelInfrastructure costsInfrastructure or Grafana CloudPer-host + per-spanPer-eventUsage-based
Vendor lock-inNone (OTLP standard)Low (Grafana ecosystem)High (dd-trace + Datadog)Low (accepts OTLP)Medium (framework + built-in dashboard)
Best forFull control, self-hostedGrafana stack usersEnterprise, single-paneHigh-cardinality debuggingEncore users, zero-config

A note on sampling

At scale, collecting every trace is expensive. Most tracing systems implement sampling, keeping a percentage of traces and dropping the rest.

Head-based sampling (deciding at the start of a request) is simple but means you might miss the interesting traces. Tail-based sampling (deciding after the trace is complete) lets you keep all error traces and slow traces, but requires buffering data before making the decision, which adds complexity and infrastructure.

With OTel, you configure sampling in the SDK or collector. With Datadog, you configure it in the agent. With Encore Cloud, sampling is handled automatically. The platform retains all traces up to your plan's event limit, with priority given to errors and slow requests.

For most TypeScript backends that aren't handling millions of requests per minute, sampling isn't an immediate concern. But it's worth understanding that your choice of backend affects how sampling works and what data you retain.

What to consider when choosing

A few questions that can narrow down the decision:

How many services do you have? Single-service APIs can get away with simpler instrumentation. Once you have 3+ services communicating with each other, distributed tracing becomes critical, and the instrumentation burden grows with every service boundary.

Do you have a dedicated platform team? Running Jaeger or Tempo requires ongoing maintenance. If you don't have a team to own that infrastructure, a SaaS backend (or Encore's built-in tracing) removes that burden.

How important is local debugging? If your team spends significant time debugging request flows locally, having traces in development (not just production) changes the workflow fundamentally.

What's your budget? Datadog is powerful but expensive. Honeycomb is focused but pricing scales with volume. Open-source backends are free but cost engineering time. Encore's tracing is usage-based, with a free tier and additional events on paid plans.

Are you starting a new project or instrumenting an existing one? Retrofitting OTel into an existing codebase is a significant effort. Starting fresh with Encore means tracing is there from the first endpoint.

The bottom line

For most TypeScript backend teams, the instrumentation problem is harder than the backend choice. Tracing backends are well-understood infrastructure that you can evaluate on cost, query capabilities, and ecosystem fit. Instrumentation is where projects stall. Teams add OpenTelemetry, get auto-instrumentation working, and then never get around to the manual spans that would make traces actually useful for debugging.

Encore eliminates instrumentation entirely. Because the compiler understands your application's structure (every API, every database query, every service call), it generates complete traces without any code from you. The traces work in local development from day one, which means you're debugging with real trace data instead of console.log statements.

Traces are viewable in Encore's built-in tracing dashboard, both locally via the dev dashboard and in production via the Encore Cloud Trace Explorer. You get zero-effort instrumentation with a purpose-built trace viewer designed for Encore applications.

If you're evaluating tracing tools for a new TypeScript backend, starting with Encore means you never have to make the instrumentation decision at all. The traces are just there.

For existing projects on Express, Fastify, or NestJS, OpenTelemetry is the standard choice. Pair it with whichever backend fits your budget and operational appetite. But if you're building something new, it's worth considering whether you want to spend the next year maintaining instrumentation code, or whether you'd rather have the compiler do it for you.

Getting started

Install the Encore CLI and create a new project:

# macOS brew install encoredev/tap/encore # Linux curl -L https://encore.dev/install.sh | bash # Windows iwr https://encore.dev/install.ps1 | iex
encore app create my-app --example=ts/hello-world cd my-app encore run

Open the local development dashboard at http://localhost:9400, make a request to your API, and click into the trace view. Every span is already there, no setup required.

See the tracing documentation for more details on Encore's built-in observability, or explore the Encore Cloud platform for production tracing with the Trace Explorer.



Building a TypeScript backend? Join our Discord community to discuss observability and tracing with other developers.

Ready to escape the maze of complexity?

Encore Cloud is the development platform for building robust type-safe distributed systems with declarative infrastructure.