Jaeger (CNCF graduated) and Zipkin are the two leading open-source distributed tracing systems. Both collect, store, and visualize traces across microservices. Jaeger is more feature-rich. Zipkin is simpler to run.
Architecture
| Component | Jaeger | Zipkin |
|---|---|---|
| Origin | Uber (2015), CNCF graduated | Twitter (2012) |
| Language | Go | Java |
| Deployment | All-in-one or distributed | All-in-one or distributed |
| Storage | Cassandra, Elasticsearch, Kafka, Badger, ClickHouse | Cassandra, Elasticsearch, MySQL, in-memory |
| Protocol | OpenTelemetry (OTLP), Jaeger, Zipkin | Zipkin, OpenTelemetry |
| Sampling | Adaptive, rate-limiting, remote | Rate-limiting |
| UI | React (feature-rich) | React (simpler) |
| Service dependencies | DAG view | Dependency graph |
Installation
Jaeger (all-in-one)
# Docker
docker run -d --name jaeger \
-p 16686:16686 \ # UI
-p 4317:4317 \ # OTLP gRPC
-p 4318:4318 \ # OTLP HTTP
jaegertracing/jaeger:latest
# Kubernetes (Helm)
helm repo add jaegertracing https://jaegertracing.github.io/helm-charts
helm install jaeger jaegertracing/jaeger \
--set storage.type=elasticsearch \
--set elasticsearch.host=elasticsearch:9200Zipkin
# Docker
docker run -d --name zipkin \
-p 9411:9411 \
openzipkin/zipkin:latest
# Kubernetes
kubectl create deployment zipkin --image=openzipkin/zipkin
kubectl expose deployment zipkin --port=9411Zipkin is a single container, single port. Jaeger has more components but also offers an all-in-one mode.
OpenTelemetry integration
Both work with OpenTelemetry, but Jaeger is the more natural fit:
# OpenTelemetry Collector config β export to Jaeger
exporters:
otlp/jaeger:
endpoint: jaeger:4317
tls:
insecure: true
# Export to Zipkin
exporters:
zipkin:
endpoint: http://zipkin:9411/api/v2/spansJaeger natively speaks OTLP (OpenTelemetryβs protocol). Zipkin requires translation through the Zipkin exporter. For new deployments using OpenTelemetry SDK, Jaeger is the path of least resistance.
Sampling
Jaeger adaptive sampling
# Jaeger supports remote adaptive sampling
# The collector adjusts sampling rates per service/endpoint
{
"service_strategies": [
{
"service": "payment-service",
"type": "probabilistic",
"param": 0.5
},
{
"service": "health-check",
"type": "ratelimiting",
"param": 2
}
],
"default_strategy": {
"type": "probabilistic",
"param": 0.1
}
}Jaegerβs adaptive sampling adjusts rates based on traffic volume β high-traffic endpoints get sampled less, rare errors get sampled more. Zipkin only supports static rate limiting.
Feature comparison
| Feature | Jaeger | Zipkin |
|---|---|---|
| Adaptive sampling | Yes | No |
| Service performance monitoring | Yes (SPM) | Basic |
| Trace comparison | Yes (diff two traces) | No |
| Kafka buffering | Native (Jaeger Ingester) | Via collector |
| ClickHouse storage | Yes (popular for scale) | Community plugin |
| Multi-tenancy | Supported | Not built-in |
| Trace quality metrics | Yes | No |
| OpenTelemetry native | Yes (OTLP) | Via exporter |
Decision guide
Choose Jaeger when:
- OpenTelemetry is your instrumentation standard
- You need adaptive sampling for high-traffic services
- Trace comparison and SPM features matter
- You are deploying on Kubernetes (CNCF ecosystem)
- Kafka buffering for high-volume trace ingestion
- ClickHouse as cost-effective trace storage
Choose Zipkin when:
- Simplicity β single container, minimal configuration
- Your team already uses Zipkin instrumentation (B3 propagation)
- Java ecosystem β Spring Cloud Sleuth defaults to Zipkin
- You want the fastest path to traces (5-minute setup)
- Small-medium scale where adaptive sampling is not needed
Also consider
- Grafana Tempo β cloud-native, object storage backend (S3), integrates with Grafana
- Loki for logs + Tempo for traces = unified Grafana observability