I tried to use Grafana Tempo for distributed tracing.
I launch it from docker-compose:
version: "3.9"
services:
# MY MICROSERVICES
...
prometheus:
image: prom/prometheus
ports:
- ${PROMETHEUS_EXTERNAL_PORT}:9090
volumes:
- ./prometheus/prometheus.yml:/etc/prometheus/prometheus.yml:cached
promtail:
image: grafana/promtail
volumes:
- ./log:/var/log
- ./promtail/:/mnt/config
command: -config.file=/mnt/config/promtail-config.yaml
loki:
image: grafana/loki
command: -config.file=/etc/loki/local-config.yaml
tempo:
image: grafana/tempo
command: [ "-config.file=/etc/tempo.yaml" ]
volumes:
- ./tempo/tempo-local.yaml:/etc/tempo.yaml
# - ./tempo-data/:/tmp/tempo
ports:
- "14268" # jaeger ingest
- "3200" # tempo
- "55680" # otlp grpc
- "55681" # otlp http
- "9411" # zipkin
tempo-query:
image: grafana/tempo-query
command: [ "--grpc-storage-plugin.configuration-file=/etc/tempo-query.yaml" ]
volumes:
- ./tempo/tempo-query.yaml:/etc/tempo-query.yaml
ports:
- "16686:16686" # jaeger-ui
depends_on:
- tempo
grafana:
image: grafana/grafana
volumes:
- ./grafana/datasource-config/:/etc/grafana/provisioning/datasources:cached
- ./grafana/dashboards/prometheus.json:/var/lib/grafana/dashboards/prometheus.json:cached
- ./grafana/dashboards/loki.json:/var/lib/grafana/dashboards/loki.json:cached
- ./grafana/dashboards-config/:/etc/grafana/provisioning/dashboards:cached
ports:
- ${GRAFANA_EXTERNAL_PORT}:3000
environment:
- GF_AUTH_ANONYMOUS_ENABLED=true
- GF_AUTH_ANONYMOUS_ORG_ROLE=Admin
- GF_AUTH_DISABLE_LOGIN_FORM=true
depends_on:
- prometheus
- loki
with tempo-local.yaml
:
server:
http_listen_port: 3200
distributor:
receivers: # this configuration will listen on all ports and protocols that tempo is capable of.
jaeger: # the receives all come from the OpenTelemetry collector. more configuration information can
protocols: # be found there: https://github.com/open-telemetry/opentelemetry-collector/tree/main/receiver
thrift_http: #
grpc: # for a production deployment you should only enable the receivers you need!
thrift_binary:
thrift_compact:
zipkin:
otlp:
protocols:
http:
grpc:
opencensus:
ingester:
trace_idle_period: 10s # the length of time after a trace has not received spans to consider it complete and flush it
max_block_bytes: 1_000_000 # cut the head block when it hits this size or ...
max_block_duration: 5m # this much time passes
compactor:
compaction:
compaction_window: 1h # blocks in this time window will be compacted together
max_block_bytes: 100_000_000 # maximum size of compacted blocks
block_retention: 1h
compacted_block_retention: 10m
storage:
trace:
backend: local # backend configuration to use
block:
bloom_filter_false_positive: .05 # bloom filter false positive rate. lower values create larger filters but fewer false positives
index_downsample_bytes: 1000 # number of bytes per index record
encoding: zstd # block encoding/compression. options: none, gzip, lz4-64k, lz4-256k, lz4-1M, lz4, snappy, zstd, s2
wal:
path: /tmp/tempo/wal # where to store the the wal locally
encoding: snappy # wal encoding/compression. options: none, gzip, lz4-64k, lz4-256k, lz4-1M, lz4, snappy, zstd, s2
local:
path: /tmp/tempo/blocks
pool:
max_workers: 100 # worker pool determines the number of parallel requests to the object store backend
queue_depth: 10000
tempo-query.yaml
:
backend: "tempo:3200"
and datasource.yml
for instrumenting datasources on grafana:
apiVersion: 1
deleteDatasources:
- name: Prometheus
- name: Tempo
- name: Loki
datasources:
- name: Prometheus
type: prometheus
access: proxy
orgId: 1
url: http://prometheus:9090
basicAuth: false
isDefault: false
version: 1
editable: false
- name: Tempo
type: tempo
access: proxy
orgId: 1
url: http://tempo-query:16686
basicAuth: false
isDefault: false
version: 1
editable: false
apiVersion: 1
uid: tempo
- name: Tempo-Multitenant
type: tempo
access: proxy
orgId: 1
url: http://tempo-query:16686
basicAuth: false
isDefault: false
version: 1
editable: false
apiVersion: 1
uid: tempo-authed
jsonData:
httpHeaderName1: 'Authorization'
secureJsonData:
httpHeaderValue1: 'Bearer foo-bar-baz'
- name: Loki
type: loki
access: proxy
orgId: 1
url: http://loki:3100
basicAuth: false
isDefault: false
version: 1
editable: false
apiVersion: 1
jsonData:
derivedFields:
- datasourceUid: tempo
matcherRegex: \[.+,(.+),.+\]
name: TraceID
url: $${__value.raw}
But if I test the datasource in grafana, I have this error:
In Loki view I can found the Tempo button for seeing traces... but I can't see it on Tempo because I have an error:
Anyway if I take the trace id and I search it on Jaeger, I can see it correctly.
What I missing in configuration for Tempo? How configure it correctly?
Grafana 7.5 and later can talk to Tempo natively, and no longer need the tempo-query proxy. I think this explains what is happening, Grafana is attempting to use the Tempo-native API against tempo-query, which exposes the Jaeger API instead. Try changing the Grafana datasource in datasource.yml to http://tempo:3200
.