Group By Trace
Overview
The Group By Trace processor holds incoming spans in memory until the trace is "complete" (no new spans arrive within wait_duration), then emits all spans of the trace together as one batch. This is the prerequisite for tail sampling and any downstream processor that needs to see a whole trace at once (latency analysis, error-only export, span-count rules).
Without this processor, spans flow through one-by-one and tail-sampling decisions can't be made because the processor never sees the full trace.
Supported types: Traces
Configuration
| Parameter | Type | Default | Required | Description |
|---|---|---|---|---|
wait_duration | duration | — | Yes | How long to wait after the last span of a trace arrives before considering the trace complete and releasing it. Must be > 0. Typical: 5s–30s. |
num_traces | int | upstream default | No | Maximum number of traces held in memory at once. When exceeded, the oldest trace is evicted. Higher = better completeness, more memory. |
num_workers | int | upstream default | No | Goroutine count releasing completed traces downstream. Tune up under high trace throughput. |
discard_orphans | bool | false | No | When true, spans that arrive after their trace has already been released are dropped instead of being forwarded as a "late" trace. |
store_on_disk | bool | false | No | When true, traces are buffered on disk instead of in memory. Increases capacity at the cost of latency and I/O. Requires the file_storage extension. |
Example Configuration
{
"wait_duration": "10s",
"num_traces": 100000,
"num_workers": 8,
"discard_orphans": true,
"store_on_disk": false,
}
Notes
- Pipeline order:
groupbytrace→tail_sampling→ exporters. The grouping is what makes tail sampling possible. - Latency cost: every emitted span is delayed by at least
wait_durationafter the final span arrives. Don't use this on a real-time alerting pipeline. - Memory ceiling: the in-memory holding buffer is the dominant memory consumer in trace-heavy pipelines. Set
num_tracesbased on your peak trace rate ×wait_duration. Switch tostore_on_diskif memory is constrained. - Late spans: spans arriving after
wait_durationcreate a second "trace" with the same trace ID. Setdiscard_orphans: trueif you'd rather drop them than have downstream see two trace batches with the same ID.