Group By Trace

Overview

The Group By Trace processor holds incoming spans in memory until the trace is "complete" (no new spans arrive within wait_duration), then emits all spans of the trace together as one batch. This is the prerequisite for tail sampling and any downstream processor that needs to see a whole trace at once (latency analysis, error-only export, span-count rules).

Without this processor, spans flow through one-by-one and tail-sampling decisions can't be made because the processor never sees the full trace.

Supported types: Traces

Configuration

Parameter	Type	Default	Required	Description
`wait_duration`	duration	—	Yes	How long to wait after the last span of a trace arrives before considering the trace complete and releasing it. Must be > 0. Typical: `5s`–`30s`.
`num_traces`	int	upstream default	No	Maximum number of traces held in memory at once. When exceeded, the oldest trace is evicted. Higher = better completeness, more memory.
`num_workers`	int	upstream default	No	Goroutine count releasing completed traces downstream. Tune up under high trace throughput.
`discard_orphans`	bool	`false`	No	When true, spans that arrive after their trace has already been released are dropped instead of being forwarded as a "late" trace.
`store_on_disk`	bool	`false`	No	When true, traces are buffered on disk instead of in memory. Increases capacity at the cost of latency and I/O. Requires the `file_storage` extension.

Example Configuration

{
  "wait_duration": "10s",
  "num_traces": 100000,
  "num_workers": 8,
  "discard_orphans": true,
  "store_on_disk": false,
}

Notes

Pipeline order: groupbytrace → tail_sampling → exporters. The grouping is what makes tail sampling possible.
Latency cost: every emitted span is delayed by at least wait_duration after the final span arrives. Don't use this on a real-time alerting pipeline.
Memory ceiling: the in-memory holding buffer is the dominant memory consumer in trace-heavy pipelines. Set num_traces based on your peak trace rate × wait_duration. Switch to store_on_disk if memory is constrained.
Late spans: spans arriving after wait_duration create a second "trace" with the same trace ID. Set discard_orphans: true if you'd rather drop them than have downstream see two trace batches with the same ID.

Overview​

Configuration​

Example Configuration​

Notes​

Overview

Configuration

Example Configuration

Notes