Skip to main content

Tail Sampling

Overview

The Tail Sampling processor decides whether to keep or drop a trace after all of its spans have been collected, allowing decisions based on the trace as a whole — total latency, error status, span count, attribute combinations. This is the right tool when you need policy-based sampling ("keep all errors", "keep slow traces", "rate-limit by service") rather than blind volume reduction.

The processor must be paired with the Group By Trace processor upstream — that's what holds spans in memory until the trace is complete.

Supported types: Traces

Required Pipeline Order

receivers → memory_limiter → groupbytrace → tail_sampling → batch → exporters

Configuration

ParameterTypeDefaultRequiredDescription
decision_waitdurationYesHow long to buffer a trace before deciding. Should match or exceed groupbytrace.wait_duration.
num_tracesuint64upstream defaultNoMaximum number of traces held for decision. Eviction happens when exceeded.
expected_new_traces_per_secuint64upstream defaultNoHint to the processor for sizing internal data structures.
decision_cache_sampled_sizeintupstream defaultNoCache size for already-sampled trace IDs (so late-arriving spans of an already-decided trace inherit the decision).
decision_cache_not_sampled_sizeintupstream defaultNoCache size for already-rejected trace IDs.
policiesobject[]YesAt least one policy. The first policy that fires "sample" wins (OR semantics across policies).

Policy Types

Each policy entry has name and type plus a type-specific config block.

Policy typeBehaviorConfig block
always_sampleKeep every trace.(none)
latencyKeep traces whose total latency is at least threshold_ms (and optionally below upper_threshold_ms).latency
numeric_attributeKeep traces with a numeric span attribute in [min_value, max_value].numeric_attribute
string_attributeKeep traces with a string span attribute matching values (exact or regex).string_attribute
boolean_attributeKeep traces with a boolean span attribute matching value.boolean_attribute
status_codeKeep traces whose root span (or any span, depending on impl) has one of status_codes (OK, ERROR, UNSET).status_code
probabilisticKeep sampling_percentage of traces. Like the probabilistic sampler, but applied at tail time.probabilistic
rate_limitingKeep at most spans_per_second total spans.rate_limiting
span_countKeep traces with span count in [min_spans, max_spans].span_count
trace_stateKeep traces whose tracestate has key matching one of values.trace_state
ottl_conditionKeep traces whose spans match an OTTL boolean expression.ottl_condition
andKeep traces matching ALL and_sub_policy entries.and
compositeApply multiple sub-policies with rate allocation per policy.composite

Common type-specific config

"latency": { "threshold_ms": 1000, "upper_threshold_ms": 0 },
"numeric_attribute": { "key": "http.status_code", "min_value": 500, "max_value": 599, "invert_match": false },
"string_attribute": { "key": "service.name", "values": ["billing"], "enabled_regex_matching": false, "cache_max_size": 0, "invert_match": false },
"boolean_attribute": { "key": "is_critical", "value": true },
"status_code": { "status_codes": ["ERROR"] },
"probabilistic": { "hash_salt": "salt", "sampling_percentage": 10.0 },
"rate_limiting": { "spans_per_second": 1000 },
"span_count": { "min_spans": 5, "max_spans": 100 },
"trace_state": { "key": "tenant", "values": ["paid"] },
"ottl_condition": { "error_mode": "ignore", "expression": ["attributes[\"http.status_code\"] >= 500"] },

Example Configuration

{
"decision_wait": "10s",
"num_traces": 100000,
"expected_new_traces_per_sec": 5000,

"policies": [
// Always keep error traces
{
"name": "errors",
"type": "status_code",
"status_code": { "status_codes": ["ERROR"] },
},

// Keep slow traces (latency >= 1s)
{
"name": "slow",
"type": "latency",
"latency": { "threshold_ms": 1000 },
},

// Keep 100% of traces from the billing service
{
"name": "billing",
"type": "string_attribute",
"string_attribute": {
"key": "service.name",
"values": ["billing", "billing-internal"],
},
},

// Keep 5% of everything else (background sampling)
{
"name": "background",
"type": "probabilistic",
"probabilistic": { "sampling_percentage": 5.0 },
},
],
}

Composite policy with rate allocation

{
"decision_wait": "10s",
"policies": [
{
"name": "composite-policy",
"type": "composite",
"composite": {
"max_total_spans_per_second": 1000,
"policy_order": ["errors", "slow", "background"],
"composite_sub_policy": [
{ "name": "errors", "type": "status_code" },
{ "name": "slow", "type": "latency" },
{ "name": "background", "type": "always_sample" },
],
"rate_allocation": [
{ "policy": "errors", "percent": 50 },
{ "policy": "slow", "percent": 30 },
{ "policy": "background", "percent": 20 },
],
},
},
],
}

Notes

  • Pipeline order is mandatory. groupbytracetail_sampling. Without grouping, the processor receives one span at a time and can't make trace-level decisions.
  • Decision wait latency. Every trace is delayed by at least decision_wait. This is the cost of policy-based sampling; the probabilistic sampler avoids this delay but can't make trace-level decisions.
  • OR semantics. Policies are evaluated independently — if any one policy says "sample", the trace is kept. To require multiple conditions, use the and policy type.
  • Memory. Buffer holds num_traces × average-trace-size bytes. Sized for peak load, not steady-state.
  • Late spans. Spans arriving after a trace's decision is finalized inherit the decision via the decision cache. Set decision_cache_*_size large enough to span the late-span window.
  • Deduplication concern. If you run multiple tail-sampling collectors that see overlapping spans of the same trace (e.g. behind a load balancer), they may make different decisions. Pin trace IDs to specific collectors or run a single tail-sampler at the gateway.