Skip to main content

Splunk Search API

Overview

Splunk Search API is a pull-based source that runs Splunk searches against a customer's Splunk deployment via the REST search API (/services/search/jobs) and emits matching events as logs into a Praxis pipeline.

Use it when you need Praxis to drain events out of an existing Splunk indexer — for example, to migrate a Google/Chronicle Forwarder kind: splunk collector to Praxis without losing data, or to send a copy of Splunk-indexed events to a second SIEM.

Supported types: Logs

Supported platforms: Linux, Windows, macOS

Minimum collector version: 0.3.0

Two operating modes

ModeWhen to useRequired configuration
One-shot (backfill)Migration / historical import. The receiver runs each search once between an explicit earliest_time and latest_time, paginates results, and then idles.Set earliest_time and latest_time on each search; leave polling_interval at 0.
Continuous (polling)Ongoing tail of a Splunk index. The receiver re-runs each search on a fixed cadence, advancing the time window each tick.Set polling_interval >= 30 seconds; leave earliest_time / latest_time empty.

The receiver enforces a minimum 30s polling interval to protect the customer search head from accidentally aggressive configurations.

Authentication

Two credential types are supported, mutually exclusive:

Credential typeWhen to use
Splunk Token (splunktoken)Splunk session tokens, HEC tokens, and Splunk Cloud Stack tokens. Sent as Authorization: Splunk <token> by the praxis-collector splunktokenauth extension.
Basic Auth (basicauth)On-prem Splunk deployments still using HTTP Basic with splunk_username + splunk_password.

For Bearer-prefixed tokens (rare in Splunk land), use the existing bearertokenauth credential instead — splunktoken always emits the Splunk scheme.

Basic configuration

ParameterTypeRequiredDefaultDescription
endpointstring (URI)YesnoneBase URL of the Splunk REST API (search head or load balancer). e.g. https://splunk.example.com:8089.
searchesarrayYesnoneOne or more SPL searches to run. At least one entry is required.
polling_intervalinteger (seconds)No00 = one-shot mode. >= 30 = continuous polling cadence. The receiver rejects values in (0, 30).
lookback_windowinteger (seconds)No60Continuous mode only. Each tick computes its window as [now - polling_interval - lookback_window, now] so events whose _time arrives a few seconds late are still picked up. The receiver dedupes via Splunk _cd so a generous lookback is safe.
job_poll_intervalinteger (seconds)No5How often the receiver checks /services/search/v2/jobs/{sid} for completion. Independent of polling_interval.

searches[] entries

ParameterTypeRequiredDefaultDescription
querystringYesnoneStandalone search command (no pipes; the receiver rejects `
earliest_timestringConditionalnoneOne-shot mode only. Window start in YYYY-MM-DDTHH:MM (UTC). Must be empty when polling_interval is set.
latest_timestringConditionalnoneOne-shot mode only. Window end. Must be empty when polling_interval is set.
limitintegerNo0Max events emitted per execution. 0 = unlimited.
event_batch_sizeintegerNo100Splunk results page size (1–50 000).

advanced

ParameterTypeDefaultDescription
timeoutinteger (seconds)30HTTP timeout for outbound requests to Splunk.
storagestringnoneComponent ID of a storage extension (typically file_storage) used to checkpoint the rolling window across restarts. Without this, restarts re-ingest the configured window from scratch.
tls.insecure_skip_verifyboolfalseDisable TLS server certificate verification. Not recommended in production.
tls.ca_filestringnoneCustom CA bundle for verifying the Splunk server.

Output mapping

Results from /services/search/v2/jobs/{sid}/results are converted to OTel log records as follows:

Splunk fieldOTel destination
_rawLogRecord.Body (string)
_timeLogRecord.Timestamp
hostResource.Attributes["host"]
sourceResource.Attributes["source"]
sourcetypeResource.Attributes["sourcetype"]
indexResource.Attributes["index"]
_cdLogRecord.Attributes["splunk.cd"] (used internally for dedupe)

The host/source/sourcetype/index lift onto resource attributes so downstream processors (Google SecOps standardization, transforms, etc.) can route by them.

Example — continuous polling

{
"endpoint": "https://splunk.example.com:8089", // required
"searches": [
{
"query": "search index=main sourcetype=access_combined",
"event_batch_size": 500,
"limit": 0
}
],
"polling_interval": 60, // 1-minute tick
"lookback_window": 120, // 2-minute lookback to catch late events
"job_poll_interval": 5,
"advanced": {
"timeout": 30,
"storage": "file_storage",
"tls": { "insecure_skip_verify": false }
}
}

Example — one-shot backfill

{
"endpoint": "https://splunk.example.com:8089",
"searches": [
{
"query": "search index=security",
"earliest_time": "2026-04-01T00:00",
"latest_time": "2026-04-29T00:00",
"event_batch_size": 1000
}
],
"polling_interval": 0, // one-shot
"advanced": { "storage": "file_storage" }
}

Migrating from Google/Chronicle Forwarder

When the forwarder migration tool encounters a kind: splunk collector, it emits a splunk_search source with continuous polling (default polling_interval: 60). The translation:

Forwarder fieldsplunk_search field
urlendpoint
queries[].query (or query_string)searches[].query (normalized to search ..., pipes stripped, index=<index_name> injected if missing)
index_namemerged into searches[].query as index=<name>
poll_interval_sec (or polling_interval)polling_interval (clamped to >= 30)
username + passwordbasicauth credential
auth / tokensplunktoken credential

Operational notes

  • Search-head load. The receiver creates a search job per tick per configured search. Keep searches short and the queries narrow (index=, sourcetype=) to limit search-head cost.
  • Cursor durability. Without advanced.storage, the rolling time window resets on each collector restart and the lookback window may double-emit events. Configure a file_storage extension for production deployments.
  • Late events. lookback_window plus _cd dedupe handles late arrivals up to the lookback bound. Events arriving later than that are dropped on principle (the receiver will not re-query historical windows once their tick is past).

Metrics

In addition to the standard collector_source_* metrics that all sources expose, the search-API receiver emits its own request-level telemetry through the OTel HTTP client; HTTP errors surface in the collector's standard otelcol_exporter_* and receiver-level error counters.