Splunk Search API
Overview
Splunk Search API is a pull-based source that runs Splunk searches against a customer's Splunk deployment via the REST search API (/services/search/jobs) and emits matching events as logs into a Praxis pipeline.
Use it when you need Praxis to drain events out of an existing Splunk indexer — for example, to migrate a Google/Chronicle Forwarder kind: splunk collector to Praxis without losing data, or to send a copy of Splunk-indexed events to a second SIEM.
Supported types: Logs
Supported platforms: Linux, Windows, macOS
Minimum collector version: 0.3.0
Two operating modes
| Mode | When to use | Required configuration |
|---|---|---|
| One-shot (backfill) | Migration / historical import. The receiver runs each search once between an explicit earliest_time and latest_time, paginates results, and then idles. | Set earliest_time and latest_time on each search; leave polling_interval at 0. |
| Continuous (polling) | Ongoing tail of a Splunk index. The receiver re-runs each search on a fixed cadence, advancing the time window each tick. | Set polling_interval >= 30 seconds; leave earliest_time / latest_time empty. |
The receiver enforces a minimum 30s polling interval to protect the customer search head from accidentally aggressive configurations.
Authentication
Two credential types are supported, mutually exclusive:
| Credential type | When to use |
|---|---|
Splunk Token (splunktoken) | Splunk session tokens, HEC tokens, and Splunk Cloud Stack tokens. Sent as Authorization: Splunk <token> by the praxis-collector splunktokenauth extension. |
Basic Auth (basicauth) | On-prem Splunk deployments still using HTTP Basic with splunk_username + splunk_password. |
For Bearer-prefixed tokens (rare in Splunk land), use the existing bearertokenauth credential instead — splunktoken always emits the Splunk scheme.
Basic configuration
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
endpoint | string (URI) | Yes | none | Base URL of the Splunk REST API (search head or load balancer). e.g. https://splunk.example.com:8089. |
searches | array | Yes | none | One or more SPL searches to run. At least one entry is required. |
polling_interval | integer (seconds) | No | 0 | 0 = one-shot mode. >= 30 = continuous polling cadence. The receiver rejects values in (0, 30). |
lookback_window | integer (seconds) | No | 60 | Continuous mode only. Each tick computes its window as [now - polling_interval - lookback_window, now] so events whose _time arrives a few seconds late are still picked up. The receiver dedupes via Splunk _cd so a generous lookback is safe. |
job_poll_interval | integer (seconds) | No | 5 | How often the receiver checks /services/search/v2/jobs/{sid} for completion. Independent of polling_interval. |
searches[] entries
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
query | string | Yes | none | Standalone search command (no pipes; the receiver rejects ` |
earliest_time | string | Conditional | none | One-shot mode only. Window start in YYYY-MM-DDTHH:MM (UTC). Must be empty when polling_interval is set. |
latest_time | string | Conditional | none | One-shot mode only. Window end. Must be empty when polling_interval is set. |
limit | integer | No | 0 | Max events emitted per execution. 0 = unlimited. |
event_batch_size | integer | No | 100 | Splunk results page size (1–50 000). |
advanced
| Parameter | Type | Default | Description |
|---|---|---|---|
timeout | integer (seconds) | 30 | HTTP timeout for outbound requests to Splunk. |
storage | string | none | Component ID of a storage extension (typically file_storage) used to checkpoint the rolling window across restarts. Without this, restarts re-ingest the configured window from scratch. |
tls.insecure_skip_verify | bool | false | Disable TLS server certificate verification. Not recommended in production. |
tls.ca_file | string | none | Custom CA bundle for verifying the Splunk server. |
Output mapping
Results from /services/search/v2/jobs/{sid}/results are converted to OTel log records as follows:
| Splunk field | OTel destination |
|---|---|
_raw | LogRecord.Body (string) |
_time | LogRecord.Timestamp |
host | Resource.Attributes["host"] |
source | Resource.Attributes["source"] |
sourcetype | Resource.Attributes["sourcetype"] |
index | Resource.Attributes["index"] |
_cd | LogRecord.Attributes["splunk.cd"] (used internally for dedupe) |
The host/source/sourcetype/index lift onto resource attributes so downstream processors (Google SecOps standardization, transforms, etc.) can route by them.
Example — continuous polling
{
"endpoint": "https://splunk.example.com:8089", // required
"searches": [
{
"query": "search index=main sourcetype=access_combined",
"event_batch_size": 500,
"limit": 0
}
],
"polling_interval": 60, // 1-minute tick
"lookback_window": 120, // 2-minute lookback to catch late events
"job_poll_interval": 5,
"advanced": {
"timeout": 30,
"storage": "file_storage",
"tls": { "insecure_skip_verify": false }
}
}
Example — one-shot backfill
{
"endpoint": "https://splunk.example.com:8089",
"searches": [
{
"query": "search index=security",
"earliest_time": "2026-04-01T00:00",
"latest_time": "2026-04-29T00:00",
"event_batch_size": 1000
}
],
"polling_interval": 0, // one-shot
"advanced": { "storage": "file_storage" }
}
Migrating from Google/Chronicle Forwarder
When the forwarder migration tool encounters a kind: splunk collector, it emits a splunk_search source with continuous polling (default polling_interval: 60). The translation:
| Forwarder field | splunk_search field |
|---|---|
url | endpoint |
queries[].query (or query_string) | searches[].query (normalized to search ..., pipes stripped, index=<index_name> injected if missing) |
index_name | merged into searches[].query as index=<name> |
poll_interval_sec (or polling_interval) | polling_interval (clamped to >= 30) |
username + password | basicauth credential |
auth / token | splunktoken credential |
Operational notes
- Search-head load. The receiver creates a search job per tick per configured search. Keep
searchesshort and the queries narrow (index=,sourcetype=) to limit search-head cost. - Cursor durability. Without
advanced.storage, the rolling time window resets on each collector restart and the lookback window may double-emit events. Configure afile_storageextension for production deployments. - Late events.
lookback_windowplus_cddedupe handles late arrivals up to the lookback bound. Events arriving later than that are dropped on principle (the receiver will not re-query historical windows once their tick is past).
Metrics
In addition to the standard collector_source_* metrics that all sources expose, the search-API receiver emits its own request-level telemetry through the OTel HTTP client; HTTP errors surface in the collector's standard otelcol_exporter_* and receiver-level error counters.