Skip to main content

Google Cloud Storage

Overview

Google Cloud Storage allows you to ingest and process log files directly from GCS buckets.

Supported platforms

  • Linux: Logs
  • Windows: Logs
  • macOS: Logs

Authentication

Google Cloud Storage requires authentication to access objects from a GCS bucket.

The credentials must be provided using the Credential Type field.

ParameterTypeDescription
Credential TypestringSpecifies how Google Cloud credentials are provided to access the GCS bucket.
JSONoptionUse JSON to paste Google Cloud service account credentials.
FileoptionUse a credentials file on disk (service account key file).

Basic Configuration

These fields are required to establish a connection to your Google Cloud Storage bucket.

ParameterTypeRequiredDescription
project_idstringYesThe unique identifier of the Google Cloud Platform project that owns the storage bucket.
bucket_namestringYesThe name of the Cloud Storage bucket containing the log files. Do not include the gs:// prefix.

Data Selection

The Data Selection configuration controls how far back the source should look when retrieving log objects from the Google Cloud Storage bucket.

ParameterTypeRequiredDefaultDescription
valueintegerYes1Number of time units to look back when scanning for log objects. The maximum allowed lookback window is 4 weeks.
unitstringYesweekTime unit used for the lookback window. Supported units include hour, day, and week.

Advanced Configuration

ParameterTypeRequiredDefaultDescription
prefixstringNoA directory path prefix used to filter objects. If provided, the source will only look for files inside this specific folder within the bucket.
intervaldurationNo10sInterval at which the source polls the GCS bucket for new log files.
formatstringNojsonSpecifies the format of the logs stored in the GCS objects. Supported values include json and txt.
delimiterstringNo\nCharacter sequence used to separate individual log entries within a file.
object_patternslist(string)NoA list of Glob patterns to strictly filter which files are processed. Example: 2025/, files/*.json.
encoding_formatstringNoutf-8The encoding of the files being read. Valid values are nop, utf-8, utf-8-raw,utf-16le, utf-16be, ascii, and big5.
max_objects_per_scanintegerNo10The maximum number of object files the source will process during a single poll interval. Increasing this processes backlogs faster but consumes more CPU/Network resources.
max_bytes_per_objectintegerNo1048576 (1 MB)The limit on how much data to read from a single file. This prevents the system from hanging on accidentally massive files
max_log_sizeintegerNo1048576 (1 MB)The maximum size allowed for a single log entry. If a log line exceeds this size, it may be truncated or dropped depending on parsing rules.
initial_bufferintegerNo65536 (64 KB)The size of the memory buffer allocated initially for scanning objects. Tuning this is generally only necessary for extremely high-throughput environments.
decompressstringNoautoDetermines how compressed log files are handled. When set to auto, compression is detected automatically based on file extension. Valid values are gzip, auto, none, tar_gz
attributes (Key-Value Map)map[string]stringNo-This section allows you to map fields from your Cloudflare/GCP logs (Keys) to standard OpenTelemetry attribute names (Values).

Example Configuration

{
"project_id": "", // required, default: none
"bucket_name": "", // required, default: none

"lookback": {
"value": 1, // required inside lookback, default: 1
"unit": "week", // required inside lookback, default: "week"
},

"prefix": "", // default: ""
"interval": "10s", // default: "10s"
"format": "json", // default: "json", allowed: "json" | "txt"
"delimiter": "\n", // default: "\n"
"object_patterns": [], // default: none
"encoding_format": "utf-8", // default: "utf-8"
"max_objects_per_scan": 10, // default: 10
"max_bytes_per_object": 1048576, // default: 1048576
"max_log_size": 1048576, // default: 1048576
"initial_buffer": 65536, // default: 65536
"decompress": "auto", // default: "auto"

"attributes": {
"source_field_name": "otel.attribute.name",
}, // default: none
}

Metrics Covered

Metric NameDescription
collector_source_records_received_totalTotal number of log records successfully read from Google Cloud Storage objects and forwarded to the processing pipeline (logs).
collector_source_bytes_received_totalTotal number of bytes read from objects stored in the configured GCS bucket.
collector_source_records_dropped_totalCounts log records dropped during processing. Possible reasons include missing_file and downstream_error
collector_source_parse_errors_totalCounts errors encountered while processing object contents. Possible reason includes corrupt_file
collector_source_errors_totalCounts operational errors encountered by the source. Possible reasons include auth_failed permission_denied throttle bucket_not_found object_deleted and io