Google Cloud Storage
Overview
Google Cloud Storage allows you to ingest and process log files directly from GCS buckets.
Supported platforms
- Linux:
Logs - Windows:
Logs - macOS:
Logs
Authentication
Google Cloud Storage requires authentication to access objects from a GCS bucket.
The credentials must be provided using the Credential Type field.
| Parameter | Type | Description |
|---|---|---|
| Credential Type | string | Specifies how Google Cloud credentials are provided to access the GCS bucket. |
| JSON | option | Use JSON to paste Google Cloud service account credentials. |
| File | option | Use a credentials file on disk (service account key file). |
Basic Configuration
These fields are required to establish a connection to your Google Cloud Storage bucket.
| Parameter | Type | Required | Description |
|---|---|---|---|
| project_id | string | Yes | The unique identifier of the Google Cloud Platform project that owns the storage bucket. |
| bucket_name | string | Yes | The name of the Cloud Storage bucket containing the log files. Do not include the gs:// prefix. |
Data Selection
The Data Selection configuration controls how far back the source should look when retrieving log objects from the Google Cloud Storage bucket.
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
| value | integer | Yes | 1 | Number of time units to look back when scanning for log objects. The maximum allowed lookback window is 4 weeks. |
| unit | string | Yes | week | Time unit used for the lookback window. Supported units include hour, day, and week. |
Advanced Configuration
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
| prefix | string | No | — | A directory path prefix used to filter objects. If provided, the source will only look for files inside this specific folder within the bucket. |
| interval | duration | No | 10s | Interval at which the source polls the GCS bucket for new log files. |
| format | string | No | json | Specifies the format of the logs stored in the GCS objects. Supported values include json and txt. |
| delimiter | string | No | \n | Character sequence used to separate individual log entries within a file. |
| object_patterns | list(string) | No | — | A list of Glob patterns to strictly filter which files are processed. Example: 2025/, files/*.json. |
| encoding_format | string | No | utf-8 | The encoding of the files being read. Valid values are nop, utf-8, utf-8-raw,utf-16le, utf-16be, ascii, and big5. |
| max_objects_per_scan | integer | No | 10 | The maximum number of object files the source will process during a single poll interval. Increasing this processes backlogs faster but consumes more CPU/Network resources. |
| max_bytes_per_object | integer | No | 1048576 (1 MB) | The limit on how much data to read from a single file. This prevents the system from hanging on accidentally massive files |
| max_log_size | integer | No | 1048576 (1 MB) | The maximum size allowed for a single log entry. If a log line exceeds this size, it may be truncated or dropped depending on parsing rules. |
| initial_buffer | integer | No | 65536 (64 KB) | The size of the memory buffer allocated initially for scanning objects. Tuning this is generally only necessary for extremely high-throughput environments. |
| decompress | string | No | auto | Determines how compressed log files are handled. When set to auto, compression is detected automatically based on file extension. Valid values are gzip, auto, none, tar_gz |
| attributes (Key-Value Map) | map[string]string | No | - | This section allows you to map fields from your Cloudflare/GCP logs (Keys) to standard OpenTelemetry attribute names (Values). |
Example Configuration
{
"project_id": "", // required, default: none
"bucket_name": "", // required, default: none
"lookback": {
"value": 1, // required inside lookback, default: 1
"unit": "week", // required inside lookback, default: "week"
},
"prefix": "", // default: ""
"interval": "10s", // default: "10s"
"format": "json", // default: "json", allowed: "json" | "txt"
"delimiter": "\n", // default: "\n"
"object_patterns": [], // default: none
"encoding_format": "utf-8", // default: "utf-8"
"max_objects_per_scan": 10, // default: 10
"max_bytes_per_object": 1048576, // default: 1048576
"max_log_size": 1048576, // default: 1048576
"initial_buffer": 65536, // default: 65536
"decompress": "auto", // default: "auto"
"attributes": {
"source_field_name": "otel.attribute.name",
}, // default: none
}
Metrics Covered
| Metric Name | Description |
|---|---|
collector_source_records_received_total | Total number of log records successfully read from Google Cloud Storage objects and forwarded to the processing pipeline (logs). |
collector_source_bytes_received_total | Total number of bytes read from objects stored in the configured GCS bucket. |
collector_source_records_dropped_total | Counts log records dropped during processing. Possible reasons include missing_file and downstream_error |
collector_source_parse_errors_total | Counts errors encountered while processing object contents. Possible reason includes corrupt_file |
collector_source_errors_total | Counts operational errors encountered by the source. Possible reasons include auth_failed permission_denied throttle bucket_not_found object_deleted and io |