Skip to Content
Unified docs shell with shared Classifyre tokens and acid-green highlight accents.
SourcesGoogle Cloud Storage

Google Cloud Storage

Schema-driven source documentation.

GOOGLE_CLOUD_STORAGE38 fields1 examples
Commonly Asked Questions
Assistant knowledge mapped to this source type from assistant_knowledge.json.

Required
Fields required for a valid configuration payload under `config.required`.
PathTypeRequiredDescriptionDefaultConstraints
requiredobjectYesno extra properties
required.bucketstringYesGoogle Cloud Storage bucket name
Masked
Sensitive fields under `config.masked` (secrets/credentials).
PathTypeRequiredDescriptionDefaultConstraints
maskedobjectNoOptional inline service account credentials JSON. Leave empty to use ADC/workload identity.no extra properties
masked.gcp_credentials_jsonstringNoGoogle service account credentials JSON as inline string
Optional
Optional configuration fields under `config.optional`.
PathTypeRequiredDescriptionDefaultConstraints
optionalobjectNono extra properties
optional.connectionobjectNono extra properties
optional.connection.gcp_credentials_filestringNoPath to Google service account JSON credentials file
optional.connection.max_keys_per_pageintegerNoMaximum objects requested per list page200min 1, max 1000
optional.connection.max_object_bytesintegerNoMaximum bytes downloaded per object for MIME detection and text extraction5242880min 1024, max 52428800
optional.connection.project_idstringNoOptional GCP project ID override for auth context and bucket listing
optional.connection.request_timeout_secondsnumberNoNetwork timeout in seconds for list/download operations30min 1, max 300
optional.scopeobjectNoObject scope and filtering controls.no extra properties
optional.scope.exclude_extensionsarrayNoOptional extension denylist
optional.scope.exclude_extensions[]stringNo
optional.scope.include_content_previewbooleanNoDownload object bytes to infer MIME and extract detector-ready text previewstrue
optional.scope.include_empty_objectsbooleanNoInclude zero-byte objects in extraction resultsfalse
optional.scope.include_extensionsarrayNoOptional extension allowlist (for example, .pdf, .csv, .parquet)
optional.scope.include_extensions[]stringNo
optional.scope.include_object_metadatabooleanNoAttach provider metadata (etag, size, content-type hints, timestamps) to asset checksumstrue
optional.scope.prefixstringNoObject key prefix filter (for example, exports/2026/)
Examples
Reference payloads generated from shared source examples JSON.
GCS full bucket sweep
Scan all objects in a GCS bucket using Application Default Credentials

Schedule

{
  "enabled": true,
  "preset": "nightly",
  "cron": "26 1 * * *",
  "timezone": "UTC"
}

Config Payload

{
  "type": "GOOGLE_CLOUD_STORAGE",
  "required": {
    "bucket": "prod-data-lake"
  },
  "masked": {},
  "optional": {
    "connection": {
      "project_id": "acme-prod"
    },
    "scope": {
      "prefix": "exports/"
    }
  },
  "sampling": {
    "strategy": "ALL"
  }
}