Skip to content

Conversation

deangalvin-cb
Copy link

DRAFT WHILE IN PROGRESS

Summary

This adds GCS as a source. It is built primarily off of the AWS S3 source concept (replacing SQS with pubsub, and S3 with GCS).

Vector configuration

How did you test this PR?

Change Type

  • Bug fix
  • New feature
  • Non-functional (chore, refactoring, docs)
  • Performance

Is this a breaking change?

  • Yes
  • No

Does this PR include user facing changes?

  • Yes. Please add a changelog fragment based on our guidelines.
  • No. A maintainer will apply the no-changelog label to this PR.

References

Closes: #7501

Notes

  • Please read our Vector contributor resources.
  • Do not hesitate to use @vectordotdev/vector to reach out to us regarding this PR.
  • Some CI checks run only after we manually approve them.
    • We recommend adding a pre-push hook, please see this template.
    • Alternatively, we recommend running the following locally before pushing to the remote branch:
      • make fmt
      • make check-clippy (if there are failures it's possible some of them can be fixed with make clippy-fix)
      • make test
  • After a review is requested, please avoid force pushes to help us review incrementally.
    • Feel free to push as many commits as you want. They will be squashed into one before merging.
    • For example, you can run git merge origin master and git push.
  • If this PR introduces changes Vector dependencies (modifies Cargo.lock), please
    run make build-licenses to regenerate the license inventory and commit the changes (if any). More details here.

@github-actions github-actions bot added the domain: sources Anything related to the Vector's sources label Oct 2, 2025
Copy link
Member

@pront pront left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

counter!("gcp_cloud_storage_objects_filtered_total")
.increment(1);
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: add newline

Comment on lines +202 to +203
// #[cfg(feature = "sources-gcp_cloud_storage")]
// pub(crate) use self::gcp_cloud_storage::*;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// #[cfg(feature = "sources-gcp_cloud_storage")]
// pub(crate) use self::gcp_cloud_storage::*;

/// Automatically attempt to determine the compression scheme.
///
/// The compression scheme of the object is determined from its `Content-Encoding` and
/// `Content-Type` metadata, as well as the key suffix (for example, `.gz`).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as well as the key suffix

let's explain a bit more what has precedence to avoid user confusion

Comment on lines 113 to 121
#[configurable(derived)]
#[serde(default = "default_framing")]
#[derivative(Default(value = "default_framing()"))]
pub framing: FramingConfig,

#[configurable(derived)]
#[serde(default = "default_decoding")]
#[derivative(Default(value = "default_decoding()"))]
pub decoding: DeserializerConfig,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit:

Suggested change
#[configurable(derived)]
#[serde(default = "default_framing")]
#[derivative(Default(value = "default_framing()"))]
pub framing: FramingConfig,
#[configurable(derived)]
#[serde(default = "default_decoding")]
#[derivative(Default(value = "default_decoding()"))]
pub decoding: DeserializerConfig,
#[serde(flatten)]
pub encoding: EncodingConfigWithFraming,

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isnt this strictly for sinks?

@deangalvin-cb
Copy link
Author

Just a note, ill be on leave for a little while, but will come back to this. I have a few of the suggestions implemented & cleaning up the code!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

domain: sources Anything related to the Vector's sources

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Request: Add google cloud storage source

2 participants