Skip to content

Conversation

@mrjoe7
Copy link

@mrjoe7 mrjoe7 commented Jul 15, 2025

Motivation

We’re adding change.stream.lookup.full.document.before.change to ensure that we can access the full document prior to a delete operation in MongoDB change streams.

This is particularly important for our use case where:

  • Deleted documents must be preserved for downstream consumers.
  • The delete change event by default does not include the document body — without enabling pre-images, we would lose all context about what was removed.
  • Having the pre-image ensures that we can reconstruct the full lifecycle of a document (create → update → delete) from the change stream alone.

We’re setting this as a configurable option to remain compatible with MongoDB deployments.

@mrjoe7 mrjoe7 requested a review from a team as a code owner July 15, 2025 16:00
@mrjoe7 mrjoe7 requested review from stIncMale and removed request for a team July 15, 2025 16:00
@stIncMale stIncMale requested review from rozza and removed request for stIncMale August 27, 2025 20:36
@rozza
Copy link
Member

rozza commented Sep 1, 2025

Hi @mrjoe7,

Thanks for the PR - this looks like a great addition to the connector. I've added SPARK-449 to Jira to get this triaged and plan work.

I'm not sure we have any time planned this quarter for a release but will try to get this added asap.

Ross

@rozza rozza removed their request for review September 4, 2025 09:58
@rozza rozza requested review from Copilot and rozza December 1, 2025 09:39
Copilot finished reviewing on behalf of rozza December 1, 2025 09:41
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds support for MongoDB change stream pre-images via the change.stream.lookup.full.document.before.change configuration option, enabling access to the full document state before delete, update, or replace operations. This is essential for preserving deleted document data for downstream consumers when the MongoDB deployment has pre-image collection enabled.

  • Added new configuration constant STREAM_LOOKUP_FULL_DOCUMENT_BEFORE_CHANGE_CONFIG with comprehensive documentation
  • Implemented getStreamFullDocumentBeforeChange() method to retrieve and validate the configuration value
  • Integrated fullDocumentBeforeChange() calls in both streaming partition readers

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated no comments.

File Description
src/main/java/com/mongodb/spark/sql/connector/config/ReadConfig.java Adds configuration constant, default value, documentation, and getter method for full document before change support
src/main/java/com/mongodb/spark/sql/connector/read/MongoMicroBatchPartitionReader.java Configures change stream iterable to use the full document before change setting
src/main/java/com/mongodb/spark/sql/connector/read/MongoContinuousPartitionReader.java Configures change stream iterable to use the full document before change setting
src/test/java/com/mongodb/spark/sql/connector/config/MongoConfigTest.java Adds comprehensive unit test covering all valid options (default, off, whenAvailable, required) and invalid input handling

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Fix javadoc issues
rozza added a commit to rozza/mongo-spark that referenced this pull request Dec 3, 2025
SPARK-449

Original PR: mongodb#140
---------
Co-authored-by: Ross Lawley <[email protected]>
@rozza
Copy link
Member

rozza commented Dec 3, 2025

Closed in favor of: #144 - which takes this PR as a base

@rozza rozza closed this Dec 3, 2025
rozza added a commit to rozza/mongo-spark that referenced this pull request Dec 3, 2025
SPARK-449

Original PR: mongodb#140
---------
Co-authored-by: Ross Lawley <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants