Skip to content

Conversation

@badmonster0
Copy link
Member

No description provided.

@badmonster0 badmonster0 merged commit eb7fb54 into main Apr 5, 2025
1 check passed
* Data sources provides a **change stream**.
* Configured with a [refresh interval](flow_def#refresh-interval), which is generally applicable to all data sources.
* Specific data sources also provide their specific change capture mechanisms.
See documentations for specific data sources for details.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can there be an example

* Data sources configured with a [refresh interval](flow_def#refresh-interval).
* Data sources provides a **change stream**.
* Configured with a [refresh interval](flow_def#refresh-interval), which is generally applicable to all data sources.
* Specific data sources also provide their specific change capture mechanisms.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does push events belong here as well

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Different sources have different channels to push events, so it belongs to specific data sources.

We can add more words to clarify here after we start to support a real push events based change capture mechanism.

For data sources ineligible for live updates, or when the `live_mode` is `False`,
the `FlowLiveUpdater` only performs a one-time update, i.e. similar to the one-time update (`update()` method) above,
under a unified interface.
Note that `cocoindex.FlowLiveUpdater` provides a unified interface for both one-time update and live update.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does it make sense to link?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which part do you refer to? FlowLinkUpdater, or one-time update / live update?

So when a source is configured with a change stream, it's still recommended to set a `refresh_interval`, with a larger value.
So for most changes can be covered by the change stream (with low latency), and remaining changes (files no longer exist or accessible) will still be covered (with a higher latency).
So when a `GoogleDrive` source enabled `recent_changes_poll_interval`, it's still recommended to set a `refresh_interval`, with a larger value.
So that most changes can be covered by polling recent changes (with low latency), and remaining changes (files no longer exist or accessible) will still be covered (with a higher latency).
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wonder if make sense to give an example, e.g., every 2 hours, based on your requirement.

Copy link
Member Author

@badmonster0 badmonster0 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ty!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants