Skip to content

Conversation

@nicktelford
Copy link
Contributor

These are the new interfaces detailed in KIP-1035: "StateStore managed changelog offsets".

This PR introduces the interfaces changes, but makes otherwise no consequential behavioural changes.

Outside of StateStore.java, all changes are essentially just a rename of all invocations of StateStore#flush to instead call StateStore#commit.

The changelogOffsets being passed in these invocations is currently unused: the behaviour of StateStore#commit remains identical to StateStore#flush before these changes.

A new implementation of StateStore#commit that actually uses these offsets, along with changes to the use-site (in ProcessorStateManager and GlobalStateManager) will come in a later PR.

Many strings, including documentation, and some variable names, have also been renamed (from "flush" to "commit"), to maintain consistency with the method they relate to.

One exception is the flush-rate metric, which has not been renamed, because it will instead be deprecated in favour of a new commit-rate metric, which will be introduced in another PR.


The only change in behaviour is as follows: calling StateStore#flush from within a Processor is now a guaranteed no-op.

In the future, this will throw an UnsupportedOperationException, but to ensure no changes to end-user experience, we currently make it a no-op.

Previously, any call to StateStore#flush from a Processor would have made no difference to program semantics, but likely would introduce performance problems for RocksDB. This is because it would force a flush of RocksDB memtables to disk on every invocation, which if naively used could be on every Record.

Consequently, making this a no-op should not make a difference for end-users, except potentially improving performance if they were incorrectly calling this method.

These are the new interfaces detailed in KIP-1035: "StateStore managed
changelog offsets".

This PR introduces the interfaces changes, but makes otherwise no
consequential behavioural changes.

Outside of `StateStore.java`, _all_ changes are essentially just a
rename of all invocations of `StateStore#flush` to instead call
`StateStore#commit`.

The `changelogOffsets` being passed in these invocations is currently
unused: the behaviour of `StateStore#commit` remains identical to
`StateStore#flush` before these changes.

A new implementation of `StateStore#commit` that actually uses these
offsets, along with changes to the use-site (in `ProcessorStateManager`
and `GlobalStateManager`) will come in a later PR.

Many strings, including documentation, and some variable names, have
also been renamed (from "flush" to "commit"), to maintain consistency
with the method they relate to.

One exception is the `flush-rate` metric, which has not been renamed,
because it will instead be deprecated in favour of a new `commit-rate`
metric, which will be introduced in another PR.

---

The only change in behaviour is as follows: calling `StateStore#flush`
from within a `Processor` is now a guaranteed no-op.

In the future, this will throw an `UnsupportedOperationException`, but
to ensure no changes to end-user experience, we currently make it a
no-op.

Previously, any call to `StateStore#flush` from a `Processor` would have
made no difference to program semantics, but likely would introduce
performance problems for RocksDB. This is because it would force a flush
of RocksDB memtables to disk on every invocation, which if naively used
could be on _every_ `Record`.

Consequently, making this a no-op should not make a difference for
end-users, except potentially improving performance if they were
incorrectly calling this method.
@github-actions github-actions bot added triage PRs from the community streams labels Nov 21, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

streams triage PRs from the community

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant