Skip to content

fix: missing KVS close method (#18)#20

Open
brainiac-five wants to merge 1 commit intomasterfrom
fix/missing-kvs-close
Open

fix: missing KVS close method (#18)#20
brainiac-five wants to merge 1 commit intomasterfrom
fix/missing-kvs-close

Conversation

@brainiac-five
Copy link
Collaborator

@brainiac-five brainiac-five commented Dec 31, 2025

Adding the missing Close() function to KVS.
fixes #18

Most changes are to the Index class that receives a flag and a check for wether the index instance -- i.e. its mutex loop -- is already closed. This avoids hanging requests to index that would occur whenever an index whose mutex loop does not run anymore (not) receives a lock request.

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

Copy link
Member

@zelig zelig left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Radical changes needed, see comments
you should always run test with the --race flag. That shall reveal all I am saying in the review commenst

write chan elements.Node // hands out current root for writes and locks
root chan elements.Node // channel for new roots
quit chan struct{} // closing this channel signals quit
closed bool
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok so this is not needed, in fact checking this field creates a race condition if Close is called concurrently with any other function that checks it.

write: make(chan elements.Node),
root: make(chan elements.Node),
quit: make(chan struct{}),
closed: false,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this assignment is unnecessary since the zero value of boolean is false

for {
select {
case <-quit:
idx.closed = true
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is a write write race with close

func (idx *Index) Update(ctx context.Context, k []byte, e *elements.Entry) error {
var root elements.Node

if idx.closed {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is read/write race on the boolean closed

if idx.closed {
return errors.New("trie closed")
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but it is even worse cos if Close is called when we are here concurrently then the following select will hang on line 114 until the context cancels :))


// Iterate wraps the underlying pot's iterator
func (idx *Index) Iterate(ctx context.Context, p, k []byte, f func(elements.Entry) (stop bool, err error)) error {
if idx.closed {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same


// Close quits the process loop and closes the mode
func (idx *Index) Close() error {
if idx.closed {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if you are protecting here from a concurrent call to Close then it is not gonna work and you can still get a close on closed channel error


// Size returns the size (number of entries) of the pot
func (idx *Index) Size() int {
func (idx *Index) Size() (int, error) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so here you dont need to return an error at all and this is where you should see that the pattern is wrong. So there is no reason one cannot call Size on a closed pot


// String pretty prints the current state of the pot
func (idx *Index) String() string {
func (idx *Index) String() (string, error) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as Size, Save : no reason why not


// Save calls the mode specific save method for the root node
func (idx *Index) Save(ctx context.Context) ([]byte, error) {
if idx.closed {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

like Size, no reason not to allow Save on a closed

@brainiac-five
Copy link
Collaborator Author

I separate out the core intent, adding kvs.Close(), into a new PR, #21, which has nothing to do with the good points you raise. It only adds code rather than alter it.

The closed flag as introduced in this PR improves the current situation where most calls after Close() hang. These tests all fail with the current master:

	t.Run("add item to closed index", func(t *testing.T) {
		idx.Close()
		idx.Add(ctx, want)
	})
	t.Run("find item in closed index", func(t *testing.T) {
		idx.Add(ctx, want)
		idx.Close()
		idx.Find(ctx, want.Key())
	})
	t.Run("double close index", func(t *testing.T) {
		idx.Close()
		idx.Close()
	})

The first two hang, the third panics.

The full test is checked in on branch fail/closed-index, index_test.go.

The closed flag suffices to most of the time inform a developer rather than just hang (or panic). Worst case, in the rare case of a race, it behaves as bad as the current situation (hang or panic). So it is only improving but the point can be made that it can be unhelpful as it projects deceptive safety.

For the purposes of using this in WASM, there is no problem at all, because WASM is single-threaded. A WASM setup -- like POT JS -- also has a more dire need for this protection as a hanging WASM object effectively 'deletes methods' as well as taking out state. The WASM process going down should be avoided where possible. Yet worse, the garbage collection hooks of Javascript (FinalizationRegistry) are said to be unreliable; but kvs.Close() is called automatically triggered by the Javascript garbage collector --

https://github.com/brainiac-five/potjs/blob/f9b57daab12563093be5ea32f6788d768e93a379/potjs.go#L276

-- so that a developer might not even be at fault for a Close() comming too early, or otherwise unexpectedly, or doubly. All such special circumstances would be addressed by the closed flag (because WASM is single-threaded). On the other hand, POT JS does not expose the kvs.Close() method but only uses it to automate part of its garbage collection service. That takes out the developer as possible error source for wrong or doubly uses of Close().

Relevantly, while not perfect, this proposal was less intrusive; a proper solution seems to mean wrapping everything in mutexes. That degree of change looked inappropriate. But that should be implemented then, to better protect POT JS users.

The --race flag of go test reacts to the fact that closed is not an atomic.Boolean. Which is to counter optimization effects more remote than a race between the check of closed and the entering of the mux select. The flag does not, as you suggest, surface the race conditions we are discussing. Regardless, it should not find issues with the code.

Adding the check for closed indices also to Size() etc. was for symmetry: it is easier to learn, that all functions will return an error once the index is closed. And Close() should as a matter of principle always be the last method to invoke on any index. The index should simply not be useful for anything after that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

KVS missing Close() method to release Index mutex

2 participants