Skip to content

Dropped error in etcd/v2 lets multiple clients think they've acquired a lock. #50

@jlhawn

Description

@jlhawn

There's some bad error checking around here:

libkv/store/etcd/v2/etcd.go

Lines 528 to 534 in 5e4bb28

if err != nil {
if etcdError, ok := err.(etcd.Error); ok {
if etcdError.Code != etcd.ErrorCodeNodeExist {
return nil, err
}
setOpts.PrevIndex = ^uint64(0)
}

Where if the error is not of type etcd.Error then it is ignored completely. Then, another request to Set() the key is made with the PrevIndex value not set. This will just overwrite whatever value was at that key as long as it already exists. This results in two processes thinking they have acquired the lock at the same time. This only occurs under the condition that the first Set call fails due to some transient connection issue and the second call succeeds. We have seen this occur several times.

Why is there even a second call to Set() the key? Why not immediately go into a wait loop if the first Set() attempt results in etcd.ErrorCodeNodeExist?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions