Skip to content

nogil __len__ and __eq__ break atomicity on sets and dicts #129519

@luisggpina

Description

@luisggpina

Bug report

Bug description:

Hi,

We're a research group focused on testing concurrent runtimes. Our work-in-progress prototype found a violation of atomicity on the current nogil build when using set __len__ or __eq__ with other concurrent operations on the same set/dict. The program below shows the wrong behavior for __len__ and __ixor__:

import threading
import sys
def t0(b1,s,res):
    b1.wait()
    res.append(s.__len__())

def t1(b1,s,res):
    b1.wait()
    s.__ixor__({0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 'a'})

def Test():
  s =  {17, 'a', 'b', 18, 'c', 'd', 'e'}
  threads=[]
  barrier = threading.Barrier(2)
  res = []
  threads.append(threading.Thread(target= t0, args=(barrier, s,res)))
  threads.append(threading.Thread(target= t1, args=(barrier, s,res)))
  for i in range(0, len(threads)):
      threads[i].start()
  for i in range(0, len(threads)):
      threads[i].join()
  if res[0] not in { 7 , 32 }:
      print("found bug: " + str(res[0]))

print("test begin...")
for i in range(0,50000):
  threads = []
  if i % 1000 == 0:
      print(i)
  for i in range(0,100):
      threads.append(threading.Thread(target= Test))
  for t in threads:
      t.start()
  for t in threads:
      t.join()
print("test Done")

Operation __len__ should see the initial set and return 7 or the set after the __ixor__ completes and return 32. However, it can see internal changes made by __ixor__ and return different numbers. Here's an example execution:

test begin...
0
found bug: 9
found bug: 17
found bug: 22
found bug: 30
found bug: 9
found bug: 9
found bug: 9
found bug: 9
found bug: 9

We have observed the same problem when __len__ runs concurrently with other operations: symmetric_difference_update, __isub__, and update. We'll share these test below in a comment.

We observed the same problem when dict's __len__ runs concurrently with __ior__:

import threading
import sys
def t0(b1,s,res):
    b1.wait()
    res.append(s.__len__())
​
def t1(b1,s,res):
    b1.wait()
    s.__ior__({0: 0, 1: 1, 2: 2, 3: 3, 4: 4})
​
def Test():
  d =  {'k1': 'v1', 'k2': 'v2', 'k3': 3, 10: 10}
  threads=[]
  barrier = threading.Barrier(2)
  res = []
  threads.append(threading.Thread(target= t0, args=(barrier, d,res)))
  threads.append(threading.Thread(target= t1, args=(barrier, d,res)))
  for i in range(0, len(threads)):
      threads[i].start()
  for i in range(0, len(threads)):
      threads[i].join()
  if res[0] not in { 4, 9 }:
      print("found bug: " + str(res[0]))
​
print("test begin...")
for i in range(0,50000):
  threads = []
  if i % 1000 == 0:
      print(i)
  for i in range(0,100):
      threads.append(threading.Thread(target= Test))
  for t in threads:
      t.start()
  for t in threads:
      t.join()
print("test Done")

Again, the length of the dict should be either 4 or 9, depending on whether __ior__ took place. However, __len__ can see internal changes from __ior__ as shown by the output:

test begin...
0
found bug: 5
found bug: 6
found bug: 5
found bug: 5
found bug: 8
found bug: 7

Finally, we observed set's __eq__ to be able to access intermediate results of __isub__, as shown by the program below:

import threading
import sys
def t0(b1,s,res):
    b1.wait()
    s.__isub__({1, 'b', 'd', 17})

def t1(b1,s,res):
    b1.wait()
    res.append(s.__eq__({'a', 'b', 18, 'c', 'd', 'e'}))

def Test():
  s =  {'a', 17, 'b', 18, 'c', 'd', 'e'}
  threads=[]
  barrier = threading.Barrier(2)
  res = []
  threads.append(threading.Thread(target= t0, args=(barrier, s,res)))
  threads.append(threading.Thread(target= t1, args=(barrier, s,res)))
  for i in range(0, len(threads)):
      threads[i].start()
  for i in range(0, len(threads)):
      threads[i].join()
  if res[0] != False:
      print("found bug: " + str(res[0]))

print("test begin...")
for i in range(0,50000):
  threads = []
  if i % 1000 == 0:
      print(i)
  for i in range(0,100):
      threads.append(threading.Thread(target= Test))
  for t in threads:
      t.start()
  for t in threads:
      t.join()
print("test Done")

The original and resulting sets are never equal to the argument passed to __eq__, however the operation can see internal changes from __isub__ as shown by the ouput:

test begin...
0
found bug: True
found bug: True
found bug: True
found bug: True
found bug: True
found bug: True

@flypoodles and @overlorde are part of the team, adding them so they get notified about further discussion.

We note that we observed these outputs in several x86_64 machines and one ARM machine.

CPython versions tested on:

CPython main branch, 3.14

Operating systems tested on:

Linux

Metadata

Metadata

Assignees

No one assigned

    Labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions