nogil `__len__` and `__eq__` break atomicity on sets and dicts

# Bug report

### Bug description:

Hi,

We're a research group focused on testing concurrent runtimes. Our work-in-progress prototype found a violation of atomicity on the current nogil build when using set `__len__` or `__eq__` with other concurrent operations on the same set/dict. The program below shows the wrong behavior for `__len__` and `__ixor__`:

```python
import threading
import sys
def t0(b1,s,res):
    b1.wait()
    res.append(s.__len__())

def t1(b1,s,res):
    b1.wait()
    s.__ixor__({0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 'a'})

def Test():
  s =  {17, 'a', 'b', 18, 'c', 'd', 'e'}
  threads=[]
  barrier = threading.Barrier(2)
  res = []
  threads.append(threading.Thread(target= t0, args=(barrier, s,res)))
  threads.append(threading.Thread(target= t1, args=(barrier, s,res)))
  for i in range(0, len(threads)):
      threads[i].start()
  for i in range(0, len(threads)):
      threads[i].join()
  if res[0] not in { 7 , 32 }:
      print("found bug: " + str(res[0]))

print("test begin...")
for i in range(0,50000):
  threads = []
  if i % 1000 == 0:
      print(i)
  for i in range(0,100):
      threads.append(threading.Thread(target= Test))
  for t in threads:
      t.start()
  for t in threads:
      t.join()
print("test Done")
```

Operation `__len__` should see the initial set and return `7` or the set after the `__ixor__` completes and return `32`.  However, it can see internal changes made by `__ixor__` and return different numbers.  Here's an example execution:

```
test begin...
0
found bug: 9
found bug: 17
found bug: 22
found bug: 30
found bug: 9
found bug: 9
found bug: 9
found bug: 9
found bug: 9
```

We have observed the same problem when `__len__` runs concurrently with other operations: `symmetric_difference_update`, `__isub__`, and `update`.  We'll share these test below in a comment.

We observed the same problem when dict's `__len__` runs concurrently with `__ior__`:

```python
import threading
import sys
def t0(b1,s,res):
    b1.wait()
    res.append(s.__len__())
​
def t1(b1,s,res):
    b1.wait()
    s.__ior__({0: 0, 1: 1, 2: 2, 3: 3, 4: 4})
​
def Test():
  d =  {'k1': 'v1', 'k2': 'v2', 'k3': 3, 10: 10}
  threads=[]
  barrier = threading.Barrier(2)
  res = []
  threads.append(threading.Thread(target= t0, args=(barrier, d,res)))
  threads.append(threading.Thread(target= t1, args=(barrier, d,res)))
  for i in range(0, len(threads)):
      threads[i].start()
  for i in range(0, len(threads)):
      threads[i].join()
  if res[0] not in { 4, 9 }:
      print("found bug: " + str(res[0]))
​
print("test begin...")
for i in range(0,50000):
  threads = []
  if i % 1000 == 0:
      print(i)
  for i in range(0,100):
      threads.append(threading.Thread(target= Test))
  for t in threads:
      t.start()
  for t in threads:
      t.join()
print("test Done")
```

Again, the length of the dict should be either 4 or 9, depending on whether `__ior__` took place.  However, `__len__` can see internal changes from `__ior__` as shown by the output:

```
test begin...
0
found bug: 5
found bug: 6
found bug: 5
found bug: 5
found bug: 8
found bug: 7
```

Finally, we observed set's `__eq__` to be able to access intermediate results of `__isub__`, as shown by the program below:
```python
import threading
import sys
def t0(b1,s,res):
    b1.wait()
    s.__isub__({1, 'b', 'd', 17})

def t1(b1,s,res):
    b1.wait()
    res.append(s.__eq__({'a', 'b', 18, 'c', 'd', 'e'}))

def Test():
  s =  {'a', 17, 'b', 18, 'c', 'd', 'e'}
  threads=[]
  barrier = threading.Barrier(2)
  res = []
  threads.append(threading.Thread(target= t0, args=(barrier, s,res)))
  threads.append(threading.Thread(target= t1, args=(barrier, s,res)))
  for i in range(0, len(threads)):
      threads[i].start()
  for i in range(0, len(threads)):
      threads[i].join()
  if res[0] != False:
      print("found bug: " + str(res[0]))

print("test begin...")
for i in range(0,50000):
  threads = []
  if i % 1000 == 0:
      print(i)
  for i in range(0,100):
      threads.append(threading.Thread(target= Test))
  for t in threads:
      t.start()
  for t in threads:
      t.join()
print("test Done")
```

The original and resulting sets are never equal to the argument passed to `__eq__`, however the operation can see internal changes from `__isub__` as shown by the ouput:

```
test begin...
0
found bug: True
found bug: True
found bug: True
found bug: True
found bug: True
found bug: True
```

@flypoodles and @overlorde are part of the team, adding them so they get notified about further discussion.

We note that we observed these outputs in several x86_64 machines and one ARM machine.

### CPython versions tested on:

CPython main branch, 3.14

### Operating systems tested on:

Linux

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

nogil `len` and `eq` break atomicity on sets and dicts #129519

Bug report

Bug description:

CPython versions tested on:

Operating systems tested on:

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Uh oh!

nogil __len__ and __eq__ break atomicity on sets and dicts #129519

Description

Bug report

Bug description:

CPython versions tested on:

Operating systems tested on:

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions

nogil `len` and `eq` break atomicity on sets and dicts #129519