Undocumented differences between defaultdict and dict behavior

# Documentation

`defaultdict` seems to call `__getitem__` whenever `__setitem__` is called (regardless of if the item was already present), whereas regular `dict` does not call `__getitem__` when `__setitem__` is called. The documentation for  [`defaultdict`](https://docs.python.org/3.14/library/collections.html#collections.defaultdict) says that `defaultdict` and `dict` are basically identical, except in a few narrow cases.

> [defaultdict](https://docs.python.org/3.14/library/collections.html#collections.defaultdict) is a subclass of the built-in [dict](https://docs.python.org/3.14/library/stdtypes.html#dict) class. It overrides one method and adds one writable instance variable. The remaining functionality is the same as for the [dict](https://docs.python.org/3.14/library/stdtypes.html#dict) class and is not documented here.

But nothing is mentioned in the docs about this difference in behavior of calling `__getitem__` / `__setitem__`

This comes up when making a child class of either of them if you want to have a preprocessing step that operates on keys before they are used to index into the dictionary, e.g. 

```python
from collections import defaultdict

class Item: ...

def preprocess_item(item: Item) -> str:
    return f'::{item.__class__.__name__}@{hex(id(item))}'

class PreprocessingDefaultDict(defaultdict):
    def __getitem__(self, item: Item):
        key = preprocess_item(item)
        return super().__getitem__(key)

    # def __setitem__(self, item: Item, value):
    #     key = preprocess_item(item)
    #     super().__setitem__(key, value)


class PreprocessingDict(dict):
    def __getitem__(self, item: Item):
        key = preprocess_item(item)
        return super().__getitem__(key)

    def __setitem__(self, key: Item, value):
        key = preprocess_item(key)
        super().__setitem__(key, value)


if __name__ == '__main__':
    item = Item()
    
    d1 = PreprocessingDefaultDict(dict)
    d1[item]['a'] = 10  # initial creation of dict at d1[item]
    d1[item]['a'] = 20  # updating already existing dict at d1[item] 
    print(dict(d1)) # wrap in dict so prints the same as d2

    d2 = PreprocessingDict()
    d2[item] = {}
    d2[item]['a'] = 10  # initial creation of dict at d2[item]
    d2[item]['a'] = 20  # updating already existing dict at d2[item]
    print(d2)
```

Which prints out something like:

```
{'::Item@0x7fd29b035280': {'a': 20}}
{'::Item@0x7fd29b035280': {'a': 20}}
```

In this example, I have a preprocessor function I'd like to run on all keys to convert them from objects into strings which can be used in the dictionary. It is not clear from the docs that you need to not override `__setitem__` like I have commented out, because `defaultdict` will always call `__getitem__` thus always running the preprocessor. If you override `__setitem__` like I have commented out, you will preprocess the item twice, and end up with results like this:

```
{'::str@0x7fc55b78a930': {'a': 10}, '::str@0x7fc55b78a970': {'a': 20}}
{'::Item@0x7fc55b754890': {'a': 20}}
```

or this:

```
{'::str@0x7f3715686930': {'a': 20}}
{'::Item@0x7f3715650860': {'a': 20}}
```

(I believe the extra element happens because the string from `preprocess_item` may or may not allocate new memory given an identical input)

I'm not exactly sure what the underlying cause of this difference is. It doesn't seem to be related to the `__missing__` method mentioned in the docs, because the behavior I mentioned happens for keys that are not present in the `defaultdict` as well as for those that are already present (and presumably wouldn't be calling `__missing__`).


### python version
I ran my example in python **3.6 through 3.12**, and observed the same behavior in all of them

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Undocumented differences between defaultdict and dict behavior #124875

Documentation

python version

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Uh oh!

Undocumented differences between defaultdict and dict behavior #124875

Description

Documentation

python version

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions