Skip to content

Performance regression of pathlib.Path hashingย #138407

@csernazs

Description

@csernazs

Bug report

Bug description:

Converting a list of pathlib.Path objects to a set() takes significantly more time in python 3.12 and beyond compared to 3.11.

The below code is a minimal reproduction:

from pathlib import Path
import time
import sys

print(sys.version)

# make 6000 paths as a string
paths_str = ["/tmp/foo/bar/{i}" for i in range(1000, 7000)]

t = time.perf_counter()
# convert it to a list of Path objects
paths_path = [Path(path) for path in paths_str]

elapsed = time.perf_counter() - t
print(f"List construct: {int(elapsed * 1000)} msec")

# convert list to set
t = time.perf_counter()
set(paths_path)
elapsed = time.perf_counter() - t
print(f"Set conversion: {int(elapsed * 1000)} msec")


# extra step: convert list to set (demonstrate caching makes it faster)
t = time.perf_counter()
set(paths_path)
elapsed = time.perf_counter() - t
print(f"Set conversion (cached): {int(elapsed * 1000)} msec")

It shows with 3.12:

3.12.11 (main, Jun  3 2025, 15:41:47) [GCC 14.3.0]
List construct: 5 msec
Set conversion: 28 msec
Set conversion (cached): 1 msec

With 3.14rc1 (similar):

3.14.0rc1 (main, Jul 22 2025, 16:42:44) [GCC 14.3.0]
List construct: 4 msec
Set conversion: 26 msec
Set conversion (cached): 1 msec

But with python 3.11 (faster):

3.11.13 (main, Jun  3 2025, 18:38:25) [GCC 14.3.0]
List construct: 12 msec
Set conversion: 8 msec
Set conversion (cached): 1 msec

I also found that not only hash() got worse but str() as well - if you replace the set() call with a [str(p) for p in paths_path].

I found a possible candidate for this issue, but that's more just a theory, I could not test it:
a68e585

Could you please check what causes this?
I think newer python versions should not make the code slower, pathlib.Path is a great object (much superior than str) and now using lots of paths with a set or dict (which are also powerful data structures) kills the performance.

If you need any help, let me know.

Thank you,
Zsolt

CPython versions tested on:

3.12

Operating systems tested on:

Linux

Linked PRs

Metadata

Metadata

Assignees

No one assigned

    Labels

    performancePerformance or resource usagestdlibStandard Library Python modules in the Lib/ directorytopic-pathlibtype-bugAn unexpected behavior, bug, or error

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions