Skip to content

Conversation

jbosboom
Copy link
Contributor

@jbosboom jbosboom commented Sep 5, 2025

While file timestamps can be anything the file system can store, most lie between the recent past and the near future. Optimize fill_time for typical timestamps in three ways:

  • When possible, convert to nanoseconds with C arithmetic.
  • When using C arithmetic and the seconds member is not required (for st_birthtime), avoid creating a long object.
  • When using C arithmetic, reorder the code to avoid the null checks implied in Py_XDECREF.

This improves python -m pyperf timeit -s 'import os' 'os.stat(".")' from 1.26 us +- 0.02 us to 1.15 us +- 0.01 us on Linux 6.16.2.arch1-1 btrfs and --enable-optimizations --with-lto.

I found this while implementing os.stat_result.st_birthtime on Linux and trying not to regress performance. This is a small change that should be an improvement on all platforms, so I've submitted it separately.

This needs tests, presumably in test_os.py UtimeTests.test_large_time, though it's actually testing os.stat. But that test currently only runs on NTFS on Windows. I'll look at Linux filesystem support tomorrow.

While file timestamps can be anything the file system can store, most
lie between the recent past and the near future.  Optimize fill_time for
typical timestamps in three ways:

- When possible, convert to nanoseconds with C arithmetic.
- When using C arithmetic and the seconds member is not required (for
  st_birthtime), avoid creating a long object.
- When using C arithmetic, reorder the code to avoid the null checks
  implied in Py_XDECREF.
@bedevere-app
Copy link

bedevere-app bot commented Sep 5, 2025

Most changes to Python require a NEWS entry. Add one using the blurb_it web app or the blurb command-line tool.

If this change has little impact on Python users, wait for a maintainer to apply the skip news label instead.

@eendebakpt eendebakpt added the performance Performance or resource usage label Sep 5, 2025
@bedevere-app
Copy link

bedevere-app bot commented Sep 5, 2025

Most changes to Python require a NEWS entry. Add one using the blurb_it web app or the blurb command-line tool.

If this change has little impact on Python users, wait for a maintainer to apply the skip news label instead.

@jbosboom
Copy link
Contributor Author

jbosboom commented Sep 6, 2025

I want to add tests for timestamps just inside and outside the fast path to UtimeTests.test_large_time. Most but not all Linux filesystems support large timestamps. The direct way to check the filesystem type is calling statfs and comparing the f_type member to the magic numbers in man statfs. Unfortunately both sizeof(struct statfs) and sizeof(f_type) vary across different architectures, which makes it hard to call with ctypes. Is there precedent for adding a non-public Python interface to a Linux-specific C function just for test purposes? I didn't see anything about that in the devguide docs.

After some reflection, we can deal with varying struct statfs by simply overallocating by a lot. f_type is always the first 4 or 8 bytes of the structure (u32 or s64 [sic]). The magic numbers are all 2 or 4 bytes, and none of them are 0 or -1. Look at the first pair of u32s. If one of them is 0 or -1, the other is the magic (depending on architecture endianness). Otherwise the first (lowest address) u32 is the magic.

Another way is to parse /proc/self/mountinfo looking for the major:minor of the test file, but parsing that file is nontrivial because it is space-separated with fields that may contain spaces, has optional fields in the middle of lines, and has no documented escaping rules (see man proc_pid_mountinfo).

@jbosboom
Copy link
Contributor Author

jbosboom commented Sep 6, 2025

I don't think that checking large timestamp support using the filesystem magic number or name is actually that useful. I think tmpfs has supported 64-bit timestamps as long as the kernel VFS layer has. btrfs has always had 64-bit timestamps on-disk but I'm not sure they were plumbed in right away. The in-kernel NTFS driver should also have 64-bit timestamps, but I think most Linux NTFS users use the FUSE implementation ntfs-3g, and that will have the same magic number as other FUSE filesystems.

For the other common Linux filesystems we can't assume large timestamps. It's still possible to format ext4 with 32-bit timestamps, and anyway ext2 and ext3 use the same f_type magic number as ext4. xfs has only supported 64-bit timestamps since Linux 5.10.

I imagine most programs that care test for large timestamps by seeing if utimes and friends fails or stat returns a clamped timestamp ("Updated file timestamps are set to the greatest value supported by the filesystem that is not greater than the specified time." -- man utimensat). But obviously that can't be part of a test that os.stat properly handles large timestamps!

So, reviewers, would you be satisfied just testing this on Windows NTFS by adding a few timestamps to UtimeTests.test_large_time? fill_time only depends on time_t, and that should be a signed 64-bit type everywhere now.

@eendebakpt
Copy link
Contributor

eendebakpt commented Sep 6, 2025

For the tests I would prefer to keep it simple: just add some more cases above and below the fast path threshold. If needed add tests for both 32- and 64-bit.

There is a slight behavior change with this PR (I do not think it matters, but that is for someone else to judge): with current main if an error occurs anywhere, no changes are made to the struct sequence. In this PR it can happen that some changes are already made (e.g. s_index is set), and then an error occurs.

@jbosboom Could you add a news entry?

@AA-Turner AA-Turner requested a review from vstinner September 6, 2025 23:00
@vstinner
Copy link
Member

vstinner commented Sep 9, 2025

This improves python -m pyperf timeit -s 'import os' 'os.stat(".")' from 1.26 us +- 0.02 us to 1.15 us +- 0.01 us on Linux 6.16.2.arch1-1 btrfs and --enable-optimizations --with-lto.

I re-run the benchmark on Python built with ./configure (gcc -O3):

Mean +- std dev: [ref] 1.07 us +- 0.01 us -> [change] 867 ns +- 13 ns: 1.23x faster

Nice performance improvement!

@vstinner vstinner merged commit d4825ac into python:main Sep 9, 2025
45 checks passed
@vstinner
Copy link
Member

vstinner commented Sep 9, 2025

Merged, thanks for this nice optimization.

lkollar pushed a commit to lkollar/cpython that referenced this pull request Sep 9, 2025
…8537)

While file timestamps can be anything the file system can store, most
lie between the recent past and the near future.  Optimize fill_time()
for typical timestamps in three ways:

- When possible, convert to nanoseconds with C arithmetic.
- When using C arithmetic and the seconds member is not required (for
  st_birthtime), avoid creating a long object.
- When using C arithmetic, reorder the code to avoid the null checks
  implied in Py_XDECREF().

Co-authored-by: Victor Stinner <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance Performance or resource usage
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants