Microbenchmarks improvements and bug fixes by Mahalaxmibejugam · Pull Request #799 · fsspec/gcsfs

Mahalaxmibejugam · 2026-04-01T08:45:19Z

This PR includes the following changes to the microbenchmarks suite:

Fix chunking in test_info_multi_threaded: Corrected the handling of paths in multi-threaded info benchmarks to ensure proper distribution across threads instead of passing a single tuple.
Add more files in info benchmarks: Increased the number of files in info benchmarks to provide a more rigorous performance test.
Remove sleep from rename benchmarks: Removed a sleep call that was added to work around a Long Running Operation (LRO) issue that has since been fixed.
Add more folders in rm benchmarks: Increased the number of folders in rm benchmarks to better measure performance under scale.

codecov · 2026-04-01T10:01:33Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 76.32%. Comparing base (d35f8f8) to head (072c124).
⚠️ Report is 3 commits behind head on main.

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #799      +/-   ##
==========================================
+ Coverage   75.98%   76.32%   +0.33%     
==========================================
  Files          14       14              
  Lines        2665     2665              
==========================================
+ Hits         2025     2034       +9     
+ Misses        640      631       -9

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

jasha26 · 2026-04-01T14:00:10Z

gcsfs/tests/perf/microbenchmarks/info/configs.yaml

+    - 131072
  folders:
+    - 256
+  sample_size:


why are we sampling?

Previously, we created only 100 files and 100 folders in the bucket and called info on all 200 paths (files and folders). Now that I've modified benchmark to include 65k and 130k files, calling info on all 65k paths will not yield significantly more data points compared to what we would get by calling only 100 paths and would unnecessarily increase the benchmark's runtime.

Seggregated scenarios for files and folders, so there is no need of sampling now. For file scenarios, 10k files are created and info is called on them. For folder scenarios, we are creating 65k files, 256 folders and calling info on all 256 folders.

jasha26 · 2026-04-01T14:41:44Z

gcsfs/tests/perf/microbenchmarks/delete/configs.yaml

 scenarios:
  - name: "delete_flat"
-    folders: [256]
+    folders: [1024, 2048, 4096]


Instead of updating the folders i'd suggest create a new scenario with these options. This will impact the daily runs as it will take long time to create these as part of setup. So if you really want to run a daily trigger that compares large number of folders, better create different scenarios and trigger pointing to these scenarios.

Just increasing the number of folders won't actually increase the setup time as we are not making explicit calls to create folders using mkdir but they are implicitly getting created during file creation.

But as we are increasing the scenarios from one (256) to three (1024, 2048, 4096), more scenarios will run now and hence delete benchmarks would take more time. However, I suggest keeping them because delete benchmark's latency has significant contribution from the number of folders, and we only observe latency differences in HNS and standard buckets at 2k and 4k folders.

Mahalaxmibejugam added 4 commits April 1, 2026 04:49

add more folders in rm benchmarks

b3d4c73

remove the sleep from rename benchmarks as the LRO issue is fixed

78109e8

add more number of files in info benchmarks

8cde383

Fix chunking in test_info_multi_threaded

c291d9d

Mahalaxmibejugam changed the title ~~Update micro benchmarks~~ Microbenchmarks improvements and bug fixes Apr 1, 2026

fix test and address comment from gemini review

12ba9b6

jasha26 reviewed Apr 1, 2026

View reviewed changes

Mahalaxmibejugam added 2 commits April 2, 2026 15:18

update info configs

9034422

Fix test_info_configurator to match new config style

072c124

Mahalaxmibejugam requested a review from jasha26 April 3, 2026 10:52

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Microbenchmarks improvements and bug fixes#799

Microbenchmarks improvements and bug fixes#799
Mahalaxmibejugam wants to merge 7 commits intofsspec:mainfrom
ankitaluthra1:benchmark-changes

Mahalaxmibejugam commented Apr 1, 2026

Uh oh!

codecov bot commented Apr 1, 2026 •

edited

Loading

Uh oh!

jasha26 Apr 1, 2026

Uh oh!

Mahalaxmibejugam Apr 1, 2026

Uh oh!

Mahalaxmibejugam Apr 2, 2026

Uh oh!

jasha26 Apr 1, 2026

Uh oh!

Mahalaxmibejugam Apr 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Mahalaxmibejugam commented Apr 1, 2026

Uh oh!

codecov bot commented Apr 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

jasha26 Apr 1, 2026

Choose a reason for hiding this comment

Uh oh!

Mahalaxmibejugam Apr 1, 2026

Choose a reason for hiding this comment

Uh oh!

Mahalaxmibejugam Apr 2, 2026

Choose a reason for hiding this comment

Uh oh!

jasha26 Apr 1, 2026

Choose a reason for hiding this comment

Uh oh!

Mahalaxmibejugam Apr 1, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

codecov bot commented Apr 1, 2026 •

edited

Loading