gh-130701: create new PGO task based on benchmarks #130702

nascheme · 2025-02-28T19:18:11Z

As described in the issue, using the unit tests as the PGO task has some problems. This PR creates a new PGO task by taking benchmarks from the "pyperformance" suite and adjusting them (slightly) to work better as a PGO task. I had a couple objectives when moving the benchmark code over:

use only benchmarks that don't have external dependencies
prefer benchmarks that exercise the most heavily used parts of Python
adjust code so that execution time is mostly spent doing "real work" rather than on timing related things
adjust the loops or iterations so that each task takes around 0.1 to 1 seconds, roughly

There are some potential issues in adding this new PGO task:

adding the additional code to the cpython repo is not ideal. It is approximately 500 kB of data added to the Lib/test/pgo_task folder. We could put the code in a separate repo or PyPI package but I think we prefer that Python can be compiled without additional external dependencies.
the code duplication between pyperformance and Lib/test/pgo_task is not ideal. OTOH, the pyperformance benchmarks don't see a lot of code changes (in order for benchmark results to be stable) and so keeping the code up-to-date should not be too much of a problem.
the current task set doesn't cover as much code as the unit test based PGO task did. I think this can be resolved over time by adding more modules to the pgo_task folder to cover those code paths.
It's possible that using the new PGO task will result in worse optimization of the Python executable, for real work-loads. However, I think if that happens we should adjust both the pyperformance and the pgo_tasks so those execution patterns are better represented. We are focusing a lot of optimization efforts based on what pyperformance results say and so those benchmarks should also be a good representation of real-world programs. If not, we should improve them.

Benchmark results from pyperformance.

Issue: Using the unit tests as the PGO task has problems #130701

brandtbucher · 2025-02-28T21:01:13Z

I agree with the root issue, but I'm skeptical of using our benchmarks as a profiling task (it feels too much like "gaming" them).

As a data point, I tried using pyperformance as a PGO task a couple of years ago, and it made things about 4% faster (according to pyperformance, of course): faster-cpython/ideas#99 (comment)

nascheme added 4 commits February 27, 2025 17:46

Use benchmarks from pyperformance as PGO task.

aae2849

Add more PGO tasks.

5341cf9

Adjust PGO task loops.

2cd5822

Use argparse, tweak output.

8dd8862

bedevere-app bot mentioned this pull request Feb 28, 2025

Using the unit tests as the PGO task has problems #130701

Open

zanieb self-requested a review February 28, 2025 20:56

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

gh-130701: create new PGO task based on benchmarks #130702

gh-130701: create new PGO task based on benchmarks #130702

Uh oh!

nascheme commented Feb 28, 2025 •

edited

Loading

Uh oh!

brandtbucher commented Feb 28, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

gh-130701: create new PGO task based on benchmarks #130702

Are you sure you want to change the base?

gh-130701: create new PGO task based on benchmarks #130702

Uh oh!

Conversation

nascheme commented Feb 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

brandtbucher commented Feb 28, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

nascheme commented Feb 28, 2025 •

edited

Loading