Throughput Updates, main branch (2025.08.18.) #1130

krasznaa · 2025-08-18T15:38:21Z

Following #1112, here I try to restore the overall throughput of the GPU applications. To jump to the chase already, I see the following throughput with the code as it is just before #1112 would have been merged in:

./build-old/bin/traccc_throughput_mt_cuda --input-directory /data/Acts/odd-simulations-20240509
/geant4_ttbar_mu200/ --input-events=100 --deterministic --cpu-threads=4
...
Warm-up processing [==================================================] 100% [00m:00s]
Event processing   [==================================================] 100% [00m:00s]
05:20:04 PM ThroughputExample             INFO      Reconstructed track parameters: 2727261
05:20:04 PM ThroughputExample             INFO      Time totals:                   File reading  4601 ms
05:20:04 PM ThroughputExample             INFO                  Warm-up processing  987 ms
05:20:04 PM ThroughputExample             INFO                    Event processing  9778 ms
05:20:04 PM ThroughputExample             INFO      Throughput:            Warm-up processing  98.7184 ms/event, 10.1298 events/s
05:20:04 PM ThroughputExample             INFO                    Event processing  97.7832 ms/event, 10.2267 events/s

And with this PR's code I see:

./build-new/bin/traccc_throughput_mt_cuda --input-directory /data/Acts/odd-simulations-20240509
/geant4_ttbar_mu200/ --input-events=100 --deterministic --cpu-threads=4
...
Warm-up processing [==================================================] 100% [00m:00s]
Event processing   [==================================================] 100% [00m:00s]
05:20:33 PM ThroughputExample             INFO      Reconstructed track parameters: 2727308
05:20:33 PM ThroughputExample             INFO      Time totals:                   File reading  4242 ms
05:20:33 PM ThroughputExample             INFO                  Warm-up processing  1013 ms
05:20:33 PM ThroughputExample             INFO                    Event processing  9917 ms
05:20:33 PM ThroughputExample             INFO      Throughput:            Warm-up processing  101.331 ms/event, 9.8686 events/s
05:20:33 PM ThroughputExample             INFO                    Event processing  99.1775 ms/event, 10.0829 events/s

There is unfortunately still a slight drop, which I intend to look a bit more at still, but the code is creating a more representative description of the reconstructed tracks in this new version in host code than was available before #1112. (The tracks to states jagged indices are copied back to the host in the new version, while in the old version all that info was left on the device.)

Finally, about the PR:

Simplified the common code of the throughput applications such that they would only use vecmem::host_memory_resource. Leaving anything more specific to the full chain algorithm classes.
Modified the full chain algorithms to:
- Created pinned host memory resources, with their own caching, internally;
- Made them pass the cached host and device memory resources to all of their sub-algorithms for the intermediate object creation.
- Made them copy the final objects first into a buffer in cached and pinned host memory, to then copy it with host-to-host transfers into "host containers, that use regular host memory. (This is the part that should be responsible the remaining performance difference.)

But as I started, I'll still look a bit more at this, to see if it could be made yet a little faster / more efficient.

So that it would be left up to the individual full-chain algorithms to do with their host memory handling as they wished.

krasznaa · 2025-08-18T15:47:27Z

To add, the current main branch (or rather the version that this PR's branch is currently based on, since it became out of date since), produces the following:

./build-current/bin/traccc_throughput_mt_cuda --input-directory /data/Acts/odd-simulations-2024
0509/geant4_ttbar_mu200/ --input-events=100 --deterministic --cpu-threads=4
...
Warm-up processing [==================================================] 100% [00m:00s]
Event processing   [==================================================] 100% [00m:00s]
05:45:13 PM ThroughputExample             INFO      Reconstructed track parameters: 2727265
05:45:13 PM ThroughputExample             INFO      Time totals:                   File reading  4411 ms
05:45:13 PM ThroughputExample             INFO                  Warm-up processing  1866 ms
05:45:13 PM ThroughputExample             INFO                    Event processing  16169 ms
05:45:13 PM ThroughputExample             INFO      Throughput:            Warm-up processing  186.646 ms/event, 5.35775 events/s
05:45:13 PM ThroughputExample             INFO                    Event processing  161.693 ms/event, 6.18455 events/s

As discussed in #1112 earlier. 🤔

stephenswat

Good!

sonarqubecloud · 2025-08-18T17:19:14Z

Quality Gate passed

Issues
0 New issues
0 Accepted issues

Measures
0 Security Hotspots
0.0% Coverage on New Code
11.1% Duplication on New Code

See analysis details on SonarQube Cloud

krasznaa added 2 commits August 18, 2025 16:11

Removed central host memory caching from the throughput code.

b69f51b

So that it would be left up to the individual full-chain algorithms to do with their host memory handling as they wished.

Modified the use of host memory caching in the algorithms.

abf584d

krasznaa requested review from beomki-yeo and stephenswat August 18, 2025 15:38

krasznaa added improvement Improve an existing feature examples Changes to the examples labels Aug 18, 2025

stephenswat approved these changes Aug 18, 2025

View reviewed changes

stephenswat enabled auto-merge (squash) August 18, 2025 17:18

Merge branch 'main' into ThroughputUpdates-main-20250818

0dfc444

stephenswat merged commit 2490295 into acts-project:main Aug 18, 2025
25 of 29 checks passed

krasznaa deleted the ThroughputUpdates-main-20250818 branch August 19, 2025 07:04

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Throughput Updates, main branch (2025.08.18.) #1130

Throughput Updates, main branch (2025.08.18.) #1130

Uh oh!

krasznaa commented Aug 18, 2025

Uh oh!

krasznaa commented Aug 18, 2025

Uh oh!

stephenswat left a comment

Uh oh!

sonarqubecloud bot commented Aug 18, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Throughput Updates, main branch (2025.08.18.) #1130

Throughput Updates, main branch (2025.08.18.) #1130

Uh oh!

Conversation

krasznaa commented Aug 18, 2025

Uh oh!

krasznaa commented Aug 18, 2025

Uh oh!

stephenswat left a comment

Choose a reason for hiding this comment

Uh oh!

sonarqubecloud bot commented Aug 18, 2025

Quality Gate passed

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants