Skip to content

Optimize contact constraint generation#699

Merged
Jondolf merged 56 commits intomainfrom
optimize-generate-constraints
Apr 17, 2025
Merged

Optimize contact constraint generation#699
Jondolf merged 56 commits intomainfrom
optimize-generate-constraints

Conversation

@Jondolf
Copy link
Member

@Jondolf Jondolf commented Apr 14, 2025

Objective

Currently, contact constraints are generated serially after the narrow phase. This requires iterating over all contact pairs and redoing various queries and computations, which can add meaningful overhead.

(Note: The commit history also includes most commits from #683, sorry about that 😅)

Solution

Optimize contact constraint generation by generating constraints directly in the parallel contact update loop of the narrow phase. This way, we get "free" parallelism while getting rid of the extra iteration, and we don't need to query for the rigid bodies or colliders a second time. We can even reuse some values computed for the contact update, like effective_speculative_margin.

Constraint generation is multi-threaded by storing constraints in thread-local Vecs and draining them serially into ContactConstraints. This preserves determinism.

Performance

For physics diagnostics, the "Generate Constraints" step has been removed, and the cost is now included in the "Narrow Phase" step. This means that the narrow phase is seemingly more expensive, but the total cost is reduced quite drastically. Improvements can be seen in both single-threaded and multi-threaded performance, with the latter having a larger difference.

Note that the "Store Impulses" step is now slightly more expensive. This is because the contact pair lookup can no longer be performed with the edge index directly, as constraints are generated before contact pair removal, and pair removal can invalidate indices.

Single-threaded

Before:

Before

After:

After

Multi-threaded

Before:

Before

After:

After

Migration Guide

NarrowPhaseSet::GenerateConstraints has been removed. Contact constraints are now generated as part of NarrowPhaseSet::Update.

- The broad phase now emits new collision pairs and stores all pairs in a HashSet

- Contact status and kind is now tracked with `ContactPairFlags` instead of booleans

- The narrow phase adds new collision pairs, updates existing pairs, and responds to state changes separately instead of overwriting and doing extra work for persistent contact

- State changes are tracked with bit vectors (bit sets), which are fast to iterate serially

- The narrow phase is responsible for collision events instead of the `ContactReportingPlugin`
- Renamed `BroadCollisionPairs` to `BroadPhasePairSet`
- Added `BroadPhasePairSet` for fast pair lookup with new `PairKey`
- Improve broad phase docs
…pt-in

- Removed `BroadPhaseAddedPairs`
- Renamed `BroadPhasePairSet` to `BroadPhasePairs`
- Moved contact creation to broad phase to improve persistence
- Removed some graph querying overhead from contact pair removal by using the `EdgeIndex` directly
- Made collision events opt-in with `CollisionEventsEnabled` component
- Improved a lot of docs
@Jondolf Jondolf added C-Performance Improvements or questions related to performance A-Collision Relates to the broad phase, narrow phase, colliders, or other collision functionality labels Apr 14, 2025
@Jondolf Jondolf added this to the 0.3 milestone Apr 14, 2025
@Jondolf Jondolf enabled auto-merge (squash) April 17, 2025 00:23
@Jondolf Jondolf merged commit bac2021 into main Apr 17, 2025
5 checks passed
@Jondolf Jondolf deleted the optimize-generate-constraints branch April 17, 2025 01:15
Jondolf added a commit that referenced this pull request Apr 27, 2025
…nabled (#712)

# Objective

Multithreaded determinism is currently broken! This is because #699 accidentally resulted in contact constraint order being non-deterministic. CI didn't catch this, because it currently doesn't use the `parallel` feature.

## Solution

When the `parallel` feature is enabled, store the pair index for each `ContactConstraint`, and just sort the constraints based on these indices.

I suspect that a better approach that doesn't require sorting exists, but for now, this works as a simple hot-fix. I also tried the simple approach of returning `Vec`s from Bevy's `par_splat_map_mut`, but that would mean that we can't reuse the constraint buffers, and have to allocate from scratch every time. Either way, there's more to try and experiment with here.

Because the constraints within chunks are still largely sorted, the sorting ends up being quite fast. In the `pyramid_2d` example, it increased the time of "Update Contacts" from about 0.42 ms to about 0.44 ms with ~3025 constraints.

I also enabled the `parallel` feature for CI to hopefully catch multithreading problems like this automatically in the future!
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

A-Collision Relates to the broad phase, narrow phase, colliders, or other collision functionality C-Performance Improvements or questions related to performance

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant