Skip to content

Conversation

pcwalton
Copy link
Contributor

@pcwalton pcwalton commented Oct 6, 2025

PR #20993 attempted to fix a crash that would occur with the following sequence of events:

  1. An entity's material changes from type A to type B. (The material type must change, not just the material asset.)

  2. The extract_entities_needs_specialization<B> system runs and adds a new specialization change tick to the entity.

  3. The extract_entities_needs_specialization<A> system runs and removes the specialization change tick.

  4. We crash in rendering because no specialization change tick was present for the entity.

Unfortunately, that PR used the presence of the entity in RenderMaterialInstances to detect whether the entity is safe to delete from the specialization change tick table. This is incorrect for meshes that change material types while not visible, because only visible meshes are present in RenderMaterialInstances. So the above race can still occur if the mesh changes materials while off-screen, which will lead to a crash when the mesh becomes visible.

This PR fixes the issue by dividing the process of adding new specialization ticks and the process of removing old specialization ticks into two systems. First, all specialization ticks for all materials are updated; alongside that, we store the material instance tick, which is a tick that's updated once per frame. After that, we run sweep_entities_needing_specialization, which traverses the RemovedComponents list for each material and prunes dead entities from the table if and only if their material instance ticks haven't changed since the last frame. This ensures that the above race can't happen, regardless of whether the meshes are present in
RenderMaterialInstances.

Having to have two separate specialization ticks, one being the standard Bevy system tick and one being a special material instance tick that's updated once per frame, is unfortunate, but it seemed to me like the least bad option. We should be able to get rid of all of this when we have untyped materials.

changes and the material then becomes visible.

PR bevyengine#20993 attempted to fix a crash that would occur with the following
sequence of events:

1. An entity's material changes from type A to type B. (The material
   *type* must change, not just the material asset.)

2. The `extract_entities_needs_specialization<B>` system runs and adds a
   new specialization change tick to the entity.

3. The `extract_entities_needs_specialization<A>` system runs and
   removes the specialization change tick.

4. We crash in rendering because no specialization change tick was
   present for the entity.

Unfortunately, that PR used the presence of the entity in
`RenderMaterialInstances` to detect whether the entity is safe to delete
from the specialization change tick table. This is incorrect for meshes
that change material types while not visible, because only visible
meshes are present in `RenderMaterialInstances`. So the above race can
still occur if the mesh changes materials while off-screen, which will
lead to a crash when the mesh becomes visible.

This PR fixes the issue by dividing the process of adding new
specialization ticks and the process of removing old specialization
ticks into two systems. First, all specialization ticks for all
materials are updated; alongside that, we store the *material instance
tick*, which is a tick that's updated once per frame. After that, we run
`sweep_entities_needing_specialization`, which processes the
`RemovedComponents` list for each material and prunes dead entities from
the table if and only if their material instance ticks haven't changed
since the last frame. This ensures that the above race can't happen,
regardless of whether the meshes are present in
`RenderMaterialInstances`.

Having to have two separate specialization ticks, one being the standard
Bevy system tick and one being a special material instance tick that's
updated once per frame, is unfortunate, but it seemed to me like the
least bad option.
@pcwalton pcwalton added the A-Rendering Drawing game state to the screen label Oct 6, 2025
@pcwalton pcwalton added P-Crash A sudden unexpected crash C-Bug An unexpected or incorrect behavior P-Regression Functionality that used to work but no longer does. Add a test for this! labels Oct 6, 2025
@alice-i-cecile alice-i-cecile added this to the 0.17.3 milestone Oct 6, 2025
a point release.

This also fixes the `manual_material` example (which was actually broken
before, as it never had PR bevyengine#20993 applied).
Copy link
Member

@tychedelia tychedelia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes sense, thanks. Going forward, I'd like to consider the way we can make this bookkeeping untyped, as these these kinds of races and other weirdness in the presence of multiple materials is very difficult to reason about. I think the primary blocker here is the lack of an easy way to observe untyped change events across any possible material handle. But will think about this some.

@tychedelia tychedelia added the S-Ready-For-Final-Review This PR has been approved by the community. It's ready for a maintainer to consider merging it label Oct 6, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-Rendering Drawing game state to the screen C-Bug An unexpected or incorrect behavior P-Crash A sudden unexpected crash P-Regression Functionality that used to work but no longer does. Add a test for this! S-Ready-For-Final-Review This PR has been approved by the community. It's ready for a maintainer to consider merging it
Projects
Status: No status
Development

Successfully merging this pull request may close these issues.

4 participants