Skip to content

fix: Reduce redundant cell checks for large radii in partition circle fill logic #1426

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

Stubbjax
Copy link

@Stubbjax Stubbjax commented Aug 6, 2025

This change brings a small performance boost and logic optimisation to the doCircleFill method in the PartitionManager, which is the process used to determine which cells of the grid a medium to large cylindrical or spherical object occupies.

The original implementation is incorrect in that it uses the fixed cell radius in the loop condition instead of the decreasing y value. This means that the circle fill is not only performing redundant iterations on repeated rows, but also erroneously filling trapezoidal shapes along the top and bottom of the circle (as the x is not being curbed along each scan line). Cells occupy 40×40 world units.

In reality (retail), the trapezoidal shapes do not cover any additional cells as no object geometry has a large enough radius to be affected, with identical cell coverage at a radius of 240 and below. (The largest circular geometry in the game is the Anthrax Bomb poison field with a radius of 150.) Regardless, the fixed logic is behind a !RETAIL_COMPATIBLE_CRC flag to guarantee no mods or maps that introduce gigantic geometries cause any mismatches due to different cell coverage. See below for comparisons between the original and fixed implementations.

Comparison 1

h_anim
Demonstration of the trapezoidal issue at large scale

Comparison 2

s_anim
At radii <= 240, cell coverage is identical

Comparison 3

s_anim
At radii > 240, cell coverage diverges

Tabulated differences

image

Loops: Total number of for-loop iterations, including nested scan line iterations
Checks: Total number of cell checks performed
Cells: Total number of cells covered

Green cells: Reduced calculations
Yellow cells: Divergent cell coverage

TODO

  • Replicate in Generals

@Mauller
Copy link

Mauller commented Aug 6, 2025

Have you tested this for retail compatibility?

@xezon xezon added Bug Something is not working right, typically is user facing Performance Is a performance concern Gen Relates to Generals ZH Relates to Zero Hour labels Aug 6, 2025
@xezon
Copy link

xezon commented Aug 6, 2025

Nice illustration. I understand the problem and implication, but do not quite follow the algorithm. How did you notice that?

Also, how big of an impact does this have in a busy frame? Is it a minor or major performance improvement?

@Stubbjax
Copy link
Author

Stubbjax commented Aug 7, 2025

Have you tested this for retail compatibility?

It is fully compatible with retail unless a modified object with a radius greater than 240 is introduced via a custom map or mod. And even then, a mismatch would only be possible if a collision was detected within the original coverage that didn't end up being detected by any of the shared coverage (e.g. if a projectile were to land right on the outer radius). Such a scenario is unlikely as such a large object would take up half the map, but as the possibility exists, it is better to be safe.

Nice illustration. I understand the problem and implication, but do not quite follow the algorithm. How did you notice that?

The algorithm is an implementation of the midpoint circle algorithm. I noticed it while investigating the issue where units sometimes deal double damage with missile weaponry (TheSuperHackers/GeneralsGamePatch#954) - the cell partition algorithms are optimised for larger objects, and so accuracy is low for smaller to medium objects. (Objects often don't occupy the correct cells, leading to duplicate missile detonations.) For example, see this Quad Cannon (18×7) covering one cell with minimal extents overlap. Or this Stinger Site (radius 36) only covering one cell. It might be worth opening a dedicated issue for this.

Also, how big of an impact does this have in a busy frame? Is it a minor or major performance improvement?

It is minor and primarily makes a difference for the large radiation and poison fields (radius 100 - 150), which will perform fewer cell coverage checks upon instantiation. This change is really more of a 'correctness' fix.

image

@xezon xezon added Minor Severity: Minor < Major < Critical < Blocker Fix Is fixing something, but is not user facing and removed Bug Something is not working right, typically is user facing labels Aug 7, 2025
@xezon
Copy link

xezon commented Aug 7, 2025

Very nice. I am impressed by the detail of investigation and presentation.

If this change provably does not affect circles <= 240, then I do not mind applying the fix under that condition for RETAIL_COMPATIBLE_CRC as well.

Also, if the algorithm refers to https://en.wikipedia.org/wiki/Midpoint_circle_algorithm then we can add a comment refering to that for the next person.

@Stubbjax
Copy link
Author

Stubbjax commented Aug 7, 2025

If this change provably does not affect circles <= 240, then I do not mind applying the fix under that condition for RETAIL_COMPATIBLE_CRC as well.

The problem with any kind of additional conditional logic is that it would be checked in performance-critical code, which would somewhat undo the performance gain (in most cases it would actually worsen it).

Every cylindrical / spherical object on the map runs through this logic whenever it is instantiated or moves, which could result in hundreds to thousands of float comparison checks per frame if we were to check radius <= 240. (This is likely why an IsSmallGeometry bool is used to branch small radius checks rather than determining the branch based on the radius size.)

Also, if the algorithm refers to https://en.wikipedia.org/wiki/Midpoint_circle_algorithm then we can add a comment refering to that for the next person.

Done!

@xezon
Copy link

xezon commented Aug 7, 2025

I do not expect this to be a performance concern. If 100% of circles yield the same result, the branch predictor has a very easy job and will guess 100% of times correctly and there will be no pipeline flush.

You can measure it and check if it is ok.

@Stubbjax
Copy link
Author

Stubbjax commented Aug 7, 2025

I do not expect this to be a performance concern. If 100% of circles yield the same result, the branch predictor has a very easy job and will guess 100% of times correctly and there will be no pipeline flush.

You can measure it and check if it is ok.

Alright, if that is indeed the case, how would you prefer the coverage divergence to be proven? I took some screenshots with occupied cells highlighted using some debug logic. Below are comparisons between a modified Ranger (left, 239 radius) and Missile Defender (right, 240 radius) before and after the fix. Cell coverage is identical for radii < 240.

Before

image Ranger: 97 cells, Missile Defender: 145 cells

After

image Ranger: 97 cells, Missile Defender: 137 cells

Note

It should be noted that the boundary is really < 240 instead of <= 240 due to floating-point errors, which I did not previously consider.

image

(One would expect 240 × 0.025 to equal 6, but it instead equals 240 × 0.0250000004 = 6.000000096 which fast_float_ceil rounds up to 7. If we make the fixed logic conditional, we can simply check radius < 240 accordingly.)

@xezon
Copy link

xezon commented Aug 7, 2025

I suspect as long as every radius from 1 to 239 produces the same result before and after, then this should be good to go?

Is there a way to get a result of the function as number(s), that can be compared between the 2 implementation?

@Stubbjax
Copy link
Author

Stubbjax commented Aug 7, 2025

I suspect as long as every radius from 1 to 239 produces the same result before and after, then this should be good to go?

After splitting the change into conditional blocks as suggested, yes. The below animation might help to confirm the cell coverage matches up until 240 (alternates between original (yellow) and fixed (blue) every second).

RADIUS.mp4

Is there a way to get a result of the function as number(s), that can be compared between the 2 implementation?

What information are you looking for that is not covered by the tabulated data in the initial description?

@Mauller
Copy link

Mauller commented Aug 7, 2025

If the game does not use any circles with radii at or over 240, then i don't see a problem with just implementing this fix as it is.
There should be a reasonable performance gain without the overscanning that is occuring in the original algorithm.

@xezon
Copy link

xezon commented Aug 8, 2025

What information are you looking for that is not covered by the tabulated data in the initial description?

I would have tried to test this by putting the old implementation next to the new one and then call both functions with the same inputs from 1 to 239 and campare their results. If they are always identical, then it is proven to be identical.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Fix Is fixing something, but is not user facing Gen Relates to Generals Minor Severity: Minor < Major < Critical < Blocker Performance Is a performance concern ZH Relates to Zero Hour
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants