-
-
Notifications
You must be signed in to change notification settings - Fork 4.2k
Improve bloom and add option to use faster but lower quality bloom #21340
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Welcome, new contributor! Please make sure you've read our contributing guide and we look forward to reviewing your pull request shortly ✨ |
Your PR caused a change in the graphical output of an example or rendering test. This might be intentional, but it could also mean that something broke! If it's expected, please add the M-Deliberate-Rendering-Change label. If this change seems unrelated to your PR, you can consider updating your PR to target the latest main branch, either by rebasing or merging main into it. |
What's perf like with the low quality version? Afaik, the main perf issues with bloom is CPU usage, not GPU usage. Having so many separate render passes (1 per upsample/downsample step) is quite expensive. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code looks well written!
It looks like your PR has been selected for a highlight in the next release blog post, but you didn't provide a release note. Please review the instructions for writing release notes, then expand or revise the content in the release notes directory to showcase your changes. |
I quite like the idea of this, but I want to see benchmarks on low-end machines :) |
Tested the bloom 3d example on Mali G76 android with log diagnostics (debug build, opt-level 3): CPU time
GPU time (high quality bloom)
GPU time (low quality bloom)
TLDR: Bloom CPU takes ~2.3 ms (can be reduced if building in release mode). |
Your PR caused a change in the graphical output of an example or rendering test. This might be intentional, but it could also mean that something broke! If it's expected, please add the M-Deliberate-Rendering-Change label. If this change seems unrelated to your PR, you can consider updating your PR to target the latest main branch, either by rebasing or merging main into it. |
I also add a |
Your PR caused a change in the graphical output of an example or rendering test. This might be intentional, but it could also mean that something broke! If it's expected, please add the M-Deliberate-Rendering-Change label. If this change seems unrelated to your PR, you can consider updating your PR to target the latest main branch, either by rebasing or merging main into it. |
bbd330a
to
ea0c52c
Compare
Your PR caused a change in the graphical output of an example or rendering test. This might be intentional, but it could also mean that something broke! If it's expected, please add the M-Deliberate-Rendering-Change label. If this change seems unrelated to your PR, you can consider updating your PR to target the latest main branch, either by rebasing or merging main into it. |
1 similar comment
Your PR caused a change in the graphical output of an example or rendering test. This might be intentional, but it could also mean that something broke! If it's expected, please add the M-Deliberate-Rendering-Change label. If this change seems unrelated to your PR, you can consider updating your PR to target the latest main branch, either by rebasing or merging main into it. |
.min(bloom.max_mip_count) | ||
.max(2) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
isn't this going to be always capped at 2? It seems a bit surprising to default it to u32::MAX if that's the case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, it is clamped to [2, max_mip_count]
I added the Deliberate-Rendering-Change tag but I'm not sure all of it is correct. There's new text in the example which should cause it to fail but there's also a ton of other changes in the images which is surprising since I would have expected the default values to still look the same. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, happy to enable more perf configurability here.
|
||
// Whether to use a high quality bloom implementation (default: true). | ||
// If false, bloom will use an implementation that significantly reduces the number of texture samples and improves performance, but at the cost of lower quality. | ||
pub high_quality: bool, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we make this an enum to allow adding more possible quality settings / configuration in the future?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure if we should add an enum. The two bloom implementations seem sufficient, and there are other quality configuration. The performance difference of using moderate samples may not be significant.
downsampling_bind_groups: Box<[BindGroup]>, | ||
upsampling_bind_groups: Box<[BindGroup]>, | ||
sampler: Sampler, | ||
downsample_sampler: Sampler, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: would prefer to avoid these kinds of noise-y changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It can be removed actually.
I modified the calculation for the max mip dimensions, and used different samplers (ClampToBorder and ClampToEdge) for downsampling and upsampling , so the image changes are as expected. |
ea0c52c
to
ab01abd
Compare
Objective
Related to #13109. Bloom is slow, especially on mobile.
Solution
high_quality
option to Bloom, if false, use blur kernels that only read 4 samples for downsampling and upsampling (from https://www.shadertoy.com/view/mdsyDf), instead of 13 samples and 9 samples.ClampToBorder
for downsampling andClampToEdge
for upsampling. Only usingClampToEdge
causes the bloom too bright and may flicker in the edges in low quality bloom.Testing
Added the option to
bloom_2d
andbloom_3d
examples to test it.TODO: benchmark the performance.
Showcase
Click to view showcase