Skip to content

Conversation

@infinitalo
Copy link

Make sure to read the contributing guidelines before submitting a PR

@infinitalo infinitalo force-pushed the italo/adreno_inference branch from 0f376cd to 0f6d6da Compare October 16, 2025 20:15
@infinitalo infinitalo changed the base branch from temp-latest to temp-latest-finetuning October 16, 2025 20:39
@infinitalo
Copy link
Author

infinitalo commented Oct 16, 2025

@olyasir I've addressed your comments and pushed some cleanups, please feel free to give it another look.

I've also edited the PR to merge into temp-latest-finetuning instead of temp-latest.

@infinitalo infinitalo force-pushed the italo/adreno_inference branch 2 times, most recently from e40fed4 to 3f57c82 Compare October 21, 2025 18:20
@gianni-cor
Copy link

gianni-cor commented Oct 24, 2025

@infinitalo can you please also check the failed pipelines?

Italo Nicola added 11 commits October 24, 2025 18:12
Shouldn't change any behavior since currently nb00 is always 1.
Robustness is usually disabled for Q8/Q4 shaders since having it enabled
impacts performance more significantly for those types than F16/F32.
Introduce a CMAKE option for disabling Adreno-specific shaders if
needed, this improves build time, but should not be used when targeting
Adreno devices.
Extend device detection to classify Qualcomm Adreno GPUs, enabling
targeted workarounds and shader selection when those devices are
present.
Avoid subgroup operations on Adreno by selecting safer paths to sidestep
compiler/driver bugs while preserving behavior.
Similar to what we do for other vendors such as Intel.
This optimization broke inference on Adreno.
Add build-time generation of Adreno-targeted shader variants under a
guard, so Adreno devices use safer code paths while other GPUs remain
unaffected.
Increase OUT_PROD Q8 performance through improving memory locality.
This makes finetuning work without crashing on Adreno 830.
@infinitalo infinitalo force-pushed the italo/adreno_inference branch from 3f57c82 to 2c9a0e8 Compare October 24, 2025 21:14
@gianni-cor gianni-cor self-requested a review October 27, 2025 16:36
@olyasir olyasir merged commit f9e7293 into tetherto:temp-latest-finetuning Oct 27, 2025
39 of 47 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants