You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
[python][compiler] Memoize device max shared memory per device (triton-lang#6503)
Similar to triton-lang#6000 this patch is
an upstreamed internal patch at Meta with the goal of reducing our
internal patches, cc @jamesjwu the original author.
When running various benchmarks with small kernels we see a non-trivial
amount of time spent fetching this property, and memoizing helped.
It might be worth looking into memoizing `get_device_properties`, but I
think that'd need a more careful treatment in the driver package in
order to properly handle arbitrary backends.
0 commit comments