Skip to content

Commit b015d80

Browse files
committed
Update on "[Executorch] Add quantized kv cache to oss ci"
Fixes to make sure quantized kv cache works in oss Differential Revision: [D66269487](https://our.internmc.facebook.com/intern/diff/D66269487/) [ghstack-poisoned]
2 parents e49b3ad + bfb6bcd commit b015d80

File tree

1 file changed

+2
-0
lines changed

1 file changed

+2
-0
lines changed

examples/models/llama/source_transformation/quantized_kv_cache.py

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -249,6 +249,8 @@ def from_float(cls, kv_cache, cache_type: QuantizedCacheType):
249249
def replace_kv_cache_with_quantized_kv_cache(module):
250250
# This is needed to ensure that custom ops are registered
251251
from executorch.extension.pybindings import portable_lib # noqa # usort: skip
252+
from executorch.extension.llm.custom_ops import custom_ops # noqa: F401
253+
252254
logging.warning(
253255
"Replacing KVCache with QuantizedKVCache. This modifies the model in place."
254256
)

0 commit comments

Comments
 (0)