Skip to content

Commit 643086c

Browse files
committed
Update on "[Executorch] Add quantized kv cache to oss ci"
Fixes to make sure quantized kv cache works in oss Differential Revision: [D66269487](https://our.internmc.facebook.com/intern/diff/D66269487/) [ghstack-poisoned]
2 parents 753f87f + b06a09f commit 643086c

File tree

2 files changed

+3
-4
lines changed

2 files changed

+3
-4
lines changed

examples/models/llama/source_transformation/quantized_kv_cache.py

Lines changed: 0 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -22,8 +22,6 @@
2222

2323
import executorch
2424

25-
from executorch.extension.pybindings import portable_lib # noqa # usort: skip
26-
2725
# Ideally package is installed in only one location but usage of
2826
# PYATHONPATH can result in multiple locations.
2927
# ATM this is mainly used in CI for qnn runner. Will need to revisit this
@@ -247,8 +245,6 @@ def from_float(cls, kv_cache, cache_type: QuantizedCacheType):
247245

248246

249247
def replace_kv_cache_with_quantized_kv_cache(module):
250-
# This is needed to ensure that custom ops are registered
251-
from executorch.extension.pybindings import portable_lib # noqa # usort: skip
252248
from executorch.extension.llm.custom_ops import custom_ops # noqa: F401
253249

254250
logging.warning(

extension/llm/custom_ops/custom_ops.py

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -26,6 +26,9 @@
2626

2727
import executorch
2828

29+
# This is needed to ensure that custom ops are registered
30+
from executorch.extension.pybindings import portable_lib # noqa # usort: skip
31+
2932
# Ideally package is installed in only one location but usage of
3033
# PYATHONPATH can result in multiple locations.
3134
# ATM this is mainly used in CI for qnn runner. Will need to revisit this

0 commit comments

Comments
 (0)