We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
There was an error while loading. Please reload this page.
1 parent 52ecd62 commit 06fca85Copy full SHA for 06fca85
examples/models/llama/export_llama_lib.py
@@ -606,7 +606,9 @@ def _prepare_for_llama_export(args) -> LLMEdgeManager:
606
)
607
608
609
- # We want to do compute the actual ops in the precision of the dtype_override.
+ # We want to do compute the actual ops in the precision of the dtype_override,
610
+ # since the precision of the quantized linear will initially be the dtype of the
611
+ # checkpoint, not the dtype_override.
612
def _set_precision_to_fp32(module):
613
"""
614
Recursively iterate through the module and set the precision attribute
0 commit comments