-
Notifications
You must be signed in to change notification settings - Fork 696
OpenVINO Export Llama Support #14022
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
meta-codesync
merged 107 commits into
pytorch:main
from
cavusmustafa:openvino_llama_support
Oct 14, 2025
Merged
Changes from all commits
Commits
Show all changes
107 commits
Select commit
Hold shift + click to select a range
e489d6c
Runtime support for openvino quantized models
cavusmustafa f0d901f
openvino export_llama_lib support
cavusmustafa 24f2d93
nncf pattern checker in openvino partitioner
cavusmustafa 7dd8d0f
nncf compression init
anzr299 ce0e809
Merge pull request #5 from anzr299/origin/nncf_compression
cavusmustafa 1716834
openvino backend llama nncf support
cavusmustafa 198190e
openvino quantizer init
anzr299 9a1dff2
Merge pull request #7 from anzr299/origin/nncf_compression
cavusmustafa 3d88a4e
Moved all openvino llama example changes into export_llama_lib
cavusmustafa e81f60d
Removed openvino utils.py since it is not needed anymore
cavusmustafa 457a868
Update nncf_observers.py
anzr299 050448e
Merge pull request #8 from anzr299/patch-1
cavusmustafa d1e9330
Add export llama runner build option into openvino build script
cavusmustafa cedab9d
Update README.md
cavusmustafa 1010323
Merge branch 'main' into openvino_llama_support
cavusmustafa cf0d3b7
Merge branch 'main' into openvino_llama_support
suryasidd e54f4c7
Added CMAKE EXPORT Changes
suryasidd c12a4ba
code formating updates
cavusmustafa bf65943
code formating changes
cavusmustafa 30a1a25
openvino quantizer refactored
anzr299 4cc7694
fixes
anzr299 5da40a5
support all_layers, backup mode in OVQuantizer
anzr299 9e65a7e
clean up and use new nncf method for obtaining compression parameters
anzr299 53e0f4c
review changes & update method names according to wc algo
anzr299 bf95930
review changes
anzr299 2d4bec7
review changes
anzr299 0a2e361
Update export_llama_lib.py
anzr299 4c86a9c
enable group_size parameter for nncf compression
cavusmustafa 46ed3f6
Update README.md
cavusmustafa 0a1256e
Update README.md
cavusmustafa f2151e3
Update README.md
cavusmustafa dfc8eab
openvino backend build script updates
cavusmustafa 2ac8a8c
Update README.md
cavusmustafa 35444ae
Update README.md
cavusmustafa 1cfbf0b
Merge branch 'main' into openvino_llama_support
cavusmustafa 5b8b633
formatting fix
cavusmustafa f4a1423
formatting fix
cavusmustafa 44f0883
formatting fix
cavusmustafa 5f657d3
formatting fix
cavusmustafa eafcc33
formatting fix
cavusmustafa 1763b99
formatting fix
cavusmustafa 4863826
formatting fix
cavusmustafa e24072f
formatting fix
cavusmustafa b9bb5f0
formatting fix
cavusmustafa 291dcd9
formatting fix
cavusmustafa c8ea777
use new transformations
anzr299 a6b605f
add comment for manual MP allocation
anzr299 9614fc4
remove nncf_compression from export llama lib
anzr299 45007cf
change pt2e quantize flag to use openvino_4wo instead of openvino_8da…
anzr299 9d49414
follow up to last commit
anzr299 d6727cf
update quantizer lib with openvino_4wo
anzr299 4a0a781
split qspec function into 2 parts; 1 for WC and other for PTQ qspecs
anzr299 f6a1ee3
micro fix
anzr299 d285fcc
udpate mixed precision layers for higher accuracy. Change INT4 mode t…
anzr299 4e66df1
Apply suggestions from code review
anzr299 e850e41
Review changes
anzr299 204043f
review changes in quantizer
anzr299 ae6b089
revert extra args changes
anzr299 a6f036c
Merge branch 'openvino_llama_support' of https://github.com/anzr299/e…
anzr299 2de5693
precommit fixes
anzr299 0e10f28
revert _calculate_qparams back to calculate_qparams
anzr299 05f5a92
remove manual ignored nodes
anzr299 fbe0e21
add ratio to quantizer initialization
anzr299 6bff1cd
Update export_llama_lib.py
anzr299 d744ae9
Update quantizer_lib.py
anzr299 21c43fe
Merge pull request #9 from anzr299/an/ovquantizer
suryasidd b874204
Updated NNCF commit id
suryasidd 08280ed
Merge branch 'main' into openvino_llama_support
suryasidd 35f1d84
Update README.md
cavusmustafa 41ac36a
openvino llama export configuration - initial
cavusmustafa 4426541
Update README.md
cavusmustafa 6b936c5
Update README.md
cavusmustafa 08461ec
updated ov llama config file
cavusmustafa be85af8
Update README.md
cavusmustafa bba4a01
Update README.md
cavusmustafa 1421921
Update README.md with quantization paragraph
anzr299 cf0e71c
Merge pull request #10 from anzr299/patch-3
cavusmustafa f050eea
formatting fix
cavusmustafa 4bfdca9
Update README.md
cavusmustafa 16aba1b
Update non_cpu_backends.md for OpenVINO instructions
cavusmustafa 155529f
Update llama instructions link for OpenVINO backend
cavusmustafa 5875aa8
Remove OpenVINO from non_cpu_backends.md
cavusmustafa 2630fd6
Update llama instructions for OpenVINO backend
cavusmustafa 6d0cbc5
Removed the comma which was added by mistake
cavusmustafa 3fbefec
Added NPU in choices
suryasidd c97bd09
Merge branch 'main' into openvino_llama_support
suryasidd 12e51c7
Fixed ref links
suryasidd d3d3ae0
Merge branch 'main' into openvino_llama_support
suryasidd 72331f5
Added Remove clone ops transformation to OpenVINO backend
suryasidd 8016165
Fixed variable names
suryasidd f0d9fc7
Added extended support list for openvino backend
cavusmustafa 9b41c28
formating fix
cavusmustafa e751726
formatting fix
cavusmustafa 1736571
Merge pull request #11 from cavusmustafa/remove_clone_ops
cavusmustafa 8106204
Added DimorderOpsRevertPass to Openvino backend
suryasidd 04ca3f3
Merge remote-tracking branch 'cavus/main' into openvino_llama_support
suryasidd 62f74a8
Merge branch 'main' into openvino_llama_support
suryasidd eaf0e17
Fixed linter issues
suryasidd 15f5e23
Merge branch 'main' into openvino_llama_support
suryasidd 3b358d5
Merge branch 'main' into openvino_llama_support
suryasidd 8efba17
Merge branch 'main' into openvino_llama_support
suryasidd 229bbd2
Use defualt runner for OpenVINO backend as well
suryasidd 0525d9c
Merge pull request #12 from suryasidd/runner_changes
cavusmustafa 24f67b6
Merge branch 'main' into openvino_llama_support
suryasidd 82bc4c5
Merge branch 'main' into openvino_llama_support
suryasidd 1428d81
Changed quantization scheme
suryasidd caba225
Merge branch 'main' into openvino_llama_support
suryasidd File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -1,3 +1,3 @@ | ||
| from .quantizer import OpenVINOQuantizer, quantize_model | ||
| from .quantizer import OpenVINOQuantizer, QuantizationMode, quantize_model | ||
|
|
||
| __all__ = ["OpenVINOQuantizer", "quantize_model"] | ||
| __all__ = ["OpenVINOQuantizer", "quantize_model", "QuantizationMode"] |
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.