5454 < div class ="local-toc "> < ul >
5555< li > < a class ="reference internal " href ="# "> Intel® Extension for PyTorch* Large Language Model (LLM) Feature Get Started For Llama 3 models</ a > </ li >
5656< li > < a class ="reference internal " href ="#environment-setup "> 1. Environment Setup</ a > < ul >
57- < li > < a class ="reference internal " href ="#conda-based-environment-setup-with-pre-built-wheels-on-windows-11-home "> 1.1 Conda-based environment setup with pre-built wheels on Windows 11 Home </ a > </ li >
57+ < li > < a class ="reference internal " href ="#conda-based-environment-setup-with-pre-built-wheels-on-windows-11 "> 1.1 Conda-based environment setup with pre-built wheels on Windows 11</ a > </ li >
5858</ ul >
5959</ li >
6060< li > < a class ="reference internal " href ="#how-to-run-llama-3 "> 2. How To Run Llama 3</ a > < ul >
6161< li > < a class ="reference internal " href ="#usage-of-running-llama-3-models "> 2.1 Usage of running Llama 3 models</ a > < ul >
6262< li > < a class ="reference internal " href ="#int4-woq-model "> 2.1.1 INT4 WOQ Model</ a > </ li >
63- < li > < a class ="reference internal " href ="#measure-llama-3-woq-int4-performance-on-windows-11-home "> 2.1.2 Measure Llama 3 WOQ INT4 Performance on Windows 11 Home </ a > </ li >
64- < li > < a class ="reference internal " href ="#validate-llama-3-woq-int4-accuracy-on-windows-11-home "> 2.1.3 Validate Llama 3 WOQ INT4 Accuracy on Windows 11 Home </ a > </ li >
63+ < li > < a class ="reference internal " href ="#measure-llama-3-woq-int4-performance-on-windows-11 "> 2.1.2 Measure Llama 3 WOQ INT4 Performance on Windows 11</ a > </ li >
64+ < li > < a class ="reference internal " href ="#validate-llama-3-woq-int4-accuracy-on-windows-11 "> 2.1.3 Validate Llama 3 WOQ INT4 Accuracy on Windows 11</ a > </ li >
6565</ ul >
6666</ li >
6767< li > < a class ="reference internal " href ="#miscellaneous-tips "> Miscellaneous Tips</ a > </ li >
@@ -99,8 +99,8 @@ <h1>Intel® Extension for PyTorch* Large Language Model (LLM) Feature Get Starte
9999</ section >
100100< section id ="environment-setup ">
101101< h1 > 1. Environment Setup< a class ="headerlink " href ="#environment-setup " title ="Link to this heading "> </ a > </ h1 >
102- < section id ="conda-based-environment-setup-with-pre-built-wheels-on-windows-11-home ">
103- < h2 > 1.1 Conda-based environment setup with pre-built wheels on Windows 11 Home < a class ="headerlink " href ="#conda-based-environment-setup-with-pre-built-wheels-on-windows-11-home " title ="Link to this heading "> </ a > </ h2 >
102+ < section id ="conda-based-environment-setup-with-pre-built-wheels-on-windows-11 ">
103+ < h2 > 1.1 Conda-based environment setup with pre-built wheels on Windows 11< a class ="headerlink " href ="#conda-based-environment-setup-with-pre-built-wheels-on-windows-11 " title ="Link to this heading "> </ a > </ h2 >
104104< div class ="highlight-bash notranslate "> < div class ="highlight "> < pre > < span > </ span > < span class ="c1 "> # Install Visual Studio 2022</ span >
105105https://visualstudio.microsoft.com/zh-hans/thank-you-downloading-visual-studio/?sku< span class ="o "> =</ span > Community< span class ="p "> &</ span > < span class ="nv "> channel</ span > < span class ="o "> =</ span > Release< span class ="p "> &</ span > < span class ="nv "> version</ span > < span class ="o "> =</ span > VS2022< span class ="p "> &</ span > < span class ="nv "> source</ span > < span class ="o "> =</ span > VSLandingPage< span class ="p "> &</ span > < span class ="nv "> cid</ span > < span class ="o "> =</ span > < span class ="m "> 2030</ span > < span class ="p "> &</ span > < span class ="nv "> passive</ span > < span class ="o "> =</ span > < span class ="nb "> false</ span >
106106
@@ -122,7 +122,7 @@ <h2>1.1 Conda-based environment setup with pre-built wheels on Windows 11 Home<a
122122pip< span class ="w "> </ span > install< span class ="w "> </ span > https://intel-extension-for-pytorch.s3.amazonaws.com/ipex_dev/xpu/torch-2.1.0a0%2Bgit04048c2-cp39-cp39-win_amd64.whl
123123
124124< span class ="c1 "> # Install Intel® Extension for PyTorch*</ span >
125- pip< span class ="w "> </ span > install< span class ="w "> </ span > https://intel-extension-for-pytorch.s3.amazonaws.com/ipex_dev/xpu/intel_extension_for_pytorch-2.1.30%2Bgit03c5535 -cp39-cp39-win_amd64.whl
125+ pip< span class ="w "> </ span > install< span class ="w "> </ span > https://intel-extension-for-pytorch.s3.amazonaws.com/ipex_dev/xpu/intel_extension_for_pytorch-2.1.30%2Bgit6661060 -cp39-cp39-win_amd64.whl
126126
127127< span class ="c1 "> # Install Intel® Extension for Transformers*</ span >
128128git< span class ="w "> </ span > clone< span class ="w "> </ span > https://github.com/intel/intel-extension-for-transformers.git< span class ="w "> </ span > intel-extension-for-transformers< span class ="w "> </ span > -b< span class ="w "> </ span > xpu_lm_head< span class ="w "> </ span >
@@ -221,8 +221,8 @@ <h3>2.1.1 INT4 WOQ Model<a class="headerlink" href="#int4-woq-model" title="Link
221221</ div >
222222< p > The int4 model is saved in folder ~/llama3_all_int4.</ p >
223223</ section >
224- < section id ="measure-llama-3-woq-int4-performance-on-windows-11-home ">
225- < h3 > 2.1.2 Measure Llama 3 WOQ INT4 Performance on Windows 11 Home < a class ="headerlink " href ="#measure-llama-3-woq-int4-performance-on-windows-11-home " title ="Link to this heading "> </ a > </ h3 >
224+ < section id ="measure-llama-3-woq-int4-performance-on-windows-11 ">
225+ < h3 > 2.1.2 Measure Llama 3 WOQ INT4 Performance on Windows 11< a class ="headerlink " href ="#measure-llama-3-woq-int4-performance-on-windows-11 " title ="Link to this heading "> </ a > </ h3 >
226226< ul class ="simple ">
227227< li > < p > Command:</ p > </ li >
228228</ ul >
@@ -232,14 +232,20 @@ <h3>2.1.2 Measure Llama 3 WOQ INT4 Performance on Windows 11 Home<a class="heade
232232</ pre > </ div >
233233</ div >
234234</ section >
235- < section id ="validate-llama-3-woq-int4-accuracy-on-windows-11-home ">
236- < h3 > 2.1.3 Validate Llama 3 WOQ INT4 Accuracy on Windows 11 Home < a class ="headerlink " href ="#validate-llama-3-woq-int4-accuracy-on-windows-11-home " title ="Link to this heading "> </ a > </ h3 >
235+ < section id ="validate-llama-3-woq-int4-accuracy-on-windows-11 ">
236+ < h3 > 2.1.3 Validate Llama 3 WOQ INT4 Accuracy on Windows 11< a class ="headerlink " href ="#validate-llama-3-woq-int4-accuracy-on-windows-11 " title ="Link to this heading "> </ a > </ h3 >
237237< ul class ="simple ">
238238< li > < p > Command:</ p > </ li >
239239</ ul >
240240< div class ="highlight-bash notranslate "> < div class ="highlight "> < pre > < span > </ span > < span class ="nb "> set</ span > < span class ="w "> </ span > < span class ="nv "> LLM_ACC_TEST</ span > < span class ="o "> =</ span > < span class ="m "> 1</ span > < span class ="w "> </ span >
241+ python< span class ="w "> </ span > run_generation_gpu_woq_for_llama.py< span class ="w "> </ span > --model< span class ="w "> </ span > < span class ="si "> ${</ span > < span class ="nv "> PATH</ span > < span class ="p "> /TO/MODEL</ span > < span class ="si "> }</ span > < span class ="w "> </ span > --accuracy< span class ="w "> </ span > --task< span class ="w "> </ span > < span class ="s2 "> "openbookqa"</ span >
241242python< span class ="w "> </ span > run_generation_gpu_woq_for_llama.py< span class ="w "> </ span > --model< span class ="w "> </ span > < span class ="si "> ${</ span > < span class ="nv "> PATH</ span > < span class ="p "> /TO/MODEL</ span > < span class ="si "> }</ span > < span class ="w "> </ span > --accuracy< span class ="w "> </ span > --task< span class ="w "> </ span > < span class ="s2 "> "piqa"</ span >
243+ python< span class ="w "> </ span > run_generation_gpu_woq_for_llama.py< span class ="w "> </ span > --model< span class ="w "> </ span > < span class ="si "> ${</ span > < span class ="nv "> PATH</ span > < span class ="p "> /TO/MODEL</ span > < span class ="si "> }</ span > < span class ="w "> </ span > --accuracy< span class ="w "> </ span > --task< span class ="w "> </ span > < span class ="s2 "> "rte"</ span >
244+ python< span class ="w "> </ span > run_generation_gpu_woq_for_llama.py< span class ="w "> </ span > --model< span class ="w "> </ span > < span class ="si "> ${</ span > < span class ="nv "> PATH</ span > < span class ="p "> /TO/MODEL</ span > < span class ="si "> }</ span > < span class ="w "> </ span > --accuracy< span class ="w "> </ span > --task< span class ="w "> </ span > < span class ="s2 "> "truthfulqa_mc1"</ span >
245+
242246*Note:*< span class ="w "> </ span > replace< span class ="w "> </ span > < span class ="si "> ${</ span > < span class ="nv "> PATH</ span > < span class ="p "> /TO/MODEL</ span > < span class ="si "> }</ span > < span class ="w "> </ span > with< span class ="w "> </ span > actual< span class ="w "> </ span > Llama< span class ="w "> </ span > < span class ="m "> 3</ span > < span class ="w "> </ span > INT4< span class ="w "> </ span > model< span class ="w "> </ span > < span class ="nb "> local</ span > < span class ="w "> </ span > path
247+ *Note:*< span class ="w "> </ span > you< span class ="w "> </ span > may< span class ="w "> </ span > validate< span class ="w "> </ span > the< span class ="w "> </ span > Llama< span class ="w "> </ span > < span class ="m "> 3</ span > < span class ="w "> </ span > WOQ< span class ="w "> </ span > INT4< span class ="w "> </ span > accuracy< span class ="w "> </ span > using< span class ="w "> </ span > any< span class ="w "> </ span > task< span class ="w "> </ span > listed< span class ="w "> </ span > above,< span class ="w "> </ span > such< span class ="w "> </ span > as< span class ="w "> </ span > the< span class ="w "> </ span > first< span class ="w "> </ span > < span class ="nb "> command</ span > < span class ="w "> </ span > with< span class ="w "> </ span > < span class ="s2 "> "openbookqa"</ span > < span class ="w "> </ span > only,
248+ or< span class ="w "> </ span > validate< span class ="w "> </ span > all< span class ="w "> </ span > of< span class ="w "> </ span > them,< span class ="w "> </ span > depending< span class ="w "> </ span > on< span class ="w "> </ span > your< span class ="w "> </ span > needs.< span class ="w "> </ span > Please< span class ="w "> </ span > expect< span class ="w "> </ span > more< span class ="w "> </ span > < span class ="nb "> time</ span > < span class ="w "> </ span > needed< span class ="w "> </ span > < span class ="k "> for</ span > < span class ="w "> </ span > executing< span class ="w "> </ span > more< span class ="w "> </ span > than< span class ="w "> </ span > one< span class ="w "> </ span > task.
243249</ pre > </ div >
244250</ div >
245251</ section >
@@ -264,7 +270,7 @@ <h2>Miscellaneous Tips<a class="headerlink" href="#miscellaneous-tips" title="Li
264270 Built with < a href ="https://www.sphinx-doc.org/ "> Sphinx</ a > using a
265271 < a href ="https://github.com/readthedocs/sphinx_rtd_theme "> theme</ a >
266272 provided by < a href ="https://readthedocs.org "> Read the Docs</ a > .
267- < jinja2 .runtime.BlockReference object at 0x7f1deffaf130 >
273+ < jinja2 .runtime.BlockReference object at 0x7fcd27dcb7c0 >
268274< p > </ p > < div > < a href ='https://www.intel.com/content/www/us/en/privacy/intel-cookie-notice.html ' data-cookie-notice ='true '> Cookies</ a > < a href ='https://www.intel.com/content/www/us/en/privacy/intel-privacy-notice.html '> | Privacy</ a > < a href ="/# " data-wap_ref ="dns " id ="wap_dns "> < small > | Your Privacy Choices</ small > </ a > < a href =https://www.intel.com/content/www/us/en/privacy/privacy-residents-certain-states.html data-wap_ref ="nac " id ="wap_nac "> < small > | Notice at Collection</ small > </ a > </ div > < p > </ p > < div > © Intel Corporation. Intel, the Intel logo, and other Intel marks are trademarks of Intel Corporation or its subsidiaries. Other names and brands may be claimed as the property of others. No license (express or implied, by estoppel or otherwise) to any intellectual property rights is granted by this document, with the sole exception that code included in this document is licensed subject to the Zero-Clause BSD open source license (OBSD), < a href ='http://opensource.org/licenses/0BSD '> http://opensource.org/licenses/0BSD</ a > . </ div >
269275
270276
@@ -280,4 +286,4 @@ <h2>Miscellaneous Tips<a class="headerlink" href="#miscellaneous-tips" title="Li
280286 </ script >
281287
282288</ body >
283- </ html >
289+ </ html >
0 commit comments