Skip to content

Nixl optimization for llama4 local attention#87

Closed
mgoin wants to merge 59 commits intopd-launch-branchfrom
nixl-l4-opt
Closed

Nixl optimization for llama4 local attention#87
mgoin wants to merge 59 commits intopd-launch-branchfrom
nixl-l4-opt

Conversation

@mgoin
Copy link
Member

@mgoin mgoin commented May 15, 2025

No description provided.

ekagra-ranjan and others added 30 commits May 14, 2025 12:31
…aft model to free ~1GB for llama 3 model (vllm-project#17326)

Co-authored-by: root <root@ekagra-8xh100.us-east5-a.c.serving-efficiency-poc.internal>
Co-authored-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
Signed-off-by: mgoin <mgoin64@gmail.com>
Signed-off-by: mgoin <mgoin64@gmail.com>
Signed-off-by: mgoin <mgoin64@gmail.com>
Co-authored-by: Aaron Pham <Aaronpham0103@gmail.com>
Signed-off-by: Lucas Wilkinson <lwilkinson@neuralmagic.com>
Signed-off-by: Aaron Pham <contact@aarnphm.xyz>
Co-authored-by: Russell Bryant <rbryant@redhat.com>
Signed-off-by: mgoin <mgoin64@gmail.com>
)

Signed-off-by: Russell Bryant <rbryant@redhat.com>
Signed-off-by: mgoin <mgoin64@gmail.com>
Signed-off-by: Nick Hill <nhill@redhat.com>
Co-authored-by: Nick Hill <nhill@redhat.com>
Signed-off-by: Mark McLoughlin <markmc@redhat.com>
Signed-off-by: Aaron Pham <contact@aarnphm.xyz>
Signed-off-by: reidliu41 <reid201711@gmail.com>
Co-authored-by: reidliu41 <reid201711@gmail.com>
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
…llm-project#18013)

Signed-off-by: Thomas Parnell <tpa@zurich.ibm.com>
Co-authored-by: Lucas Wilkinson <lwilkinson@neuralmagic.com>
Signed-off-by: Andy Xie <andy.xning@gmail.com>
Signed-off-by: inkcherry <mingzhi.liu@intel.com>
Signed-off-by: David Xia <david@davidxia.com>
Signed-off-by: Russell Bryant <rbryant@redhat.com>
Signed-off-by: omahs <73983677+omahs@users.noreply.github.com>
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Alexei-V-Ivanov-AMD and others added 21 commits May 15, 2025 11:01
Signed-off-by: Lucia Fang <fanglu@fb.com>
Signed-off-by: Lucas Wilkinson <lwilkinson@neuralmagic.com>
…-project#18229)

Signed-off-by: Lucas Wilkinson <lwilkinson@neuralmagic.com>
…attention on ROCm (vllm-project#18093)

Signed-off-by: kf <kuanfu.liu@embeddedllm.com>
Signed-off-by: lisiqi23 <lisiqi23@xiaomi.com>
Signed-off-by: skylee-01 <497627264@qq.com>
Co-authored-by: lisiqi23 <lisiqi23@xiaomi.com>
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Signed-off-by: David Xia <david@davidxia.com>
vllm-project#17973)

Signed-off-by: Vadim Gimpelson <vadim.gimpelson@centml.ai>
Signed-off-by: Seiji Eicher <seiji@anyscale.com>
Signed-off-by: Felix Marty <felmarty@amd.com>
Signed-off-by: learner0810 <zhongjun.li@daocloud.io>
Signed-off-by: reidliu41 <reid201711@gmail.com>
Co-authored-by: reidliu41 <reid201711@gmail.com>
Signed-off-by: mgoin <mgoin64@gmail.com>
Signed-off-by: mgoin <mgoin64@gmail.com>
@github-actions
Copy link

This pull request has been automatically marked as stale because it has not had any activity within 90 days. It will be automatically closed if no further activity occurs within 30 days. Leave a comment if you feel this pull request should remain open. Thank you!

@github-actions github-actions bot added the stale label Aug 15, 2025
@github-actions
Copy link

This pull request has been automatically closed due to inactivity. Please feel free to reopen if you intend to continue working on it. Thank you!

@github-actions github-actions bot closed this Sep 15, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.