Update llama3 8b fp16 mlir #15

IanWood1 · 2025-06-10T21:47:46Z

Export command:

python3 -m sharktank.examples.export_paged_llm_v1 \
	--irpa-file=/shark-dev/data/llama3.1/weights/8b/fp16/llama3.1_8b_fp16.irpa \
	--output-mlir=model.mlir --output-config=/dev/null --bs-prefill=4 \
	--bs-decode=4  --attention-dtype=float16 --activation-dtype=float16 \
	--use-attention-mask --use-hf --kv-cache-dtype=float16

Signed-off-by: Ian Wood <[email protected]>

kuhar

Do we know what changed?

(Maybe should be fine to update the IR regardless?)

IanWood1 · 2025-06-11T22:34:49Z

Do we know what changed?

(Maybe should be fine to update the IR regardless?)

Looks like there are some reshapes that aren't getting folded causing iree_linalg_ext.gather to fail to fuse. llvm/llvm-project#142827 fixes it for llama & mistral. With the change, the before vs after decode times are about equal.

kuhar · 2025-06-12T14:53:24Z

Looks like there are some reshapes that aren't getting folded causing iree_linalg_ext.gather to fail to fuse.

@Groverkss

IanWood1 · 2025-06-20T16:22:41Z

Updating both instead at #16

Update llama3 8b fp16 mlir

6cfa1e4

Signed-off-by: Ian Wood <[email protected]>

IanWood1 marked this pull request as ready for review June 10, 2025 21:56

IanWood1 requested review from Groverkss and MaheshRavishankar June 10, 2025 21:56

kuhar reviewed Jun 11, 2025

View reviewed changes

IanWood1 closed this Jun 20, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Update llama3 8b fp16 mlir #15

Update llama3 8b fp16 mlir #15

Uh oh!

IanWood1 commented Jun 10, 2025 •

edited

Loading

Uh oh!

kuhar left a comment

Uh oh!

IanWood1 commented Jun 11, 2025 •

edited

Loading

Uh oh!

kuhar commented Jun 12, 2025

Uh oh!

IanWood1 commented Jun 20, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Update llama3 8b fp16 mlir #15

Update llama3 8b fp16 mlir #15

Uh oh!

Conversation

IanWood1 commented Jun 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

kuhar left a comment

Choose a reason for hiding this comment

Uh oh!

IanWood1 commented Jun 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

kuhar commented Jun 12, 2025

Uh oh!

IanWood1 commented Jun 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

IanWood1 commented Jun 10, 2025 •

edited

Loading

IanWood1 commented Jun 11, 2025 •

edited

Loading

IanWood1 commented Jun 20, 2025 •

edited

Loading