Commit 1e15c1a
committed
Added check to not pass Custom_IO yaml when model weight and pkv are both in bfloat16.
Added a patch incloud infer to map bfloat16 or 11 key type to np.float16 for AI200 inference.
Signed-off-by: Dhiraj Kumar Sah <dhirajku@qti.qualcomm.com>1 parent 6af261e commit 1e15c1a
2 files changed
+9
-1
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
545 | 545 | | |
546 | 546 | | |
547 | 547 | | |
| 548 | + | |
| 549 | + | |
548 | 550 | | |
549 | 551 | | |
550 | 552 | | |
551 | 553 | | |
552 | 554 | | |
553 | | - | |
| 555 | + | |
| 556 | + | |
| 557 | + | |
| 558 | + | |
| 559 | + | |
| 560 | + | |
554 | 561 | | |
555 | 562 | | |
556 | 563 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
65 | 65 | | |
66 | 66 | | |
67 | 67 | | |
| 68 | + | |
68 | 69 | | |
69 | 70 | | |
70 | 71 | | |
| |||
0 commit comments