Commit 8680d1a
Added check to not pass Custom_IO yaml when model weight and pkv are both in bfloat16.
Added a patch incloud infer to map bfloat16 or 11 key type to np.float16 for AI200 inference.
Signed-off-by: Dhiraj Kumar Sah <dhirajku@qti.qualcomm.com>
Signed-off-by: Asmita Goswami <asmigosw@qti.qualcomm.com>1 parent 2915e49 commit 8680d1a
2 files changed
+9
-1
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
538 | 538 | | |
539 | 539 | | |
540 | 540 | | |
| 541 | + | |
| 542 | + | |
541 | 543 | | |
542 | 544 | | |
543 | 545 | | |
544 | 546 | | |
545 | 547 | | |
546 | | - | |
| 548 | + | |
| 549 | + | |
| 550 | + | |
| 551 | + | |
| 552 | + | |
| 553 | + | |
547 | 554 | | |
548 | 555 | | |
549 | 556 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
65 | 65 | | |
66 | 66 | | |
67 | 67 | | |
| 68 | + | |
68 | 69 | | |
69 | 70 | | |
70 | 71 | | |
| |||
0 commit comments