[CI][TorchModels] Update llama 8b fp16 golden time (iree-org#22426)

jtuyls · pstarkcdpr · commit c39572af6bb7 · 2025-11-28T13:55:09.000-08:00
The latency of the prefill phase of llama 8b f16 with sequence length 128 should have improved with this PR: iree-org#22393, so bumping the golden time here. Signed-off-by: Jorn Tuyls <jorn.tuyls@gmail.com>
diff --git a/tests/external/iree-test-suites/torch_models/llama_8b_fp16/prefill_benchmark_seq128_mi325.json b/tests/external/iree-test-suites/torch_models/llama_8b_fp16/prefill_benchmark_seq128_mi325.json
@@ -36,5 +36,5 @@
         "value": "34x2097152xf16"
       }
     ],
-    "golden_time_ms": 42.0
+    "golden_time_ms": 29.5
 }

Original file line number	Diff line number	Diff line change
`@@ -36,5 +36,5 @@`
`36`	`36`	`"value": "34x2097152xf16"`
`37`	`37`	`}`
`38`	`38`	`],`
`39`		`- "golden_time_ms": 42.0`
	`39`	`+ "golden_time_ms": 29.5`
`40`	`40`	`}`