Commit 10f72fc
authored
Fix flaky testPhi4 and testVoxtral by setting temperature=0 (#16517)
Summary:
Both tests were flaky because LLM outputs are non-deterministic with the
default temperature of 0.8, which uses RNG-based sampling with a
time-based seed. Setting temperature=0 enables greedy argmax decoding,
eliminating randomness and making assertions on generated text reliable.
This is consistent with how other LLM tests and production runners in
the codebase handle determinism (e.g., test_text_decoder_runner.cpp,
test_sampler.cpp, and QNN/QAI Hub runners).
This fixes 5 flaky tests
{F1984490494}
Reviewed By: shoumikhin
Differential Revision: D903611871 parent f0edae2 commit 10f72fc
File tree
2 files changed
+3
-0
lines changed- extension/llm/apple/ExecuTorchLLM/__tests__
2 files changed
+3
-0
lines changedLines changed: 1 addition & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
238 | 238 | | |
239 | 239 | | |
240 | 240 | | |
| 241 | + | |
241 | 242 | | |
242 | 243 | | |
243 | 244 | | |
| |||
Lines changed: 2 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
87 | 87 | | |
88 | 88 | | |
89 | 89 | | |
| 90 | + | |
90 | 91 | | |
91 | 92 | | |
92 | 93 | | |
| |||
100 | 101 | | |
101 | 102 | | |
102 | 103 | | |
| 104 | + | |
103 | 105 | | |
104 | 106 | | |
105 | 107 | | |
| |||
0 commit comments