Skip to content

Commit 395d3f5

Browse files
authored
enable parallel prefill again
Differential Revision: D61751873 Pull Request resolved: #4893
1 parent f92139f commit 395d3f5

File tree

2 files changed

+1
-2
lines changed

2 files changed

+1
-2
lines changed

examples/models/llama2/runner/runner.cpp

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -126,7 +126,7 @@ Error Runner::load() {
126126
tokenizer_.get(),
127127
text_decoder_runner_.get(),
128128
metadata_.at(kUseKVCache),
129-
enable_parallel_prefill_);
129+
metadata_.at(kEnableDynamicShape));
130130

131131
text_token_generator_ = std::make_unique<TextTokenGenerator>(
132132
tokenizer_.get(),

examples/models/llama2/runner/runner.h

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -45,7 +45,6 @@ class Runner {
4545

4646
private:
4747
float temperature_;
48-
bool enable_parallel_prefill_;
4948
bool shouldStop_{false};
5049

5150
// model

0 commit comments

Comments
 (0)