Skip to content

Commit b38ccd5

Browse files
committed
streamingLLM fix2
1 parent 1ac70dc commit b38ccd5

File tree

1 file changed

+1
-1
lines changed

1 file changed

+1
-1
lines changed

src/models/jiuge/jiuge.cpp

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -134,7 +134,7 @@ void inferDeviceBatch(const JiugeMeta &meta, DeviceResource &rsrc,
134134
// sparse attention
135135
auto ratio = 0.2;
136136
int attentionSinkWindow = 4;
137-
bool sparseOn = true
137+
bool sparseOn = true;
138138

139139
// Allocate buffers
140140
auto logits_in = Tensor::buffer(dt_logits, {ntok, d}, rsrc.memory_pool);

0 commit comments

Comments
 (0)