Skip to content

Commit 9f8d27b

Browse files
committed
fix: Qwen3 memory estimation
1 parent 2171284 commit 9f8d27b

File tree

2 files changed

+3
-1
lines changed

2 files changed

+3
-1
lines changed

docs/blog/v3.12-gpt-oss.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
---
22
title: gpt-oss is here!
3-
date: 2025-08-09T15:00:00Z
3+
date: 2025-08-09T18:00:00Z
44
lastUpdated: false
55
author:
66
name: Gilad S.

src/gguf/insights/GgufInsights.ts

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -310,6 +310,8 @@ export class GgufInsights {
310310
// )
311311
// );
312312
// }
313+
} else if (this._ggufFileInfo.metadata.general?.architecture === GgufArchitectureType.qwen3) {
314+
return int32TBytes * batchSize * (embeddingLength + (kvSize * headCount));
313315
} else if (expertCount > 0) {
314316
const expertsUsedCount = this._ggufFileInfo.architectureMetadata.expert_used_count ?? 2;
315317

0 commit comments

Comments
 (0)