Skip to content

Commit eb9084a

Browse files
authored
⬆️ Use Correct Encoding Type for GPT-4o (#441)
Upgraded jtokkit to a newer version that supports the encoding type we need.
1 parent 9170c87 commit eb9084a

File tree

2 files changed

+2
-6
lines changed

2 files changed

+2
-6
lines changed

gradle/libs.versions.toml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@ javaparser-symbolsolver = "3.15.15"
66
java-security-toolkit = "1.2.0"
77
java-security-toolkit-xstream = "1.0.2"
88
javax-inject = "1"
9-
jtokkit = "0.6.1"
9+
jtokkit = "1.1.0"
1010
commons-jexl = "3.2.1"
1111
logback = "1.4.5"
1212
maven = "3.8.7"

plugins/codemodder-plugin-llm/src/main/java/io/codemodder/plugins/llm/StandardModel.java

Lines changed: 1 addition & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -24,13 +24,9 @@ public int tokens(final List<String> messages) {
2424
}
2525
},
2626
GPT_4O_2024_05_13("gpt-4o-2024-05-13", 128_000) {
27-
/**
28-
* This is wrong - we copy / pasted from GPT 3.5 while we await GPT-4o token counting support <a
29-
* href="https://github.com/knuddelsgmbh/jtokkit/issues/96">from upstream utility</a>.
30-
*/
3127
@Override
3228
public int tokens(final List<String> messages) {
33-
return Tokens.countTokens(messages, 3, EncodingType.CL100K_BASE);
29+
return Tokens.countTokens(messages, 3, EncodingType.O200K_BASE);
3430
}
3531
};
3632

0 commit comments

Comments
 (0)