-
Notifications
You must be signed in to change notification settings - Fork 1
OpenAI Harmony
Park Woorak edited this page Jan 26, 2026
·
2 revisions
The content of the "analysis" channel in the conversation is desired to be excluded from the rendered text, after the "final" channel message based on it is generated (unless auto_drop_analysis=False is specified in the configuration).
last_final_idx is desired instead of first_final_idx.
So .position() is replaced with .rposition(), along with the variable names.
diff --git a/src/encoding.rs b/src/encoding.rs
index 60257e7..d4ef185 100644
--- a/src/encoding.rs
+++ b/src/encoding.rs
@@ -204,16 +204,16 @@ impl HarmonyEncoding {
let should_drop_analysis =
config.is_some_and(|c| c.auto_drop_analysis && last_assistant_is_final);
- let first_final_idx = messages
+ let last_final_idx = messages
.iter()
- .position(|msg| msg.channel.as_deref() == Some("final"));
+ .rposition(|msg| msg.channel.as_deref() == Some("final"));
let result = messages
.iter()
.enumerate()
.filter(|(idx, msg)| {
!(should_drop_analysis
- && first_final_idx.is_some_and(|first| *idx < first)
+ && last_final_idx.is_some_and(|last| *idx < last)
&& msg.channel.as_deref() == Some("analysis"))
})
.try_for_each(|(_, msg)| self.render_into(msg, into, Some(&render_options)));A boolean variable last_assistant_is_final used for the condition should_drop_analysis is removed since it is no more needed.
diff --git a/src/encoding.rs b/src/encoding.rs
index 60257e7..f1ce32b 100644
--- a/src/encoding.rs
+++ b/src/encoding.rs
@@ -192,17 +192,9 @@ impl HarmonyEncoding {
let render_options = RenderOptions {
conversation_has_function_tools: has_function_tools,
};
- let last_assistant_is_final = messages
- .iter()
- .rev()
- .find_map(|msg| {
- (msg.author.role == Role::Assistant)
- .then(|| msg.channel.as_deref() == Some("final"))
- })
- .unwrap_or(false);
let should_drop_analysis =
- config.is_some_and(|c| c.auto_drop_analysis && last_assistant_is_final);
+ config.is_some_and(|c| c.auto_drop_analysis);
let first_final_idx = messages
.iter()- gpt-oss-tvm
- gpt-oss
- Model Card
- Blog post
- GitHub
- [Huggingface] gpt-oss-20b
- [Huggingface] gpt-oss-120b
- TVM
- MLC LLM
-
Attention & Sliding Window
- Computing attentions in TVM
- Sink Token Workaround
-
Mixture-of-Experts (MoE)
- TIR-based MoE Einsum
- Gating Network Implementation
- Comparison with Standard TVM Approaches
-
RoPE with YaRN
- What is YaRN?
- Limitations in Existing TVM Implementations
- Our Improvements
-
TIR-based support for MXFP4
- What is MXFP4?
- MXFP4 TIR Implementation
- Operator Fusion