Skip to content
Discussion options

You must be logged in to vote

Yes, inference-time techniques are still useful for reasoning/thinking models. We have a number of them in the repo that demonstrate that - for instance see autothink and deepthink. Even in reasoning models both sequential inference-time scaling by generating more tokens and parallel inference-time scaling by combining multiple parallel responses helps improve the accuracy. This is similar to how Grok-Heavy or Gemini-DeepThink work. In addition, there is work to make the reasoning more efficient as we did in AutoThink - https://huggingface.co/blog/codelion/autothink

Replies: 1 comment

Comment options

You must be logged in to vote
0 replies
Answer selected by itsmeknt
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
2 participants