File tree Expand file tree Collapse file tree 1 file changed +19
-0
lines changed Expand file tree Collapse file tree 1 file changed +19
-0
lines changed Original file line number Diff line number Diff line change @@ -163,6 +163,25 @@ response = client.chat.completions.create(
163163)
164164```
165165
166+ You can also use the alternate decoding techniques like ` cot_decoding ` and ` entropy_decoding ` directly with the local inference server.
167+
168+ ``` python
169+ response = client.chat.completions.create(
170+ model = " meta-llama/Llama-3.2-1B-Instruct" ,
171+ messages = messages,
172+ temperature = 0.2 ,
173+ extra_body = {
174+ " decoding" : " cot_decoding" , # or "entropy_decoding"
175+ # CoT specific params
176+ " k" : 10 ,
177+ " aggregate_paths" : True ,
178+ # OR Entropy specific params
179+ " top_k" : 27 ,
180+ " min_p" : 0.03 ,
181+ }
182+ )
183+ ```
184+
166185### Starting the optillm proxy with an external server (e.g. llama.cpp or ollama)
167186
168187- Set the ` OPENAI_API_KEY ` env variable to a placeholder value
You can’t perform that action at this time.
0 commit comments