-
Notifications
You must be signed in to change notification settings - Fork 374
[INFERENCE PROVIDERS] function calling tutorial #1832
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[INFERENCE PROVIDERS] function calling tutorial #1832
Conversation
|
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Really cool guide, learnt a few things along the way ^^
|
|
||
| # Initialize client | ||
| client = OpenAI( | ||
| base_url="https://router.huggingface.co/nebius/v1", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is slightly misleading since it won't work the same for all providers. Using URL "https://router.huggingface.co/v1" is more flexible and should work for all providers (provider selection when using this route is on its way server-side)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it thanks. Agreed that the router makes sense here.
| - base_url="https://router.huggingface.co/together/v1", | ||
| + base_url="https://router.huggingface.co/nebius/v1", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here is typically where using "https://router.huggingface.co/v1" would be better for server-side model resolution.
If you look at
- https://huggingface.co/deepseek-ai/DeepSeek-R1-0528?inference_api=true&language=python&inference_provider=together
- vs https://huggingface.co/deepseek-ai/DeepSeek-R1-0528?inference_api=true&language=python&inference_provider=nebius
you'll see the model id must be "deepseek-ai/DeepSeek-R1" for Together vs "deepseek-ai/DeepSeek-R1-0528" for Nebius. Once we have a proper "auto route with provider selection", this should hopefully be much simpler to use with the openai client.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Using InferenceClient it's much simpler since all the resolution is done client-side for you. We are currently closing the gap between the two clients)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here is typically where using "https://router.huggingface.co/v1" would be better for server-side model resolution.
Totally agree on this, but my main effort is to show the power of provider selection. I think this commit solves it, where I show a working example with provider="auto" and the router but diff trace for other providers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very nice!
cc @sergiopaniego, we worked together on this notebook for tool calling with Llama (but most info should be general).
I also enjoyed @Rocketknight1's post, in particular the get_json_schema helper. Could we maybe leverage it here?
|
|
||
| # Get final response with function results | ||
| final_response = client.chat.completions.create( | ||
| model="meta-llama/Llama-3.1-8B-Instruct", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we use a different model here? (Not saying it's wrong, just pointing out it could be potentially puzzling)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah. This is wrong.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Loved it! 😄
Co-authored-by: Sergio Paniego Blanco <[email protected]> Co-authored-by: Pedro Cuenca <[email protected]>
…ttps://github.com/huggingface/hub-docs into function-calling-tutorial-for-inference-providers
|
@Wauplin @pcuenca @sergiopaniego Thanks for the reviews! I've responded to everything. Please could I get an approval? |
|
Nice, some small assorted tweaks in #1837 |
This PR introduces a new tutorial on function calling.