-
Checked other resources
Commit to Help
Example CodeFailed to resolve model_id:Could not find model id for inference server: meta-llama/Meta-Llama-3.1-8B-InstructMake sure that your Hugging Face token has access to the endpoint. DescriptionI'm trying to use Lang graph to build out a demo product. In order to do this as low cost as possible I am searching for a way to wrap the "serverless inference" option from HuggingFace around an agent (Not inference endpoints just yet!) I have tried the suggested code in the Langchain docs but I think the api call is still looking for an inference endpoint server as it is returning a error with the model id (meta-llama/Meta-Llama-3.1-8B) as the endpoint_url. Thank you in advance to anyone that has any ideas or solutions! Hope you are having a great day! System Infon/a |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment
-
Hello @kw1018! I'm here to help you with any bugs, questions, or contributions. Let's work together to solve this! To integrate the "serverless inference" option from HuggingFace with an agent in LangGraph to build a demo product while keeping costs low, you can use the free serverless API provided by HuggingFace. Here is a step-by-step guide:
Additionally, you can explore using DeepInfra, another serverless inference service that provides access to a variety of LLMs. DeepInfra offers 1 hour of free serverless GPU compute to test different models, which can help keep costs low for your demo product.
This setup uses the free serverless API from HuggingFace and the free compute time from DeepInfra, which are suitable for demo products and help keep costs low. However, be aware that these free options may be rate-limited for heavy use cases. For more demanding applications, consider using dedicated endpoints [1][2][3][4]. |
Beta Was this translation helpful? Give feedback.
Hello @kw1018! I'm here to help you with any bugs, questions, or contributions. Let's work together to solve this!
To integrate the "serverless inference" option from HuggingFace with an agent in LangGraph to build a demo product while keeping costs low, you can use the free serverless API provided by HuggingFace. Here is a step-by-step guide:
Install the necessary packages:
Get your HuggingFace API token:
Set the API token as an environment variable:
Prepare the HuggingFaceEndpoint and LLMChain: