-
Hi I have used the HuggingFacePipeline with different models such as flan-t5 and stablelm-7b etc., and it works with local inference. I tried using the HuggingFaceHub as well, but it constantly gives a time out error for basically every model. Now I have created an inference endpoint on HF, but how do I use that with langchain? The HuggingFaceHub class only accepts a text parameter which is the repo_id or model name, but the inference endpoint gives me a URL only. I can get individual text samples by a simple API request, but how do I integrate this with langchain? |
Beta Was this translation helpful? Give feedback.
Replies: 5 comments 2 replies
-
It seems this functionality was added here. Here is the class that allows the integration. I have not tried it yet, but you can give it a try. Unfortunately, this integration is not in the official documentation. |
Beta Was this translation helpful? Give feedback.
-
Looks like its very new functionality! some docs are needed :) |
Beta Was this translation helpful? Give feedback.
-
How to get the URL: https://huggingface.co/docs/inference-endpoints/guides/test_endpoint You have to deploy the model you want as an inference point, basically you have to pay in HF for inference compute time |
Beta Was this translation helpful? Give feedback.
-
the provided class if self.task == "text-generation": not the current class will likely just return a 0. |
Beta Was this translation helpful? Give feedback.
-
@syeminpark did you test it? It seems to work fine |
Beta Was this translation helpful? Give feedback.
It seems this functionality was added here. Here is the class that allows the integration. I have not tried it yet, but you can give it a try. Unfortunately, this integration is not in the official documentation.