Skip to content

Commit 4e51353

Browse files
burtenshawmdrxy
andauthored
docs: update huggingface inference to latest usage (#31906)
This PR updates the doc on Hugging Face's inference offering from 'inference API' to 'inference providers' --------- Co-authored-by: Mason Daugherty <[email protected]>
1 parent b8e2420 commit 4e51353

File tree

4 files changed

+71
-26
lines changed

4 files changed

+71
-26
lines changed

docs/docs/integrations/chat/huggingface.ipynb

Lines changed: 27 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -120,7 +120,7 @@
120120
},
121121
{
122122
"cell_type": "code",
123-
"execution_count": 10,
123+
"execution_count": null,
124124
"metadata": {},
125125
"outputs": [
126126
{
@@ -138,11 +138,36 @@
138138
"from langchain_huggingface import ChatHuggingFace, HuggingFaceEndpoint\n",
139139
"\n",
140140
"llm = HuggingFaceEndpoint(\n",
141-
" repo_id=\"HuggingFaceH4/zephyr-7b-beta\",\n",
141+
" repo_id=\"deepseek-ai/DeepSeek-R1-0528\",\n",
142142
" task=\"text-generation\",\n",
143143
" max_new_tokens=512,\n",
144144
" do_sample=False,\n",
145145
" repetition_penalty=1.03,\n",
146+
" provider=\"auto\", # let Hugging Face choose the best provider for you\n",
147+
")\n",
148+
"\n",
149+
"chat_model = ChatHuggingFace(llm=llm)"
150+
]
151+
},
152+
{
153+
"cell_type": "markdown",
154+
"metadata": {},
155+
"source": [
156+
"Now let's take advantage of [Inference Providers](https://huggingface.co/docs/inference-providers) to run the model on specific third-party providers"
157+
]
158+
},
159+
{
160+
"cell_type": "code",
161+
"execution_count": null,
162+
"metadata": {},
163+
"outputs": [],
164+
"source": [
165+
"llm = HuggingFaceEndpoint(\n",
166+
" repo_id=\"deepseek-ai/DeepSeek-R1-0528\",\n",
167+
" task=\"text-generation\",\n",
168+
" provider=\"hyperbolic\", # set your provider here\n",
169+
" # provider=\"nebius\",\n",
170+
" # provider=\"together\",\n",
146171
")\n",
147172
"\n",
148173
"chat_model = ChatHuggingFace(llm=llm)"

docs/docs/integrations/llms/huggingface_endpoint.ipynb

Lines changed: 6 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -117,7 +117,7 @@
117117
"source": [
118118
"## Examples\n",
119119
"\n",
120-
"Here is an example of how you can access `HuggingFaceEndpoint` integration of the free [Serverless Endpoints](https://huggingface.co/inference-endpoints/serverless) API."
120+
"Here is an example of how you can access `HuggingFaceEndpoint` integration of the serverless [Inference Providers](https://huggingface.co/docs/inference-providers) API.\n"
121121
]
122122
},
123123
{
@@ -128,13 +128,17 @@
128128
},
129129
"outputs": [],
130130
"source": [
131-
"repo_id = \"mistralai/Mistral-7B-Instruct-v0.2\"\n",
131+
"repo_id = \"deepseek-ai/DeepSeek-R1-0528\"\n",
132132
"\n",
133133
"llm = HuggingFaceEndpoint(\n",
134134
" repo_id=repo_id,\n",
135135
" max_length=128,\n",
136136
" temperature=0.5,\n",
137137
" huggingfacehub_api_token=HUGGINGFACEHUB_API_TOKEN,\n",
138+
" provider=\"auto\", # set your provider here hf.co/settings/inference-providers\n",
139+
" # provider=\"hyperbolic\",\n",
140+
" # provider=\"nebius\",\n",
141+
" # provider=\"together\",\n",
138142
")\n",
139143
"llm_chain = prompt | llm\n",
140144
"print(llm_chain.invoke({\"question\": question}))"

docs/docs/integrations/providers/huggingface.mdx

Lines changed: 16 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,11 @@
11
# Hugging Face
22

3-
All functionality related to the [Hugging Face Platform](https://huggingface.co/).
3+
All functionality related to [Hugging Face Hub](https://huggingface.co/) and libraries like [transformers](https://huggingface.co/docs/transformers/index), [sentence transformers](https://sbert.net/), and [datasets](https://huggingface.co/docs/datasets/index).
4+
5+
> [Hugging Face](https://huggingface.co/) is an AI platform with all major open source models, datasets, MCPs, and demos.
6+
> It supplies model inference locally and via serverless [Inference Providers](https://huggingface.co/docs/inference-providers).
7+
>
8+
> You can use [Inference Providers](https://huggingface.co/docs/inference-providers) to run open source models like DeepSeek R1 on scalable serverless infrastructure.
49
510
## Installation
611

@@ -26,6 +31,7 @@ from langchain_huggingface import ChatHuggingFace
2631

2732
### HuggingFaceEndpoint
2833

34+
We can use the `HuggingFaceEndpoint` class to run open source models via serverless [Inference Providers](https://huggingface.co/docs/inference-providers) or via dedicated [Inference Endpoints](https://huggingface.co/inference-endpoints/dedicated).
2935

3036
See a [usage example](/docs/integrations/llms/huggingface_endpoint).
3137

@@ -35,7 +41,7 @@ from langchain_huggingface import HuggingFaceEndpoint
3541

3642
### HuggingFacePipeline
3743

38-
Hugging Face models can be run locally through the `HuggingFacePipeline` class.
44+
We can use the `HuggingFacePipeline` class to run open source models locally.
3945

4046
See a [usage example](/docs/integrations/llms/huggingface_pipelines).
4147

@@ -47,6 +53,8 @@ from langchain_huggingface import HuggingFacePipeline
4753

4854
### HuggingFaceEmbeddings
4955

56+
We can use the `HuggingFaceEmbeddings` class to run open source embedding models locally.
57+
5058
See a [usage example](/docs/integrations/text_embedding/huggingfacehub).
5159

5260
```python
@@ -55,6 +63,8 @@ from langchain_huggingface import HuggingFaceEmbeddings
5563

5664
### HuggingFaceEndpointEmbeddings
5765

66+
We can use the `HuggingFaceEndpointEmbeddings` class to run open source embedding models via a dedicated [Inference Endpoint](https://huggingface.co/inference-endpoints/dedicated).
67+
5868
See a [usage example](/docs/integrations/text_embedding/huggingfacehub).
5969

6070
```python
@@ -63,6 +73,8 @@ from langchain_huggingface import HuggingFaceEndpointEmbeddings
6373

6474
### HuggingFaceInferenceAPIEmbeddings
6575

76+
We can use the `HuggingFaceInferenceAPIEmbeddings` class to run open source embedding models via [Inference Providers](https://huggingface.co/docs/inference-providers).
77+
6678
See a [usage example](/docs/integrations/text_embedding/huggingfacehub).
6779

6880
```python
@@ -71,6 +83,8 @@ from langchain_community.embeddings import HuggingFaceInferenceAPIEmbeddings
7183

7284
### HuggingFaceInstructEmbeddings
7385

86+
We can use the `HuggingFaceInstructEmbeddings` class to run open source embedding models locally.
87+
7488
See a [usage example](/docs/integrations/text_embedding/instruct_embeddings).
7589

7690
```python

docs/docs/integrations/text_embedding/huggingfacehub.ipynb

Lines changed: 22 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -95,35 +95,36 @@
9595
"id": "92019ef1-5d30-4985-b4e6-c0d98bdfe265",
9696
"metadata": {},
9797
"source": [
98-
"## Hugging Face Inference API\n",
99-
"We can also access embedding models via the Hugging Face Inference API, which does not require us to install ``sentence_transformers`` and download models locally."
98+
"## Hugging Face Inference Providers\n",
99+
"\n",
100+
"We can also access embedding models via the [Inference Providers](https://huggingface.co/docs/inference-providers), which let's us use open source models on scalable serverless infrastructure.\n",
101+
"\n",
102+
"First, we need to get a read-only API key from [Hugging Face](https://huggingface.co/settings/tokens).\n"
100103
]
101104
},
102105
{
103106
"cell_type": "code",
104-
"execution_count": 1,
105-
"id": "66f5c6ba-1446-43e1-b012-800d17cef300",
107+
"execution_count": null,
108+
"id": "c5576a6c",
106109
"metadata": {},
107-
"outputs": [
108-
{
109-
"name": "stdout",
110-
"output_type": "stream",
111-
"text": [
112-
"Enter your HF Inference API Key:\n",
113-
"\n",
114-
" ········\n"
115-
]
116-
}
117-
],
110+
"outputs": [],
118111
"source": [
119-
"import getpass\n",
112+
"from getpass import getpass\n",
120113
"\n",
121-
"inference_api_key = getpass.getpass(\"Enter your HF Inference API Key:\\n\\n\")"
114+
"huggingfacehub_api_token = getpass()"
115+
]
116+
},
117+
{
118+
"cell_type": "markdown",
119+
"id": "3ad10337",
120+
"metadata": {},
121+
"source": [
122+
"Now we can use the `HuggingFaceInferenceAPIEmbeddings` class to run open source embedding models via [Inference Providers](https://huggingface.co/docs/inference-providers)."
122123
]
123124
},
124125
{
125126
"cell_type": "code",
126-
"execution_count": 4,
127+
"execution_count": null,
127128
"id": "d0623c1f-cd82-4862-9bce-3655cb9b66ac",
128129
"metadata": {},
129130
"outputs": [
@@ -139,10 +140,11 @@
139140
}
140141
],
141142
"source": [
142-
"from langchain_community.embeddings import HuggingFaceInferenceAPIEmbeddings\n",
143+
"from langchain_huggingface import HuggingFaceInferenceAPIEmbeddings\n",
143144
"\n",
144145
"embeddings = HuggingFaceInferenceAPIEmbeddings(\n",
145-
" api_key=inference_api_key, model_name=\"sentence-transformers/all-MiniLM-l6-v2\"\n",
146+
" api_key=huggingfacehub_api_token,\n",
147+
" model_name=\"sentence-transformers/all-MiniLM-l6-v2\",\n",
146148
")\n",
147149
"\n",
148150
"query_result = embeddings.embed_query(text)\n",

0 commit comments

Comments
 (0)