Skip to content

Commit fdeeb89

Browse files
Merge branch 'main' into add-support-for-image-to-image-models-on-Replicate
2 parents cb4f7f0 + 369d105 commit fdeeb89

File tree

41 files changed

+973
-148
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

41 files changed

+973
-148
lines changed

README.md

Lines changed: 27 additions & 22 deletions
Original file line numberDiff line numberDiff line change
@@ -57,7 +57,7 @@ This is a collection of JS libraries to interact with the Hugging Face API, with
5757

5858
- [@huggingface/inference](packages/inference/README.md): Use all supported (serverless) Inference Providers or switch to Inference Endpoints (dedicated) to make calls to 100,000+ Machine Learning models
5959
- [@huggingface/hub](packages/hub/README.md): Interact with huggingface.co to create or delete repos and commit / download files
60-
- [@huggingface/agents](packages/agents/README.md): Interact with HF models through a natural language interface
60+
- [@huggingface/mcp-client](packages/mcp-client/README.md): A Model Context Protocol (MCP) client, and a tiny Agent library, built on top of InferenceClient.
6161
- [@huggingface/gguf](packages/gguf/README.md): A GGUF parser that works on remotely hosted files.
6262
- [@huggingface/dduf](packages/dduf/README.md): Similar package for DDUF (DDUF Diffusers Unified Format)
6363
- [@huggingface/tasks](packages/tasks/README.md): The definition files and source-of-truth for the Hub's main primitives like pipeline tasks, model libraries, etc.
@@ -79,15 +79,15 @@ To install via NPM, you can download the libraries as needed:
7979
```bash
8080
npm install @huggingface/inference
8181
npm install @huggingface/hub
82-
npm install @huggingface/agents
82+
npm install @huggingface/mcp-client
8383
```
8484

8585
Then import the libraries in your code:
8686

8787
```ts
8888
import { InferenceClient } from "@huggingface/inference";
89-
import { HfAgent } from "@huggingface/agents";
9089
import { createRepo, commit, deleteRepo, listFiles } from "@huggingface/hub";
90+
import { McpClient } from "@huggingface/mcp-client";
9191
import type { RepoId } from "@huggingface/hub";
9292
```
9393

@@ -97,7 +97,7 @@ You can run our packages with vanilla JS, without any bundler, by using a CDN or
9797

9898
```html
9999
<script type="module">
100-
import { InferenceClient } from 'https://cdn.jsdelivr.net/npm/@huggingface/inference@3.12.1/+esm';
100+
import { InferenceClient } from 'https://cdn.jsdelivr.net/npm/@huggingface/inference@3.13.0/+esm';
101101
import { createRepo, commit, deleteRepo, listFiles } from "https://cdn.jsdelivr.net/npm/@huggingface/[email protected]/+esm";
102102
</script>
103103
```
@@ -107,12 +107,10 @@ You can run our packages with vanilla JS, without any bundler, by using a CDN or
107107
```ts
108108
// esm.sh
109109
import { InferenceClient } from "https://esm.sh/@huggingface/inference"
110-
import { HfAgent } from "https://esm.sh/@huggingface/agents";
111110

112111
import { createRepo, commit, deleteRepo, listFiles } from "https://esm.sh/@huggingface/hub"
113112
// or npm:
114113
import { InferenceClient } from "npm:@huggingface/inference"
115-
import { HfAgent } from "npm:@huggingface/agents";
116114

117115
import { createRepo, commit, deleteRepo, listFiles } from "npm:@huggingface/hub"
118116
```
@@ -223,29 +221,36 @@ await deleteFiles({
223221
});
224222
```
225223

226-
### @huggingface/agents example
224+
### @huggingface/mcp-client example
227225

228226
```ts
229-
import { HfAgent, LLMFromHub, defaultTools } from '@huggingface/agents';
227+
import { Agent } from '@huggingface/mcp-client';
230228

231229
const HF_TOKEN = "hf_...";
232230

233-
const agent = new HfAgent(
234-
HF_TOKEN,
235-
LLMFromHub(HF_TOKEN),
236-
[...defaultTools]
237-
);
238-
231+
const agent = new Agent({
232+
provider: "auto",
233+
model: "Qwen/Qwen2.5-72B-Instruct",
234+
apiKey: HF_TOKEN,
235+
servers: [
236+
{
237+
// Playwright MCP
238+
command: "npx",
239+
args: ["@playwright/mcp@latest"],
240+
},
241+
],
242+
});
239243

240-
// you can generate the code, inspect it and then run it
241-
const code = await agent.generateCode("Draw a picture of a cat wearing a top hat. Then caption the picture and read it out loud.");
242-
console.log(code);
243-
const messages = await agent.evaluateCode(code)
244-
console.log(messages); // contains the data
245244

246-
// or you can run the code directly, however you can't check that the code is safe to execute this way, use at your own risk.
247-
const messages = await agent.run("Draw a picture of a cat wearing a top hat. Then caption the picture and read it out loud.")
248-
console.log(messages);
245+
await agent.loadTools();
246+
for await (const chunk of agent.run("What are the top 5 trending models on Hugging Face?")) {
247+
if ("choices" in chunk) {
248+
const delta = chunk.choices[0]?.delta;
249+
if (delta.content) {
250+
console.log(delta.content);
251+
}
252+
}
253+
}
249254
```
250255

251256
There are more features of course, check each library's README!

packages/agents/README.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,10 @@
22

33
A way to call Hugging Face models and Inference Endpoints from natural language, using an LLM.
44

5+
> [!WARNING]
6+
> `@huggingface/agents` is now deprecated, and a modern version, built on top of MCP, is [Tiny Agents](https://github.com/huggingface/huggingface.js/tree/main/packages/mcp-client).
7+
> Go checkout the `Tiny Agents` introduction blog [here](https://huggingface.co/blog/tiny-agents).
8+
59
## Install
610

711
```console

packages/gguf/package.json

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
{
22
"name": "@huggingface/gguf",
33
"packageManager": "[email protected]",
4-
"version": "0.1.14",
4+
"version": "0.1.17",
55
"description": "a GGUF parser that works on remotely hosted files",
66
"repository": "https://github.com/huggingface/huggingface.js.git",
77
"publishConfig": {

packages/inference/package.json

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
{
22
"name": "@huggingface/inference",
3-
"version": "3.12.1",
3+
"version": "3.13.0",
44
"packageManager": "[email protected]",
55
"license": "MIT",
66
"author": "Hugging Face and Tim Mikeladze <[email protected]>",

packages/inference/src/lib/getProviderHelper.ts

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -115,6 +115,7 @@ export const PROVIDERS: Record<InferenceProvider, Partial<Record<InferenceTask,
115115
"text-to-image": new Nebius.NebiusTextToImageTask(),
116116
conversational: new Nebius.NebiusConversationalTask(),
117117
"text-generation": new Nebius.NebiusTextGenerationTask(),
118+
"feature-extraction": new Nebius.NebiusFeatureExtractionTask(),
118119
},
119120
novita: {
120121
conversational: new Novita.NovitaConversationalTask(),

packages/inference/src/providers/fal-ai.ts

Lines changed: 26 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -14,10 +14,12 @@
1414
*
1515
* Thanks!
1616
*/
17+
import { base64FromBytes } from "../utils/base64FromBytes";
18+
1719
import type { AutomaticSpeechRecognitionOutput } from "@huggingface/tasks";
1820
import { InferenceOutputError } from "../lib/InferenceOutputError";
1921
import { isUrl } from "../lib/isUrl";
20-
import type { BodyParams, HeaderParams, ModelId, UrlParams } from "../types";
22+
import type { BodyParams, HeaderParams, ModelId, RequestArgs, UrlParams } from "../types";
2123
import { delay } from "../utils/delay";
2224
import { omit } from "../utils/omit";
2325
import {
@@ -27,6 +29,7 @@ import {
2729
type TextToVideoTaskHelper,
2830
} from "./providerHelper";
2931
import { HF_HUB_URL } from "../config";
32+
import type { AutomaticSpeechRecognitionArgs } from "../tasks/audio/automaticSpeechRecognition";
3033

3134
export interface FalAiQueueOutput {
3235
request_id: string;
@@ -224,6 +227,28 @@ export class FalAIAutomaticSpeechRecognitionTask extends FalAITask implements Au
224227
}
225228
return { text: res.text };
226229
}
230+
231+
async preparePayloadAsync(args: AutomaticSpeechRecognitionArgs): Promise<RequestArgs> {
232+
const blob = "data" in args && args.data instanceof Blob ? args.data : "inputs" in args ? args.inputs : undefined;
233+
const contentType = blob?.type;
234+
if (!contentType) {
235+
throw new Error(
236+
`Unable to determine the input's content-type. Make sure your are passing a Blob when using provider fal-ai.`
237+
);
238+
}
239+
if (!FAL_AI_SUPPORTED_BLOB_TYPES.includes(contentType)) {
240+
throw new Error(
241+
`Provider fal-ai does not support blob type ${contentType} - supported content types are: ${FAL_AI_SUPPORTED_BLOB_TYPES.join(
242+
", "
243+
)}`
244+
);
245+
}
246+
const base64audio = base64FromBytes(new Uint8Array(await blob.arrayBuffer()));
247+
return {
248+
...("data" in args ? omit(args, "data") : omit(args, "inputs")),
249+
audio_url: `data:${contentType};base64,${base64audio}`,
250+
};
251+
}
227252
}
228253

229254
export class FalAITextToSpeechTask extends FalAITask {

packages/inference/src/providers/hf-inference.ts

Lines changed: 37 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -36,7 +36,7 @@ import type {
3636
import { HF_ROUTER_URL } from "../config";
3737
import { InferenceOutputError } from "../lib/InferenceOutputError";
3838
import type { TabularClassificationOutput } from "../tasks/tabular/tabularClassification";
39-
import type { BodyParams, UrlParams } from "../types";
39+
import type { BodyParams, RequestArgs, UrlParams } from "../types";
4040
import { toArray } from "../utils/toArray";
4141
import type {
4242
AudioClassificationTaskHelper,
@@ -70,7 +70,10 @@ import type {
7070
} from "./providerHelper";
7171

7272
import { TaskProviderHelper } from "./providerHelper";
73-
73+
import { base64FromBytes } from "../utils/base64FromBytes";
74+
import type { ImageToImageArgs } from "../tasks/cv/imageToImage";
75+
import type { AutomaticSpeechRecognitionArgs } from "../tasks/audio/automaticSpeechRecognition";
76+
import { omit } from "../utils/omit";
7477
interface Base64ImageGeneration {
7578
data: Array<{
7679
b64_json: string;
@@ -221,6 +224,15 @@ export class HFInferenceAutomaticSpeechRecognitionTask
221224
override async getResponse(response: AutomaticSpeechRecognitionOutput): Promise<AutomaticSpeechRecognitionOutput> {
222225
return response;
223226
}
227+
228+
async preparePayloadAsync(args: AutomaticSpeechRecognitionArgs): Promise<RequestArgs> {
229+
return "data" in args
230+
? args
231+
: {
232+
...omit(args, "inputs"),
233+
data: args.inputs,
234+
};
235+
}
224236
}
225237

226238
export class HFInferenceAudioToAudioTask extends HFInferenceTask implements AudioToAudioTaskHelper {
@@ -303,7 +315,12 @@ export class HFInferenceImageSegmentationTask extends HFInferenceTask implements
303315
override async getResponse(response: ImageSegmentationOutput): Promise<ImageSegmentationOutput> {
304316
if (
305317
Array.isArray(response) &&
306-
response.every((x) => typeof x.label === "string" && typeof x.mask === "string" && typeof x.score === "number")
318+
response.every(
319+
(x) =>
320+
typeof x.label === "string" &&
321+
typeof x.mask === "string" &&
322+
(x.score === undefined || typeof x.score === "number")
323+
)
307324
) {
308325
return response;
309326
}
@@ -321,6 +338,23 @@ export class HFInferenceImageToTextTask extends HFInferenceTask implements Image
321338
}
322339

323340
export class HFInferenceImageToImageTask extends HFInferenceTask implements ImageToImageTaskHelper {
341+
async preparePayloadAsync(args: ImageToImageArgs): Promise<RequestArgs> {
342+
if (!args.parameters) {
343+
return {
344+
...args,
345+
model: args.model,
346+
data: args.inputs,
347+
};
348+
} else {
349+
return {
350+
...args,
351+
inputs: base64FromBytes(
352+
new Uint8Array(args.inputs instanceof ArrayBuffer ? args.inputs : await (args.inputs as Blob).arrayBuffer())
353+
),
354+
};
355+
}
356+
}
357+
324358
override async getResponse(response: Blob): Promise<Blob> {
325359
if (response instanceof Blob) {
326360
return response;

packages/inference/src/providers/nebius.ts

Lines changed: 31 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -14,13 +14,15 @@
1414
*
1515
* Thanks!
1616
*/
17+
import type { FeatureExtractionOutput } from "@huggingface/tasks";
1718
import { InferenceOutputError } from "../lib/InferenceOutputError";
18-
import type { BodyParams, UrlParams } from "../types";
19+
import type { BodyParams } from "../types";
1920
import { omit } from "../utils/omit";
2021
import {
2122
BaseConversationalTask,
2223
BaseTextGenerationTask,
2324
TaskProviderHelper,
25+
type FeatureExtractionTaskHelper,
2426
type TextToImageTaskHelper,
2527
} from "./providerHelper";
2628

@@ -32,6 +34,12 @@ interface NebiusBase64ImageGeneration {
3234
}>;
3335
}
3436

37+
interface NebiusEmbeddingsResponse {
38+
data: Array<{
39+
embedding: number[];
40+
}>;
41+
}
42+
3543
export class NebiusConversationalTask extends BaseConversationalTask {
3644
constructor() {
3745
super("nebius", NEBIUS_API_BASE_URL);
@@ -59,8 +67,7 @@ export class NebiusTextToImageTask extends TaskProviderHelper implements TextToI
5967
};
6068
}
6169

62-
makeRoute(params: UrlParams): string {
63-
void params;
70+
makeRoute(): string {
6471
return "v1/images/generations";
6572
}
6673

@@ -88,3 +95,24 @@ export class NebiusTextToImageTask extends TaskProviderHelper implements TextToI
8895
throw new InferenceOutputError("Expected Nebius text-to-image response format");
8996
}
9097
}
98+
99+
export class NebiusFeatureExtractionTask extends TaskProviderHelper implements FeatureExtractionTaskHelper {
100+
constructor() {
101+
super("nebius", NEBIUS_API_BASE_URL);
102+
}
103+
104+
preparePayload(params: BodyParams): Record<string, unknown> {
105+
return {
106+
input: params.args.inputs,
107+
model: params.model,
108+
};
109+
}
110+
111+
makeRoute(): string {
112+
return "v1/embeddings";
113+
}
114+
115+
async getResponse(response: NebiusEmbeddingsResponse): Promise<FeatureExtractionOutput> {
116+
return response.data.map((item) => item.embedding);
117+
}
118+
}

packages/inference/src/providers/providerHelper.ts

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -48,8 +48,10 @@ import type {
4848
import { HF_ROUTER_URL } from "../config";
4949
import { InferenceOutputError } from "../lib/InferenceOutputError";
5050
import type { AudioToAudioOutput } from "../tasks/audio/audioToAudio";
51-
import type { BaseArgs, BodyParams, HeaderParams, InferenceProvider, UrlParams } from "../types";
51+
import type { BaseArgs, BodyParams, HeaderParams, InferenceProvider, RequestArgs, UrlParams } from "../types";
5252
import { toArray } from "../utils/toArray";
53+
import type { ImageToImageArgs } from "../tasks/cv/imageToImage";
54+
import type { AutomaticSpeechRecognitionArgs } from "../tasks/audio/automaticSpeechRecognition";
5355

5456
/**
5557
* Base class for task-specific provider helpers
@@ -142,6 +144,7 @@ export interface TextToVideoTaskHelper {
142144
export interface ImageToImageTaskHelper {
143145
getResponse(response: unknown, url?: string, headers?: HeadersInit): Promise<Blob>;
144146
preparePayload(params: BodyParams<ImageToImageInput & BaseArgs>): Record<string, unknown>;
147+
preparePayloadAsync(args: ImageToImageArgs): Promise<RequestArgs>;
145148
}
146149

147150
export interface ImageSegmentationTaskHelper {
@@ -245,6 +248,7 @@ export interface AudioToAudioTaskHelper {
245248
export interface AutomaticSpeechRecognitionTaskHelper {
246249
getResponse(response: unknown, url?: string, headers?: HeadersInit): Promise<AutomaticSpeechRecognitionOutput>;
247250
preparePayload(params: BodyParams<AutomaticSpeechRecognitionInput & BaseArgs>): Record<string, unknown> | BodyInit;
251+
preparePayloadAsync(args: AutomaticSpeechRecognitionArgs): Promise<RequestArgs>;
248252
}
249253

250254
export interface AudioClassificationTaskHelper {

0 commit comments

Comments
 (0)