Skip to content

Conversation

@Wauplin
Copy link
Contributor

@Wauplin Wauplin commented Sep 26, 2024

This PR adds inference snippets for image-text-to-text models, say meta-llama/Llama-3.2-11B-Vision-Instruct for example 😄

I've tested all three examples locally and they work as expected :)

@Wauplin Wauplin changed the title Add code snippets for image-text-to-text Add inference snippets for image-text-to-text Sep 26, 2024
],
max_tokens: 500,
})) {
process.stdout.write(chunk.choices[0]?.delta?.content || "");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
process.stdout.write(chunk.choices[0]?.delta?.content || "");
process.stdout.write(chunk.choices[0]?.delta?.content);

"${model.id}",
token="${accessToken || "{API_TOKEN}"}",
)
image_url = "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
Copy link
Collaborator

@mishig25 mishig25 Sep 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe good to show an example with pillow.Image.open and use that image's base64 str representation so that users can get an example where they can load local images

from PIL import Image
import requests
from io import BytesIO
import base64

image_url = "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
response = requests.get(image_url)

image = Image.open(BytesIO(response.content))

# Convert the image to a byte array in PNG format
buffered = BytesIO()
image.save(buffered, format="PNG")

# Encode this byte array to base64
img_base64 = base64.b64encode(buffered.getvalue())

# Print the base64 string
print(img_base64.decode())

maybe the snippet would become too long. I will let you decide

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm i'd say this would complicate a bit too much (no strong opinion though)

Note that this is for remote inference not local usage

Copy link
Collaborator

@mishig25 mishig25 Sep 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that this is for remote inference not local usage

yes, I meant more like: remote inference using local img file (otherwise, to use the snippet, user needs to upload their image and get its url)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do you know if it's possible to have several snippets by returning a list, same as for code snippets ?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do you know if it's possible to have several snippets by returning a list, same as for code snippets ?

for inference snippet, not possible right now. So suggest that:

  1. we merge this PR as it is with only image url example
  2. mayeb unify on moon-side to have inference snippet to be able to have a list like code snippets. If so, we can re-iterate and add an example with a local image

@Wauplin
Copy link
Contributor Author

Wauplin commented Sep 30, 2024

@coyotte508 @mishig25 thanks for the feedback. I addressed above the comment to use "conversational" tag. Otherwise let's merge this and come back to it when multiple inference snippets can be provided. Can I have a final review before merging?

@Wauplin Wauplin requested a review from mishig25 September 30, 2024 13:47
export const snippetConversationalWithImage = (model: ModelDataMinimal, accessToken: string): string =>
`from huggingface_hub import InferenceClient
client = InferenceClient(
Copy link
Collaborator

@mishig25 mishig25 Sep 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit, so in playground, we use snippet like this to match it as much as possible to OAI format/spec:

from huggingface_hub import InferenceClient

client = InferenceClient(api_key="YOUR_HF_TOKEN")

messages = [
	{ "role": "user", "content": "Tell me a story" }
]

output = client.chat.completions.create(
        model="mistralai/Mistral-7B-Instruct-v0.3", 
	messages=messages, 
	stream=True, 
	temperature=0.5,
	max_tokens=1024,
	top_p=0.7
)

the specific changes are:

  1. use api_key rather than token
  2. declare model inside completions.create rather than InferenceClient(

I will let you decide

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this comment maybe applies to text conv snippet as well

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if we decide to change it, lets handle in subseq PR

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll use the same convention then. I've addressed it in e4c6cba for both the text-generation and text-image-to-text snippets.

@mishig25
Copy link
Collaborator

github is showing that model is indendted wrongly?

image

…gingface/huggingface.js into code-snippets-for-image-text-to-text
@Wauplin
Copy link
Contributor Author

Wauplin commented Sep 30, 2024

github is showing that model is indendted wrongly?

Well well well, looks like it yes. Addressed in 0f8452c

Copy link
Collaborator

@mishig25 mishig25 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm! again

@Wauplin
Copy link
Contributor Author

Wauplin commented Sep 30, 2024

Thanks! Sorry about the back and forth 😬

@Wauplin Wauplin merged commit 4b211b0 into main Sep 30, 2024
5 checks passed
@Wauplin Wauplin deleted the code-snippets-for-image-text-to-text branch September 30, 2024 14:10
@mishig25
Copy link
Collaborator

trigerred https://github.com/huggingface/huggingface.js/actions/runs/11107930206 so that we can get it in moon

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants