Skip to content

Commit 44c862e

Browse files
authored
Merge pull request #195 from zucchini-nlp/main
Fix: videos in LLaVa-OV
2 parents ba6b5d2 + 5e18143 commit 44c862e

File tree

1 file changed

+2
-1
lines changed

1 file changed

+2
-1
lines changed

docs/LLaVA_OneVision_Tutorials.ipynb

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -345,6 +345,7 @@
345345
"\n",
346346
"input_ids = tokenizer_image_token(prompt_question, tokenizer, IMAGE_TOKEN_INDEX, return_tensors=\"pt\").unsqueeze(0).to(device)\n",
347347
"image_sizes = [frame.size for frame in video_frames]\n",
348+
"modalities = [\"video\"] * len(video_frames)\n",
348349
"\n",
349350
"# Generate response\n",
350351
"cont = model.generate(\n",
@@ -354,7 +355,7 @@
354355
" do_sample=False,\n",
355356
" temperature=0,\n",
356357
" max_new_tokens=4096,\n",
357-
" modalities=[\"video\"],\n",
358+
" modalities=modalities,\n",
358359
")\n",
359360
"text_outputs = tokenizer.batch_decode(cont, skip_special_tokens=True)\n",
360361
"print(text_outputs[0])"

0 commit comments

Comments
 (0)