Skip to content

Commit 3e21e95

Browse files
Update smolvlm2.md (#2694)
- Add missing articles before "500M variant" and "SmolVLM family" - Correct "see" → "seeing" in "looking forward to seeing" - Fix awkward phrasing in fine-tuning section ("meanwhile" → new sentence) - Change "like follows" → "as follows" for standard usage
1 parent 592a145 commit 3e21e95

File tree

1 file changed

+5
-5
lines changed

1 file changed

+5
-5
lines changed

smolvlm2.md

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -298,7 +298,7 @@ python -m mlx_vlm.generate \
298298
--prompt "Can you describe this image?"
299299
```
300300

301-
We also created a simple script for video understanding. You can use it like follows:
301+
We also created a simple script for video understanding. You can use it as follows:
302302

303303
```bash
304304
python -m mlx_vlm.smolvlm_video_generate \
@@ -315,7 +315,7 @@ Note that the system prompt is important to bend the model to the desired behavi
315315

316316
The Swift language is also supported through the [mlx-swift-examples repo](https://github.com/ml-explore/mlx-swift-examples), which is what we used to build our iPhone app.
317317

318-
Until [our in-progress PR](https://github.com/ml-explore/mlx-swift-examples/pull/206) is finalized and merged, you have to compile the project [from this fork](https://github.com/cyrilzakka/mlx-swift-examples), and then you can use the `llm-tool` CLI on your Mac like follows.
318+
Until [our in-progress PR](https://github.com/ml-explore/mlx-swift-examples/pull/206) is finalized and merged, you have to compile the project [from this fork](https://github.com/cyrilzakka/mlx-swift-examples), and then you can use the `llm-tool` CLI on your Mac as follows.
319319

320320
For image inference:
321321

@@ -343,14 +343,14 @@ If you integrate SmolVLM2 in your apps using MLX and Swift, we'd love to know ab
343343
### Fine-tuning SmolVLM2
344344

345345
You can fine-tune SmolVLM2 on videos using transformers 🤗
346-
We have fine-tuned 500M variant in Colab on video-caption pairs in [VideoFeedback](https://huggingface.co/datasets/TIGER-Lab/VideoFeedback) dataset for demonstration purposes. Since 500M variant is small, it's better to apply full fine-tuning instead of QLoRA or LoRA, meanwhile you can try to apply QLoRA on cB variant. You can find the fine-tuning notebook [here](https://github.com/huggingface/smollm/blob/main/vision/finetuning/SmolVLM2_Video_FT.ipynb).
346+
We have fine-tuned the 500M variant in Colab on video-caption pairs in [VideoFeedback](https://huggingface.co/datasets/TIGER-Lab/VideoFeedback) dataset for demonstration purposes. Since the 500M variant is small, it's better to apply full fine-tuning instead of QLoRA or LoRA, meanwhile you can try to apply QLoRA on cB variant. You can find the fine-tuning notebook [here](https://github.com/huggingface/smollm/blob/main/vision/finetuning/SmolVLM2_Video_FT.ipynb).
347347

348348

349349
## Read More
350350

351351
We would like to thank Raushan Turganbay, Arthur Zucker and Pablo Montalvo Leroux for their contribution of the model to transformers.
352352

353-
We are looking forward to see all the things you'll build with SmolVLM2!
354-
If you'd like to learn more about SmolVLM family of models, feel free to read the following:
353+
We are looking forward to seeing all the things you'll build with SmolVLM2!
354+
If you'd like to learn more about the SmolVLM family of models, feel free to read the following:
355355

356356
[SmolVLM2 - Collection with Models and Demos](https://huggingface.co/collections/HuggingFaceTB/smolvlm2-smallest-video-lm-ever-67ab6b5e84bf8aaa60cb17c7)

0 commit comments

Comments
 (0)