-
-
Notifications
You must be signed in to change notification settings - Fork 4.4k
Closed
Labels
feature requestFeature request pending on roadmapFeature request pending on roadmap
Description
When loading a fine-tuned VLM (specifically tested on SmolVLM and Qwen2-VL) using FastVisionModel.from_pretrained, the loader returns only a Tokenizer object (e.g., GPT2TokenizerFast) instead of the full Vision-Language Processor (including Image and Video Processor).
This happens because the saving process writes "processor_class": "_Unsloth_Patched_SmolVLMProcessor" into processor_config.json. Upon reloading, even within Unsloth, the library fails to map this patched name back to the correct processing logic, causing a fallback to text-only components.
Reproduction Code:
from unsloth import FastVisionModel
# 1. After training and saving:
# processor_config.json contains "_Unsloth_Patched_SmolVLMProcessor"
# 2. Attempting to load for inference:
model, processor = FastVisionModel.from_pretrained(
model_name = "path/to/finetuned_folder",
load_in_4bit = True
)
print(processor)
- tokenizer: GPT2TokenizerFast(name_or_path='/home/hamzas/github/VLM/outputs/smolvlm2-256m/2026-02-06/15-33-03/checkpoint-237', vocab_size=49152, model_max_length=8192, is_fast=True, padding_side='left', truncation_side='left', special_tokens={'bos_token': '<|im_start|>', 'eos_token': '<end_of_utterance>', 'unk_token': '<|endoftext|>', 'pad_token': '<|im_end|>', 'additional_special_tokens'.................
The workaround I found is to manually remove Unsloth_Patched from every processor config json and only then the processor loads with the Image and Video Processor configs.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
feature requestFeature request pending on roadmapFeature request pending on roadmap