Skip to content

[Bug] FastVisionModel.from_pretrained fails to load full Processor if processor_class is patched (SmolVLM/Qwen2-VL) #4085

@HamzaSulaiman1996

Description

@HamzaSulaiman1996

When loading a fine-tuned VLM (specifically tested on SmolVLM and Qwen2-VL) using FastVisionModel.from_pretrained, the loader returns only a Tokenizer object (e.g., GPT2TokenizerFast) instead of the full Vision-Language Processor (including Image and Video Processor).

This happens because the saving process writes "processor_class": "_Unsloth_Patched_SmolVLMProcessor" into processor_config.json. Upon reloading, even within Unsloth, the library fails to map this patched name back to the correct processing logic, causing a fallback to text-only components.

Reproduction Code:

from unsloth import FastVisionModel

# 1. After training and saving:
# processor_config.json contains "_Unsloth_Patched_SmolVLMProcessor"

# 2. Attempting to load for inference:
model, processor = FastVisionModel.from_pretrained(
    model_name = "path/to/finetuned_folder",
    load_in_4bit = True
)

print(processor)

- tokenizer: GPT2TokenizerFast(name_or_path='/home/hamzas/github/VLM/outputs/smolvlm2-256m/2026-02-06/15-33-03/checkpoint-237', vocab_size=49152, model_max_length=8192, is_fast=True, padding_side='left', truncation_side='left', special_tokens={'bos_token': '<|im_start|>', 'eos_token': '<end_of_utterance>', 'unk_token': '<|endoftext|>', 'pad_token': '<|im_end|>', 'additional_special_tokens'.................

The workaround I found is to manually remove Unsloth_Patched from every processor config json and only then the processor loads with the Image and Video Processor configs.

Metadata

Metadata

Assignees

No one assigned

    Labels

    feature requestFeature request pending on roadmap

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions