Skip to content

Possible to extend to VLMs? #17

@sterzhang

Description

@sterzhang

Hi Authors,

Thanks for your great work! I really enjoy reading your paper and its pretty inspiring. Right now, I have conducted some experiments on the reasoning tasks mentioned in your paper and the results are pretty good!

I am wondering is it possible to extend LatentMAS to visual reasoning tasks as well. For example, can you directly apply a vision language model e.g., Qwen3-VL-8B as the backbone model for image perception and understanding tasks like VLM2-Bench (https://vlm2-bench.github.io/) … that requires model reasoning and collaboration as well?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions