Replies: 1 comment 8 replies
-
Interesting findings. Which code are you running? I'm wondering if setting the format_options for both PDF and Image could end up with twice the models. In case we should investigate and fix it. cc @cau-git |
Beta Was this translation helpful? Give feedback.
8 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
I am trying to execute I am trying to execute https://docling-project.github.io/docling/examples/compare_vlm_models/ for the
https://huggingface.co/mistralai/Pixtral-12B-2409
But loading this model fails with:
Given the gpu is a l40 with about 46GB RAM I was hoping to be able to load this 12B model -- I usually am able to load much larger (but quantized) models (30 billion easily - sometimes even larger ones).
So far, I have not yet set
PYTORCH_CUDA_ALLOC_CONF
.It looks like this model might be loaded twice - could this be true?
Beta Was this translation helpful? Give feedback.
All reactions