We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
There was an error while loading. Please reload this page.
1 parent 12027be commit 6cf9d67Copy full SHA for 6cf9d67
torchtitan/experiments/vlm/README.md
@@ -16,4 +16,4 @@ Distributed training usually does not play nice with input of varying shapes. To
16
Then we scatter the patch embeddings to their actual positions in the LLM input tokens.
17
This result in a very simple and general interface to train modern VLM with interleaved data and native resolution & aspect ratio.
18
By setting the appropriate dataloader hyperparameters, we can easily reduce the amount of padding tokens.
19
-We leverage Flex Attention to efficiently handle varying number of patches per image.
+We leverage FlexAttention to efficiently handle varying number of patches per image.
0 commit comments