Skip to content

Commit a7c99ec

Browse files
authored
Add Qwen 3 VL Support (Dense) (#414)
* Add Qwen3-VL model support * Clean up model implementation * Removing debug logging * Revert ContentView and VLMEvaluator * Refactor Qwen3VLLanguage and Qwen3VLVision models * Replace custom splitFeatures() with native MLX split(indices:) Use MLX's built-in split() function instead of manual Swift slicing loop. Converts size array to cumulative indices for MLX's split(indices:) method. Removes 12 lines of unnecessary Swift code. * Vectorize rotaryPositionEmbedding() using MLX broadcasting instead of nested loops Replace 5-nested-loop coordinate generation with MLX vectorized operations: - Changed from Swift Array loops (~113K iterations) to MLX broadcasting - Use expandedDimensions() for Cartesian product generation - Apply arithmetic on broadcasted tensors instead of element-wise appends - Use stacked() and tiled() for final coordinate organization This is a direct port of Python's meshgrid pattern already used in positionalEmbeddings(). Expected speedup: 50-75% for this function. * Refine rotaryPositionEmbedding() vectorization implementation - Add guard statement to skip empty grids (mergedH or mergedW == 0) - Use .reshaped() instead of multiple expandedDimensions() calls - Move mergeScalar computation outside the loop for efficiency - Explicit broadcast to target shape before flattening - Cleaner variable naming and comments This approach is more efficient and matches the Python implementation pattern. * Refactor Qwen3VL and Qwen3VLVision for improved cumulative index calculations * Refactor ContentView and VLMEvaluator * Update model configuration in VLMEvaluator * Update model configuration in VLMEvaluator to use `smolvlm` * Remove Qwen3VL_PositionIDs, Qwen3VLConfiguration, Qwen3VLLanguage, and Qwen3VLVision files
1 parent b12ef41 commit a7c99ec

File tree

4 files changed

+1802
-14
lines changed

4 files changed

+1802
-14
lines changed

Applications/VLMEval/ContentView.swift

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -416,7 +416,6 @@ class VLMEvaluator {
416416

417417
let stream = try MLXLMCommon.generate(
418418
input: lmInput, parameters: generateParameters, context: context)
419-
420419
// generate and output in batches
421420
for await batch in stream._throttle(
422421
for: updateInterval, reducing: Generation.collect)

0 commit comments

Comments
 (0)