Replies: 3 comments 8 replies
-
|
Oh, also, given that MLX is optimized for unified memory, does it make sense to provide IOSurface backed images - is there any advantage there from a client perspective there? I dont see any IOSurface backing options in the media pipeline in MLX, so my presumption is no? |
Beta Was this translation helpful? Give feedback.
-
|
Largely the API that exists is what I had specific use cases for -- it can and should be extended. I think we need to consider the costs to current implementations and how we would consume these inputs. For example, the Image case has a sort of canonical representation in The image side does handle an array of So having We already have something close -- the |
Beta Was this translation helpful? Give feedback.
-
|
Beta Was this translation helpful? Give feedback.

Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Hi
I'm the author of Fabric which is aiming to be a modern replacement for Quartz Composer - a content creation tool that made nodes and patching cool way before ComfyUI was a thing :)
I've integrated standard LLM calling by porting some of the code from the MLX-Swift examples which works great, but im trying to wrap my head around video understanding with VLLMs in the context of not having movie files to process (think live camera input, rendered content, processed content, etc)
In reviewing the code, in the
MediaProcessingI see every video path assumes a fixed asset (AVAsset), which is also passed through to the prompting subsystem viaUserInput, which has aVideoStruct, but i don't see a way to send in a set of frames I have in memory to be processed by the standard 'UserInput' or 'Prompt' methods into a VLLM?I see that
Imagecan take in aCIImageor existingMLArraywhich is awesome, but I'd love to know if it makes sense to expose an MLXArray, or array of CIImages as a sequence of fixed length to be processed by a vLLM as part of a UserInput ?I do see
ProcessedVideoas field onLMInput- am i right to assume the correct path would be something like injecting my ownProcessedVideoframes into aLMInputi get from the standardUserInputprompt - which would have video options enabled (since i have no asset to reference) ?Beta Was this translation helpful? Give feedback.
All reactions