Espresso: direct ANE inference at 4.76x CoreML — potential complementary approach? #372
christopherkarani
started this conversation in
General
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Hi MLX team,
I wanted to share Espresso (https://github.com/christopherkarani/Espresso) and get your technical perspective.
Espresso is an open-source (MIT), pure-Swift framework for Apple Silicon that accesses the ANE directly via the private MIL text dialect — bypassing CoreML's abstraction layer. On M3 Max we achieve:
I see Espresso as potentially complementary to MLX rather than competitive — MLX provides an excellent general-purpose ML framework, while Espresso demonstrates the maximum possible performance ceiling for ANE inference on specific transformer architectures.
A few questions for the team:
Happy to discuss further in this thread or via a GitHub issue.
— Chris
https://github.com/christopherkarani/Espresso
Beta Was this translation helpful? Give feedback.
All reactions