[Ecosystem] Mooncake

### Contact emails


james0zan@gmail.com, me@zhyncs.com


### Project summary


A KVCache-centric Disaggregated Architecture for LLM Serving


### Project description


* Transfer Engine is a high-performance data transfer framework. Transfer Engine provides a unified interface to transfer data from DRAM, VRAM or NVMe, while the technical details related to hardware are hidden. Transfer Engine supports TCP, RDMA (InfiniBand/RoCEv2/eRDMA/NVIDIA GPUDirect) and NVMe over Fabric (NVMe-of) protocols.

* P2P Store is built on the Transfer Engine and supports sharing temporary objects between peer nodes in a cluster. P2P Store is ideal for scenarios like checkpoint transfer, where data needs to be rapidly and efficiently shared across a cluster.

* Mooncake Store is a distributed KVCache storage engine specialized for LLM inference based on Transfer Engine. It is the central component of the KVCache-centric disaggregated architecture. The goal of Mooncake Store is to store the reusable KV caches across various locations in an inference cluster.

* Mooncake Backend serves as a fault-tolerant PyTorch distributed backend. It provides robust collective communication primitives capable of continuing operation seamlessly in the presence of rank failures. Mooncake EP extends these capabilities to support elastic and fault-tolerant MoE model inference with dynamic token routing.



### Are there any other projects in the PyTorch Ecosystem similar to yours? If yes, what are they?


No.

While SGLang, vLLM, and LMCache are also designed for LLM serving, they operate in a collaborative and complementary relationship with Mooncake. Mooncake is deeply integrated into these projects and serves as a foundational infrastructure layer, providing high-performance data transfer for PD and EPD architectures, distributed and shared KV cache storage, and collective communication for EP parallelism, among other core functionalities.


### Project repo URL


https://github.com/kvcache-ai/Mooncake


### Additional repos in scope of the application


No response


### Project license


Apache License


### GitHub handles of the project maintainer(s)


@james0zan, @stmatengss, @UNIDY2002, @ShangmingCai, @alogfans, @chestnut-Q, @ykwd



### Is there a corporate or academic entity backing this project? If so, please provide the name and URL of the entity.


No response


### Website URL


https://kvcache-ai.github.io/Mooncake/


### Documentation


https://kvcache-ai.github.io/Mooncake/getting_started/quick-start.html

https://kvcache-ai.github.io/Mooncake/index.html


### How do you build and test the project today (continuous integration)? Please describe.


We rely on GitHub Actions as our CI system to automatically build and test the project. An example CI run can be found here: [https://github.com/kvcache-ai/Mooncake/actions/runs/20810476279](https://www.google.com/url?q=https://github.com/kvcache-ai/Mooncake/actions/runs/20810476279&sa=D&source=editors&ust=1768013607757599&usg=AOvVaw1YLamuhaeHt9i5YrSC3aPc)



<img width="1146" height="677" alt="Image" src="https://github.com/user-attachments/assets/f562aced-10d5-4631-85cd-9fa0b5d6e736" />



### Version of PyTorch


2.8.0, 2.9.0, 2.9.1


### Components of PyTorch


As a communication library for LLM serving, this project primarily leverages PyTorch Distributed. See:

[https://docs.pytorch.org/docs/stable/distributed.html](https://www.google.com/url?q=https://docs.pytorch.org/docs/stable/distributed.html&sa=D&source=editors&ust=1768013607758543&usg=AOvVaw01EOTX8POtIj2z3rF5pood)
https://kvcache-ai.github.io/Mooncake/python-api-reference/ep-backend.html

### How long do you expect to maintain the project?


This project is a critical dependency for inference engines such as vLLM and SGLang. It powers key components, including the transfer engine for PD disaggregation data transfer and the Mooncake store for distributed CPU memory caching. As a result, the project will be actively and continuously maintained.


### Additional information


Mooncake’s architecture is based on the [FAST 2025 Best Paper](https://www.usenix.org/conference/fast25/presentation/qin). The open-source project has 130 contributors and is maintained and contributed to by developers from NVIDIA, AMD, Intel, Google, Moonshot AI, Tsinghua University, Stanford, Alibaba, Approaching.ai, Ant Group, Tencent, etc., as well as individual contributors.


Mooncake has deep collaboration and integration with PyTorch Foundation projects such as SGLang, vLLM, and LMCache, and has been widely adopted across many companies, operating at scale on thousands of GPUs.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Ecosystem] Mooncake #52

Contact emails

Project summary

Project description

Are there any other projects in the PyTorch Ecosystem similar to yours? If yes, what are they?

Project repo URL

Additional repos in scope of the application

Project license

GitHub handles of the project maintainer(s)

Is there a corporate or academic entity backing this project? If so, please provide the name and URL of the entity.

Website URL

Documentation

How do you build and test the project today (continuous integration)? Please describe.

Version of PyTorch

Components of PyTorch

How long do you expect to maintain the project?

Additional information

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Ecosystem] Mooncake #52

Description

Contact emails

Project summary

Project description

Are there any other projects in the PyTorch Ecosystem similar to yours? If yes, what are they?

Project repo URL

Additional repos in scope of the application

Project license

GitHub handles of the project maintainer(s)

Is there a corporate or academic entity backing this project? If so, please provide the name and URL of the entity.

Website URL

Documentation

How do you build and test the project today (continuous integration)? Please describe.

Version of PyTorch

Components of PyTorch

How long do you expect to maintain the project?

Additional information

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions