[Call for Contribution] Enhance Model Weight Storage for Mooncake Store

### Describe your feature request

## Background

The Mooncake Store, built on top of the Mooncake Transfer Engine, has demonstrated excellent performance and stability. As a result, it is increasingly being adopted for **model weight storage and update workflows** in LLM systems.

Several real-world scenarios benefit from this capability:

* **Reinforcement Learning (RL):** Mooncake Store can be used to efficiently publish and fetch updated model weights during training loops.
* **Model Management:** It can serve as a high-performance backend for storing and retrieving model weights across different training or inference services.

Many existing system designs, APIs, and optimizations in Mooncake Store were originally built for **KVCache storage**. A large portion of these components can be reused for model weight storage. However, certain design assumptions made for KVCache workloads are not ideal for model weight scenarios.

### Current Limitations

1. **Lack of a Hard Pin Mechanism**

   In most cases, model weights should **not be evicted unexpectedly**.

   Currently, the system provides a *soft pin* mechanism. In theory, soft pinning can emulate hard pin behavior using configurations such as:

   ```bash
   --default_kv_soft_pin_ttl=N
   --allow_evict_soft_pinned_objects=false
   ```

   where `N` is a sufficiently large number.

   However, this approach is **not user-friendly**, and the semantics are indirect. A dedicated **hard pin mechanism** would provide clearer guarantees and improve usability.

2. **Missing Upsert Interface**

   Mooncake Store originally assumes that KVCache entries are **immutable**, so the system does not provide update semantics.

   In reinforcement learning workflows, however, **model weights are updated frequently**. The current workaround is using `remove → put` process to update the same weight object.

   While this approach works, providing a native **`upsert` interface** would offer several advantages:

   * Simplifies user workflows
   * Provides a clearer API for weight updates
   * Creates opportunities for future optimization
   * Enables improved fault tolerance for update operations

## Discussion

During early discussions, we considered introducing an **“RL mode”** for the master component. In this mode, the system would:

* Enable **hard pinning by default**
* Support **upsert operations**
* Apply other configurations optimized for reinforcement learning workloads

However, we later realized that **model weight storage is not limited to reinforcement learning**.

For example, in **model management systems**, different models may have different storage requirements:

* **Important models** may require **hard pinning**
* **Non-important models** may use **soft pinning**
* Remaining storage space may still be used to **cache KVCache objects**

Because of these mixed workloads, introducing a **fixed mode with predefined behaviors** would reduce flexibility.

Instead, we believe a better approach is to **introduce these mechanisms as independent features**, such as:

* A native **hard pin mechanism**
* A dedicated **upsert API**

This design allows users to **compose behaviors based on their needs**, while keeping the system extensible for future requirements such as engram table storage.

## Proposed Contributions

We welcome contributions in the following areas:

1. **Hard Pin Support**

   * Introduce a hard pin mechanism for objects where the pinned objects cannot be evicted by the eviction policy

2. **Upsert API**

   * Implement an `upsert` interface for updating objects

3. **Documentation and Examples**

   * Provide usage examples for model weight storage
   * Document recommended patterns for RL and model management scenarios

## Expected Outcome

By introducing these mechanisms, Mooncake Store will better support **model weight storage workloads**, including:

* Reinforcement learning training pipelines
* Model management systems
* Hybrid environments combining **KVCache and model weights**

At the same time, this approach preserves the flexibility and extensibility of the system architecture.

## How to Contribute

If you are interested in contributing:

1. Join the discussion in this issue.
2. Share your design proposals or implementation ideas.
3. Submit a pull request with your implementation.

We welcome feedback, design discussions, and implementation contributions from the community.


### Before submitting a new issue...

- [ ] Make sure you already searched for relevant issues and read the [documentation](https://kvcache-ai.github.io/Mooncake/)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Call for Contribution] Enhance Model Weight Storage for Mooncake Store #1621

Describe your feature request

Background

Current Limitations

Discussion

Proposed Contributions

Expected Outcome

How to Contribute

Before submitting a new issue...

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Call for Contribution] Enhance Model Weight Storage for Mooncake Store #1621

Description

Describe your feature request

Background

Current Limitations

Discussion

Proposed Contributions

Expected Outcome

How to Contribute

Before submitting a new issue...

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions