Skip to content

[RFC]: Data Type Support in Mooncake Store #1673

@stmatengss

Description

@stmatengss

Changes proposed

Summary

Extend Mooncake Store to support explicit data type classification for stored objects (KVCache, Tensor, Weight, Sample, etc.), enabling type-aware optimizations for storage, eviction, replication, and retrieval policies.

Motivation

Currently, Mooncake Store treats all objects as opaque byte blobs without semantic understanding of their content. This limits optimization opportunities:

  1. Eviction Policy: KVCache blocks have different access patterns than model weights
  2. Replication Strategy: Weights benefit from higher replication, samples may need lower
  3. Compression: Different data types have different compression characteristics
  4. Prefetching: Type-aware prefetching can improve cache hit rates
  5. Monitoring: Type-specific metrics enable better observability
  6. Storage Tiering: Critical types (weights) stay in faster storage tiers

Design Goals

  • Backward Compatibility: Existing clients continue working without modification
  • Minimal Overhead: Type metadata adds <1% storage overhead
  • Extensibility: Easy to add new data types without breaking changes
  • Type-Aware Policies: Enable per-type eviction, replication, and allocation strategies

Proposed Design

1. Data Type Enumeration

Add a new enum in mooncake-store/include/types.h:

enum class ObjectDataType : uint8_t {
    UNKNOWN = 0,      // Default for backward compatibility
    KVCACHE = 1,      // KV cache blocks for LLM inference
    TENSOR = 2,       // General tensor data
    WEIGHT = 3,       // Model weights/parameters
    SAMPLE = 4,       // Training samples or prompts
    ACTIVATION = 5,   // Intermediate activations
    GRADIENT = 6,     // Gradient tensors
    OPTIMIZER_STATE = 7,  // Optimizer state (momentum, etc.)
    METADATA = 8,     // Model metadata, configs
    // Reserved 9-255 for future types
};

2. Extended ReplicateConfig

Extend ReplicateConfig in mooncake-store/include/replica.h:

struct ReplicateConfig {
    size_t replica_num{1};
    bool with_soft_pin{false};
    std::vector<std::string> preferred_segments{};
    std::string preferred_segment{};
    bool prefer_alloc_in_same_node{false};

    // NEW: Data type specification
    ObjectDataType data_type{ObjectDataType::UNKNOWN};

    // NEW: Type-specific hints (optional)
    std::unordered_map<std::string, std::string> type_hints{};
};

3. Object Metadata Extension

Extend object metadata to store type information:

struct ObjectMetadata {
    ObjectKey key;
    Version version;
    uint64_t size;
    ObjectDataType data_type;  // NEW
    uint64_t created_at_ms;
    uint64_t last_accessed_at_ms;
    // ... existing fields
};

4. Type-Aware Eviction Strategy

Create new eviction strategy in mooncake-store/include/eviction_strategy.h:

class TypeAwareEvictionStrategy : public EvictionStrategy {
public:
    struct TypePolicy {
        double eviction_priority;  // Lower = harder to evict
        uint64_t min_ttl_ms;       // Minimum time before eviction
        bool allow_eviction;
    };

    void set_type_policy(ObjectDataType type, TypePolicy policy);
    std::vector<ObjectKey> select_victims(size_t target_bytes) override;
};

Default policies:

  • WEIGHT: priority=0.1 (rarely evict), min_ttl=3600s
  • KVCACHE: priority=0.5 (moderate), min_ttl=60s
  • SAMPLE: priority=0.9 (evict first), min_ttl=10s
  • UNKNOWN: priority=0.5 (default behavior)

5. API Changes

C++ API (backward compatible)

// Existing API continues to work (data_type defaults to UNKNOWN)
int put(const std::string& key, std::span<const char> value,
        const ReplicateConfig& config = ReplicateConfig{});

// New overload with explicit type
int put_typed(const std::string& key, std::span<const char> value,
              ObjectDataType data_type,
              const ReplicateConfig& config = ReplicateConfig{});

Python API

# Existing API (backward compatible)
client.put(key, value)

# New typed API
from mooncake import ObjectDataType
client.put(key, value, data_type=ObjectDataType.KVCACHE)
client.put(key, value, data_type=ObjectDataType.WEIGHT,
           config=ReplicateConfig(replica_num=3))

6. Type-Aware Allocation

Extend AllocationStrategy to consider data types:

class TypeAwareAllocationStrategy : public AllocationStrategy {
    // Allocate WEIGHT to high-performance segments
    // Allocate SAMPLE to cost-effective segments
    SegmentId select_segment(size_t size, ObjectDataType type,
                            const std::vector<std::string>& preferred) override;
};

7. Monitoring & Metrics

Add per-type metrics:

  • store.objects.count{type=KVCACHE}
  • store.objects.bytes{type=WEIGHT}
  • store.evictions.count{type=SAMPLE}
  • store.cache_hit_rate{type=TENSOR}

Implementation Plan

Phase 1: Core Infrastructure (Week 1-2)

  • Add ObjectDataType enum to types.h
  • Extend ReplicateConfig with data_type field
  • Update object metadata structures
  • Add serialization/deserialization for new fields
  • Ensure backward compatibility with existing metadata

Phase 2: Storage & Retrieval (Week 3-4)

  • Update put() family to accept and store data type
  • Update get() family to return data type (optional)
  • Add put_typed() convenience methods
  • Update master service to track type metadata
  • Add migration path for existing objects (default to UNKNOWN)

Phase 3: Type-Aware Policies (Week 5-6)

  • Implement TypeAwareEvictionStrategy
  • Implement TypeAwareAllocationStrategy
  • Add configuration for per-type policies
  • Add type-based replication policies

Phase 4: Python Bindings & Testing (Week 7-8)

  • Expose ObjectDataType to Python
  • Update pybind11 bindings
  • Add unit tests for all type-aware features
  • Add integration tests with vLLM/SGLang
  • Performance benchmarks

Phase 5: Monitoring & Documentation (Week 9-10)

  • Add per-type metrics collection
  • Update Prometheus exporters
  • Write user documentation
  • Create migration guide
  • Add examples for each data type

Backward Compatibility

  1. Default Behavior: Objects without explicit type use UNKNOWN, treated with default policies
  2. Metadata Migration: Existing objects automatically tagged as UNKNOWN on first access
  3. API Compatibility: All existing put()/get() calls work unchanged
  4. Wire Protocol: Type field optional in RPC, defaults to UNKNOWN if missing

Performance Considerations

  1. Storage Overhead: 1 byte per object for type enum (~0.001% for 1MB objects)
  2. CPU Overhead: Single enum comparison in hot paths (<1ns)
  3. Memory Overhead: Type-to-policy map cached in memory (~1KB)
  4. Network Overhead: 1 byte added to RPC messages (negligible)

Alternatives Considered

Alternative 1: Key Prefix Convention

Use key prefixes like kvcache:, weight: to infer types.

Rejected: Fragile, requires parsing, breaks existing keys, no enforcement.

Alternative 2: Separate Stores per Type

Create separate store instances for each data type.

Rejected: Resource fragmentation, complex management, no unified view.

Alternative 3: External Type Registry

Store type mappings in external service (etcd).

Rejected: Extra network hop, consistency challenges, single point of failure.

Open Questions

  1. Type Inference: Should we support automatic type detection from tensor metadata?
  2. Type Conversion: Allow changing object type after creation?
  3. Composite Types: Support objects containing multiple types (e.g., checkpoint = weights + optimizer state)?
  4. Type Hierarchies: Should WEIGHT be a subtype of TENSOR?

Success Metrics

  1. Adoption: >50% of objects tagged with explicit types within 3 months
  2. Performance: 10-20% improvement in cache hit rate for typed workloads
  3. Eviction Quality: 30% reduction in premature weight evictions
  4. Compatibility: Zero breaking changes for existing deployments

References

Before submitting a new issue...

  • Make sure you already searched for relevant issues and read the documentation

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions