Skip to content

and-per-i/too-simplex

Repository files navigation

🚀 Simplicial Transformer with KV Cache Optimization

Python 3.8+ PyTorch License: MIT Code style: black

Efficient 2-Simplicial Transformer with Low-Rank KV Cache Compression – A research implementation for memory-efficient autoregressive generation.

📖 Overview

This repository implements a 2-Simplicial Transformer with optimized KV cache management, following the architecture from the Fast & Simplex paper. The project focuses on memory-efficient autoregressive generation through innovative cache compression techniques.

🔬 Research Contributions

  1. 2-Simplicial Attention: Implements the novel attention mechanism with (K₁, K₂, V₁, V₂) cache structure
  2. Low-Rank Compression: SVD-based compression of KV cache matrices for significant memory reduction
  3. Hybrid Selection: Combines L2-norm selection with low-rank compression for optimal quality-memory tradeoff
  4. Incremental Optimization: PyTorch vanilla → Triton kernels → compression techniques

🏗️ Architecture

Core Components

# 2-Simplicial Attention with KV Cache
Attention(K₁, K₂, V₁, V₂) = σ(Q·K₁) ⊙ σ(Q·K₂) · V₁ · V

Project Structure

simplicial-transformer/
├── simplicial/
│   ├── attention/              # 2-simplicial attention mechanisms
│   ├── cache/                  # KV cache with compression (K₁, K₂, V₁, V₂)
│   ├── layers/                 # Feedforward and simplicial blocks
│   ├── models/                 # Transformer implementations
│   ├── utils/                  # Utility functions (RoPE, sliding window)
│   └── validation/             # Correctness validation tools
├── training/                   # Training scripts and configs
├── scripts/                    # Inference and data preparation scripts
├── tests/                      # Comprehensive test suite
└── debug_tools/                # Debug and validation scripts

🚀 Quick Start

Installation

# Clone the repository
git clone https://github.com/and-per-i/too-simplex.git
cd too-simplex

# Install dependencies
pip install -r requirements.txt

# Install in development mode
pip install -e .

Training

# Train with default config
python training/train.py --config training/configs/logic_finetuning_4090.yaml

# Or use the launcher script
./start.sh train

# With custom config and paths
./start.sh train --config training/configs/logic_finetuning_4090.yaml --log-dir logs --checkpoint-dir checkpoints

Inference

# Generate text
python scripts/generate_text.py

# Or use the CLI entry point
simplicial-generate

📝 License

MIT License

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Packages

 
 
 

Contributors