textrm: Text Generation Model with TRM

An optimized implementation of TinyRecursiveModels using the MLX framework for Apple Silicon.

This project implements a recursive transformer architecture that improves latent reasoning states through multiple cycles, achieving high performance with a significantly smaller parameter count.

Features

MLX Optimized: Specifically designed for Mac Unified Memory.
Recursive Reasoning: Implements the core $n$ latent recursions and $T$ improvement cycles from the TRM paper.
Deep Supervision: Training with supervision at each improvement step for better stability.
EMA (Exponential Moving Average): Integrated weight averaging for better generalization.

Pre-Trained Model

Usage

Setup the environment

python -m venv .venv
source .venv/bin/activate

Install requirements
```
pip install -r requirements.txt 
```
Configure the model Adjust hyperparameters in models/config.py.
Train the model The training script uses MLX's efficient gradient computation and automatic hardware acceleration.
```
python train.py 
```
Best weights will be saved as best_model_mlx.safetensors.

Run Inference Generate text using the trained model:

python inference.py --prompt "Write a polite refusal email"

Project Structure

train.py: Main entry point for training.
inference.py: Interactive text generation.
models/: MLX model definitions (trm_model.py, trm_build.py).
training/: MLX-specific training loop and logic.
ema/: Exponential Moving Average implementation for MLX.
dataset/: Dataset loading and tokenization logic.

Name		Name	Last commit message	Last commit date
Latest commit History 56 Commits
dataset		dataset
ema		ema
models		models
training		training
.gitignore		.gitignore
.python-version		.python-version
LICENSE		LICENSE
README.md		README.md
inference.py		inference.py
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

textrm: Text Generation Model with TRM

Features

Pre-Trained Model

Usage

Project Structure

Acknowledgments

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

textrm: Text Generation Model with TRM

Features

Pre-Trained Model

Usage

Project Structure

Acknowledgments

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages