MicroGPT

A minimal GPT (transformer language model) in pure Ruby, ported from Andrej Karpathy's microGPT Python implementation (~243 lines).

This is the full algorithmic content of a GPT — autograd engine, transformer architecture, Adam optimizer, training loop, and autoregressive inference — implemented from scratch with zero external dependencies (only Ruby stdlib).

Educational / demonstrative purposes only — extremely inefficient by design.

In Simple Terms

It reads a text file of names, learns the character patterns in those names (which letters tend to follow which), and then generates new, made-up names one character at a time based on what it learned. Same architecture as ChatGPT, just with ~4,000 parameters instead of hundreds of billions.

What's Inside

Component	Description
`Value`	Scalar-valued autograd engine with reverse-mode differentiation
`NN`	Neural network primitives: `linear`, `softmax`, `rmsnorm`
`Tokenizer`	Character-level tokenizer (unique chars → token ids)
`Model`	GPT-2-style transformer (RMSNorm, multi-head attention, ReLU MLP)
`AdamOptimizer`	Adam with bias correction and linear LR decay
`Trainer`	Training loop with cross-entropy loss
`Sampler`	Temperature-controlled autoregressive text generation
`Config`	All hyperparameters in one immutable struct

The default configuration produces a model with 4,192 parameters — compare that to GPT-4's hundreds of billions. Same architecture, just much smaller.

Requirements

Ruby 3.2+
Bundler (for running tests)

Quick Start

Install test dependencies:

bundle install

Train the model and generate samples:

ruby bin/train

This will:

Load the names dataset from input.txt (32k names)
Train for 1,000 steps (~minutes on a modern machine)
Generate 20 hallucinated names

Use --steps 50 when running the model for testing to keep it fast.

Example: ruby bin/train train --steps 50

You can also pass a custom dataset file:

ruby bin/train train path/to/your/data.txt

The dataset should be a text file with one document (e.g. name, word, short sentence) per line.

Running Tests

bundle exec rspec

97 examples covering every class: autograd correctness, gradient propagation, softmax numerical stability, encode/decode roundtrips, optimizer updates, training loss decrease, and sampling determinism.

Project Structure

├── bin/train                    # Runner script
├── input.txt                    # Names dataset (one name per line)
├── lib/
│   ├── micro_gpt.rb             # Top-level module
│   └── micro_gpt/
│       ├── value.rb             # Autograd engine
│       ├── nn.rb                # linear, softmax, rmsnorm
│       ├── random.rb            # Gaussian RNG, weighted sampling
│       ├── config.rb            # Hyperparameters
│       ├── tokenizer.rb         # Character-level tokenizer
│       ├── dataset.rb           # Local file dataset loader
│       ├── model.rb             # GPT model + KV cache
│       ├── optimizer.rb         # Adam optimizer
│       ├── trainer.rb           # Training loop
│       └── sampler.rb           # Text generation
└── spec/                        # RSpec tests for everything

How It Works

The model learns character-level patterns from the dataset. During training, each name is wrapped in BOS (Beginning of Sequence) tokens, fed through the transformer one character at a time, and the model learns to predict the next character. At inference time, it generates new text by sampling from the predicted distribution.

The entire forward and backward pass operates on Value objects — scalar floats that track their computation graph. Calling loss.backward walks the graph in reverse topological order, applying the chain rule to compute gradients for every parameter. This is the same algorithm (backpropagation) used by PyTorch and TensorFlow, just on individual scalars instead of tensors.

Credits

Original Python implementation by Andrej Karpathy — part of a six-year compression arc from micrograd (2020) to microGPT (2026), stripping away every layer of abstraction to reveal the core algorithm.

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
bin		bin
lib		lib
spec		spec
.gitignore		.gitignore
.rspec		.rspec
FLOW.md		FLOW.md
Gemfile		Gemfile
Gemfile.lock		Gemfile.lock
README.md		README.md
input.txt		input.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MicroGPT

In Simple Terms

What's Inside

Requirements

Quick Start

Running Tests

Project Structure

How It Works

Credits

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

MicroGPT

In Simple Terms

What's Inside

Requirements

Quick Start

Running Tests

Project Structure

How It Works

Credits

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages