Transformer from scratch

A from-scratch implementation of the original Transformer architecture described in Attention Is All You Need (Vaswani et al., 2017), built in PyTorch without relying on high-level abstractions.

Motivation

The goal of this project is to gain a thorough understand the Transformer architecture by studying and replicating each of its building blocks, then assembling them into a working model and training it on real-world data (en-fr translation).

Repository Structure

transformer-from-scratch.ipynb contains the block-wise implementation of transformers, alongside the personal notes I took to help me break down and digest the Transformer Architecture.

transformer contains a cleaned-up, streamlined version of the architecture found in the Jupyter Notebook, which was more verbose for clarity purposes.

transformer/
├── model/
│   ├── attention.py       # MultiHeadAttention (self, masked, cross)
│   ├── encoder.py         # EncoderLayer, Encoder
│   ├── decoder.py         # DecoderLayer, Decoder
│   ├── feedforward.py     # Position-wise FeedForwardBlock
│   ├── embedding.py       # PositionalEncoding
│   └── transformer.py     # Seq2Seq (composes encoder + decoder)
├── data/
│   └── tokenizer.py       # Word-level tokenizer with regex splitting
├── train.py               # Training loop
├── main.py                # Entry point
└── config.py              # Hyperparameters and device configuration

Hyperparameters

Parameter	Value
d_model	512
d_k / d_v	64
Heads	8
Layers	4
d_ff	2048
Optimizer	Adam
Learning rate	0.0001

Training

Trained on the opus_books English-French dataset. The tokenizer splits on words and punctuation separately using regex, building a shared vocabulary from both source and target languages.

Requirements

torch
numpy
datasets

Install with:

pip install -r requirements.txt

Usage

python main.py

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
transformer		transformer
.gitignore		.gitignore
input.txt		input.txt
readme.md		readme.md
requirements.txt		requirements.txt
transformer-from-scratch.ipynb		transformer-from-scratch.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Transformer from scratch

Motivation

Repository Structure

Hyperparameters

Training

Requirements

Usage

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Transformer from scratch

Motivation

Repository Structure

Hyperparameters

Training

Requirements

Usage

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages