🔤 English–Tamil Transformer from Scratch (PyTorch)

This project demonstrates a Transformer-based neural machine translation model built completely from scratch using PyTorch. The goal was to translate English to Tamil using a small dataset and limited compute resources.

📌 Overview

✅ Implemented Transformer architecture from scratch (no high-level libs like torch.nn.Transformer)
🗣️ Trained on 200,000 English–Tamil sentence pairs
🏋️‍♂️ Trained for 20 epochs due to compute limitations
🔄 Translation supported via:
- Greedy decoding
- Beam search decoding
📉 Accuracy is low due to limited training, but the model shows learning ability
🚀 When trained for more epochs, performance improves significantly

🧠 Model Architecture

Encoder–Decoder Transformer
Positional Encoding
Masked Multi-Head Attention
Layer Normalization
Cross-Attention between Encoder and Decoder
Beam Search and Greedy Decoding

⚙️ Training Details

Item	Description
Dataset	200,000 English–Tamil pairs
Epochs	20
Batch Size	32
Optimizer	Adam with warmup scheduler
Loss	Cross-Entropy with masking
Beam Width	3

📥 Example Translations

English	Tamil (Greedy)	Tamil (Beam Search)
It's not your fault	உன் முழுவதும் இல்லை	உன் முழுவதும் இல்லை
How are you brother?	உங்கள் காதலிக்கு என்ன செய்து கொண்டார்கள்?	உங்களுக்கு நீங்கள் என்ன?

⚠️ These translations are not accurate yet due to limited training. With more epochs, the model's performance improves.

🛠️ Run Locally

# Clone the repo
https://github.com/Coolcoder009/Transformers-Scratch.git
cd Transformers-Scratch

# Environment variable creation
python -m venv venv

# Activate 
venv/scripts/activate

# Train the model
python train.py

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
data		data
output		output
.gitignore		.gitignore
README.md		README.md
Sample_testing.txt		Sample_testing.txt
inference.py		inference.py
requirements.txt		requirements.txt
train.py		train.py
transformer.py		transformer.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

🔤 English–Tamil Transformer from Scratch (PyTorch)

📌 Overview

🧠 Model Architecture

⚙️ Training Details

📥 Example Translations

🛠️ Run Locally

About

Uh oh!

Releases

Packages

Languages

Coolcoder009/Transformers-Scratch

Folders and files

Latest commit

History

Repository files navigation

🔤 English–Tamil Transformer from Scratch (PyTorch)

📌 Overview

🧠 Model Architecture

⚙️ Training Details

📥 Example Translations

🛠️ Run Locally

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages