This document records all development operations, workflows, and progress logs for ReplicateAI.
It serves as both a quick-start guide and an internal research diary.
Before running any commands, make sure your Python environment is ready.
uv sync
source .venv/bin/activate
pip install -r requirements.txt # optional, if you have dependencies laterAll Makefile commands assume Python is in .venv/bin/python.
Run this once after cloning the repository:
make initThis will create the following directory layout:
ReplicateAI/
├── stage1_foundation/
├── stage2_representation/
├── stage3_deep_renaissance/
├── stage4_statistical/
├── stage5_neural_origins/
├── scripts/
├── paper_template/
└── PAPER_INDEX.json
Each stage corresponds to a historical era of AI research:
| Stage | Era | Description |
|---|---|---|
| stage1_foundation | 🪐 Modern Foundation | LLMs & Multimodal Models (2023–2025) |
| stage2_representation | 🔍 Representation | Transformers, BERT, Embeddings |
| stage3_deep_renaissance | 🧩 Deep Renaissance | CNNs, Autoencoders |
| stage4_statistical | 📊 Statistical | SVMs, Random Forests, EM |
| stage5_neural_origins | 🧬 Neural Origins | Perceptron, Backprop, Hopfield |
Use the make add command to create a new paper entry.
Example:
make add name="Qwen2.5" year=2025 org="Alibaba" stage=foundation
make add name="BERT" year=2018 org="Google" stage=representation
make add name="Perceptron" year=1958 org="Cornell" stage=neuralEach command will:
- Create a folder like
stage1_foundation/2025_Qwen2.5/ - Copy template files from
paper_template/ - Add metadata to
PAPER_INDEX.json
After running, check that new directories and index entries were created.
Check current reproduction progress by stage:
make statusExample output:
📊 Current Progress by Stage:
🪐 Modern Foundation : 1 papers
🔍 Representation : 1 papers
🧩 Deep Renaissance : 0 papers
📊 Statistical : 0 papers
🧬 Neural Origins : 1 papers
To see all registered papers from PAPER_INDEX.json:
make listExample output:
2025 | Qwen2.5 | Alibaba | planned
2018 | BERT | Google | planned
1958 | Perceptron | Cornell | planned
To quickly summarize total paper count and stage distribution:
make reportExample output:
📅 Report generated: 2025-10-18
🧩 Total Papers: 3
Foundation -> 1 papers
Representation -> 1 papers
Neural Origins -> 1 papers
When you want to clean Python caches and temp files:
make cleanThis removes:
__pycache__/.pycfiles
Each paper module created by make add contains a standard structure:
2025_Qwen2.5/
├── README.md ← Paper summary & key ideas
├── report.md ← Experiment results / analysis
├── notebook/ ← Interactive notebooks
├── src/ ← Implementation code
└── references.bib ← Original citation
- Open the new folder (e.g.
stage1_foundation/2025_Qwen2.5/) - Edit
README.mdfollowing the paper_template - Implement your model under
src/ - Add experiment results in
report.md - Update status to
"in progress"or"completed"inPAPER_INDEX.json
git add .
git commit -m "Add Qwen2.5 reproduction module"
git push| Task | Command |
|---|---|
| Initialize repo | make init |
| Add paper | make add name="..." year=YYYY org="..." stage=<stage> |
| Check status | make status |
| List all papers | make list |
| Generate summary report | make report |
| Clean temp files | make clean |
| Date | Update | Notes |
|---|---|---|
| 2025-10-18 | ✅ Initialized ReplicateAI structure | Added Makefile, scripts, and template |
| 2025-10-18 | ➕ Added BERT | Verified indexing and make status output |
| 2025-10-18 | ✅ Attention All You Need | Implemented Scaled Dot-Product Attention, MultiHeadAttention and PositionwiseFeedforward |
| 2025-10-18 | ✅ Attention All You Need | Implemented Encoder, Decoder, Transformer,add a toy dataset for train test |
| 2025-10-21 | ✅ Attention All You Need | Implemented training on multi30k dataset training |
| 2025-10-22 | ✅ Attention All You Need | Debug and fixed loss NAN issue, and refacted the code |
| 2025-10-23 | ✅ Attention All You Need | Implemented multiple tokenizer to abalition test |
You can continue appending to this table as the project evolves.
| Stage Code | Directory | Display Name | Era |
|---|---|---|---|
| foundation | stage1_foundation | 🪐 Modern Foundation | 2023–2025 |
| representation | stage2_representation | 🔍 Representation | 2013–2020 |
| deep | stage3_deep_renaissance | 🧩 Deep Renaissance | 2006–2014 |
| statistical | stage4_statistical | 📊 Statistical | 1990s–2000s |
| neural | stage5_neural_origins | 🧬 Neural Origins | 1950s–1980s |
- Always verify that
PAPER_INDEX.jsonstays valid JSON. - Use
make helpto recall all supported commands. - Each reproduction module should be independent and documented.
- When in doubt: check
paper_template/README.md.
🧩 ReplicateAI — Rebuilding AI, one paper at a time.