A collection of Transformer-based implementations for various NLP tasks, following the original paper Attention Is All You Need.
| Task | Description | Directory |
|---|---|---|
| Machine Translation | English-to-Chinese NMT using WIT3 dataset | en-zh_NMT/ |
| Chinese Word Segmentation | Sequence labeling with B/E/S/M tags | transformer_jieba/ |
| Text Classification | Chinese text classification (THU-CTC) | transformer_text_Classfication/ |
| Natural Language Inference | Sentence entailment with Stanford SNLI | transformer_infersent/ |
| Reading Comprehension | Extractive QA with Pointer Network | transformer_RC/ |
# Clone the repository
git clone https://github.com/fooSynaptic/transfromer_NN_Block.git
cd transfromer_NN_Block
# Install dependencies
pip install -r requirements.txttransfromer_NN_Block/
├── Models/ # Shared model definitions
│ └── models.py # Vanilla Transformer implementation
├── en-zh_NMT/ # English-Chinese Neural Machine Translation
├── transformer_jieba/ # Chinese Word Segmentation
├── transformer_text_Classfication/ # Text Classification
├── transformer_infersent/ # Natural Language Inference
├── transformer_RC/ # Reading Comprehension
├── images/ # Training visualizations
├── results/ # Model checkpoints
├── hyperparams.py # Hyperparameter configurations
└── requirements.txt # Python dependencies
Train a sequence labeling model using B/E/S/M tagging scheme:
- B (Begin): Start of a word
- E (End): End of a word
- S (Single): Single-character word
- M (Middle): Middle of a word
cd transformer_jieba
# Preprocess data
python prepro.py
# Train the model
python train.py
# Evaluate (BLEU score ~80)
python eval.pyDataset: WIT3 (Web Inventory of Transcribed and Translated Talks)
cd en-zh_NMT
# Preprocess data
python prepro.py
# Train the model
python train.py
# Evaluate
python eval.pyResults:
Dataset: THU Chinese Text Classification (THUCTC)
Categories:
labels = {
'时尚': 0, '教育': 1, '时政': 2, '体育': 3, '游戏': 4,
'家居': 5, '科技': 6, '房产': 7, '财经': 8, '娱乐': 9
}cd transformer_text_Classfication
# Preprocess and train
python prepro.py
python train.py
# Evaluate
python eval.pyResults (10-class classification):
| Metric | Score |
|---|---|
| Accuracy | 0.85 |
| Macro Avg F1 | 0.85 |
| Weighted Avg F1 | 0.85 |
Detailed Classification Report
precision recall f1-score support
0 0.91 0.95 0.93 1000
1 0.96 0.77 0.85 1000
2 0.92 0.93 0.92 1000
3 0.95 0.93 0.94 1000
4 0.86 0.91 0.88 1000
5 0.83 0.47 0.60 1000
6 0.86 0.85 0.86 1000
7 0.64 0.87 0.74 1000
8 0.79 0.91 0.85 1000
9 0.88 0.91 0.89 1000
accuracy 0.85 10000
macro avg 0.86 0.85 0.85 10000
weighted avg 0.86 0.85 0.85 10000
Dataset: Stanford SNLI
cd transformer_infersent
# Download and prepare data
wget https://nlp.stanford.edu/projects/snli/snli_1.0.zip
unzip snli_1.0.zip
# Preprocess
python data_prepare.py
python prepro.py
# Train
python train.py
# Evaluate
python eval.py --task infersentTraining Progress:
| Training Accuracy | Training Loss |
|---|---|
![]() |
![]() |
Evaluation Results (3-class: entailment, contradiction, neutral):
| Metric | Score |
|---|---|
| Accuracy | 0.76 |
| Macro Avg F1 | 0.76 |
Architecture: Transformer Encoder + BiDAF Attention + Pointer Network
cd transformer_RC
# Preprocess and train
python prepro.py
python train.py
# Evaluate
python eval.pyResults:
| Metric | Score |
|---|---|
| Rouge-L | 0.2651 |
| BLEU-1 | 0.36 |
The core Transformer implementation follows the original paper with:
- Multi-Head Self-Attention: Parallelized attention mechanism
- Positional Encoding: Sinusoidal or learned embeddings
- Layer Normalization: Pre-norm or post-norm variants
- Feed-Forward Networks: Position-wise fully connected layers
- Residual Connections: Skip connections for gradient flow
| Parameter | Default Value |
|---|---|
| Hidden Units | 512 |
| Attention Heads | 8 |
| Encoder/Decoder Blocks | 5 |
| Dropout Rate | 0.1 |
| Learning Rate | 0.0001 |
| Batch Size | 32-64 |
- Attention Is All You Need (Vaswani et al., 2017)
- Kyubyong's Transformer Implementation
- Stanford SNLI Dataset
- THU Chinese Text Classification
This project is licensed under the MIT License - see the LICENSE file for details.
Contributions are welcome! Please feel free to submit a Pull Request.
⭐ If you find this project helpful, please consider giving it a star!



