Skip to content

Implemented transformer NN block for Machine translation, text classfication, Natural language inference as well as Machine reading comprehension model.

Notifications You must be signed in to change notification settings

fooSynaptic/transfromer_NN_Block

Repository files navigation

Transformer Neural Network Building Blocks

Python 3.6+ TensorFlow 1.12 License: MIT

A collection of Transformer-based implementations for various NLP tasks, following the original paper Attention Is All You Need.

🎯 Implemented Tasks

Task Description Directory
Machine Translation English-to-Chinese NMT using WIT3 dataset en-zh_NMT/
Chinese Word Segmentation Sequence labeling with B/E/S/M tags transformer_jieba/
Text Classification Chinese text classification (THU-CTC) transformer_text_Classfication/
Natural Language Inference Sentence entailment with Stanford SNLI transformer_infersent/
Reading Comprehension Extractive QA with Pointer Network transformer_RC/

🚀 Quick Start

Installation

# Clone the repository
git clone https://github.com/fooSynaptic/transfromer_NN_Block.git
cd transfromer_NN_Block

# Install dependencies
pip install -r requirements.txt

Project Structure

transfromer_NN_Block/
├── Models/                          # Shared model definitions
│   └── models.py                    # Vanilla Transformer implementation
├── en-zh_NMT/                       # English-Chinese Neural Machine Translation
├── transformer_jieba/               # Chinese Word Segmentation
├── transformer_text_Classfication/  # Text Classification
├── transformer_infersent/           # Natural Language Inference
├── transformer_RC/                  # Reading Comprehension
├── images/                          # Training visualizations
├── results/                         # Model checkpoints
├── hyperparams.py                   # Hyperparameter configurations
└── requirements.txt                 # Python dependencies

📋 Task Details

1. Chinese Word Segmentation (Encoder-Decoder)

Train a sequence labeling model using B/E/S/M tagging scheme:

  • B (Begin): Start of a word
  • E (End): End of a word
  • S (Single): Single-character word
  • M (Middle): Middle of a word
cd transformer_jieba

# Preprocess data
python prepro.py

# Train the model
python train.py

# Evaluate (BLEU score ~80)
python eval.py

2. English-Chinese Machine Translation

Dataset: WIT3 (Web Inventory of Transcribed and Translated Talks)

cd en-zh_NMT

# Preprocess data
python prepro.py

# Train the model
python train.py

# Evaluate
python eval.py

Results:

NMT BLEU Score


3. Chinese Text Classification

Dataset: THU Chinese Text Classification (THUCTC)

Categories:

labels = {
    '时尚': 0, '教育': 1, '时政': 2, '体育': 3, '游戏': 4,
    '家居': 5, '科技': 6, '房产': 7, '财经': 8, '娱乐': 9
}
cd transformer_text_Classfication

# Preprocess and train
python prepro.py
python train.py

# Evaluate
python eval.py

Results (10-class classification):

Metric Score
Accuracy 0.85
Macro Avg F1 0.85
Weighted Avg F1 0.85
Detailed Classification Report
              precision    recall  f1-score   support

           0       0.91      0.95      0.93      1000
           1       0.96      0.77      0.85      1000
           2       0.92      0.93      0.92      1000
           3       0.95      0.93      0.94      1000
           4       0.86      0.91      0.88      1000
           5       0.83      0.47      0.60      1000
           6       0.86      0.85      0.86      1000
           7       0.64      0.87      0.74      1000
           8       0.79      0.91      0.85      1000
           9       0.88      0.91      0.89      1000

    accuracy                           0.85     10000
   macro avg       0.86      0.85      0.85     10000
weighted avg       0.86      0.85      0.85     10000

4. Natural Language Inference (SNLI)

Dataset: Stanford SNLI

cd transformer_infersent

# Download and prepare data
wget https://nlp.stanford.edu/projects/snli/snli_1.0.zip
unzip snli_1.0.zip

# Preprocess
python data_prepare.py
python prepro.py

# Train
python train.py

# Evaluate
python eval.py --task infersent

Training Progress:

Training Accuracy Training Loss
Accuracy Loss

Evaluation Results (3-class: entailment, contradiction, neutral):

Metric Score
Accuracy 0.76
Macro Avg F1 0.76

5. Reading Comprehension

Architecture: Transformer Encoder + BiDAF Attention + Pointer Network

cd transformer_RC

# Preprocess and train
python prepro.py
python train.py

# Evaluate
python eval.py

Results:

Metric Score
Rouge-L 0.2651
BLEU-1 0.36

Training Loss


🏗️ Model Architecture

The core Transformer implementation follows the original paper with:

  • Multi-Head Self-Attention: Parallelized attention mechanism
  • Positional Encoding: Sinusoidal or learned embeddings
  • Layer Normalization: Pre-norm or post-norm variants
  • Feed-Forward Networks: Position-wise fully connected layers
  • Residual Connections: Skip connections for gradient flow

Key Hyperparameters

Parameter Default Value
Hidden Units 512
Attention Heads 8
Encoder/Decoder Blocks 5
Dropout Rate 0.1
Learning Rate 0.0001
Batch Size 32-64

📚 References


📝 License

This project is licensed under the MIT License - see the LICENSE file for details.


🤝 Contributing

Contributions are welcome! Please feel free to submit a Pull Request.


⭐ If you find this project helpful, please consider giving it a star!

About

Implemented transformer NN block for Machine translation, text classfication, Natural language inference as well as Machine reading comprehension model.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages