Token Scope - Output Length Prediction Model

A PyTorch Lightning-based training framework for output length prediction models, designed to predict the length of output text given input text.

Features

🚀 Efficient training framework based on PyTorch Lightning
📊 Combined regression and classification loss functions
🔧 Qwen series models as base encoders
📈 Integrated TensorBoard logging
⚡ Multi-GPU training support
🎯 Configurable hyperparameter tuning

Project Structure

token_scope/
├── olp/                    # Main module
│   ├── dataset/           # Data processing module
│   │   └── data_module.py # PyTorch Lightning data module
│   └── train/             # Training related modules
│       ├── plmodel.py     # PyTorch Lightning model
│       ├── model.py       # Core model definition
│       ├── loss.py        # Loss functions
│       ├── callbacks.py   # Training callbacks
│       └── lr_scheduler.py # Learning rate scheduler
├── train.py              # Main training script
├── .gitignore           # Git ignore file
├── .black               # Black code formatting config
└── .isort.cfg          # Import sorting config

Installation

pip install pytorch-lightning torch transformers typer tensorboard scikit-learn

Usage

Basic Training

python train.py

Custom Parameter Training

python train.py \
    --model-name "Qwen/Qwen3-0.6B" \
    --batch-size 4 \
    --max-epochs 10 \
    --learning-rate 2e-4 \
    --reg-weight 0.6 \
    --cls-weight 0.4 \
    --gpus 4 \
    --max-length 8192 \
    --file-list data1.json data2.json

Parameters

--model-name: Base model name (default: "Qwen/Qwen3-0.6B")
--batch-size: Batch size (default: 1)
--max-epochs: Maximum training epochs (default: 5)
--learning-rate: Learning rate (default: 1e-4)
--reg-weight: Regression loss weight (default: 0.5)
--cls-weight: Classification loss weight (default: 0.5)
--num-workers: Number of data loader workers (default: 4)
--gpus: Number of GPUs (default: 8)
--max-length: Maximum input length (default: 10240)
--file-list: List of data files (default: ["./output.json"])

Model Architecture

The project uses a combined model architecture:

Encoder: Qwen series pre-trained models
Prediction Head: Simultaneous regression and classification prediction
Loss Function: Weighted combination of regression and classification losses

Training Features

Mixed Loss: Combines MSE regression loss and cross-entropy classification loss
Learning Rate Scheduling: Uses cosine annealing learning rate scheduler
Model Checkpointing: Automatically saves best models
Progress Monitoring: Custom TQDM progress bar and TensorBoard logging

Data Format

Training data should be in JSON format containing input text and corresponding output length labels.

Development Tools

The project is configured with the following code quality tools:

Black: Python code formatting
isort: Import statement sorting

Run formatting:

black .
isort .

Training Monitoring

Use TensorBoard to view training logs:

tensorboard --logdir logs

License

[Please add license information as appropriate]

Contributing

Issues and Pull Requests are welcome to improve the project.

中文版本: README_CN.md

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
olp		olp
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
README.md		README.md
README_CN.md		README_CN.md
pyproject.toml		pyproject.toml
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Token Scope - Output Length Prediction Model

Features

Project Structure

Installation

Usage

Basic Training

Custom Parameter Training

Parameters

Model Architecture

Training Features

Data Format

Development Tools

Training Monitoring

License

Contributing

About

Uh oh!

Releases

Packages

Languages

RolaoDenthu/token_scope

Folders and files

Latest commit

History

Repository files navigation

Token Scope - Output Length Prediction Model

Features

Project Structure

Installation

Usage

Basic Training

Custom Parameter Training

Parameters

Model Architecture

Training Features

Data Format

Development Tools

Training Monitoring

License

Contributing

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages