Skip to content

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

32 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Efficient Real Time Toxicity Rephrasing

A reproduction of the ParaGeDi and CondBERT detoxification frameworks via knowledge distillation.

Overview

This project provides lightweight, robust solutions that identifies toxic language and paraphrases it into a non-toxic version that could be used real time. By combining efficient sequence generation with a distilled classifier, the system maintains high accuracy while significantly reducing computational overhead—making toxicity mitigation more feasible for large-scale, latency-sensitive applications.

Installation

Requirements

  • Conda (Miniconda or Anaconda)

Setup

All dependencies and environment configuration are handled automatically.

bash environment/setup.bash

This script will:

  • Create the Conda environment
  • Install all required Python packages
  • Set up any environment variables or paths needed for training and evaluation

Once complete, activate the environment:

conda activate nlp_env_test

Training

ParaGeDi

This project trains the distilled model on the ParaDetox detoxification dataset — a large-scale English dataset of toxic sentences and their non-toxic paraphrases.

Training the distilled model can be done with this script:

python emnlp2021/style_transfer/paraGeDi/paragedi_kd_train.py

Add flags to control hyperparameters such as --num_epochs, --train_batch_size, --learning_rate, and --kl_alpha.

Models, checkpoints, and validation loss plots will be saved to the emnlp2021/style_transfer/paraGeDi/paragedi_kd_output folder.

CondBERT

Training can be done locally or on Google Colab.

Local Training:

python emnlp2021/style_transfer/condBERT/knowledge_distillation/train_kd.py \
    --data_dir emnlp2021/data/train \
    --train_file train_toxic \
    --vocab_path emnlp2021/style_transfer/condBERT/vocab \
    --batch_size 16 \
    --num_epochs 16 \
    --learning_rate 6e-5 \
    --output_dir emnlp2021/style_transfer/condBERT/knowledge_distillation/condbert_student \
    --teacher_logits_path teacher_logits.pt

Google Colab: Use emnlp2021/style_transfer/condBERT/knowledge_distillation/train_kd_colab.ipynb for training with automatic GPU detection and memory optimizations.

Extract Teacher Logits (Optional): For large datasets, extract and save teacher logits first:

python emnlp2021/style_transfer/condBERT/knowledge_distillation/extract_logits_compressed.py \
    --teacher_model bert-base-uncased \
    --texts_file emnlp2021/data/train/train_toxic \
    --output_path teacher_logits.pt \
    --top_k 2000

Models and training metrics are saved to the output directory.

Evaluation

We used the metric folder to evaluate our model in the same manner as the original paper.

ParaGeDi

Run the metrics.py script in this manner:

python emnlp2021/metric/metric.py --inputs emnlp2021/data/test/test_10k_toxic --preds emnlp2021/data/test/test_10k_paragedi_kd_results.txt

Additionally, throughput can be measured with this script:

python emnlp2021/style_transfer/paraGeDi/paragedi_kd_infer.py

CondBERT

Evaluate the trained student model:

python emnlp2021/style_transfer/condBERT/knowledge_distillation/evaluate_student_model.py \
    --model_path emnlp2021/style_transfer/condBERT/knowledge_distillation/condbert_student \
    --vocab_path emnlp2021/style_transfer/condBERT/vocab \
    --test_file emnlp2021/data/test/test_10k_toxic \
    --output_dir emnlp2021/style_transfer/condBERT/knowledge_distillation/results

Measure generation latency:

python emnlp2021/style_transfer/condBERT/knowledge_distillation/measure_latency.py \
    --model_path emnlp2021/style_transfer/condBERT/knowledge_distillation/condbert_student \
    --test_file emnlp2021/data/test/test_10k_toxic

Plot training loss:

python emnlp2021/style_transfer/condBERT/knowledge_distillation/plot_training_loss.py \
    --metrics_file emnlp2021/style_transfer/condBERT/knowledge_distillation/condbert_student/training_metrics.json

Sample Data

Sample toxic text inputs and detoxified outputs (from the basline and from knowledge distillation) can be found here: emnlp2021/data/test

Add flags to control generation parameters such as --temperature, --top_p, and --batch_size.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors