Skip to content

Introduces a new dataset for Turkish Stance Detection and provides code for fine-tuning transformer-based models on this dataset.

Notifications You must be signed in to change notification settings

ByUnal/polistance-tr

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

DALL-e

PoliStance-TR: A Dataset for Turkish Stance Detection in Political Domain

Developed by M.Cihat Unal

This repository introduces a new dataset for Turkish Stance Detection and provides code for fine-tuning transformer-based models on this dataset. The dataset includes three stance labels: Favor, Against, and Neutral.

Dataset Overview

The dataset was specifically collected for stance detection in the Turkish language. It contains the following labels:

  • Favor: The text supports the target.
  • Against: The text opposes the target.
  • Neutral: The text does not express a clear stance on the target.

Data Splits and Distribution

The dataset is split into three parts as follows:

  • Train data: 6060 samples
  • Validation data: 674 samples
  • Test data: 1189 samples

Each set retains the same percentage of labels as the original dataset. The overall label distribution is:

  • Favor (Positive): 2898 samples
  • Against (Negative): 2858 samples
  • Neutral: 2167 samples

The data files are located in the data/ folder:

  • stance_train.csvs
  • stance_val.csv
  • stance_test.csv

Model Fine-Tuning

We provide main.py as the primary script for fine-tuning pre-trained transformer-based models on this dataset. The models have been trained to classify stance into the three categories (Favor, Against, Neutral) using this unique Turkish stance detection dataset.

Key Files:

  • main.py: Main script for model fine-tuning on Turkish stance detection.
  • preprocess.py: Includes necessary scripts for preprocessing.
  • Jupyten Notebook file which includes all necessary codes for both training and evaluation can be found in notebook/ folder.

How to Use

  1. Clone the repository:
    git clone https://github.com/ByUnal/polistance-tr.git
    cd polistance-tr
    
  2. Install Dependencies:
    pip install -r requirements.txt
    
  3. Fine-tune the model
    python main.py --learning_rate 4e-5 --epoch 10 --save_dir trained-models
    

if you want to push the model after fine-tuning to HuggingFace enter the repository name by using --hf_repo_name environment variable.

Pre-trained Models

Transformer-based Fine-tuned models can be reached via my HuggingFace profile.

About

Introduces a new dataset for Turkish Stance Detection and provides code for fine-tuning transformer-based models on this dataset.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published