GitHub - ByUnal/polistance-tr: Introduces a new dataset for Turkish Stance Detection and provides code for fine-tuning transformer-based models on this dataset.

PoliStance-TR: A Dataset for Turkish Stance Detection in Political Domain

This repository introduces a new dataset for Turkish Stance Detection and provides code for fine-tuning transformer-based models on this dataset. The dataset includes three stance labels: Favor, Against, and Neutral.

Dataset Overview

The dataset was specifically collected for stance detection in the Turkish language. It contains the following labels:

Favor: The text supports the target.
Against: The text opposes the target.
Neutral: The text does not express a clear stance on the target.

Data Splits and Distribution

The dataset is split into three parts as follows:

Train data: 6060 samples
Validation data: 674 samples
Test data: 1189 samples

Each set retains the same percentage of labels as the original dataset. The overall label distribution is:

Favor (Positive): 2898 samples
Against (Negative): 2858 samples
Neutral: 2167 samples

The data files are located in the data/ folder:

stance_train.csvs
stance_val.csv
stance_test.csv

Model Fine-Tuning

We provide main.py as the primary script for fine-tuning pre-trained transformer-based models on this dataset. The models have been trained to classify stance into the three categories (Favor, Against, Neutral) using this unique Turkish stance detection dataset.

Key Files:

main.py: Main script for model fine-tuning on Turkish stance detection.
preprocess.py: Includes necessary scripts for preprocessing.
Jupyten Notebook file which includes all necessary codes for both training and evaluation can be found in notebook/ folder.

How to Use

Clone the repository:

git clone https://github.com/ByUnal/polistance-tr.git
cd polistance-tr

Install Dependencies:
```
pip install -r requirements.txt
```

Fine-tune the model

python main.py --learning_rate 4e-5 --epoch 10 --save_dir trained-models

if you want to push the model after fine-tuning to HuggingFace enter the repository name by using --hf_repo_name environment variable.

Pre-trained Models

Transformer-based Fine-tuned models can be reached via my HuggingFace profile.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

PoliStance-TR: A Dataset for Turkish Stance Detection in Political Domain

Dataset Overview

Data Splits and Distribution

Model Fine-Tuning

Key Files:

How to Use

Pre-trained Models

About

Uh oh!

Releases

Packages

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
data		data
notebook		notebook
Readme.md		Readme.md
arg_parser.py		arg_parser.py
main.py		main.py
preprocess.py		preprocess.py
requirements.txt		requirements.txt

ByUnal/polistance-tr

Folders and files

Latest commit

History

Repository files navigation

PoliStance-TR: A Dataset for Turkish Stance Detection in Political Domain

Dataset Overview

Data Splits and Distribution

Model Fine-Tuning

Key Files:

How to Use

Pre-trained Models

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages