Skip to content

Latest commit

 

History

History
23 lines (22 loc) · 1 KB

File metadata and controls

23 lines (22 loc) · 1 KB

Contact Info.


mojtabasadjadi@gmail.com
https://www.linkedin.com/in/smsajjadi/
https://github.com/mojtabasajjadi/FarSSiM

Introduction


FarSSiM is the first STS dataset for the informal Persian language. It consists of about 1123 informal Farsi short text pairs. Each text pair is annotated for relatedness and semantics in meaning and for the entailment relation between the two elements. This dataset is collected by identifying paraphrases between Persian tweets.

File Structure: xlsx file

Fields


tweet 1: the first text
tweet 2: the second text
1st: the first annotator's score
2st: the second annotator's score
3st: the third annotator's score
4st: the fourth annotator's score
average: the mean of 4 annotators' score
standard deviation: the standard deviation of 4 annotators' score
variance: the variance of 4 annotators' score

Statistic


Total pairs: 1123