Collate Datasets for Marathi and Aggregate them 

Collate data sets for Hindi language and create an aggregated dataset

Original Datasets being used :

https://github.com/TharinduDR/DeepOffense/tree/master/examples/marathi/data
https://github.com/l3cube-pune/MarathiNLP/tree/main/L3Cube-MahaHate
https://hasocfire.github.io/hasoc/2023/dataset.html

### Target Dataset format 
| Column | Description | Format
--|--|--
UID | Unique identifier to trace the origin of the dataset and act as index for dataset.| <dataset>_<language_code>_<train/test/val>_<index_number>
|text | The text content used for classifier | utf-8 encoded text
|label_yn| A binary label indicating whether text is classified as hate / non-hate in respective datasets.| 1 - hate <br> 0 - non-hate

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Collate Datasets for Marathi and Aggregate them #4

Target Dataset format

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Column	Description	Format
UID	Unique identifier to trace the origin of the dataset and act as index for dataset.	<language_code><train/test/val>_<index_number>
text	The text content used for classifier	utf-8 encoded text
label_yn	A binary label indicating whether text is classified as hate / non-hate in respective datasets.	1 - hate 0 - non-hate

Collate Datasets for Marathi and Aggregate them #4

Description

Target Dataset format

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions