You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: _pages/mm-argfallacy2025.md
+60-3Lines changed: 60 additions & 3 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -49,25 +49,82 @@ For each sub-task, participants can leverage the debate context of a given input
49
49
# Data
50
50
51
51
52
-
We use **MM-USED-fallacy** and release a version of the dataset specifically designed for argumentative fallacy detection. This dataset includes 1,891 sentences from [Haddadan et al.'s (2019)](https://aclanthology.org/P19-1463.pdf) dataset on US presidential elections. Each sentence is labeled with one of six argumentative fallacy categories, as introduced by [Goffredo et al. (2022)](https://www.ijcai.org/proceedings/2022/575).
52
+
We use **MM-USED-fallacy** and release a version of the dataset specifically designed for argumentative fallacy detection. This dataset includes 1,278 sentences from [Haddadan et al.'s (2019)](https://aclanthology.org/P19-1463.pdf) dataset on US presidential elections. Each sentence is labeled with one of six argumentative fallacy categories, as introduced by [Goffredo et al. (2022)](https://www.ijcai.org/proceedings/2022/575).
53
53
54
54
Inspired by observations from [Goffredo et al. (2022)](https://www.ijcai.org/proceedings/2022/575) on the benefits of leveraging multiple argument mining tasks for fallacy detection and classification, we also provide additional datasets to encourage multi-task learning. A summary is provided in the table below:
|**MM-USED-fallacy**| A multimodal extension of USElecDeb60to20 dataset, covering US presidential debates (1960-2020). Inlcludes labels for argumentative fallacy detection and argumentative fallacy classification. | 1,278 samples (updated version)|
61
+
|**MM-USED**| A multimodal extension of the USElecDeb60to16 dataset, covering US presidential debates (1960–2016). Includes labels for argumentative sentence detection and component classification. | 23,505 sentences (updated version)|
60
62
|**UKDebates**| 386 sentences and audio samples from the 2015 UK Prime Ministerial elections. Sentences are labeled for argumentative sentence detection: containing or not containing a claim. | 386 sentences |
61
63
|**M-Arg**| A multimodal dataset for argumentative relation classification from the 2020 US Presidential elections. Sentences are labeled as attacking, supporting, or unrelated to another sentence. | 4,104 pairs |
62
-
|**MM-USED**| A multimodal extension of the USElecDeb60to16 dataset, covering US presidential debates (1960–2016). Includes labels for argumentative sentence detection and component classification. | 26,781 sentences |
64
+
63
65
64
66
---
65
67
66
68
All datasets will be available through [MAMKit](https://nlp-unibo.github.io/mamkit/).
67
69
68
70
Since many multimodal datasets cannot release audio samples due to copyright restrictions, MAMKit provides an interface to dynamically build datasets and promote reproducible research.
69
71
70
-
Datasets are formatted as `torch.Dataset` objects, containing input values (text, audio, or both) and corresponding task-specific labels. More details about data formats and dataset building are available in MAMKit's documentation.
72
+
Datasets are formatted as `torch.Dataset` objects, containing input values (text, audio, or both) and corresponding task-specific labels. More details about data formats and dataset building are available in MAMKit's documentation. ## Retrieving the Data through MAMKit
73
+
74
+
To retrieve the datasets through MAMKit, you can use the following code interface:
75
+
76
+
```python
77
+
from mamkit.data.datasets import MMUSEDFallacy, USEDFallacy, UKDebates, MArg
input_mode=InputMode.TEXT_AUDIO, # Choose between TEXT_ONLY, AUDIO_ONLY, or TEXT_AUDIO
88
+
base_data_path=base_data_path
89
+
)
90
+
91
+
# MM-USED dataset
92
+
mm_used_loader = MMUSED(
93
+
task_name='asd',#Choose between 'asd' or 'acc'
94
+
input_mode=InputMode.TEXT_AUDIO, # Choose between TEXT_ONLY, AUDIO_ONLY, or TEXT_AUDIO
95
+
base_data_path=base_data_path
96
+
)
97
+
98
+
# UKDebates dataset
99
+
uk_debates_loader = UKDebates(
100
+
task_name='asd',
101
+
input_mode=InputMode.TEXT_AUDIO, # Choose between TEXT_ONLY, AUDIO_ONLY, or TEXT_AUDIO
102
+
base_data_path=base_data_path
103
+
)
104
+
105
+
# M-Arg dataset
106
+
m_arg_loader = MArg(
107
+
task_name='arc',
108
+
input_mode=InputMode.TEXT_AUDIO, # Choose between TEXT_ONLY, AUDIO_ONLY, or TEXT_AUDIO
109
+
base_data_path=base_data_path
110
+
)
111
+
```
112
+
113
+
Each loader is initialized with the appropriate task name (`afc` for argumentative fallacy classification, `asd` for argumentative sentence detection, and 'arc' for argumentative relation classification), input mode (InputMode.TEXT_ONLY, InputMode.AUDIO_ONLY, or InputMode.TEXT_AUDIO), and the base data path.
114
+
115
+
Ensure that you have MAMKit installed and properly configured in your environment to use these loaders.
116
+
117
+
For more details, refer to the MAMKit [GitHub repository](https://nlp-unibo.github.io/mamkit/) and [website](https://nlp-unibo.github.io/mamkit/) .
118
+
119
+
120
+
### References
121
+
122
+
-**MM-USED-fallacy**: [Mancini et al. (2024)](https://aclanthology.org/2024.eacl-short.16.pdf). The version provided through MAMKit includes updated samples, with refinements in the alignment process. This results in a different number of samples compared to the original dataset.
123
+
-**MM-USED**: [Mancini et al. (2022)](https://aclanthology.org/2022.argmining-1.15.pdf). The version provided through MAMKit includes updated samples, with refinements in the alignment process. This results in a different number of samples compared to the original dataset.
124
+
-**UK-Debates**: [Lippi and Torroni (2016)](https://ojs.aaai.org/index.php/AAAI/article/view/10384).
125
+
-**M-Arg**: [Mestre et al. (2021)](https://aclanthology.org/2021.argmining-1.8.pdf).
126
+
127
+
**Note**: By "updated version," we mean that the datasets have undergone a refinement in the alignment process, which has resulted in adjustments to the number of samples included compared to the original versions published in the referenced papers.
71
128
72
129
# Evaluation
73
130
For argumentative fallacy detection, we will compute the binary F1-score on predicted sentence-level labels.
0 commit comments