Toxicity Begets Toxicity: Unraveling Conversational Chains in Political Podcasts

Accepted at ACM Multimedia 2025

Naquee Rizwan, Nayandeep Deb, Sarthak Roy, Vishwajeet Singh Solanki, Kiran Garimella, Animesh Mukherjee: [Arxiv] (Main content + Appendix in one PDF), [Hugging Face]

Abstract

Tackling toxic behavior in digital communication continues to be a pressing concern for both academics and industry professionals. While significant research has explored toxicity on platforms like social networks and discussion boards, podcasts—despite their rapid rise in popularity—remain relatively understudied in this context. This work seeks to fill that gap by curating a dataset of political podcast transcripts and analyzing them with a focus on conversational structure. Specifically, we investigate how toxicity surfaces and intensifies through sequences of replies within these dialogues, shedding light on the organic patterns by which harmful language can escalate across conversational turns.

Dataset

The top 100 toxic conversation chains and their ground truth cpd annotations, each for conservative and liberal podcast channels are present in this GitHub repository itself [cpd/dataset]. This folder contains:

two annotation csv files (one each for conservatives and liberals) containing annotations of individual annotators (ex: Annotator_ND) and based on the majority voting as well (refer 'Inter_Annotator'). Further, this file also contains the cpd results as predicted by traditional CPD algorithms (refer [ruptures] library).
two json files (one each for conservatives and liberals) containing the details of top 100 toxic conversation chains.

Hugging Face

Additionally, we also provide the [Hugging Face] dataset with:

audio clips (.wav files) of top 100 toxic conversation chains (for both conservatives and liberals). These files are required to run the audio prompts in [cpd/dataset/audio_prompt_cpd.py]. Note- Please accordingly update the path to folders to make the code working.
all toxic conversation chains from both, conservative and liberal podcast channels. As stated in the paper, we define a toxic conversation chain whose anchor segment's toxicity value is greater than 0.7.
complete diarized dataset with toxicity scores calculated using Perspective API for both conservative and liberal podcast channels.

Appendix

ACM MM 2025 did not have the provision of incorporating supplementary material. Hence, we provide it [here].

Snippets from paper

Schema of conversation chain

Top: Computation of segments from chunks and their contents. Bottom: Schema for toxic conversation chains. The segment marked in red color represents the anchored segment with toxicity above a threshold of 0.7.

Examples of chains

Since it is not feasible to illustrate all segments, one among the previous and next segments is shown along with the anchor segment from the toxic conversation chain. Toxic texts in the anchor segment are marked in red color. Note: start and end times are in seconds.

Toxicity begets toxicity

Using conversational chains to identify and predict the toxicity trends using change point detection algorithms. Plots representing two samples comparing human annotation with GPT-4o-Audio zero-shot setup's detected change points. Correctly predicted points are marked with a red hexagon and incorrect predictions are marked with red cross.

Podcast Statistics

Left: Top 10 podcast shows with most amount of toxic content for each leaning, i.e. right and left. The bars show the percentage of episodes containing at least one toxic conversation. Right: Distribution of toxic conversation chains across the podcast channels for each leaning, i.e. right and left. Percentage contribution for Top 10 podcast channels are shown.

Quantitative results

Please cite our paper

@misc{rizwan2025dynamicstoxicitypoliticalpodcasts,
      title={Dynamics of Toxicity in Political Podcasts}, 
      author={Naquee Rizwan and Nayandeep Deb and Sarthak Roy and Vishwajeet Singh Solanki and Kiran Garimella and Animesh Mukherjee},
      year={2025},
      eprint={2501.12640},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2501.12640}, 
}

Contact

For any questions or issues, please contact: [email protected]

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
analysis		analysis
cpd		cpd
dataset_processing		dataset_processing
.gitignore		.gitignore
Appendix.pdf		Appendix.pdf
LICENSE		LICENSE
README.md		README.md
_cpdresults.png		_cpdresults.png
conversation_1.jpg		conversation_1.jpg
conversation_4.jpg		conversation_4.jpg
cpd_606.jpg		cpd_606.jpg
cpd_952.jpg		cpd_952.jpg
podcast.jpg		podcast.jpg
toxic_chains.jpg		toxic_chains.jpg
toxic_episodes.jpg		toxic_episodes.jpg

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Toxicity Begets Toxicity: Unraveling Conversational Chains in Political Podcasts

Abstract

Dataset

Hugging Face

Appendix

Snippets from paper

Schema of conversation chain

Examples of chains

Toxicity begets toxicity

Podcast Statistics

Quantitative results

Please cite our paper

Contact

About

Uh oh!

Releases

Packages

Languages

License

hate-alert/ToxicityBegetsToxicity-Audio

Folders and files

Latest commit

History

Repository files navigation

Toxicity Begets Toxicity: Unraveling Conversational Chains in Political Podcasts

Abstract

Dataset

Hugging Face

Appendix

Snippets from paper

Schema of conversation chain

Examples of chains

Toxicity begets toxicity

Podcast Statistics

Quantitative results

Please cite our paper

Contact

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages