Explore the effects of takedown delay on the persistence of illegal content

This repository contains code to reproduce the results in the paper ``Delayed takedown of illegal content on social media makes moderation ineffective''.

The model is an extension of SimSoM: A Simulator of Social Media

Overview of the repo

data: contains raw & derived datasets
example: contains a minimal example to start using the SimSoM model. This model was written and tested with Python>=3.6
experiments: experiment configurations, results, supplementary data and .ipynb noteboooks to produce figures reported in the paper
libs: contains the extended SimSoM model package that can be imported into scripts
workflow:
- rules contains scripts to run the experiments
- scripts contains helper scripts used by rules scripts. These include functions such as network initialization, data parsing, etc.

1. Install SimSoM

We include two ways to set up the environment and install the model

1.1. Using Make (the simpliest way --- recommended)

Run make from the project directory (SimSoM)

1.2. Using Conda

We use conda, a package manager to manage the development environment. Please make sure you have conda or mamba installed on your machine

1.2.1. Create the environment with required packages: run conda env create -n simsom -f environment.yml

1.2.2. Install the SimSoM module:

activate virtualenv: conda activate simsom
run pip install -e ./libs/

2. Plot results from the paper

Run the notebooks in experiments/figures to visualize the experiment results in the paper

The results in the paper are based on averages across 10+ simulation runs. For step 3 below, the shell script is configured so that it runs one simulation for each set of experiment.

To run many simulations, change the NO_RUNS variable in the workflow/rules/run_experiment.sh. However, since running multiple simulations takes a lot of time, we suggest running many of them at once. See this workflow/rules/run_exps.smk for an inspiration on how to do this using a workflow manager, Snakemake. Also note that saving all message information as a gzip compressed file takes about 20-500 megabytes per run.

3. Reproduce results from scratch

The steps to reproduce the results from scratch, rather than using the provided results in experiments/results, are outlined below. Warning: following these steps will overwrite the content of experiments/results. All scripts are run from the project root directory, simsom_removal

3.1. Run experiments

3.1.1. Unzip the data file: unzip data/data.zip -d .

3.1.2. Automatically run all experiments to reproduce the results in the paper by running 2 commands:

make file executable: chmod +x workflow/rules/run_experiment.sh
run shell script: workflow/rules/run_experiments.sh

This script does 2 things

Create configuration folders for all experiments (see experiments/config for the results of this step)
Run the run_exps.py script with an argument to specify the experiment to run:
- vary_tau: main results
- vary_group_size: robustness check for varying group sizes
- vary_illegal_probability: robustness check for varying illegal probabilities
- vary_network_type: robustness check for varying network structures

3.2. Parse experiment data

We are interested in the prevalence of illegal content and engagement metrics such as reach and impressions. To aggregate these metrics, we need to parse the experiment verbose tracking files. To parse these files, run:

For reach and impressions: python workflow/scripts/read_data_engagement.py --result_path experiments/<experiment_name> --out_path experiments/results/<experiment_name>
For prevalence of illegal content: python read_data_illegal_count.py --result_path experiments/<experiment_name> --out_path experiments/results/<experiment_name>

See point 2 above to visualize the newly created results.

Other notes

Data description

The empirical network is created from the Replication Data for: Right and left, partisanship predicts vulnerability to misinformation, where:

measures.tab contains user information, i.e., one's partisanship and misinformation score.
anonymized-friends.json is the adjacency list.

We reconstruct the empirical network from the above 2 files, resulting in data/follower_network.gml. The steps are specified in the script to create empirical network

Step-by-step instruction and example of running SimSoM

Check out example to get started.

Example of the simulation and results: example/run_simulation.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Explore the effects of takedown delay on the persistence of illegal content

Overview of the repo

1. Install SimSoM

1.1. Using Make (the simpliest way --- recommended)

1.2. Using Conda

2. Plot results from the paper

3. Reproduce results from scratch

3.1. Run experiments

3.2. Parse experiment data

Other notes

Data description

Step-by-step instruction and example of running SimSoM

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 73 Commits
data		data
example		example
experiments		experiments
libs		libs
workflow		workflow
.gitignore		.gitignore
Makefile		Makefile
README.md		README.md
environment.yml		environment.yml

osome-iu/simsom_removal

Folders and files

Latest commit

History

Repository files navigation

Explore the effects of takedown delay on the persistence of illegal content

Overview of the repo

1. Install SimSoM

1.1. Using Make (the simpliest way --- recommended)

1.2. Using Conda

2. Plot results from the paper

3. Reproduce results from scratch

3.1. Run experiments

3.2. Parse experiment data

Other notes

Data description

Step-by-step instruction and example of running SimSoM

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages