This repository contains replication code for a research project evaluating large language models ability to fact-check PolitiFact claims using various approaches.
[TODO: Add citation information when paper is up on arxiv]
code/- all code for the projectdata/- all data for the projectfigures/- generated publication figuresreports/- generated reports/text filestables/- generated publication tables
These experiments were conducted using:
- Python 3.12.7
- GNU
bash, version 3.2.57(1)-release (arm64-apple-darwin24)
Set up virtual environment and install dependencies:
# Create and activate virtual environment
python3 -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
# Install dependencies
pip install -r requirements.txt
# Install local packages
cd code/package
pip install -e ./
cd ../data_collection/politifact_scraper
pip install -e ./
cd ../../..Download the required data from Zenodo via the link below.
- Zenodo DOI/link: https://doi.org/10.5281/zenodo.17693220
Then, extract the data.
This will create a directory called data/ which must be saved in the root directory of this repository.
All steps from start to finish—including data collection, cleaning, and analysis + figure generation—can be replicated by running the below bash scripts, after setting up virtual environments and downloading the data as specified above.
# From the root directory of this project, run the below
# Change to code directory
cd code/
# Run pipelines
bash 00-db-collection-and-generation-pipeline.sh # Scrape data, build DB
bash 01-run-factchecking-tests.sh # Run LLM tests
bash 02-data-analysis-pipeline.sh # Clean & analyze data
bash 03-generate-results-and-figures.sh # Generate outputsCertain scripts within the above pipeline are commented out by default because they will take extremely long to run or incur thousands of dollars of costs for the user. Moreover, scraped data and data generated by LLMs is unlikely to be exactly the same if collected at a later date (see this article for more details on LLM nondeterminism). We therefore prioritize transparency about our pipeline and replication, given the data we have. Nonetheless, we include all steps to show our work and allow users to replicate the entire pipeline, should they choose to do so. The bash pipeline scripts print notes about what is excluded as it executes.
Unfortunately, we cannot share the proprietary NewsGuard data that we purchased for this study.
The pipeline is set up to replicate with the Lin et al (2023) domain-quality list that we included in the domain quality sensitivity analysis of the Appendix.
Should you have your own version of NewsGuard data, you can save it in the proper location and rerun the pipeline and it will be included automatically.
The code/data_analysis/enrich_web_url_data.py has more information about including the NewsGuard data.
You will also need to uncomment one script call at the bottom of the 03-generate-results-and-figures.sh pipeline script to generate the main text figure.
See the notes in that script for details.
Various scripts require API keys. By default, these are commented out in the pipeline scripts above, as they will incur costs for the user.
Should you want to run these scripts, you will need to set the following environment variables in your system:
OPENAI_API_KEYGEMINI_API_KEYTOGETHER_API_KEY
For questions, please reach out to Matt DeVerna by visiting his personal website for his latest contact email.