Skip to content

SahPet/ai-citation-auditor

Repository files navigation

AI Citation Auditor 📚🔍

An automated tool that uses Google Gemini (Flash & Pro) to verify academic citations in your manuscripts. It checks if citations exist, if the metadata is correct, and if the cited paper actually supports the claims made in your text.

AI Citation Auditor

Features

  • Dual Mode Support:
    • docx_citation_checker.py: Scans Microsoft Word (.docx) files directly.
    • tex_citation_checker.py: Scans LaTeX (.tex) files and parses standard .bib files.
  • AI-Powered Verification:
    • Uses Gemini Flash for rapid, low-cost checking.
    • Automatically escalates "Suspicious" or "Partial" matches to Gemini Pro for deep reasoning.
  • Smart Context: Understands "split support" (e.g., when multiple citations (1, 2) collectively support a sentence).
  • Audit Reports: Generates a detailed CSV audit log and a "Deviant Entries" text report highlighting only the problematic citations.

Setup

  1. Clone the repository:

    git clone https://github.com/SahPet/ai-citation-auditor.git
    cd ai-citation-auditor
  2. Install requirements:

    pip install -r requirements.txt
  3. Get a Gemini API Key:

    • Get a free key from Google AI Studio.
    • Create a file named .env in the project folder:
    GEMINI_API_KEY=your_actual_api_key_here

Usage

For Word Documents (.docx)

  1. Run the script:

    python docx_citation_checker.py
  2. Select your .docx file in the popup dialog.

  3. The script will generate:

    • filename_audit.csv: Full report of all citations.
    • filename_deviant_entries.txt: Detailed reasoning for citations marked "Incorrect" or "No".

For LaTeX Projects (.tex + .bib)

  1. Run the script:

    python tex_citation_checker.py
  2. Select your main .tex file and your .bib bibliography file when prompted.

  3. Review the generated console output and CSV reports.

How It Works

The tool uses a two-stage verification process:

  1. Extraction: It parses your document to find every citation and the sentence it belongs to (the "Context").
  2. Flash Check: It sends the Citation + Context to Gemini Flash to ask: "Does this paper support this claim?"
  3. Pro Audit: If Flash returns "No", "Partially", or "Suspicious Integrity", the tool calls Gemini Pro to double-check, providing a detailed "Why is this better?" reasoning if a replacement is suggested.

License

MIT License

About

An automated tool that uses Google Gemini (Flash & Pro) to verify academic citations in your manuscripts. It checks if citations exist, if the metadata is correct, and if the cited paper actually supports the claims made in your text.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages