Skip to content

halfmorepiece/PhantomCircuit

Repository files navigation

If you like our project, please give us a star ⭐ on GitHub for the latest update.

arXiv License

📣 News

  • [2025/09/06] 🚀 Source code released!
  • [2025/08/21] 🎉🎉🎉 PhantomCircuit has been accepted by EMNLP 2025!

🎯 Overview

We propose PhantomCircuit, a novel LLM hallucination analysis framework leveraging Knowledge Circuit. PhantomCircuit dissects Knowledge Overshadowing, a variant of hallucination and provides a potential approch to reduce it. Our key contributions:

  • We propose PhantomCircuit, a Knowledge Circuit-Based Analysis Framework for overshadowing dynamics evolution during training phase.
  • We pioneer the analysis on pathway of information within circuit to reveal the internal mechanism of overshadowing.
  • We illustrate the promising strategy of leveraging the circuit-based method for overshadowing recovery.

PhantimCircuit

PhantimCircuit

PhantimCircuit

The comprehensive experiment results of PhantomCircuit's analysis reveal the source of overshadowing during training stage and its internal mechanism, the proposed overshadowing recovery method reduces overshadowing in many cases. More details in the paper.

🕹️ Usage

Installation

The environment setting up is similar with Knowledge Circuit in Pretrained Transforamer for Edge Attribute Pruning (EAP) method to constrcut the circuit. Running

git clone https://github.com/halfmorepiece/Hello-world.git
cd PhantomCircuit
conda env create -f environment.yml

Circuit Analysis and Recovery

  1. Modify the following parameters in config.py
  • TASK_CONFIGS, SYNTHE_TASK_CONFIGS: finetuning & synthetic dataset prompt

  • SELECTED_MODEL: model name in Huggingface

  • SELECTED_TASK_INDICES, EPOCHS: choose the task & checkpoint of specific epoch to process

  • MODEL_CONFIGS: define the path and trust code for the model trained with synthetic or finetuning dataset

  1. When only analysis, modify the parameters in config.py
  • USE_SYNTHETIC_DATASET: True for synthetic dataset pretrained model analysis & False for finetuning dataset.

  • EAP_CONFIG: choose the EAP method and target edge num

  • ANALYSIS_CONFIG: define the analysis details

  • EDGE_OPTIMIZATION_CONFIG.enable: False

  • Run and check the results in /output_info folder

python main.py
  1. When both analysis and recovery, modify the parameters in config.py
  • CO_XSUB_MODE.enable: True to launch the automatic components location

  • EDGE_OPTIMIZATION_CONFIG.enable: True

  • (optional) ANALYSIS_CONFIG.nodes_to_remove_from_circuit: List the nodes for ablation

  • Run and check the results in /output_info folder

python main.py

✏️ Citation

If you find this paper useful, please consider staring 🌟 this repo and citing 📑 our paper:

@misc{huang2025piercemistsgreetsky,
      title={Pierce the Mists, Greet the Sky: Decipher Knowledge Overshadowing via Knowledge Circuit Analysis}, 
      author={Haoming Huang and Yibo Yan and Jiahao Huo and Xin Zou and Xinfeng Li and Kun Wang and Xuming Hu},
      year={2025},
      eprint={2505.14406},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2505.14406}, 
}

📝 Related Projects

About

Official implementation of our work 'Pierce the Mists, Greet the Sky: Decipher Knowledge Overshadowing via Knowledge Circuit Analysis'

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages