Skip to content

mala-lab/Awesome-LLM-LVLM-Hallucination-Detection-and-Mitigation

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

100 Commits
 
 

Repository files navigation

Awesome-Hallucination-Detection-and-Mitigation

Awesome PRs Welcome Stars Visits Badge

A collection of papers on LLM/LVLM hallucination evaluation benchmark, detection, and mitigation.

We will continue to update this list with the latest resources. If you find any missed resources (paper/code) or errors, please feel free to open an issue or make a pull request.

Hallucinations Evaluation Benchmark

  • [Li2023] HaluEval: A Large-Scale Hallucination Evaluation Benchmark for Large Language Models in EMNLP, 2023. [paper]

  • [Chen2024] FactCHD: Benchmarking Fact-Conficting Hallucination Detection in IJCAI, 2024. [paper][code]

  • [Su2024] Unsupervised Real-Time Hallucination Detection based on the Internal States of Large Language Models in Arxiv, 2024.[paper][code]

  • [Kossen2024] Semantic Entropy Probes: Robust and Cheap Hallucination Detection in LLMs in Arxiv, 2024.[paper][code]

  • [Ji2024] ANAH: Analytical Annotation of Hallucinations in Large Language Models in ACL, 2024. [paper]

  • [Simhi2024] Constructing Benchmarks and Interventions for Combating Hallucinations in LLMs in Arxiv, 2024. [paper][code]

Causes of Hallucination

  • [Li2024] The Dawn After the Dark: An Empirical Study on Factuality Hallucination in Large Language Models in Arxiv, 2024. [paper][code]

  • [Liu2025] More Thinking, Less Seeing? Assessing Amplified Hallucination in Multimodal Reasoning Models in Arxiv, 2025. [paper][code]

Hallucination Detection

Fact-checking

  • [Niu2024] RAGTruth: A Hallucination Corpus for Developing Trustworthy Retrieval-Augmented Language Models in ACL, 2024. [paper][code]

  • [Chen2024] FactCHD: Benchmarking Fact-Conficting Hallucination Detection in IJCAI, 2024. [paper][code]

  • [Zhang2024] KnowHalu: Hallucination Detection via Multi-Form Knowledge-Based Factual Checking in Arxiv, 2024. [paper][code]

  • [Rawte2024] FACTOID: FACtual enTailment fOr hallucInation Detection in Arxiv, 2024. [paper][code]

  • [Es2024] RAGAs: Automated evaluation of retrieval augmented generation in EACL, 2024. [paper][code]

  • [Hu2024] RefChecker: Reference-based Fine-grained Hallucination Checker and Benchmark for Large Language Models in Arxiv, 2024. [paper][code]

  • [Zhang2025] CORRECT: Context- and Reference-Augmented Reasoning and Prompting for Fact-Checking, in NAACL, 2025. [paper][code]

  • [Lee2025] Enhancing Hallucination Detection via Future Context, in Arxiv, 2025. [paper][code]

Uncertainty Analysis

  • [Zhang2023] Enhancing Uncertainty-Based Hallucination Detection with Stronger Focus in EMNLP, 2023. [paper][code]

  • [Snyder2024] On Early Detection of Hallucinations in Factual Question Answering in KDD, 2024.[paper][code]

  • [Chuang2024] Lookback Lens: Detecting and Mitigating Contextual Hallucinations in Large Language Models Using Only Attention Maps in EMNLP, 2024. [paper][code]

  • [Ji2024] LLM Internal States Reveal Hallucination Risk Faced With a Query in Arxiv, 2024. [paper][code]

  • [Bouchard2025] Uncertainty Quantification for Language Models: A Suite of Black-Box, White-Box, LLM Judge, and Ensemble Scorers in Arxiv, 2025. [paper][code]

  • [Ma2025] Semantic Energy: Detecting LLM Hallucination Beyond Entropy in Arxiv, 2025. [paper][code]

Consistency Measure

  • [Cohen2023] LM vs LM: Detecting Factual Errors via Cross Examination in Arxiv, 2023. [paper][code]

  • [Manakul2023] SelfCheckGPT: Zero-Resource Black-Box Hallucination Detection for Generative Large Language Models in EMNLP, 2023. [paper][code]

  • [Chen2023] Hallucination Detection: Robustly Discerning Reliable Answers in Large Language Models in CIKM, 2023. [paper][code]

  • [Su2024] Unsupervised Real-Time Hallucination Detection based on the Internal States of Large Language Models in Arxiv, 2024.[paper][code]

  • [Mündler2024] Self-Contradictory Hallucinations of LLMs: Evaluation, Detection and Mitigation in ICLR, 2024.[paper][code]

  • [Kossen2024] Semantic Entropy Probes: Robust and Cheap Hallucination Detection in LLMs in ICML, 2024. [paper][code]

  • [Xu2024] Hallucination is Inevitable:An Innate Limitation of Large Language Models in Arxiv, 2024. [paper][code]

  • [Niu2025] Robust Hallucination Detection in LLMs via Adaptive Token Selection in NeurIPS, 2025.[paper][code]

  • [Sun2025] Why and How LLMs Hallucinate: Connecting the Dots with Subsequence Associations in Arxiv, 2025. [paper][code]

  • [Islam2025] How Much Do LLMs Hallucinate across Languages? On Multilingual Estimation of LLM Hallucination in the Wild in Arxiv, 2025. [paper][code]

  • [Muhammed2025] SelfCheckAgent: Zero-Resource Hallucination Detection in Generative Large Language Models in Arxiv, 2025. [paper][code]

  • [Yang2025] Hallucination Detection in Large Language Models with Metamorphic Relations in FSE, 2025. [paper][code]

Hidden States Analysis

  • [Azaria2023] The internal state of an llm knows when it’s lying in EMNLP findings, 2023. [paper][code]

  • [Chen2024] INSIDE: LLMs' Internal States Retain the Power of Hallucination Detection in ICLR, 2024. [paper][code]

  • [Kuhn2023] Semantic Uncertainty: Linguistic Invariances for Uncertainty Estimation in Natural Language Generation in ICLR, 2023. [paper][code]

  • [Farquhar2024] Detecting Hallucinations in Large Language Models Using Semantic Entropy in Nature,2024. [paper][code]

  • [Sriramanan2024] LLM-Check: Investigating Detection of Hallucinations in Large Language Models in NeurIPS, 2024. [paper][code]

  • [Wang2025] Latent Space Chain-of-Embedding Enables Output-free LLM Self-Evaluation in ICLR, 2025. [paper][code]

  • [Zhang2025] ICR Probe: Tracking Hidden State Dynamics for Reliable Hallucination Detection in LLMs in ACL, 2025. [paper][code]

  • [Cheang2025] Large Language Models Do NOT Really Know What They Don't Know in arXiv, 2025. [paper][code]

RL Reasoning

  • [Su2025] Learning to Reason for Hallucination Span Detection in Arxiv, 2025. [paper][code]

Hallucination Mitigation

Model Calibration

  • [Li2023] Inference-Time Intervention: Eliciting Truthful Answers from a Language Model in NeurIPS, 2023. [paper][code]

  • [Liu2023] LitCab: Lightweight Language Model Calibration over Short- and Long-form Responses in ICLR,2023. [paper][code]

  • [Ji2023] Towards Mitigating Hallucination in Large Language Models via Self-Reflection in EMNLP findings, 2023. [paper]

  • [Chen2023] PURR: Efficiently Editing Language Model Hallucinations by Denoising Language Model Corruptions in Arxiv, 2023 [paper]

  • [Campbell2023] Localizing Lying in Llama: Understanding Instructed Dishonesty on True-False Questions Through Prompting, Probing, and Patching in Arxiv, 2023. [paper]

  • [Wan2023] Faithfulness-Aware Decoding Strategies for Abstractive Summarization in EACL, 2023. [paper][code]

  • [Shi2023] Trusting Your Evidence: Hallucinate Less with Context-aware Decoding in Arxiv, 2023. [paper]

  • [Chen2024] Truth Forest: Toward Multi-Scale Truthfulness in Large Language Models through Intervention without Tuning in AAAI,2024. [paper][code]

  • [Zhang2024] R-Tuning: Instructing Large Language Models to Say `I Don't Know' in NAACL, 2023. [paper][code]

  • [Chuang2024] DoLa: Decoding by Contrasting Layers Improves Factuality in Large Language Models in ICLR, 2024. [paper][code]

  • [Kapoor2024] Calibration-Tuning: Teaching Large Language Models to Know What They Don’t Know in UncertaiNLP, 2024. [paper]

  • [Zhang2024] TruthX: Alleviating Hallucinations by Editing Large Language Models in Truthful Space in ACL, 2024. [paper][code]

  • [Zhou2025] HaDeMiF: Hallucination Detection and Mitigation in Large Language Models in ICLR, 2025. [paper][code]

  • [Zhang2025] The Law of Knowledge Overshadowing: Towards Understanding, Predicting, and Preventing LLM Hallucination in ACL, 2025. [paper][code]

  • [Wu2025] Mitigating Hallucinations in Large Vision-Language Models via Entity-Centric Multimodal Preference Optimization in Arxiv, 2025. [paper][code]

  • [Cheng2025] Integrative Decoding: Improving Factuality via Implicit Self-consistency in ICLR, 2025. [paper][code]

  • [Yang2025] Nullu: Mitigating Object Hallucinations in Large Vision-Language Models via HalluSpace Projection in CVPR, 2025. [paper][code]

  • [Wan2025] ONLY:One-Layer Intervention Sufficiently Mitigates Hallucinations in Large Vision-Language Model in ICCV, 2025. [paper][code]

  • [Chang2025] Monitoring Decoding: Mitigating Hallucination via Evaluating the Factuality of Partial Response during Generation in Arxiv, 2025. [paper][code]

  • [Wang2025] Image Tokens Matter: Mitigating Hallucination in Discrete Tokenizer-based Large Vision-Language Models via Latent Editing in Arxiv, 2025. [paper][code]

External Knowledge

  • [Ji2022] RHO (ρ): Reducing Hallucination in Open-domain Dialogues with Knowledge Grounding in ACL findings, 2022. [paper]

  • [Kang2024] Unfamiliar Finetuning Examples Control How Language Models Hallucinate in Arxiv, 2024. [paper]

  • [Gekhman2024] Does Fine-Tuning LLMs on New Knowledge Encourage Hallucinations? in EMNLP, 2024. [paper]

  • [Sun2025] Redeep: Detecting hallucination in retrieval-augmented generation via mechanistic interpretability in ICLR, 2025. [paper][code]

  • [Dey2025] Uncertainty-Aware Fusion: An Ensemble Framework for Mitigating Hallucinations in Large Language Models in WebConf, 2025. [paper][code]

  • [Lavrinovics2025] MultiHal: Multilingual Dataset for Knowledge-Graph Grounded Evaluation of LLM Hallucinations in Arxiv, 2025. [paper][code]

  • [Sui2025] Bridging External and Parametric Knowledge: Mitigating Hallucination of LLMs with Shared-Private Semantic Synergy in Dual-Stream Knowledge in Arxiv, 2025. [paper][code]

  • [Ferrando2025] Do I Know This Entity? Knowledge Awareness and Hallucinations in Language Models in ICLR, 2025. [paper][code]

  • [Xue2025] UALIGN: Leveraging Uncertainty Estimations for Factuality Alignment on Large Language Models in ACL, 2025. [paper][code]

  • [Cheng2025] Small Agent Can Also Rock! Empowering Small Language Models as Hallucination Detector in EMNLP, 2024. [paper][code]

Alignment-Fine-tuning

  • [Lee2022] Factuality Enhanced Language Models for Open-Ended Text Generation in NeurIPS, 2022. [paper][code]

  • [Tian2023] Fine-tuning Language Models for Factuality in ICLR, 2023. [paper][code]

  • [Lin2024] FLAME: Factuality-Aware Alignment for Large Language Models in NeurIPS, 2024.[paper]

  • [Yang2024] V-DPO: Mitigating Hallucination in Large Vision Language Models via Vision-Guided Direct Preference Optimization in EMNLP Findings, 2024. [paper][code]

  • [Kang2024] Unfamiliar finetuning examples control how language in NAACL, 2024. [paper][code]

  • [Yang2025] Mitigating Hallucinations in Large Vision-Language Models via DPO: On-Policy Data Hold the Key in CVPR, 2025. [paper][code]

  • [Gu2025] Mask-DPO: Generalizable Fine-grained Factuality Alignment of LLMs in ICLR, 2025. [paper][code]

Related Survey

  • [Wang2023] Survey on Factuality in Large Language Models: Knowledge, Retrieval and Domain-Specificity in Arxiv,2023. [paper]

  • [Ye2023] Cognitive Mirage: A Review of Hallucinations in Large Language Models in Arxiv,2023. [paper]

  • [Zhang2023] Siren's Song in the AI Ocean: A Survey on Hallucination in Large Language Models in Arxiv,2023. [paper]

  • [Gao2023] Retrieval-augmented generation for large language models: A survey in Arxiv, 2023. [paper]

  • [Huang2024] A Survey on Hallucination in Large Language Models: Principles, Taxonomy, Challenges, and Open Questions in TOIS, 2024. [paper]

  • [Ji2024] Survey of Hallucination in Natural Language Generation in CSUR, 2024. [paper]

  • [Bai2024] Hallucination of Multimodal Large Language Models: A Survey in Arxiv, 2024. [paper]

  • [Chen2025] A Survey of Multimodal Hallucination Evaluation and Detection in Arxiv, 2025. [paper]

  • [Lin2025] LLM-based Agents Suffer from Hallucinations: A Survey of Taxonomy, Methods, and Directions in Arxiv, 2025. [paper]

Datasets

About

A collection of papers on LLM/LVLM hallucination evaluation, detection, and mitigation

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages