Skip to content

mohanad-hafez/heal-sum-lite

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 

Repository files navigation

HEAL-Summ-lite: Edge-Deployable Health Summarization Pipeline

Inspiration & Reference: This project is a scaled-down, edge-optimized implementation inspired by the architectural and ethical principles outlined in Fisher et al., "HEAL-Summ: a lightweight and ethical framework for accessible summarization of health information."

Overview

HEAL-Summ-lite is a decoupled, multi-stage NLP pipeline designed to summarize complex public health advisories (such as CDC notices) into accessible, plain language. Building on the core concepts of the original HEAL-Summ framework, this "lite" version is specifically engineered for edge deployment, utilizing small-parameter models and deterministic heuristics rather than computationally expensive LLM-as-a-judge systems.

Architecture

The pipeline consists of a 3.8B parameter generator followed by two deterministic safety gates:

  • Generator (microsoft/Phi-4-mini-instruct, 3.8B): Selected because it dominates its weight class on the Open LLM Leaderboard (IFEval: 73.78, BBH: 38.74). Furthermore, as demonstrated in the HEAL-Summ literature, the Phi family consistently achieves the most accessible Flesch-Kincaid Grade Levels (FKGL).
  • Gate 1 | Readability (FKGL): A deterministic check using the Flesch-Kincaid Grade Level formula to ensure summaries meet a strict <8.0 threshold. Failures do not abort the process; instead, they trigger an autonomous agentic retry loop instructing the LLM to simplify the text.
  • Gate 2 | Hallucination Check (NEHR): A CPU-bound Extended Named Entity Hallucination Risk (NEHR) check using spaCy. It extracts numbers (NUM) and entities (ORG, GPE). To counter brittle statistical NER tagging, it employs a deterministic substring fallback: any extracted entity that appears literally anywhere in the source text (case-insensitive) is cleared. Only strictly fabricated strings trigger a human review flag.

Engineering Highlights

  • Draft-Embedded Agentic Retries: To ensure 100% reproducibility, the generator uses deterministic greedy decoding (do_sample=False). Because greedy decoding typically causes retry loops to output identical text, the pipeline dynamically embeds the failed draft into the retry prompt, forcing the model to explicitly "edit" its mistakes rather than regenerate from scratch.
  • Zero-Tolerance Risk Mitigation: In public health communications, missing or fabricated numbers are the highest-risk vector. The NEHR gate enforces a strict 0% tolerance set-difference rule for entity hallucination.

Known Limitations

  • The FKGL Domain Clash: Despite the retry loop enforcing <8 words per sentence and simpler synonyms, the pipeline struggles to consistently reach the <8.0 threshold on dense epidemiological texts (e.g., Ebola and Marburg notices). FKGL heavily penalizes syllables, and stripping out all polysyllabic medical terminology (e.g., "hemorrhagic", "incubation") risks degrading clinical accuracy.
  • Word-to-Digit Conversion: The NEHR check suffers from formatting false positives. If the LLM correctly summarizes "three days" as "3 days", the system flags it because the character "3" has zero substring presence in the source text.

Future Improvements

  • Domain-Specific Readability: Explore alternative readability metrics (such as the SMOG index or health-specific formulas) that do not artificially inflate scores due to necessary medical terminology.
  • Pipeline Upgrades: Add a word-to-digit normalization pre-processing step to fix the remaining false positives in the NEHR check.
  • Semantic Verification: To catch intrinsic semantic hallucinations (where numbers are correct but relationships are distorted), upgrade the safety heuristic to a claim-level factual consistency encoder like MiniCheck (Tang et al., 2024).

About

Edge-deployable NLP pipeline for summarizing CDC health advisories with deterministic hallucination and readability gates.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors