-
Notifications
You must be signed in to change notification settings - Fork 0
Description
Context
Following implementation of ConText algorithm for multilingual negation detection (PR #82, fixes #79), this tracks potential enhancements for more sophisticated clinical negation handling.
Current State: ConText-based direction-aware negation detection with TERMINATE/PSEUDO handling (F1 ~85-90% typical).
Proposed Improvements
1. Additional ConText Categories (Medium Priority)
What: Implement remaining ConText assertion categories beyond NEGATED_EXISTENCE
POSSIBLE_EXISTENCE- "possible", "might have"HISTORICAL- "history of", "previous"HYPOTHETICAL- "if patient develops", "risk of"FAMILY- "mother has", "family history of"
Why: Distinguishes uncertainty/temporality from simple negation
Effort: Low (rules exist in ConText standard)
When: When users report misclassification of uncertain/historical findings
2. Cross-Sentence Scope Detection (Low Priority)
What: Extend negation scope beyond sentence boundaries
Example: "No neurological findings. Reflexes are normal." (current: treats as separate)
Why: Clinical text often uses implicit continuation
Effort: Medium (requires discourse analysis)
When: If users report cross-sentence negation errors
3. Nested Negation Handling (Research)
What: Support complex nested structures: "not ruled out" (double negative = affirmation)
Why: Occurs in specialist clinical language
Effort: Medium-High
When: After measuring prevalence in real data
4. ML-Enhanced Scope Boundaries (Future Research)
What: Hybrid approach using NegBERT/BioBERT for scope detection
Reference: NegBERT (Khandelwal & Sawant, 2020) - F1 92% on NegEx corpus
Why: State-of-the-art performance for complex syntax
Effort: High (requires ML infrastructure, training corpus, inference pipeline)
When: Only if rule-based approach shows systematic failures
YAGNI: Rule-based ConText sufficient for current use cases
Decision Criteria
Implement when:
✅ Multiple user reports of specific negation error pattern
✅ Error impacts clinical accuracy (false positives/negatives)
✅ Benefit outweighs computational/maintenance cost
Defer if:
❌ No user complaints about current implementation
❌ Edge case with <1% occurrence rate
❌ Would add significant complexity
References
- ConText Algorithm: Chapman et al. (2013) - https://doi.org/10.1016/j.jbi.2013.05.002
- NegBERT: Khandelwal & Sawant (2020) - https://arxiv.org/abs/2010.16125
- Current implementation:
phentrieve/text_processing/assertion_detection.py - Documentation:
docs/advanced-topics/negation-detection.md
Related Issues
- fix: Negations #79 - Missing German negation terms (resolved by PR feat: Implement ConText algorithm for multilingual negation detection (fixes #79) #82)