Skip to content

EricFu1120/icd10cm-codes-skill

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ICD-10-CM Auto-Coding Skill (FY2025)

An agentic AI skill that assigns ICD-10-CM diagnosis codes from clinical documentation using the complete FY2025 code set, alphabetic index, tabular list, and official coding guidelines — following the same 10-step workflow that certified medical coders use.


Why This Exists

Problem How This Skill Solves It
LLMs hallucinate ICD codes from training data Every code is searched in the official index/tabular XML and validated against the 74K-code FY2025 flat file
Black-box code assignment with no audit trail Each code comes with a rationale, guideline reference, and sequencing justification
Missing specificity, laterality, 7th characters Built-in validation catches incomplete codes, missing placeholders, and Excludes1 conflicts
Outdated code sets Pinned to FY2025 (effective Oct 1 2024), with addenda and conversion table references
Inconsistent coding across encounters Deterministic workflow enforces Index → Tabular → Guidelines for every code

Features

  • 10-step coding workflow mirroring the official coding process
  • Alphabetic Index search — XML-parsed, supports main terms, subterms, cross-references, and the External Cause Index
  • Tabular List lookup — full hierarchy with Excludes1/2, Code First, Use Additional Code, 7th character definitions
  • Code description search — keyword-based search across all 74K+ billable codes
  • Code validation — structural checks, billability verification, placeholder X detection, Excludes1 conflict checking
  • Chapter-specific coding rules — sepsis, diabetes, hypertension, neoplasms, injuries, burns, poisoning/adverse effects, pregnancy, perinatal, pain, pressure ulcers, SDOH, and more
  • Poisoning vs. Adverse Effect vs. Underdosing decision tree
  • External cause code support (mechanism, place, activity, status)
  • SDOH Z-code guidance (Z55–Z65)
  • 12 worked examples covering complex real-world scenarios

Repository Structure

icd10cm-codes/
├── SKILL.md                          # Skill definition — workflow, rules, output format
├── README.md                         # This file
├── scripts/
│   ├── search_index.py               # Search the Alphabetic Index (XML)
│   ├── search_codes.py               # Keyword search across all billable codes
│   ├── lookup_tabular.py             # Look up a code in the Tabular List (XML)
│   └── validate_code.py              # Validate code structure, billability, Excludes1 conflicts
└── references/
    ├── data/
    │   ├── icd10cm-codes-2025.txt        # All valid billable codes + descriptions (~74K)
    │   ├── icd-10-cm-index-2025.xml      # Alphabetic Index to Diseases & Injuries
    │   ├── icd-10-cm-eindex-2025.xml     # External Cause of Injuries Index
    │   └── icd-10-cm-tabular-2025.xml    # Tabular List (full hierarchy + notes)
    ├── guidelines-key-rules.md       # Condensed coding guidelines & chapter-specific rules
    ├── coding-workflow.md            # Detailed 10-step workflow + 12 worked examples
    └── chapter-overview.md           # All 21 chapters with section-level code ranges

Quick Start

Prerequisites

  • Python 3.8+ (standard library only — no external dependencies)
  • An LLM agent runtime that supports tool/skill invocation (e.g., Windsurf Cascade, Claude Code)

Manual Usage (CLI)

Each script can be run standalone:

# Search the Alphabetic Index
python scripts/search_index.py "Pneumonia"
python scripts/search_index.py "Diabetes" "type 2" "kidney"

# Search code descriptions by keyword
python scripts/search_codes.py "chronic kidney" "stage 4"

# Look up a code in the Tabular List
python scripts/lookup_tabular.py "E11.22"

# Validate one or more codes
python scripts/validate_code.py "E11.22" "N18.4" "Z79.4"

# Validate + check Excludes1 conflicts between codes
python scripts/validate_code.py "E11.22" "N18.4" --check-excludes

# Search the External Cause Index
python scripts/search_index.py "Fall" --external

Agent Usage

When integrated as a skill, the LLM agent automatically:

  1. Reads clinical notes and extracts conditions
  2. Searches the Alphabetic Index for candidate codes
  3. Verifies each code in the Tabular List
  4. Applies official coding guidelines (sequencing, specificity, combination codes)
  5. Handles special scenarios (poisoning/adverse effects, external causes, SDOH)
  6. Validates every code for completeness and billability
  7. Outputs structured results with rationale and guideline references

Coding Workflow Overview

Clinical Note
     │
     ▼
┌─────────────────────┐
│ 1. Extract Conditions│  ← Identify diagnoses, symptoms, acuity, laterality, SDOH
└──────────┬──────────┘
           ▼
┌─────────────────────┐
│ 2. Search Index      │  ← search_index.py — find candidate codes via main terms
└──────────┬──────────┘
           ▼
┌─────────────────────┐
│ 3. Search Codes      │  ← search_codes.py — keyword fallback when index path unclear
└──────────┬──────────┘
           ▼
┌─────────────────────┐
│ 4. Verify Tabular    │  ← lookup_tabular.py — check specificity, Excludes, 7th char
└──────────┬──────────┘
           ▼
┌─────────────────────┐
│ 5. Apply Guidelines  │  ← Sequencing, combination codes, etiology/manifestation
└──────────┬──────────┘
           ▼
┌─────────────────────┐
│ 6. External Causes   │  ← Injuries: mechanism, place, activity, status (V/W/X/Y)
│ 6b. Drug Scenarios   │  ← Poisoning vs adverse effect vs underdosing
└──────────┬──────────┘
           ▼
┌─────────────────────┐
│ 7. Validate Codes    │  ← validate_code.py — structure, billability, conflicts
└──────────┬──────────┘
           ▼
┌─────────────────────┐
│ 8. Output Results    │  ← Structured table with codes, descriptions, rationale
└─────────────────────┘

Data Sources

All data files are sourced from the official CMS ICD-10-CM FY2025 release (effective October 1, 2024):

  • CMS ICD-10-CM Resources
  • Alphabetic Index, Tabular List, and code descriptions are XML/text formats published by CMS/NCHS
  • Official Guidelines: ICD-10-CM Official Guidelines for Coding and Reporting, FY2025 (October 2024)

Key Design Decisions

Decision Rationale
XML parsing, not regex Preserves hierarchical structure (chapter → section → category → code) and inherited notes
No ML model for code prediction Eliminates hallucination; every code is looked up from source data
Validation as a separate step Catches errors before output; can be used standalone for QA
Worked examples in docs Teaches the agent by demonstration, reducing guideline misinterpretation
Excludes1 conflict detection Prevents common coding errors that cause claim denials

Limitations

  • Not a standalone auto-coder — requires an LLM agent runtime for natural language understanding
  • English only — clinical note parsing assumes English documentation
  • FY2025 only — code set is pinned; update data files annually for new fiscal years
  • No NLP preprocessing — relies on the LLM's ability to extract clinical terms from notes
  • No claim/billing integration — outputs codes only; does not generate claims or interface with EHR systems
  • Guidelines condensed — the reference files summarize the official guidelines; edge cases may require consulting the full PDF

Updating for a New Fiscal Year

  1. Download the new CMS ICD-10-CM release files
  2. Replace the four files in references/data/ with the new year's versions
  3. Update the addenda and conversion table in the workspace root
  4. Review references/guidelines-key-rules.md for any guideline changes
  5. Test with validate_code.py to confirm new codes are recognized

License

The ICD-10-CM code set is a public domain work product of the U.S. Department of Health and Human Services (CMS/NCHS). The skill code and documentation in this repository are provided as-is for educational and research purposes.

Disclaimer: This tool is intended to assist — not replace — certified medical coders. All code assignments should be reviewed by qualified coding professionals before use in billing or clinical systems.

About

An Agentic ICD-10-CM Coding Skill: Teaching AI to Code Like a Medical Coder

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages