ICD-10-CM Auto-Coding Skill (FY2025)

An agentic AI skill that assigns ICD-10-CM diagnosis codes from clinical documentation using the complete FY2025 code set, alphabetic index, tabular list, and official coding guidelines — following the same 10-step workflow that certified medical coders use.

Why This Exists

Problem	How This Skill Solves It
LLMs hallucinate ICD codes from training data	Every code is searched in the official index/tabular XML and validated against the 74K-code FY2025 flat file
Black-box code assignment with no audit trail	Each code comes with a rationale, guideline reference, and sequencing justification
Missing specificity, laterality, 7th characters	Built-in validation catches incomplete codes, missing placeholders, and Excludes1 conflicts
Outdated code sets	Pinned to FY2025 (effective Oct 1 2024), with addenda and conversion table references
Inconsistent coding across encounters	Deterministic workflow enforces Index → Tabular → Guidelines for every code

Features

10-step coding workflow mirroring the official coding process
Alphabetic Index search — XML-parsed, supports main terms, subterms, cross-references, and the External Cause Index
Tabular List lookup — full hierarchy with Excludes1/2, Code First, Use Additional Code, 7th character definitions
Code description search — keyword-based search across all 74K+ billable codes
Code validation — structural checks, billability verification, placeholder X detection, Excludes1 conflict checking
Chapter-specific coding rules — sepsis, diabetes, hypertension, neoplasms, injuries, burns, poisoning/adverse effects, pregnancy, perinatal, pain, pressure ulcers, SDOH, and more
Poisoning vs. Adverse Effect vs. Underdosing decision tree
External cause code support (mechanism, place, activity, status)
SDOH Z-code guidance (Z55–Z65)
12 worked examples covering complex real-world scenarios

Repository Structure

icd10cm-codes/
├── SKILL.md                          # Skill definition — workflow, rules, output format
├── README.md                         # This file
├── scripts/
│   ├── search_index.py               # Search the Alphabetic Index (XML)
│   ├── search_codes.py               # Keyword search across all billable codes
│   ├── lookup_tabular.py             # Look up a code in the Tabular List (XML)
│   └── validate_code.py              # Validate code structure, billability, Excludes1 conflicts
└── references/
    ├── data/
    │   ├── icd10cm-codes-2025.txt        # All valid billable codes + descriptions (~74K)
    │   ├── icd-10-cm-index-2025.xml      # Alphabetic Index to Diseases & Injuries
    │   ├── icd-10-cm-eindex-2025.xml     # External Cause of Injuries Index
    │   └── icd-10-cm-tabular-2025.xml    # Tabular List (full hierarchy + notes)
    ├── guidelines-key-rules.md       # Condensed coding guidelines & chapter-specific rules
    ├── coding-workflow.md            # Detailed 10-step workflow + 12 worked examples
    └── chapter-overview.md           # All 21 chapters with section-level code ranges

Quick Start

Prerequisites

Python 3.8+ (standard library only — no external dependencies)
An LLM agent runtime that supports tool/skill invocation (e.g., Windsurf Cascade, Claude Code)

Manual Usage (CLI)

Each script can be run standalone:

# Search the Alphabetic Index
python scripts/search_index.py "Pneumonia"
python scripts/search_index.py "Diabetes" "type 2" "kidney"

# Search code descriptions by keyword
python scripts/search_codes.py "chronic kidney" "stage 4"

# Look up a code in the Tabular List
python scripts/lookup_tabular.py "E11.22"

# Validate one or more codes
python scripts/validate_code.py "E11.22" "N18.4" "Z79.4"

# Validate + check Excludes1 conflicts between codes
python scripts/validate_code.py "E11.22" "N18.4" --check-excludes

# Search the External Cause Index
python scripts/search_index.py "Fall" --external

Agent Usage

When integrated as a skill, the LLM agent automatically:

Reads clinical notes and extracts conditions
Searches the Alphabetic Index for candidate codes
Verifies each code in the Tabular List
Applies official coding guidelines (sequencing, specificity, combination codes)
Handles special scenarios (poisoning/adverse effects, external causes, SDOH)
Validates every code for completeness and billability
Outputs structured results with rationale and guideline references

Coding Workflow Overview

Clinical Note
     │
     ▼
┌─────────────────────┐
│ 1. Extract Conditions│  ← Identify diagnoses, symptoms, acuity, laterality, SDOH
└──────────┬──────────┘
           ▼
┌─────────────────────┐
│ 2. Search Index      │  ← search_index.py — find candidate codes via main terms
└──────────┬──────────┘
           ▼
┌─────────────────────┐
│ 3. Search Codes      │  ← search_codes.py — keyword fallback when index path unclear
└──────────┬──────────┘
           ▼
┌─────────────────────┐
│ 4. Verify Tabular    │  ← lookup_tabular.py — check specificity, Excludes, 7th char
└──────────┬──────────┘
           ▼
┌─────────────────────┐
│ 5. Apply Guidelines  │  ← Sequencing, combination codes, etiology/manifestation
└──────────┬──────────┘
           ▼
┌─────────────────────┐
│ 6. External Causes   │  ← Injuries: mechanism, place, activity, status (V/W/X/Y)
│ 6b. Drug Scenarios   │  ← Poisoning vs adverse effect vs underdosing
└──────────┬──────────┘
           ▼
┌─────────────────────┐
│ 7. Validate Codes    │  ← validate_code.py — structure, billability, conflicts
└──────────┬──────────┘
           ▼
┌─────────────────────┐
│ 8. Output Results    │  ← Structured table with codes, descriptions, rationale
└─────────────────────┘

Data Sources

All data files are sourced from the official CMS ICD-10-CM FY2025 release (effective October 1, 2024):

CMS ICD-10-CM Resources
Alphabetic Index, Tabular List, and code descriptions are XML/text formats published by CMS/NCHS
Official Guidelines: ICD-10-CM Official Guidelines for Coding and Reporting, FY2025 (October 2024)

Key Design Decisions

Decision	Rationale
XML parsing, not regex	Preserves hierarchical structure (chapter → section → category → code) and inherited notes
No ML model for code prediction	Eliminates hallucination; every code is looked up from source data
Validation as a separate step	Catches errors before output; can be used standalone for QA
Worked examples in docs	Teaches the agent by demonstration, reducing guideline misinterpretation
Excludes1 conflict detection	Prevents common coding errors that cause claim denials

Limitations

Not a standalone auto-coder — requires an LLM agent runtime for natural language understanding
English only — clinical note parsing assumes English documentation
FY2025 only — code set is pinned; update data files annually for new fiscal years
No NLP preprocessing — relies on the LLM's ability to extract clinical terms from notes
No claim/billing integration — outputs codes only; does not generate claims or interface with EHR systems
Guidelines condensed — the reference files summarize the official guidelines; edge cases may require consulting the full PDF

Updating for a New Fiscal Year

Download the new CMS ICD-10-CM release files
Replace the four files in references/data/ with the new year's versions
Update the addenda and conversion table in the workspace root
Review references/guidelines-key-rules.md for any guideline changes
Test with validate_code.py to confirm new codes are recognized

License

The ICD-10-CM code set is a public domain work product of the U.S. Department of Health and Human Services (CMS/NCHS). The skill code and documentation in this repository are provided as-is for educational and research purposes.

Disclaimer: This tool is intended to assist — not replace — certified medical coders. All code assignments should be reviewed by qualified coding professionals before use in billing or clinical systems.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ICD-10-CM Auto-Coding Skill (FY2025)

Why This Exists

Features

Repository Structure

Quick Start

Prerequisites

Manual Usage (CLI)

Agent Usage

Coding Workflow Overview

Data Sources

Key Design Decisions

Limitations

Updating for a New Fiscal Year

License

FilesExpand file tree

README.md

Latest commit

History

README.md

File metadata and controls

ICD-10-CM Auto-Coding Skill (FY2025)

Why This Exists

Features

Repository Structure

Quick Start

Prerequisites

Manual Usage (CLI)

Agent Usage

Coding Workflow Overview

Data Sources

Key Design Decisions

Limitations

Updating for a New Fiscal Year

License