Skip to content

virgil382/llm-d-architectural-docs

Repository files navigation

LLM-D Architectural Docs

Last Update: 2026-03-09

⚠️ IMPORTANT DISCLAIMER

Not official. These documents are not official project documentation. They were generated by reverse-engineering the source code and may be incomplete, inaccurate, or out of date. Use them for reference only.

This repository contains architectural views and related artifacts for the LLM‑D project, including:

  • LLM-D_Inference_Scheduler.md
    • A concise, developer-focused reference that provides behavioral and structural models describing the Inference Scheduler's runtime behavior and the structural relationships between key classifiers including those that contain configurable attributes (e.g. configurable via YAML). Use it to locate where to implement new scheduling logic, and to infer which attributes affect which plugins.
  • LLM-D_Inference_Scheduler (Short).md
    • A compact, slide-style overview of the above suitable for quick presentations.

About

Architectural Reference and System Flow Documentation for Distributed LLM Inference. Mapping the control-plane integration between the Kubernetes Gateway API Inference Extension and the LLM-D Inference Scheduler data-plane.

Topics

Resources

License

Stars

Watchers

Forks

Contributors