Skip to content

Conversation

@BenHamm
Copy link
Contributor

@BenHamm BenHamm commented Nov 6, 2025

🎯 Objective

Clean up the recipes/ directory to focus exclusively on production-ready Kubernetes deployments for release 0.6.1. Remove incomplete configurations and clarify that benchmark recipes are tools for users, not published performance results.


📊 Summary of Changes

34 files changed: +210 insertions, -1,542 deletions

Deleted Content

  • 5 incomplete model directories (no K8s deploy.yaml manifests):

    • deepseek-r1-distill-llama-8b/
    • gemma3/
    • llama4/
    • qwen2-vl-7b-instruct/
    • qwen3/ (kept qwen3-32b-fp8/ which is complete)
  • run.sh script (228 lines) - non-K8s automation tool

  • 12 standalone engine config YAMLs from deepseek-r1/trtllm/:

    • agg/simple/, agg/mtp/, agg/wide_ep/ (all subdirs)
    • disagg/simple/, disagg/mtp/
    • disagg/wide_ep/*.yaml (standalone configs only)
    • Kept: disagg/wide_ep/gb200/ (has complete K8s manifests)

Added/Modified Content

  • Comprehensive new recipes/README.md with:

    • Complete recipe table showing Deployment and Benchmark Recipe availability
    • Prerequisites section linking to correct K8s installation guides
    • Step-by-step deployment instructions
    • Troubleshooting section
    • Removed extraneous links (docs.nvidia.com, license section)
  • gpt-oss-120b/trtllm/disagg/README.md - Documents incomplete recipe status


📦 What Remains

4 Models | 10 Complete Deployments | 7 with Benchmark Recipes

Model Framework Recipes Status
llama-3-70b vLLM agg, disagg-single-node, disagg-multi-node ✅ All with deploy.yaml + perf.yaml
qwen3-32b-fp8 TensorRT-LLM agg, disagg ✅ All with deploy.yaml + perf.yaml
gpt-oss-120b TensorRT-LLM agg, disagg* ✅ agg complete; disagg documented as incomplete
deepseek-r1 SGLang + TensorRT-LLM sglang/disagg-8gpu, sglang/disagg-16gpu, trtllm/disagg/wide_ep/gb200 ✅ 2 functional, 1 complete

🔑 Key Changes Explained

1. Removed Incomplete Model Directories

Issue: 5 model directories contained only standalone engine configs (.yaml files) without Kubernetes deploy.yaml manifests. These were added in PR #3772 but don't follow the documented recipe structure.

Resolution: Remove them from 0.6.1. Can be re-added properly in 0.7.0 with complete K8s manifests.

2. Removed run.sh Script

Issue: Automated deployment script for non-K8s environments. Adds complexity and isn't aligned with K8s-only focus.

Resolution: Remove for 0.6.1. Recipes directory should contain declarative K8s manifests only.

3. Cleaned Up Standalone Engine Configs

Issue: deepseek-r1/trtllm/ contained many standalone engine config YAMLs (agg.yaml, decode.yaml, prefill.yaml) that aren't wrapped in K8s manifests. These configs are meant to be embedded as ConfigMaps within deploy.yaml files.

Resolution: Remove standalone configs. The one complete recipe (disagg/wide_ep/gb200/deploy.yaml) already embeds its configs as ConfigMaps.

4. Clarified "Benchmark Recipe" vs "Benchmarked"

Issue: Previous README implied we have published benchmark results. We actually provide benchmark tools (perf.yaml with AIPerf).

Resolution:

  • Changed table columns from "Benchmarked | Status" to "Deployment | Benchmark Recipe"
  • Legend now clarifies: "Benchmark Recipe: ✅ = Includes perf.yaml for running AIPerf benchmarks"
  • Removed claims of having performance numbers

📁 Directory Structure (After)

recipes/
├── README.md (comprehensive guide)
├── CONTRIBUTING.md
├── hf_hub_secret/
├── llama-3-70b/
│   ├── model-cache/
│   └── vllm/ (3 complete recipes)
├── qwen3-32b-fp8/
│   ├── model-cache/
│   └── trtllm/ (2 complete recipes)
├── gpt-oss-120b/
│   ├── model-cache/
│   └── trtllm/
│       ├── agg/ (complete)
│       └── disagg/ (documented as incomplete)
└── deepseek-r1/
    ├── model-cache/
    ├── sglang/ (2 functional recipes)
    └── trtllm/disagg/wide_ep/gb200/ (1 complete recipe)

🤔 Review Focus Areas

  1. Scope of deletions - Are we okay removing these 5 models from 0.6.1?
  2. README accuracy - Does the new table and documentation clearly convey what's available?
  3. Incomplete gpt-oss disagg - Should we keep it documented, or delete it entirely?
  4. DeepSeek-R1 structure - Only GB200 recipe remains. Should we flatten the directory structure?

Remove incomplete model directories and non-Kubernetes configurations to
streamline the recipes directory for production Kubernetes deployments.

Changes:
- Remove 5 incomplete model directories (deepseek-r1-distill-llama-8b,
  gemma3, llama4, qwen2-vl-7b-instruct, qwen3) that lack proper
  Kubernetes deployment manifests
- Delete run.sh script (non-Kubernetes automation tool)
- Remove standalone engine config YAMLs from deepseek-r1/trtllm that
  were not wrapped in Kubernetes manifests
- Document incomplete gpt-oss-120b disagg recipe with README explaining
  missing components

README improvements:
- Restructure Available Recipes table with 'Deployment' and 'Benchmark
  Recipe' columns to clarify that perf.yaml files are tools for users
  to run benchmarks, not published performance results
- Add comprehensive quick start guide with prerequisites
- Link to correct Kubernetes deployment guides
- Add troubleshooting section
- Remove extraneous links (docs.nvidia.com, license section)

Result: 4 models with 10 complete deployment recipes (7 with benchmark
scripts), focused exclusively on Kubernetes deployments.
@BenHamm BenHamm requested review from a team as code owners November 6, 2025 02:12
@biswapanda
Copy link
Contributor

LGTM. went over the shorter README and verified links are fine.

minor nit: recipes: .. PR title needs update. Valid prefix are ~ fix: , docs:

@ishandhanani ishandhanani changed the title recipes: Clean up incomplete recipes and clarify Kubernetes-only focus chore(recipies): Clean up incomplete recipes and clarify Kubernetes-only focus Nov 6, 2025
@github-actions github-actions bot added the chore label Nov 6, 2025
Address feedback to make the README less AI-generated looking by removing
decorative emojis from section headings while keeping status indicators
(✅ ❌) in tables and content.
@BenHamm BenHamm changed the title chore(recipies): Clean up incomplete recipes and clarify Kubernetes-only focus docs: Clean up incomplete recipes and clarify Kubernetes-only focus Nov 6, 2025
@github-actions github-actions bot added docs and removed chore labels Nov 6, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants