checkpoint #3357

atebites-hub · 2025-10-04T10:12:25Z

No description provided.

- Upgraded TVM submodule to FFI bump commit (f68651f035) - Fixed script printer namespace mismatch (node->script) - Added conditional script printer imports with dummy fallbacks - Resolved CMake compatibility issues in tokenizers-cpp submodules - Added mlc_llm console script entry point to pyproject.toml - Established virtual environment isolation for clean builds - TVM v0.22 now imports successfully without errors - MLC-LLM CLI functional with TVM v0.22 backend - Ready for Phase 2: DLPack Type System Migration Technical fixes: - C++: script_printer.cc namespace registration fix - Python: Optional Scriptable import with comprehensive fallback - Build: TVM Python package separate installation requirement - Environment: Virtual environment isolation for reproducibility

- Updated tokenizers-cpp to commit 405aa4fa - Updated TVM to commit 52a49c82 -hotfix for TVM cpp dependency.

Phase 1 completed basic TVM v0.22 integration, but model compilation fails with segmentation fault during convert_weight operation. Root Cause: DLPack type system incompatibility - TVM v0.22 changed DLTensor → DLNDArray - TVM v0.22 changed DLManagedTensor → DLManagedNDArray - MLC-LLM still uses old DLPack types for tensor operations Impact: Cannot compile Gemma-3-270M or any models Solution: Phase 2 DLPack migration required immediately Validation: The refactor.md complexity assessment was accurate - Phase 1 alone insufficient for full functionality.

… not DLPack Root cause identified: - Segfault occurs during TIR static initialization for Gemma3 sliding window attention - NOT DLPack type incompatibility as initially assumed - Issue is in TIR code generation for sliding window attention mechanisms - Confirmed: Even q0f16 (no quantization) still segfaults - Hypothesis: Missing TIR bitwise operations using powers of 2 for sliding window masks Phase 1: ✅ TVM basic integration successful Phase 2: 🔴 TIR sliding window operations required (not DLPack migration) User insight: 'bitwise stuff happens in quantization' - correct, but issue is broader - TIR generation for sliding window attention patterns fails.

atebites-hub added 7 commits October 4, 2025 05:12

checkpoint

4cade18

Refactor MLC-LLM to support TVM 0.22 on both cpp and python submodules.

ef52954

ss

6209da3

Update submodule references for tokenizers-cpp and TVM

9e5d2ab

- Updated tokenizers-cpp to commit 405aa4fa - Updated TVM to commit 52a49c82 -hotfix for TVM cpp dependency.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

checkpoint #3357

checkpoint #3357

atebites-hub commented Oct 4, 2025

Uh oh!

Uh oh!

checkpoint #3357

Are you sure you want to change the base?

checkpoint #3357

Conversation

atebites-hub commented Oct 4, 2025

Uh oh!

Uh oh!