[Bug] Segmentation fault when converting InkubaLM-0.4B with custom architecture

## Bug Description
Segmentation fault (exit code 139) occurs when attempting to convert InkubaLM-0.4B model using `mlc_llm convert_weight`.

## Model Information
- **Model**: `lelapa/InkubaLM-0.4B` from HuggingFace
- **Architecture**: Custom LlamaForCausalLM variant (`VulavulaLlamaForCausalLM`)
- **Size**: 0.4B parameters, 8 layers
- **Special characteristics**: 
  - Has shared tensors between layers
  - Uses custom `auto_map` configuration
  - Custom Python files: `vulavulaslm.py`

## Steps to Reproduce
1. Download the model from HuggingFace: `lelapa/InkubaLM-0.4B`
2. Run: `python -m mlc_llm convert_weight ./InkubaLM-0.4B --quantization q4f16_1 -o output`

## Expected Behavior
Successful weight conversion or graceful error message if the architecture is unsupported.

## Actual Behavior
Segmentation fault with exit code 139.

## Environment
- **OS**: macOS 15.0 (Apple Silicon ARM64)
- **MLC LLM version**: 0.1.dev0
- **Python version**: 3.10.18 (conda-forge)
- **Device tested**: Both `cpu:0` and `metal:0`

## Attempted Solutions
- ✅ Tried multiple quantization formats: `q4f16_1`, `q0f32`
- ✅ Tried CPU device instead of Metal
- ✅ Converted to safetensors format first
- ✅ Removed custom `auto_map` from config
- ❌ All attempts result in segfault

## Additional Context
The model loads successfully with `transformers.AutoModelForCausalLM.from_pretrained()` using `trust_remote_code=True`, suggesting the issue is specific to MLC's conversion process.

## Log Output

[2025-10-02 17:10:06] INFO llama_model.py:63: context_window_size not found in config.json. Falling back to max_position_embeddings (2048) [2025-10-02 17:10:06] INFO llama_model.py:89: prefill_chunk_size defaults to 2048 !!!!!!! Segfault encountered !!!!!!! zsh: segmentation fault python -m mlc_llm convert_weight [InkubaLM-0.4B](vscode-file://vscode-app/Applications/Visual%20Studio%20Code.app/Contents/Resources/app/out/vs/code/electron-browser/workbench/workbench.html) --quantization

## Impact
This prevents users from converting LelapaAI's InkubaLM-0.4B African language model to MLC format.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Bug] Segmentation fault when converting InkubaLM-0.4B with custom architecture #3356

Bug Description

Model Information

Steps to Reproduce

Expected Behavior

Actual Behavior

Environment

Attempted Solutions

Additional Context

Log Output

Impact

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Bug] Segmentation fault when converting InkubaLM-0.4B with custom architecture #3356

Description

Bug Description

Model Information

Steps to Reproduce

Expected Behavior

Actual Behavior

Environment

Attempted Solutions

Additional Context

Log Output

Impact

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions