Add Ministral/Mistral model implementation by dfalbel · Pull Request #5 · mlverse/minhub

dfalbel · 2026-02-09T18:39:29Z

Summary

Implements Ministral-style models with YaRN RoPE and GQA (Grouped Query Attention)
Supports both standard Mistral models (e.g., mistralai/Mistral-7B-v0.1) and multimodal Ministral models
Verified against HuggingFace transformers with max diff ~6e-7

Features

YaRN RoPE: Extended context support with factor, beta_fast, beta_slow, mscale parameters
GQA: Configurable n_head vs n_kv_head for grouped query attention
SwiGLU MLP: Gate/up/down projections with SiLU activation
RMSNorm: Pre-normalization

API

# Load pretrained model
model <- ministral_from_pretrained("mistralai/Mistral-7B-v0.1")

# Or create with custom config
model <- ministral(vocab_size = 32000, n_embd = 4096, ...)

Test plan

Verified logits match Python transformers output (position 0: exact, position 2: max diff 0.005)
Test loading pretrained Mistral-7B
Test custom config model creation
Test text generation with streaming

🤖 Generated with Claude Code

Implements Ministral-style models with: - YaRN RoPE (Yet another RoPE extension) for extended context - GQA (Grouped Query Attention) with configurable num_key_value_heads - SwiGLU MLP with SiLU activation - RMSNorm Verified against HuggingFace transformers MistralForCausalLM with max diff ~6e-7 (floating point precision). Includes tests for: - Loading pretrained Mistral-7B and comparing logits - Creating models with custom config - Text generation with streaming output Co-Authored-By: Claude <noreply@anthropic.com>

dfalbel merged commit 655d881 into main Feb 9, 2026
1 check failed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Ministral/Mistral model implementation#5

Add Ministral/Mistral model implementation#5
dfalbel merged 1 commit intomainfrom
add-ministral

dfalbel commented Feb 9, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

dfalbel commented Feb 9, 2026

Summary

Features

API

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant