Skip to content

Jewelzufo/granitepi-4-nano

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

59 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

GranitePi-4-Nano

GranitePi-4-Nano Banner

Details
    Date: 11-01-2025 | Version: 1.2

    Designed by: Julian A. Gonzalez - (LinkedIn)
    Co-Contributor: Thomas Mertens - (LinkedIn)

Last Commit Commit Activity Raspberry Pi 5 Ollama IBM Granite License

Run a full-featured large language model entirely on your Raspberry Pi 5 with zero cloud dependency.

This repository contains a complete, beginner-friendly guide to setting up IBM Granite 4.0 (350M) on a Raspberry Pi 5 using Ollama for 100% local, private AI inference. It utilizes a highly optimized GGUF quantization (unsloth_granite-4.0-h-350m-GGUF) to ensure smooth performance on edge hardware.

✨ Highlights

  • πŸ”’ 100% Private β€” All data stays on your device. No cloud, no tracking.
  • πŸš€ One-Click Setup β€” Automated script handles dependencies, safety checks, and installation.
  • ⚑ Optimized Performance β€” Uses specific quantization and thread settings for the Pi 5 CPU.
  • πŸ’° Cost-Effective β€” Turn a standard Raspberry Pi into an AI workstation.
  • 🌐 Fully Offline β€” Works without internet after initial setup.

πŸ“Š Model Specs

Aspect Details
Model jewelzufo/unsloth_granite-4.0-h-350m-GGUF
Parameters 350 Million
Architecture Hybrid Mamba-2 (SSM)
Download Size ~366 MB
Loaded Size ~1.2 GB RAM
Inference Memory ~800 MB - 1.2 GB
License Apache 2.0 (Open Source)
Languages 12+ (English, Spanish, French, German, Japanese, etc.)

System Architecture

system diagram

🎯 Quick Start (TL;DR)

We provide an automated setup script that verifies your hardware (RAM, Disk, Thermals) and installs the necessary components.

# 1. Clone the repository
git clone [https://github.com/Jewelzufo/granitepi-4-nano.git](https://github.com/Jewelzufo/granitepi-4-nano.git)
cd granitepi-4-nano

# 2. Make the setup script executable
chmod +x setup.sh

# 3. Run the automated installer
./setup.sh

# 4. Start chatting
ollama run jewelzufo/unsloth_granite-4.0-h-350m-GGUF

Done! πŸŽ‰

πŸ› οΈ Requirements

Hardware

  • Raspberry Pi 5

  • RAM: 8GB recommended (4GB minimum supported with warnings).

  • Storage: At least 5GB free space required (SSD preferred for speed).

  • Cooling: Active cooling (heatsink + fan) is strongly recommended to prevent thermal throttling during inference.

  • Power: Official USB-C power supply (5V 5A).

Software

  • Raspberry Pi OS 64-bit (Bookworm or later).
  • Basic terminal familiarity.

πŸš€ Detailed Installation

1. Automated Setup (setup.sh)

The included setup.sh script is the safest way to deploy. It performs the following actions:

  • Validates Architecture: Ensures you are running on a 64-bit OS.
  • Checks Hardware: Verifies sufficient RAM, disk space, and safe CPU temperatures.
  • Optimizes Swap: Configures a 2GB swap file to prevent out-of-memory crashes.
  • Installs Ollama: Sets up the inference engine and applies performance overrides (2 threads, 24h keep-alive).
  • Downloads Model: Pulls the optimized GGUF version of Granite 4.0.

2. Manual Verification

If you prefer to check your system manually before running the script:

# Verify 64-bit architecture
uname -m  # Output: aarch64

# Check available RAM
free -h   # Look for 'Available' column

# Check CPU temperature
vcgencmd measure_temp # Should be < 75Β°C

πŸ’» Usage Examples

Command Line

Once installed, interact with the model directly using the model ID configured in the setup:

# Ask a question
ollama run jewelzufo/unsloth_granite-4.0-h-350m-GGUF "How do neural networks work?"

# Multi-line prompt
ollama run jewelzufo/unsloth_granite-4.0-h-350m-GGUF "
Write a Python function that:
1. Takes a list of numbers
2. Returns the average
3. Handles empty lists
"

Python Integration

You can integrate Granite 4.0 into your Python scripts using the Ollama API. See examples/basic_query.py for a starter script.

import requests

def query_ai(prompt):
    response = requests.post('http://localhost:11434/api/generate', 
        json={
            'model': 'jewelzufo/unsloth_granite-4.0-h-350m-GGUF',
            'prompt': prompt,
            'stream': False
        }
    )
    return response.json()['response']

print(query_ai("What is quantum entanglement?"))

πŸ“Š Performance Benchmarks

On Raspberry Pi 5 (8GB, active cooling):

Task Speed Notes
Model load ~8-12 seconds Cached after first run
Question answer ~2-5 seconds For typical 100-token response
Throughput ~30-50 tokens/sec Excellent for ARM edge device
Temperature 55-65Β°C With proper cooling
Memory usage ~1.2 GB peak Model + buffers

πŸ”’ Privacy & Security

This setup is 100% private by design: βœ… No cloud uploads β€” Everything runs locally.

βœ… No internet required β€” Works offline after initial download.

βœ… No account needed β€” No tracking, no sign-ups.

Your data (medical records, proprietary documents, code) never leaves your device.

πŸ› Troubleshooting

Setup script fails on "Insufficient disk space" Free up space or expand your partition. The script requires 5GB safety buffer.

Model is slow or system freezes

  • Check your temperature: vcgencmd measure_temp.
  • Ensure the setup script successfully configured the 2GB swap file.
  • Try reducing threads manually: OLLAMA_NUM_THREADS=1 ollama run ...

"Server not responding" The setup script includes a wait loop, but if it fails, try restarting the service:

sudo systemctl restart ollama

🀝 Contributing

Contributions welcome!

  • Found a bug? Open an issue.
  • Have a better approach? Submit a PR.
  • Benchmarked different hardware? Share your results.

πŸ“ License

This tutorial and code examples are Apache 2.0 licensed. The IBM Granite model is also Apache 2.0 licensed.


Made with ❀️ for privacy advocates, AI learners, and Raspberry Pi enthusiasts.

About

Run IBM Granite 4.0 locally on Raspberry Pi 5 with Ollama.This is a privacy-first AI. Your data never leaves your device because it runs 100% locally. There are no cloud uploads and no third-party tracking.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors