Skip to content

isdzulqor/oalla

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

70 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Oalla Logo

Run Ollama and open language models directly on Android devices

What This IsArchitectureTechnical ImplementationModelsWhy This Approach


What This Is

Oalla demonstrates running a complete Go web server inside an Android app process. The result is a mobile app that can run any Ollama-compatible model locally without internet connectivity.

This is completely open source, just like Ollama itself. You can use any models from Ollama's library or Hugging Face that work with the GGUF format.

Oalla Demo

Oalla running locally on Android with offline AI models

Download

Download APK

Get the latest APK from the Releases page

Architecture

┌─────────────────────────────────────────────────────────────────┐
│                    Android App Process                          │
├─────────────────────────────────────────────────────────────────┤
│                                                                 │
│  ┌─────────────────┐    HTTP     ┌─────────────────────────┐    │
│  │   JavaScript    │ ←────────→  │     Go Server           │    │
│  │   Chat UI       │  localhost  │     (Ollama)            │    │
│  │                 │ :8000-8500  │                         │    │
│  └─────────────────┘  (dynamic)  └─────────────────────────┘    │
│           │                                    │                │
│           │                                    │                │
│  ┌─────────────────┐             ┌─────────────────────────┐    │
│  │  Android        │             │    JNI Bridge           │    │
│  │  WebView        │             │    (libbridgeollama.so) │    │
│  │                 │             │                         │    │
│  └─────────────────┘             └─────────────────────────┘    │
│           │                                    │                │
│           └────────────────────────────────────┘                │
│                    Native Integration                           │
└─────────────────────────────────────────────────────────────────┘

Key Components:

  • JavaScript UI: Rich web-based chat interface running in WebView
  • HTTP API: Standard REST endpoints (/api/chat, /api/models, etc.)
  • Go Server: Full Ollama server compiled as Android native library
  • JNI Bridge: Connects Kotlin/Java Android code with Go server
  • Single Process: Everything runs in one Android app process for efficiency
  • Dynamic Port: Randomly allocated port (8000-8500) for security

The app loads Ollama's web interface in a WebView while running the actual Ollama server natively in the same process. JavaScript communicates with the Go backend via standard HTTP requests to localhost.

Technical Implementation

Step-by-step guide to modify the official Ollama repository for Android compatibility. Covers JNI bridge creation, in-process execution, cross-compilation, and the web API endpoints that make this possible.

How the Android app manages the Go server lifecycle, handles JavaScript-native communication, implements security through dynamic ports and authentication, and manages encrypted assets.

Models

Works with any Ollama model or GGUF-format models from Hugging Face:

Model Size Context Type
tinyllama:latest 638MB 2K Text
qwen3:0.6b 523MB 40K Text
smollm2:135m 135MB 4K Text
gemma3:270m 292MB 32k Text
Model Size Context Type
hf.co/unsloth/Qwen3-4B-GGUF:Q4_K_M 1.03GB 128K Text

Why This Approach

This architecture proves that mobile devices can run sophisticated AI workloads locally. It maintains full compatibility with Ollama's ecosystem while providing a rich web-based interface that would be difficult to implement natively.

The approach is entirely offline-first and privacy-focused - no data leaves your device, no accounts required, no tracking.

Benefits:

  • Easy model installation - just download GGUF files and load them
  • Full Ollama API compatibility for seamless integration
  • Web-based UI that's simple to customize and extend

Current Limitations:

  • Text-only models supported at this time
  • Embedding and image models not yet integrated
  • No Android GPU acceleration (CPU inference only)
  • Performance depends on device capabilities

License

MIT License, same as Ollama. This project builds upon Ollama's work to bring it to mobile platforms.

About

Ollama server running inside Android - complete offline AI inference with Go backend, WebView UI, and JNI bridge. No internet, no tracking, full privacy.

Topics

Resources

License

Stars

Watchers

Forks

Contributors