Skip to content

F0R3V3R50F7/openOrchestrate

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

42 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

❀️ openOrchestrate πŸ—ͺ

β–‘β–‘β–‘β–’β–’β–’β–“β–“β–“ LOCAL AI THAT WORKS. β–“β–“β–“β–’β–’β–’β–‘β–‘β–‘


Powered by Built with Tested on

Est. 2025 | Powered by llama.cpp | Built on Caffeine & Principles


πŸ“Έ SCREENSHOTS

First Time Setup Wizard First Time Setup Wizard Chat Interface Chat Interface

πŸ“° LATEST HEADLINES

[BREAKING] Local AI Finally Gets
           Proper Orchestration!
           
[NEW!] Context Windows That Don't
       Silently Fail You
       
[HOT!] VRAM Constraints? We Actually
       Respect Those Here

🎯 QUICK STATS

Codebase......... 354,000+ chars
Models Supported. Works best with instruct tuned models*
Cloud Deps....... 0
Telemetry........ 0
Working Paths.... SACRED

*if llama.cpp supports it


🌟 β–‚β–ƒβ–…β–‡β–ˆ WHAT IS THIS β–ˆβ–‡β–…β–ƒβ–‚ 🌟

openOrchestrate is a complete Local-First MoE AI Front-End built with phpDesktop-Chrome and llama.cpp.

Not just a chat UI. An orchestration layer that:

  • Routes requests intelligently
  • Manages multiple GGUF models
  • Preserves long-term context
  • Degrades gracefully on constrained hardware

Built for people who want local AI that respects limited VRAM, limited context, and reality itself.


πŸŽͺ FEATURES THAT ACTUALLY WORK

🧠 INTELLIGENT MODEL ROUTING

╔════════════════════════════════════════╗
β•‘  Query β†’ Router β†’ Right Model          β•‘
β•‘  Code   β†’ CodeLlama                    β•‘
β•‘  Medical β†’ Meditron                    β•‘
β•‘  General β†’ Llama                       β•‘
β•šβ•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•

πŸ’Ύ VELOCITY INDEX (Long-Term Memory)

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Context full β†’ ARCHIVE & INDEX      β”‚
β”‚              β†’ Recall when needed   β”‚
β”‚              β†’ Keep continuity      β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

βœ‚οΈ CONTEXT PRUNING

Context limits managed deliberately, not silently dropped.

πŸŽ›οΈ MULTI-MODEL EXECUTION

╔══════════════════════════════════════╗
β•‘  GPU: [β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‘β–‘] Managed           β•‘
β•‘  CPU: [β–ˆβ–ˆβ–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘] Auxiliary         β•‘
β•‘  VRAM: Predictable βœ“                 β•‘
β•šβ•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•

πŸ”’ 100% LOCAL

┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃  βœ“ No API calls                  ┃
┃  βœ“ No telemetry                  ┃
┃  βœ“ No cloud                      ┃
┃  βœ“ Stays on YOUR machine         ┃
┗━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛

πŸ“Ž FILE ATTACHMENTS

Attach text files for analysis.


β˜• KEEP THE PROJECT BREWING

Fuel the development β€’ Support more features β€’ Keep it 100% local & independent


πŸ—οΈ ARCHITECTURE

        USER INTERFACE
        (HTML/CSS/JS)
              β”‚
              β–Ό
       PIPELINE ENGINE
              β”‚
     β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”
     β–Ό        β–Ό        β–Ό
  LLAMA   VELOCITY  CONTEXT
GOVERNOR   INDEX   PRUNING
     β”‚
     β–Ό
  llama.cpp

🎯 TECH STACK

Component Tech
Frontend HTML/CSS/JS (~66k chars)
Backend PHP (~40k chars)
Runtime phpDesktop-Chrome
Inference llama.cpp

🎭 SUPPORTED MODELS

╔═══════════════════════════════════════════════════╗
β•‘  IF llama.cpp CAN RUN IT, WE CAN ORCHESTRATE IT  β•‘
β•šβ•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•

πŸ—£οΈ GENERAL LLMs

  • Gemma
  • Phi
  • Qwen
  • Falcon
  • Your custom GGUF

πŸ’» CODE MODELS

  • Qwen Code
  • DeepSeek-Coder
  • StarCoder
  • Mistral Code

πŸ₯ MEDICAL/RESEARCH

  • Meditron
  • BioMistral
  • MedGemma

🎯 ANY GGUF

  • If llama.cpp runs it
  • We orchestrate it
  • No vendor lock-in

🎲 DESIGN PHILOSOPHY

╔═══════════════════════════════════════════════════════╗
β•‘  ⚑ Constraints are real                             β•‘
β•‘  ⚑ Regression is failure                            β•‘
β•‘  ⚑ Working paths are sacred                         β•‘
β•‘  ⚑ Graceful degradation > Silent failure            β•‘
β•šβ•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•

πŸ“Š STATUS

🚧 UNDER ACTIVE DEVELOPMENT - CURRENTLY VULKAN & WINDOWS ONLY!! 🚧

Focus: Stability > Features | Approach: Conservative releases


🎯 WHY THIS EXISTS

πŸ’Ύ
Limited VRAM
πŸ“
Limited Context
🀝
User Trust
🌍
Reality

Local AI deserves tooling that respects constraints.


🏴 MADE IN WALES

Welsh Flag

Crafted with love, discipline, and entirely too much tea.


┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃                                                  ┃
┃  "Finally, a local AI tool that doesn't         ┃
┃   treat me like I have a datacenter"            ┃
┃                                   β€” Hopefully You┃
┃                                                  ┃
┗━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛

πŸ’¬ Got Questions? Found Bugs? Have Opinions?

Issues are welcome. PRs are reviewed. Respect is expected.


Powered by Built with Tested on

⬆️ BACK TO TOP


Last Updated: 2026 | Page Views: ∞ | Caffeine Consumed: Yes

About

Intelligent,user-friendly Llama.cpp front-end with context saving features for systems with low VRAM.

Topics

Resources

License

Stars

Watchers

Forks

Contributors

Languages