llama-server

Star

Here are 20 public repositories matching this topic...

lordmathis / llamactl

Star

Unified management and routing for llama.cpp, MLX and vLLM models with web dashboard.

self-hosted mlx openai-api llm llamacpp llama-cpp vllm llm-inference localllm localllama llama-server llm-router mlx-lm

Updated Mar 29, 2026
Go

willbnu / Qwen-3.5-16G-Vram-Local

Star

Configs, launchers, benchmarks, and tooling for running Qwen3.5 GGUF models locally with llama.cpp on a 16GB NVIDIA GPU

Updated Mar 29, 2026
Python

hwpoison / llamacpp-terminal-chat

Star

A lightweight chat terminal-interface for llama.cpp server written in C++ with many features and windows/linux support.

chat roleplay llama teminal-application llamacpp mistral-7b llama-server

Updated May 31, 2025
C++

lynxai-team / goinfer

Star

Local LLM proxy, DevOps friendly

inference inference-server inference-api openai-api llm openaiapi llamacpp llama-cpp local-llm localllm local-ai llm-proxy llama-api llama-server llm-router language-model-api local-lm local-llm-integration

Updated Feb 8, 2026
Go

pkeffect / llama-swap-sync

Star

A robust, production-ready Python toolkit to automate the synchronization between a directory of .gguf model files and a llama-swap config.yaml

python llama-cpp gguf llama-server llama-swap

Updated Nov 15, 2025
Python

mallard1983 / openclaw-kvcache-proxy

Star

FastAPI proxy that strips volatile fields from OpenClaw requests to dramatically improve llama-server KV cache hit rates (~22× faster prompt eval)

proxy fastapi kv-cache llm prompt-caching llama-cpp local-llm llama-server openclaw amd-vulkan

Updated Feb 23, 2026
Python

thilomichael / llama-buddy

Star

CLI wrapper for llama.cpp providing an ollama-like experience

python cli huggingface llm llama-cpp local-llm gguf llama-server

Updated Mar 23, 2026
Python

nlkli / lachat

Star

minimal CLI client for llama-server

chat cli cli-app llama gpt llm chatgpt llamacpp llama-server

Updated Jan 24, 2026
Rust

CasualEngineerZombie / smolvlm-realtime-face

Star

A simple web application for real-time AI vision analysis using SmolVLM-500M-Instruct with live camera feed processing and text-to-speech.

face-recognition webcam llm-inference llama-server smolvlm

Updated Jun 30, 2025
JavaScript

nemmusu / run-llama-server

Star

This is a Bash script to automatically launch llama-server, detects available .gguf models, and selects GPU layers based on your free VRAM.

bash cli utility ai launcher nvidia llama nvidia-smi nvidia-gpu llm llamacpp gguf llama-server gguf-models

Updated May 25, 2025
Shell

A production-grade Python SDK for llama-server that streamlines authentication, token rotation, observability, and PII masking—helping AI architects ship secure, traceable LLM systems with enterprise-ready guardrails.

sdk ai openai llama observability governance pii llm generative-ai langchain llama-cpp langfuse llama-server langgraph ai-architecture

Updated Feb 28, 2026
Python

alasgarovs / openserv

Star

OpenServ is a simple Bash-based CLI tool for managing LLMs in llama.cpp server.

llm llamacpp local-ai llama-server

Updated Mar 29, 2026
Shell

byang37 / llama-runner

Star

A lightweight desktop GUI for llama-server — multi-model routing, per-model presets, live I/O recording. Built with Go. Support Windows · macOS · Linux

llm llama-cpp local-ai gguf llama-server llama-gui

Updated Mar 16, 2026
HTML

hyang97 / nanocomplete

Star

Create a code completion model & tool for IDEs that can run locally on consumer hardware and rival the performance of commercial products like Cursor.

transformers torch vscode-extension code-generation mlflow huggingface humaneval vllm evals llama-server

Updated Feb 9, 2026
Python

hannes-sistemica / nginx-llm-proxy

Star

Lightweight nginx+Lua reverse proxy that routes OpenAI-compatible API requests to multiple llama-server instances by model name. No request body mangling.

nginx lua reverse-proxy openai-api llama-cpp llama-server

Updated Feb 21, 2026
HTML

MidalaNet / hikma

Star

Hikma is a minimal GTK4 chat client in Vala for OpenAI‑compatible APIs. It renders messages as plain text, stores settings securely via libsecret, and builds with Meson/Ninja plus a simple Debian packaging flow.

linux debian vala artificial-intelligence gtk4 openai-api llama-server

Updated Dec 10, 2025
Vala

J3lackai / llama-server-manager

Star

CLI-утилита для управления llama-server на базе Python, для ОС Windows 10/11. Позволяет запускать и переключать LLM, используя только конфигурационный файл (config.ini). Реализован паттерн стратегия, для выбора моделей. Эта утилита разработана в первую очередь для видеокарт AMD (RX 6600 - 6700), не поддерживающих ROCM HIP SDK

llama-cpp llama-server gguf-models rx-6600 llm-tool rx-6700

Updated Mar 26, 2026
Python

YashwanthMY15 / Qwen-3.5-16G-Vram-Local

Star

Provide tested tools and configs to run Qwen 3.5 GGUF models efficiently on a single 16GB NVIDIA GPU using llama.cpp locally.

benchmark nvidia vlm mixture-of-experts huggingface ai-inference llama-cpp local-llm local-ai qwen gguf llama-server rtx-5080 rtx-4080 vram-16gb

Updated Mar 30, 2026
Python

david-rockwood / fimpad

Star

FIMpad is a FIM-focused local LLM interface in the form of a tabbed GUI text editor.

text-editor tkinter granite fim llm llamacpp llama-cpp fill-in-the-middle qwen llama-server ai-interface ai-text-editor

Updated Feb 25, 2026
Python

laphilosophia / reviva

Sponsor

Star

Local-first review terminal for deterministic, inspectable, and constrained repository analysis with local LLM backends.

rust terminal static-analysis developer-tools code-review repository-analysis llm llama-cpp local-llm llama-server

Updated Mar 25, 2026
Rust

Improve this page

Add a description, image, and links to the llama-server topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the llama-server topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

llama-server

Here are 20 public repositories matching this topic...

lordmathis / llamactl

willbnu / Qwen-3.5-16G-Vram-Local

hwpoison / llamacpp-terminal-chat

lynxai-team / goinfer

pkeffect / llama-swap-sync

mallard1983 / openclaw-kvcache-proxy

thilomichael / llama-buddy

nlkli / lachat

CasualEngineerZombie / smolvlm-realtime-face

nemmusu / run-llama-server

Root1V / axonium-sdk

alasgarovs / openserv

byang37 / llama-runner

hyang97 / nanocomplete

hannes-sistemica / nginx-llm-proxy

MidalaNet / hikma

J3lackai / llama-server-manager

YashwanthMY15 / Qwen-3.5-16G-Vram-Local

david-rockwood / fimpad

laphilosophia / reviva

Improve this page

Add this topic to your repo