Skip to content

Releases: ggml-org/LlamaBarn

0.25.0

18 Feb 14:23

Choose a tag to compare

  • Add custom models folder setting
  • Update llama.cpp to b8088

0.24.0

28 Jan 15:46

Choose a tag to compare

  • Move settings to a separate window
  • Add context length variants for installed models
  • Add delete and copy model ID buttons on hover
  • Removed click-to-load as models now load on demand since 0.22
  • Add GLM 4.7 Flash to catalog
  • Update llama.cpp to b7843

0.23.0

19 Jan 08:52

Choose a tag to compare

  • Allow binding server to a specific IP address
  • Set default sleep idle time to 5 minutes
  • Update llama.cpp to b7772

0.22.0

15 Jan 13:30

Choose a tag to compare

LlamaBarn now uses llama-server in Router Mode. The server stays running in the background and loads models automatically when they are needed. You no longer have to manually select a model before using it. This version also adds an optional "Unload when idle" setting that automatically removes models from memory after a period of inactivity.

  • Migrate to llama-server Router Mode
  • Introduce "Unload when idle" setting for memory management
  • Add automatic download retry logic with exponential backoff
  • Add error reporting for failed model downloads

0.21.0

07 Jan 08:01

Choose a tag to compare

  • Update header to show base URL as a separate element
  • Include memory budget details in context length info in Settings
  • Add Hugging Face link to model context menu
  • Remove --no-mmap to enable memory mapping
  • Update llama.cpp to b7652
  • Fix Nemotron KV cache footprint calculation

0.20.0

02 Jan 13:14

Choose a tag to compare

  • Refine catalog menu layout and navigation
  • Show granular download progress with decimal percentages
  • Optimize settings menu toggle to avoid full menu rebuild

0.19.0

30 Dec 14:15

Choose a tag to compare

  • Family items now open detailed views instead of expanding in place
  • Add descriptions to model families
  • Show incompatible models in the catalog with clear memory requirements
  • Group Qwen3, Qwen3 VL, and Ministral 3 models with their reasoning variants

0.18.0

29 Dec 13:05

Choose a tag to compare

  • Add Q4_K_M quantizations for Devstral 2 models
  • Refactor catalog and model structures for better organization
  • Update llama.cpp to b7569

0.17.0

28 Dec 12:27

Choose a tag to compare

  • Add Devstral 2 model family
  • Add Nemotron Nano 3 model family
  • Use xcassets for icon to reduce bundle size
  • Update llama.cpp to b7561

0.16.0

19 Dec 07:56

Choose a tag to compare

  • Add Show in Finder button to installed model context menu
  • Always show context length and estimated memory usage in model items
  • Remove memory limit option in favor of automatic safety budget
  • Add expandable info descriptions to settings
  • Fix UI flicker when downloading or cancelling models
  • Update llama.cpp to b7475