Skip to content

Releases: wsmlby/homl

v0.3.5

04 Dec 07:54

Choose a tag to compare

Bump vLLM to pre-v0.13, with HunyuanOCR support.

A new OCR model worth trying!

v0.3.1

21 Aug 06:11

Choose a tag to compare

color

v0.3.0

17 Aug 00:42

Choose a tag to compare

HoML v0.3.0 Release Notes

This release introduces model-specific configuration, allowing users to customize launch parameters for each model.

New Features

  • Model Configuration: A new command homl config model has been added to manage model-specific launch parameters.
    • Set custom launch parameters for a model: homl config model <model_name> --params "<params>"
    • Get the current configuration for a model: homl config model <model_name> --get

v0.2.2

15 Aug 07:43

Choose a tag to compare

Added option to install OpenWebUI together with HoML server to provide a chat interface.

Install by
homl server install --webui

v0.2.0

13 Aug 08:31

Choose a tag to compare

We are thrilled to announce the release of HoML v0.2.0, a landmark update focused on dramatically improving model startup times through significant architectural changes and a powerful new feature: Eager Mode.

🚀 Architectural Overhaul for Faster Model Loading

This architectural overhaul provides a massive boost to startup speeds right out of the box.

For example, the startup time for qwen3:0.6b has been slashed from 40 seconds to just 22 seconds—making it nearly 1.8x faster even without any special flags.

🔥 Introducing Eager Mode: An Extra Gear for Instantaneous Startup

On top of the new architectural baseline, we're introducing Eager Mode, a loading mechanism that prioritizes getting you to your first token even faster.

With Eager Mode, the results are staggering:

  • qwen3:0.6b: Startup time plummets from 22 seconds to a mere 8 seconds.
  • gpt-oss:20b: We've clocked a drop from 38 seconds to just 18 seconds.

CLI Enhancements

To put this power in your hands, we've updated the HoML CLI:

  • New --eager flag for homl run: Manually start any model in Eager Mode for the fastest possible launch.
    homl run qwen3:0.6b --eager
  • Smarter Defaults for a Seamless Experience:
    • The homl chat command now uses Eager Mode by default, letting you start conversations almost instantly.
    • The server also defaults to Eager Mode when automatically switching models, ensuring a smooth and rapid transition between different API requests.

Our Commitment to Speed

We believe that performance is a core feature. This update, with its two-pronged approach of deep architectural improvements and the user-facing Eager Mode, reaffirms our commitment to providing a high-performance, easy-to-use local AI experience.

Upgrade to HoML v0.2.0 today to experience this new era of speed. We're excited for you to try it and welcome your feedback.

curl -sSL https://homl.dev/install.sh | sh
homl server install --upgrade

v0.1.4

13 Aug 08:00

Choose a tag to compare

cli minor fix

v0.1.3

13 Aug 07:41

Choose a tag to compare

cli to cleanup module_info_cache_path when reinstall

v0.1.2

11 Aug 08:47

Choose a tag to compare

patch vLLM

v0.1.1

11 Aug 05:46

Choose a tag to compare

model loading is 10s faster

v0.1.0

11 Aug 04:44

Choose a tag to compare

V0.1.0:

HoML is ready to be openAI API compatible.

The following apis are implemented:

    @app.post("/v1/chat/completions") 
    @app.post("/v1/completions")
    @app.post("/v1/responses")
    @app.get("/v1/models")

If you have other endpoint you want to use, please create an issue.