Skip to content

Commit fb56a92

Browse files
authored
Merge pull request #1907 from ramalama-labs/feat/rlcr-model-store
Reorganize transports and add new rlcr transport option
2 parents fdc5e69 + 4bdcd4e commit fb56a92

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

48 files changed

+1914
-289
lines changed

.packit-copr-rpm.sh

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -6,8 +6,8 @@
66

77
set -exo pipefail
88

9-
# Extract version from pyproject.toml instead of setup.py
10-
VERSION=$(awk -F'[""]' ' /^\s*version\s*/ {print $(NF-1)}' pyproject.toml )
9+
# Extract version from Python module since pyproject.toml uses dynamic versioning
10+
VERSION=$(python3 -c "import ramalama.version; print(ramalama.version.version())")
1111

1212
SPEC_FILE=rpm/ramalama.spec
1313

docs/ramalama-bench.1.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -14,6 +14,7 @@ ramalama\-bench - benchmark specified AI Model
1414
| HuggingFace | huggingface://, hf://, hf.co/ | [`huggingface.co`](https://www.huggingface.co)|
1515
| ModelScope | modelscope://, ms:// | [`modelscope.cn`](https://modelscope.cn/)|
1616
| Ollama | ollama:// | [`ollama.com`](https://www.ollama.com)|
17+
| rlcr | rlcr:// | [`ramalama.com`](https://registry.ramalama.com/projects/ramalama) |
1718
| OCI Container Registries | oci:// | [`opencontainers.org`](https://opencontainers.org)|
1819
|||Examples: [`quay.io`](https://quay.io), [`Docker Hub`](https://docker.io),[`Artifactory`](https://artifactory.com)|
1920

docs/ramalama-perplexity.1.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -14,6 +14,7 @@ ramalama\-perplexity - calculate the perplexity value of an AI Model
1414
| HuggingFace | huggingface://, hf://, hf.co/ | [`huggingface.co`](https://www.huggingface.co)|
1515
| ModelScope | modelscope://, ms:// | [`modelscope.cn`](https://modelscope.cn/)|
1616
| Ollama | ollama:// | [`ollama.com`](https://www.ollama.com)|
17+
| rlcr | rlcr:// | [`ramalama.com`](https://registry.ramalama.com/projects/ramalama) |
1718
| OCI Container Registries | oci:// | [`opencontainers.org`](https://opencontainers.org)|
1819
|||Examples: [`quay.io`](https://quay.io), [`Docker Hub`](https://docker.io),[`Artifactory`](https://artifactory.com)|
1920

docs/ramalama-run.1.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -14,6 +14,7 @@ ramalama\-run - run specified AI Model as a chatbot
1414
| HuggingFace | huggingface://, hf://, hf.co/ | [`huggingface.co`](https://www.huggingface.co)|
1515
| ModelScope | modelscope://, ms:// | [`modelscope.cn`](https://modelscope.cn/)|
1616
| Ollama | ollama:// | [`ollama.com`](https://www.ollama.com)|
17+
| rlcr | rlcr:// | [`ramalama.com`](https://registry.ramalama.com/projects/ramalama) |
1718
| OCI Container Registries | oci:// | [`opencontainers.org`](https://opencontainers.org)|
1819
|||Examples: [`quay.io`](https://quay.io), [`Docker Hub`](https://docker.io),[`Artifactory`](https://artifactory.com)|
1920

docs/ramalama-serve.1.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -19,6 +19,7 @@ registry if it does not exist in local storage.
1919
| ModelScope | modelscope://, ms:// | [`modelscope.cn`](https://modelscope.cn/)|
2020
| Ollama | ollama:// | [`ollama.com`](https://www.ollama.com)|
2121
| OCI Container Registries | oci:// | [`opencontainers.org`](https://opencontainers.org)|
22+
| rlcr | rlcr:// | [`ramalama.com`](https://registry.ramalama.com/projects/ramalama) |
2223
|||Examples: [`quay.io`](https://quay.io), [`Docker Hub`](https://docker.io),[`Artifactory`](https://artifactory.com)|
2324

2425
RamaLama defaults to the Ollama registry transport. This default can be overridden in the `ramalama.conf` file or via the RAMALAMA_TRANSPORTS

docs/ramalama.1.md

Lines changed: 20 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -58,6 +58,7 @@ RamaLama supports multiple AI model registries types called transports. Supporte
5858
| HuggingFace | huggingface://, hf://, hf.co/ | [`huggingface.co`](https://www.huggingface.co)|
5959
| ModelScope | modelscope://, ms:// | [`modelscope.cn`](https://modelscope.cn/)|
6060
| Ollama | ollama:// | [`ollama.com`](https://www.ollama.com)|
61+
| rlcr | rlcr:// | [`ramalama.com`](https://registry.ramalama.com/projects/ramalama) |
6162
| OCI Container Registries | oci:// | [`opencontainers.org`](https://opencontainers.org)|
6263
|||Examples: [`quay.io`](https://quay.io), [`Docker Hub`](https://docker.io),[`Artifactory`](https://artifactory.com)|
6364

@@ -135,25 +136,25 @@ The default can be overridden in the ramalama.conf file.
135136

136137
| Command | Description |
137138
| ------------------------------------------------- | ---------------------------------------------------------- |
138-
| [ramalama-bench(1)](ramalama-bench.1.md) | benchmark specified AI Model |
139-
| [ramalama-chat(1)](ramalama-chat.1.md) | OpenAI chat with the specified REST API URL |
140-
| [ramalama-containers(1)](ramalama-containers.1.md)| list all RamaLama containers |
141-
| [ramalama-convert(1)](ramalama-convert.1.md) | convert AI Models from local storage to OCI Image |
142-
| [ramalama-daemon(1)](ramalama-daemon.1.md) | run a RamaLama REST server |
143-
| [ramalama-info(1)](ramalama-info.1.md) | display RamaLama configuration information |
144-
| [ramalama-inspect(1)](ramalama-inspect.1.md) | inspect the specified AI Model |
145-
| [ramalama-list(1)](ramalama-list.1.md) | list all downloaded AI Models |
146-
| [ramalama-login(1)](ramalama-login.1.md) | login to remote registry |
147-
| [ramalama-logout(1)](ramalama-logout.1.md) | logout from remote registry |
148-
| [ramalama-perplexity(1)](ramalama-perplexity.1.md)| calculate the perplexity value of an AI Model |
149-
| [ramalama-pull(1)](ramalama-pull.1.md) | pull AI Models from Model registries to local storage |
150-
| [ramalama-push(1)](ramalama-push.1.md) | push AI Models from local storage to remote registries |
151-
| [ramalama-rag(1)](ramalama-rag.1.md) | generate and convert Retrieval Augmented Generation (RAG) data from provided documents into an OCI Image |
152-
| [ramalama-rm(1)](ramalama-rm.1.md) | remove AI Models from local storage |
153-
| [ramalama-run(1)](ramalama-run.1.md) | run specified AI Model as a chatbot |
154-
| [ramalama-serve(1)](ramalama-serve.1.md) | serve REST API on specified AI Model |
155-
| [ramalama-stop(1)](ramalama-stop.1.md) | stop named container that is running AI Model |
156-
| [ramalama-version(1)](ramalama-version.1.md) | display version of RamaLama |
139+
| [ramalama-bench(1)](ramalama-bench.1.md) |benchmark specified AI Model|
140+
| [ramalama-chat(1)](ramalama-chat.1.md) |OpenAI chat with the specified REST API URL|
141+
| [ramalama-containers(1)](ramalama-containers.1.md)|list all RamaLama containers|
142+
| [ramalama-convert(1)](ramalama-convert.1.md) |convert AI Models from local storage to OCI Image|
143+
| [ramalama-daemon(1)](ramalama-daemon.1.md) |run a RamaLama REST server|
144+
| [ramalama-info(1)](ramalama-info.1.md) |display RamaLama configuration information|
145+
| [ramalama-inspect(1)](ramalama-inspect.1.md) |inspect the specified AI Model|
146+
| [ramalama-list(1)](ramalama-list.1.md) |list all downloaded AI Models|
147+
| [ramalama-login(1)](ramalama-login.1.md) |login to remote registry|
148+
| [ramalama-logout(1)](ramalama-logout.1.md) |logout from remote registry|
149+
| [ramalama-perplexity(1)](ramalama-perplexity.1.md)|calculate the perplexity value of an AI Model|
150+
| [ramalama-pull(1)](ramalama-pull.1.md) |pull AI Models from Model registries to local storage|
151+
| [ramalama-push(1)](ramalama-push.1.md) |push AI Models from local storage to remote registries|
152+
| [ramalama-rag(1)](ramalama-rag.1.md) |generate and convert Retrieval Augmented Generation (RAG) data from provided documents into an OCI Image|
153+
| [ramalama-rm(1)](ramalama-rm.1.md) |remove AI Models from local storage|
154+
| [ramalama-run(1)](ramalama-run.1.md) |run specified AI Model as a chatbot|
155+
| [ramalama-serve(1)](ramalama-serve.1.md) |serve REST API on specified AI Model|
156+
| [ramalama-stop(1)](ramalama-stop.1.md) |stop named container that is running AI Model|
157+
| [ramalama-version(1)](ramalama-version.1.md) |display version of RamaLama|
157158

158159
## CONFIGURATION FILES
159160

docsite/docs/commands/ramalama/bench.mdx

Lines changed: 10 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -18,6 +18,7 @@ description: benchmark specified AI Model
1818
| HuggingFace | huggingface://, hf://, hf.co/ | [`huggingface.co`](https://www.huggingface.co)|
1919
| ModelScope | modelscope://, ms:// | [`modelscope.cn`](https://modelscope.cn/)|
2020
| Ollama | ollama:// | [`ollama.com`](https://www.ollama.com)|
21+
| rlcr | rlcr:// | [`ramalama.com`](https://registry.ramalama.com/projects/ramalama) |
2122
| OCI Container Registries | oci:// | [`opencontainers.org`](https://opencontainers.org)|
2223
|||Examples: [`quay.io`](https://quay.io), [`Docker Hub`](https://docker.io),[`Artifactory`](https://artifactory.com)|
2324

@@ -40,6 +41,11 @@ write, and m for mknod(2).
4041

4142
Example: --device=/dev/dri/renderD128:/dev/xvdc:rwm
4243

44+
The device specification is passed directly to the underlying container engine. See documentation of the supported container engine for more information.
45+
46+
Pass '--device=none' explicitly add no device to the container, eg for
47+
running a CPU-only performance comparison.
48+
4349
#### **--env**=
4450

4551
Set environment variables inside of the container.
@@ -57,7 +63,7 @@ OCI container image to run with specified AI model. RamaLama defaults to using
5763
images based on the accelerator it discovers. For example:
5864
`quay.io/ramalama/ramalama`. See the table below for all default images.
5965
The default image tag is based on the minor version of the RamaLama package.
60-
Version 0.11.1 of RamaLama pulls an image with a `:0.11` tag from the quay.io/ramalama OCI repository. The --image option overrides this default.
66+
Version 0.12.1 of RamaLama pulls an image with a `:0.12` tag from the quay.io/ramalama OCI repository. The --image option overrides this default.
6167

6268
The default can be overridden in the ramalama.conf file or via the
6369
RAMALAMA_IMAGE environment variable. `export RAMALAMA_IMAGE=quay.io/ramalama/aiimage:1.2` tells
@@ -139,6 +145,9 @@ llama.cpp explains this as:
139145

140146
Usage: Lower numbers are good for virtual assistants where we need deterministic responses. Higher numbers are good for roleplay or creative tasks like editing stories
141147

148+
#### **--thinking**=*true*
149+
Enable or disable thinking mode in reasoning models
150+
142151
#### **--threads**, **-t**
143152
Maximum number of cpu threads to use.
144153
The default is to use half the cores available on this system for the number of threads.

docsite/docs/commands/ramalama/chat.mdx

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -56,6 +56,12 @@ $ ramalama chat
5656
Communicate with an alternative OpenAI REST API URL. With Docker containers.
5757
$ ramalama chat --url http://localhost:1234
5858
🐋 >
59+
60+
Send multiple lines at once
61+
$ ramalama chat
62+
🦭 > Hi \
63+
🦭 > tell me a funny story \
64+
🦭 > please
5965
```
6066

6167
## See Also
Lines changed: 83 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,83 @@
1+
---
2+
title: ramalama daemon.1
3+
description: run a RamaLama REST server
4+
# This file is auto-generated from manpages. Do not edit manually.
5+
# Source: ramalama-daemon.1.md
6+
---
7+
8+
# ramalama daemon.1
9+
10+
## Synopsis
11+
**ramalama daemon** [*options*] [start|run]
12+
13+
## Description
14+
Inspect the specified AI Model about additional information
15+
like the repository, its metadata and tensor information.
16+
17+
## Options
18+
19+
#### **--help**, **-h**
20+
Print usage message
21+
22+
## COMMANDS
23+
24+
#### **start**
25+
pepares to run a new RamaLama REST server so it will be run either inside a RamaLama container or on the host
26+
27+
#### **run**
28+
start a new RamaLama REST server
29+
30+
## Examples
31+
32+
Inspect the smollm:135m model for basic information
33+
```bash
34+
$ ramalama inspect smollm:135m
35+
smollm:135m
36+
Path: /var/lib/ramalama/models/ollama/smollm:135m
37+
Registry: ollama
38+
Format: GGUF
39+
Version: 3
40+
Endianness: little
41+
Metadata: 39 entries
42+
Tensors: 272 entries
43+
```
44+
45+
Inspect the smollm:135m model for all information in json format
46+
```bash
47+
$ ramalama inspect smollm:135m --all --json
48+
{
49+
"Name": "smollm:135m",
50+
"Path": "/home/mengel/.local/share/ramalama/models/ollama/smollm:135m",
51+
"Registry": "ollama",
52+
"Format": "GGUF",
53+
"Version": 3,
54+
"LittleEndian": true,
55+
"Metadata": {
56+
"general.architecture": "llama",
57+
"general.base_model.0.name": "SmolLM 135M",
58+
"general.base_model.0.organization": "HuggingFaceTB",
59+
"general.base_model.0.repo_url": "https://huggingface.co/HuggingFaceTB/SmolLM-135M",
60+
...
61+
},
62+
"Tensors": [
63+
{
64+
"dimensions": [
65+
576,
66+
49152
67+
],
68+
"n_dimensions": 2,
69+
"name": "token_embd.weight",
70+
"offset": 0,
71+
"type": 8
72+
},
73+
...
74+
]
75+
}
76+
```
77+
78+
## See Also
79+
[ramalama(1)](/docs/commands/ramalama/)
80+
81+
---
82+
83+
*Feb 2025, Originally compiled by Michael Engel <mengel@redhat.com>*

docsite/docs/commands/ramalama/inspect.mdx

Lines changed: 42 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -18,7 +18,14 @@ like the repository, its metadata and tensor information.
1818

1919
#### **--all**
2020
Print all available information about the AI Model.
21-
By default, only a basic subset is printed.
21+
By default, only a basic subset is printed.
22+
23+
#### **--get**=*field*
24+
Print the value of a specific metadata field of the AI Model.
25+
This option supports autocomplete with the available metadata
26+
fields of the given model.
27+
The special value `all` will print all available metadata
28+
fields and values.
2229

2330
#### **--help**, **-h**
2431
Print usage message
@@ -74,6 +81,40 @@ $ ramalama inspect smollm:135m --all --json
7481
}
7582
```
7683

84+
Use the autocomplete function of `--get` to view a list of fields:
85+
```bash
86+
$ ramalama inspect smollm:135m --get general.
87+
general.architecture general.languages
88+
general.base_model.0.name general.license
89+
general.base_model.0.organization general.name
90+
general.base_model.0.repo_url general.organization
91+
general.base_model.count general.quantization_version
92+
general.basename general.size_label
93+
general.datasets general.tags
94+
general.file_type general.type
95+
general.finetune
96+
```
97+
98+
Print the value of a specific field of the smollm:135m model:
99+
```bash
100+
$ ramalama inspect smollm:135m --get tokenizer.chat_template
101+
{% for message in messages %}{{'<|im_start|>' + message['role'] + '
102+
' + message['content'] + '<|im_end|>' + '
103+
'}}{% endfor %}{% if add_generation_prompt %}{{ '<|im_start|>assistant
104+
' }}{% endif %}
105+
```
106+
107+
Print all key-value pairs of the metadata of the smollm:135m model:
108+
```bash
109+
$ ramalama inspect smollm:135m --get all
110+
general.architecture: llama
111+
general.base_model.0.name: SmolLM 135M
112+
general.base_model.0.organization: HuggingFaceTB
113+
general.base_model.0.repo_url: https://huggingface.co/HuggingFaceTB/SmolLM-135M
114+
general.base_model.count: 1
115+
...
116+
```
117+
77118
## See Also
78119
[ramalama(1)](/docs/commands/ramalama/)
79120

0 commit comments

Comments
 (0)