Skip to content

Commit 5a6bc6b

Browse files
authored
model-conversion : add model card template for embeddings [no ci] (ggml-org#15557)
* model-conversion: add model card template for embeddings [no ci] This commit adds a separate model card template (model repository README.md template) for embedding models. The motivation for this is that there server command for the embedding model is a little different and some addition information can be useful in the model card for embedding models which might not be directly relevant for causal models. * squash! model-conversion: add model card template for embeddings [no ci] Fix pyright lint error. * remove --pooling override and clarify embd_normalize usage
1 parent 6b64f74 commit 5a6bc6b

File tree

5 files changed

+97
-17
lines changed

5 files changed

+97
-17
lines changed

examples/model-conversion/Makefile

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -144,6 +144,15 @@ perplexity-run:
144144
hf-create-model:
145145
@./scripts/utils/hf-create-model.py -m "${MODEL_NAME}" -ns "${NAMESPACE}" -b "${ORIGINAL_BASE_MODEL}"
146146

147+
hf-create-model-dry-run:
148+
@./scripts/utils/hf-create-model.py -m "${MODEL_NAME}" -ns "${NAMESPACE}" -b "${ORIGINAL_BASE_MODEL}" -d
149+
150+
hf-create-model-embedding:
151+
@./scripts/utils/hf-create-model.py -m "${MODEL_NAME}" -ns "${NAMESPACE}" -b "${ORIGINAL_BASE_MODEL}" -e
152+
153+
hf-create-model-embedding-dry-run:
154+
@./scripts/utils/hf-create-model.py -m "${MODEL_NAME}" -ns "${NAMESPACE}" -b "${ORIGINAL_BASE_MODEL}" -e -d
155+
147156
hf-create-model-private:
148157
@./scripts/utils/hf-create-model.py -m "${MODEL_NAME}" -ns "${NAMESPACE}" -b "${ORIGINAL_BASE_MODEL}" -p
149158

examples/model-conversion/README.md

Lines changed: 9 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -285,13 +285,21 @@ For the following targets a `HF_TOKEN` environment variable is required.
285285
This will create a new model repsository on Hugging Face with the specified
286286
model name.
287287
```console
288-
(venv) $ make hf-create-model MODEL_NAME='TestModel' NAMESPACE="danbev"
288+
(venv) $ make hf-create-model MODEL_NAME='TestModel' NAMESPACE="danbev" ORIGINAL_BASE_MODEL="some-base-model"
289289
Repository ID: danbev/TestModel-GGUF
290290
Repository created: https://huggingface.co/danbev/TestModel-GGUF
291291
```
292292
Note that we append a `-GGUF` suffix to the model name to ensure a consistent
293293
naming convention for GGUF models.
294294

295+
An embedding model can be created using the following command:
296+
```console
297+
(venv) $ make hf-create-model-embedding MODEL_NAME='TestEmbeddingModel' NAMESPACE="danbev" ORIGINAL_BASE_MODEL="some-base-model"
298+
```
299+
The only difference is that the model card for an embedding model will be different
300+
with regards to the llama-server command and also how to access/call the embedding
301+
endpoint.
302+
295303
### Upload a GGUF model to model repository
296304
The following target uploads a model to an existing Hugging Face model repository.
297305
```console
File renamed without changes.
Lines changed: 48 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,48 @@
1+
---
2+
base_model:
3+
- {base_model}
4+
---
5+
# {model_name} GGUF
6+
7+
Recommended way to run this model:
8+
9+
```sh
10+
llama-server -hf {namespace}/{model_name}-GGUF
11+
```
12+
13+
Then the endpoint can be accessed at http://localhost:8080/embedding, for
14+
example using `curl`:
15+
```console
16+
curl --request POST \
17+
--url http://localhost:8080/embedding \
18+
--header "Content-Type: application/json" \
19+
--data '{{"input": "Hello embeddings"}}' \
20+
--silent
21+
```
22+
23+
Alternatively, the `llama-embedding` command line tool can be used:
24+
```sh
25+
llama-embedding -hf {namespace}/{model_name}-GGUF --verbose-prompt -p "Hello embeddings"
26+
```
27+
28+
#### embd_normalize
29+
When a model uses pooling, or the pooling method is specified using `--pooling`,
30+
the normalization can be controlled by the `embd_normalize` parameter.
31+
32+
The default value is `2` which means that the embeddings are normalized using
33+
the Euclidean norm (L2). Other options are:
34+
* -1 No normalization
35+
* 0 Max absolute
36+
* 1 Taxicab
37+
* 2 Euclidean/L2
38+
* \>2 P-Norm
39+
40+
This can be passed in the request body to `llama-server`, for example:
41+
```sh
42+
--data '{{"input": "Hello embeddings", "embd_normalize": -1}}' \
43+
```
44+
45+
And for `llama-embedding`, by passing `--embd-normalize <value>`, for example:
46+
```sh
47+
llama-embedding -hf {namespace}/{model_name}-GGUF --embd-normalize -1 -p "Hello embeddings"
48+
```

examples/model-conversion/scripts/utils/hf-create-model.py

Lines changed: 31 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -26,38 +26,53 @@ def load_template_and_substitute(template_path, **kwargs):
2626
parser.add_argument('--org-base-model', '-b', help='Original Base model name', default="")
2727
parser.add_argument('--no-card', action='store_true', help='Skip creating model card')
2828
parser.add_argument('--private', '-p', action='store_true', help='Create private model')
29+
parser.add_argument('--embedding', '-e', action='store_true', help='Use embedding model card template')
30+
parser.add_argument('--dry-run', '-d', action='store_true', help='Print repository info and template without creating repository')
2931

3032
args = parser.parse_args()
3133

3234
repo_id = f"{args.namespace}/{args.model_name}-GGUF"
3335
print("Repository ID: ", repo_id)
3436

35-
repo_url = api.create_repo(
36-
repo_id=repo_id,
37-
repo_type="model",
38-
private=args.private,
39-
exist_ok=False
40-
)
37+
repo_url = None
38+
if not args.dry_run:
39+
repo_url = api.create_repo(
40+
repo_id=repo_id,
41+
repo_type="model",
42+
private=args.private,
43+
exist_ok=False
44+
)
4145

4246
if not args.no_card:
43-
template_path = "scripts/readme.md.template"
47+
if args.embedding:
48+
template_path = "scripts/embedding/modelcard.template"
49+
else:
50+
template_path = "scripts/causal/modelcard.template"
51+
52+
print("Template path: ", template_path)
53+
4454
model_card_content = load_template_and_substitute(
4555
template_path,
4656
model_name=args.model_name,
4757
namespace=args.namespace,
4858
base_model=args.org_base_model,
4959
)
5060

51-
if model_card_content:
52-
api.upload_file(
53-
path_or_fileobj=model_card_content.encode('utf-8'),
54-
path_in_repo="README.md",
55-
repo_id=repo_id
56-
)
57-
print("Model card created successfully.")
61+
if args.dry_run:
62+
print("\nTemplate Content:\n")
63+
print(model_card_content)
5864
else:
59-
print("Failed to create model card.")
65+
if model_card_content:
66+
api.upload_file(
67+
path_or_fileobj=model_card_content.encode('utf-8'),
68+
path_in_repo="README.md",
69+
repo_id=repo_id
70+
)
71+
print("Model card created successfully.")
72+
else:
73+
print("Failed to create model card.")
6074

61-
print(f"Repository created: {repo_url}")
75+
if not args.dry_run and repo_url:
76+
print(f"Repository created: {repo_url}")
6277

6378

0 commit comments

Comments
 (0)