Skip to content

Commit a349327

Browse files
committed
remove --pooling override and clarify embd_normalize usage
1 parent 88e257c commit a349327

File tree

1 file changed

+19
-7
lines changed

1 file changed

+19
-7
lines changed

examples/model-conversion/scripts/embedding/modelcard.template

Lines changed: 19 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@ base_model:
77
Recommended way to run this model:
88

99
```sh
10-
llama-server -hf {namespace}/{model_name}-GGUF --embedding --pooling none
10+
llama-server -hf {namespace}/{model_name}-GGUF
1111
```
1212

1313
Then the endpoint can be accessed at http://localhost:8080/embedding, for
@@ -16,21 +16,33 @@ example using `curl`:
1616
curl --request POST \
1717
--url http://localhost:8080/embedding \
1818
--header "Content-Type: application/json" \
19-
--data '{{"input": "Hello embeddings", "embd_normalize": -1}}' \
19+
--data '{{"input": "Hello embeddings"}}' \
2020
--silent
2121
```
2222

23-
Alternatively, the `llama-embedding`command line tool can be used:
23+
Alternatively, the `llama-embedding` command line tool can be used:
2424
```sh
25-
llama-embedding -hf {namespace}/{model_name}-GGUF --pooling none --embd-normalize 2 --verbose-prompt -p "Hello embeddings"
25+
llama-embedding -hf {namespace}/{model_name}-GGUF --verbose-prompt -p "Hello embeddings"
2626
```
2727

2828
#### embd_normalize
29-
When a pooling method is specified the normalization can be controlled by the
30-
`embd_normalize` parameter. The default value is `2` which means that the
31-
embeddings are normalized using the Euclidean norm (L2). Other options are:
29+
When a model uses pooling, or the pooling method is specified using `--pooling`,
30+
the normalization can be controlled by the `embd_normalize` parameter.
31+
32+
The default value is `2` which means that the embeddings are normalized using
33+
the Euclidean norm (L2). Other options are:
3234
* -1 No normalization
3335
* 0 Max absolute
3436
* 1 Taxicab
3537
* 2 Euclidean/L2
3638
* \>2 P-Norm
39+
40+
This can be passed in the request body to `llama-server`, for example:
41+
```sh
42+
--data '{{"input": "Hello embeddings", "embd_normalize": -1}}' \
43+
```
44+
45+
And for `llama-embedding`, by passing `--embd-normalize <value>`, for example:
46+
```sh
47+
llama-embedding -hf {namespace}/{model_name}-GGUF --embd-normalize -1 -p "Hello embeddings"
48+
```

0 commit comments

Comments
 (0)