Skip to content

Commit 719a895

Browse files
authored
Update logic of response generation in case of dataset usage (#257)
* - change logic of selection of response from the dataset, description will be added to readme - move all sqlite related code in custom dataset to helper class - update dataset tests accordingly - remove duplicated definitions - open sqlite db in read-only mode, this will allow to support datasets in DP mode - do not allow dataset definition in echo mode Signed-off-by: Maya Barnea <[email protected]> * - Support DP and dataset - Separate between loading custom dataset from a local file and dataset download - If custom dataset's url is defined in command line - download the dataset before initializing simulator(s) Signed-off-by: Maya Barnea <[email protected]> * Add explanation about response generation to readme Signed-off-by: Maya Barnea <[email protected]> * BaseDataset renames to DefaultDataset, readme updated, more changes by PR comments Signed-off-by: Maya Barnea <[email protected]> * add explanation for getTokensInEchoMode Signed-off-by: Maya Barnea <[email protected]> * Fixed typos in readme, fixes in GetTokens os the custom dataset, tests added for GetTokens to improve coverage Signed-off-by: Maya Barnea <[email protected]> * fix lint Signed-off-by: Maya Barnea <[email protected]> * test fixes according to PR comments Signed-off-by: Maya Barnea <[email protected]> * fix lint issues Signed-off-by: Maya Barnea <[email protected]> --------- Signed-off-by: Maya Barnea <[email protected]>
1 parent ff4c9ea commit 719a895

18 files changed

+1253
-846
lines changed

README.md

Lines changed: 54 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -362,3 +362,57 @@ curl -X POST http://localhost:8000/v1/chat/completions \
362362
]
363363
}'
364364
```
365+
366+
## Response generation
367+
368+
The `/v1/completions` and `/v1/chat/completions` endpoints produce responses based on simulator configurations and the specific request parameters.
369+
370+
### Echo mode
371+
In `echo` mode, responses always mirror the request content.
372+
In case of /v1/completions the prompt field is returned.
373+
In case of /v1/chat/completions the last message is returned.
374+
375+
Parameters `max_tokens`, `max_completions_tokens` and `ignore_eos` are ignored in this mode.
376+
377+
### Random mode
378+
In `random` mode, the fields `max_tokens`, `max_completions_tokens` and `ignore_eos` from the request are used during response generation.
379+
380+
#### Use predefined texts for response generation
381+
The simulator can generate responses from a predefined list of sentences.
382+
If `max_tokens` or `max_completions_tokens` is specified, the response length is caclulated using a histogram with six buckets and the following probabilities: 20%, 30%, 20%, 5%, 10%, 15%.
383+
For a maximum length ≤ 120, bucket sizes are equal.
384+
For a maximum length > 120, all buckets except the fourth are of size 20;
385+
the fourth bucket covers the remaining range.
386+
After the buckets are set, response length is sampled according to these probabilities.
387+
388+
389+
Examples: <br>
390+
max-len = 120: the buckets are 1-20, 21-40, 41-60, 61-80, 81-100, 101-120. <br>
391+
max-len = 200: the buckets are 1-20, 21-40, 41-60, 61-160, 161-180, 181-200. <br>
392+
393+
If the maximum response length is not specified, it defaults to `<model length>-<input-length>`.
394+
In this case, response length is sampled from a Gaussian distribution with mean 40 and standard deviation 20.
395+
396+
397+
After determining the response length:
398+
399+
A random sentence from the predefined list is chosen and trimmed if it exceeds the required length.
400+
If the sentence is shorter, additional random sentences are concatenated until the required token count is met.
401+
402+
If `ignore_eos` is true, the response always reaches the maximum allowed length.
403+
404+
The finish_reason is set to LENGTH if the response length equals the maximum; otherwise, it is set to STOP.
405+
406+
407+
#### Use responses dataset for response generation
408+
If `dataset-url` is set in command line, the dataset is downloaded to the location specified by `dataset-path`.
409+
410+
If a valid dataset exists in the `dataset-path`, it is used for response selection.
411+
The request prompt is hashed, and this value is matched against dataset entries.
412+
If all matches are longer, a random match is selected and then trimmed.
413+
414+
If `ignore_eos` is true and no match meets the required length, the response is completed with random tokens from the predefined list.
415+
416+
If the prompt hash is not present in the dataset, a random response of length ≤ maximum is selected;
417+
if all responses are longer, a random response is chosen and trimmed.
418+

pkg/common/config.go

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -660,6 +660,10 @@ func (c *Configuration) validate() error {
660660
return errors.New("dataset-path is required when dataset-url is set")
661661
}
662662

663+
if c.Mode == ModeEcho && (c.DatasetPath != "" || c.DatasetURL != "") {
664+
return errors.New("dataset cannot be defined in echo mode")
665+
}
666+
663667
return nil
664668
}
665669

pkg/common/config_test.go

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -532,6 +532,12 @@ var _ = Describe("Simulator configuration", func() {
532532
"--config", "../../manifests/config.yaml"},
533533
expectedError: "fake metrics request-max-generation-tokens cannot contain negative values",
534534
},
535+
{
536+
name: "invalid echo mode with dataset",
537+
args: []string{"random", "--model", "test", "--dataset-path", "my/path",
538+
"--mode", "echo"},
539+
expectedError: "dataset cannot be defined in echo mode",
540+
},
535541
}
536542

537543
for _, test := range invalidTests {

0 commit comments

Comments
 (0)