Skip to content

Commit 1adeeb3

Browse files
authored
Use custom dataset as response source (#200)
* Show config in yaml Signed-off-by: Qifan Deng <[email protected]> * load or download response dataset Signed-off-by: Qifan Deng <[email protected]> * Init dataset when sim starts and show downloading speed of url Signed-off-by: Qifan Deng <[email protected]> * Fix tests and init dataset when loading sim Signed-off-by: Qifan Deng <[email protected]> * Move dataset init to startSim Signed-off-by: Qifan Deng <[email protected]> * Change db structure and add test cases Signed-off-by: Qifan Deng <[email protected]> * fix test Signed-off-by: Qifan Deng <[email protected]> * remove duplicates in request.go Signed-off-by: Qifan Deng <[email protected]> * Move token generation to simulator Signed-off-by: Qifan Deng <[email protected]> * Generate tokens instead of strings Signed-off-by: Qifan Deng <[email protected]> * Move dataset.go to common Signed-off-by: Qifan Deng <[email protected]> * Refactor: abstract dataset and move response generation from common to dataset Signed-off-by: Qifan Deng <[email protected]> * fix dataset tests Signed-off-by: Qifan Deng <[email protected]> * add tests for custom dataset Signed-off-by: Qifan Deng <[email protected]> * fix custom dataset test case Signed-off-by: Qifan Deng <[email protected]> * Remove unnecessary config Signed-off-by: Qifan Deng <[email protected]> * Add cli arg of dataset path and url, also update readme Signed-off-by: Qifan Deng <[email protected]> * Return random from dataset if prmopt hash does not hit Signed-off-by: Qifan Deng <[email protected]> * Respect maxTokens Signed-off-by: Qifan Deng <[email protected]> * Resolve conflicts and fix test case Signed-off-by: Qifan Deng <[email protected]> * Update readme Signed-off-by: Qifan Deng <[email protected]> * Remove unnecessary log Signed-off-by: Qifan Deng <[email protected]> * Ignore test temp folder Signed-off-by: Qifan Deng <[email protected]> * Update README Signed-off-by: Qifan Deng <[email protected]> * flat config Signed-off-by: Qifan Deng <[email protected]> * Use ctx in main Signed-off-by: Qifan Deng <[email protected]> * Update readme and dataset downloading logic Signed-off-by: Qifan Deng <[email protected]> * Pass logger when init dataset Signed-off-by: Qifan Deng <[email protected]> * Improve progress logging, show it every 5 seconds or 10% Signed-off-by: Qifan Deng <[email protected]> * Use in memory database Signed-off-by: Qifan Deng <[email protected]> * Use backup api to load dataset from disk to memory Signed-off-by: Qifan Deng <[email protected]> * Remove duplicated log of Server starting Signed-off-by: Qifan Deng <[email protected]> * use klog Signed-off-by: Qifan Deng <[email protected]> * update readme Signed-off-by: Qifan Deng <[email protected]> --------- Signed-off-by: Qifan Deng <[email protected]>
1 parent b8eb7a4 commit 1adeeb3

26 files changed

+1415
-475
lines changed

.gitignore

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -7,3 +7,7 @@ vendor
77
.DS_Store
88
*.test
99
manifests/dev-config.yaml
10+
pkg/dataset/.llm-d
11+
pkg/llm-d-inference-sim/tests-tmp/
12+
pkg/llm-d-inference-sim/.llm-d/
13+
.llm-d/

Makefile

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -85,7 +85,7 @@ format: ## Format Go source files
8585
test: $(GINKGO) download-tokenizer download-zmq ## Run tests
8686
@printf "\033[33;1m==== Running tests ====\033[0m\n"
8787
ifdef GINKGO_FOCUS
88-
CGO_ENABLED=1 $(GINKGO) -ldflags="$(GO_LDFLAGS)" -v -r --focus="$(GINKGO_FOCUS)"
88+
CGO_ENABLED=1 ginkgo -ldflags="$(GO_LDFLAGS)" -v -r -- -ginkgo.v -ginkgo.focus="$(GINKGO_FOCUS)"
8989
else
9090
CGO_ENABLED=1 $(GINKGO) -ldflags="$(GO_LDFLAGS)" -v -r
9191
endif

README.md

Lines changed: 14 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -149,7 +149,20 @@ For more details see the <a href="https://docs.vllm.ai/en/stable/getting_started
149149
{"running-requests":10,"waiting-requests":30,"kv-cache-usage":0.4,"loras":[{"running":"lora4,lora2","waiting":"lora3","timestamp":1257894567},{"running":"lora4,lora3","waiting":"","timestamp":1257894569}]}
150150
---
151151
- `data-parallel-size`: number of ranks to run in Data Parallel deployment, from 1 to 8, default is 1. The ports will be assigned as follows: rank 0 will run on the configured `port`, rank 1 on `port`+1, etc.
152-
152+
---
153+
- `dataset-path`: Optional local file path to the SQLite database file used for generating responses from a dataset.
154+
- If not set, hardcoded preset responses will be used.
155+
- If set but the file does not exist the `dataset-url` will be used to download the database to the path specified by `dataset-path`.
156+
- If the file exists but is currently occupied by another process, responses will be randomly generated from preset text (the same behavior as if the path were not set).
157+
- Responses are retrieved from the dataset by the hash of the conversation history, with a fallback to a random dataset response, constrained by the maximum output tokens and EoS token handling, if no matching history is found.
158+
- Refer to [llm-d converted ShareGPT](https://huggingface.co/datasets/hf07397/inference-sim-datasets/blob/0b60737c2dd2c570f486cef2efa7971b02e3efde/README.md) for detailed information on the expected format of the SQLite database file.
159+
- `dataset-url`: Optional URL for downloading the SQLite database file used for response generation.
160+
- This parameter is only used if the `dataset-path` is also set and the file does not exist at that path.
161+
- If the file needs to be downloaded, it will be saved to the location specified by `dataset-path`.
162+
- If the file already exists at the `dataset-path`, it will not be downloaded again
163+
- Example URL `https://huggingface.co/datasets/hf07397/inference-sim-datasets/resolve/91ffa7aafdfd6b3b1af228a517edc1e8f22cd274/huggingface/ShareGPT_Vicuna_unfiltered/conversations.sqlite3`
164+
- `dataset-in-memory`: If true, the entire dataset will be loaded into memory for faster access. This may require significant memory depending on the size of the dataset. Default is false.
165+
---
153166
In addition, as we are using klog, the following parameters are available:
154167
- `add_dir_header`: if true, adds the file directory to the header of the log messages
155168
- `alsologtostderr`: log to standard error as well as files (no effect when -logtostderr=true)

go.mod

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -45,6 +45,7 @@ require (
4545
github.com/json-iterator/go v1.1.12 // indirect
4646
github.com/klauspost/compress v1.18.0 // indirect
4747
github.com/mailru/easyjson v0.7.7 // indirect
48+
github.com/mattn/go-sqlite3 v1.14.32 // direct
4849
github.com/modern-go/concurrent v0.0.0-20180306012644-bacd9c7ef1dd // indirect
4950
github.com/modern-go/reflect2 v1.0.2 // indirect
5051
github.com/munnerz/goautoneg v0.0.0-20191010083416-a7dc8b61c822 // indirect

go.sum

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -72,6 +72,8 @@ github.com/llm-d/llm-d-kv-cache-manager v0.3.0-rc1 h1:SDLiNrcreDcA9m9wfXAumFARDH
7272
github.com/llm-d/llm-d-kv-cache-manager v0.3.0-rc1/go.mod h1:tN80/D0Faf6pE2ocwFgTNoCxKPsqdsa2XnjQUqOaZ8Q=
7373
github.com/mailru/easyjson v0.7.7 h1:UGYAvKxe3sBsEDzO8ZeWOSlIQfWFlxbzLZe7hwFURr0=
7474
github.com/mailru/easyjson v0.7.7/go.mod h1:xzfreul335JAWq5oZzymOObrkdz5UnU4kGfJJLY9Nlc=
75+
github.com/mattn/go-sqlite3 v1.14.32 h1:JD12Ag3oLy1zQA+BNn74xRgaBbdhbNIDYvQUEuuErjs=
76+
github.com/mattn/go-sqlite3 v1.14.32/go.mod h1:Uh1q+B4BYcTPb+yiD3kU8Ct7aC0hY9fxUwlHK0RXw+Y=
7577
github.com/modern-go/concurrent v0.0.0-20180228061459-e0a39a4cb421/go.mod h1:6dJC0mAP4ikYIbvyc7fijjWJddQyLn8Ig3JB5CqoB9Q=
7678
github.com/modern-go/concurrent v0.0.0-20180306012644-bacd9c7ef1dd h1:TRLaZ9cD/w8PVh93nsPXa1VrQ6jlwL5oN8l14QlcNfg=
7779
github.com/modern-go/concurrent v0.0.0-20180306012644-bacd9c7ef1dd/go.mod h1:6dJC0mAP4ikYIbvyc7fijjWJddQyLn8Ig3JB5CqoB9Q=

pkg/common/config.go

Lines changed: 24 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -181,6 +181,22 @@ type Configuration struct {
181181
SSLKeyFile string `yaml:"ssl-keyfile" json:"ssl-keyfile"`
182182
// SelfSignedCerts enables automatic generation of self-signed certificates for HTTPS
183183
SelfSignedCerts bool `yaml:"self-signed-certs" json:"self-signed-certs"`
184+
185+
// DatasetPath Optional local file path to the SQLite database file used for generating responses from a dataset.
186+
// - If not set, hardcoded preset responses will be used.
187+
// - If set but the file does not exist the `dataset-url` will be used to download the database to the path specified by `dataset-path`.
188+
// - If the file exists but is currently occupied by another process, responses will be randomly generated from preset text (the same behavior as if the path were not set).
189+
// - Responses are retrieved from the dataset by the hash of the conversation history, with a fallback to a random dataset response, constrained by the maximum output tokens and EoS token handling, if no matching history is found.
190+
// - Refer to [llm-d converted ShareGPT](https://huggingface.co/datasets/hf07397/inference-sim-datasets/blob/0b7ac1a4daf0aace1556326964bd75633372299e/README.md) for detailed information on the expected format of the SQLite database file.
191+
DatasetPath string `yaml:"dataset-path" json:"dataset-path"`
192+
// DatasetURL Optional URL for downloading the SQLite database file used for response generation.
193+
// - This parameter is only used if the `dataset-path` is also set and the file does not exist at that path.
194+
// - If the file needs to be downloaded, it will be saved to the location specified by `dataset-path`.
195+
// - If the file already exists at the `dataset-path`, it will not be downloaded again
196+
// - Example URL `https://huggingface.co/datasets/hf07397/inference-sim-datasets/resolve/91ffa7aafdfd6b3b1af228a517edc1e8f22cd274/huggingface/ShareGPT_Vicuna_unfiltered/conversations.sqlite3`
197+
DatasetURL string `yaml:"dataset-url" json:"dataset-url"`
198+
// DatasetInMemory defines whether to load the entire dataset into memory for faster access.
199+
DatasetInMemory bool `yaml:"dataset-in-memory" json:"dataset-in-memory"`
184200
}
185201

186202
type Metrics struct {
@@ -485,6 +501,10 @@ func (c *Configuration) validate() error {
485501
return errors.New("cannot use both self-signed-certs and explicit ssl-certfile/ssl-keyfile")
486502
}
487503

504+
if c.DatasetPath == "" && c.DatasetURL != "" {
505+
return errors.New("dataset-path is required when dataset-url is set")
506+
}
507+
488508
return nil
489509
}
490510

@@ -564,6 +584,10 @@ func ParseCommandParamsAndLoadConfig() (*Configuration, error) {
564584
f.IntVar(&config.EventBatchSize, "event-batch-size", config.EventBatchSize, "Maximum number of kv-cache events to be sent together")
565585
f.IntVar(&config.DPSize, "data-parallel-size", config.DPSize, "Number of ranks to run")
566586

587+
f.StringVar(&config.DatasetPath, "dataset-path", config.DatasetPath, "Local path to the sqlite db file for response generation from a dataset")
588+
f.StringVar(&config.DatasetURL, "dataset-url", config.DatasetURL, "URL to download the sqlite db file for response generation from a dataset")
589+
f.BoolVar(&config.DatasetInMemory, "dataset-in-memory", config.DatasetInMemory, "Load the entire dataset into memory for faster access")
590+
567591
f.IntVar(&config.FailureInjectionRate, "failure-injection-rate", config.FailureInjectionRate, "Probability (0-100) of injecting failures")
568592
failureTypes := getParamValueFromArgs("failure-types")
569593
var dummyFailureTypes multiString

pkg/common/utils.go

Lines changed: 0 additions & 263 deletions
Original file line numberDiff line numberDiff line change
@@ -17,82 +17,13 @@ limitations under the License.
1717
package common
1818

1919
import (
20-
"fmt"
21-
"math"
2220
"math/rand"
2321
"regexp"
24-
"strings"
2522
"sync"
2623

2724
"github.com/google/uuid"
2825
)
2926

30-
const (
31-
ResponseLenMax = 128
32-
responseLenMean = 40
33-
responseLenStddev = 20
34-
stopFinishReasonProbability = 0.8
35-
36-
StopFinishReason = "stop"
37-
LengthFinishReason = "length"
38-
ToolsFinishReason = "tool_calls"
39-
RemoteDecodeFinishReason = "remote_decode"
40-
)
41-
42-
// this array defines the probabilities for the buckets to be used for the generation of number of tokens in response
43-
var respLenBucketsProbabilities = [...]float64{0.2, 0.3, 0.2, 0.05, 0.1, 0.15}
44-
var cumulativeBucketsProbabilities []float64
45-
46-
const (
47-
flexBucketIndex = 3
48-
maxFixedBucketSize = 20
49-
)
50-
51-
// list of responses to use in random mode for comepltion requests
52-
var chatCompletionFakeResponses = []string{
53-
`Testing@, #testing 1$ ,2%,3^, [4&*5], 6~, 7-_ + (8 : 9) / \ < > .`,
54-
`Testing, testing 1,2,3.`,
55-
`I am fine, how are you today?`,
56-
`I am your AI assistant, how can I help you today?`,
57-
`Today is a nice sunny day.`,
58-
`The temperature here is twenty-five degrees centigrade.`,
59-
`Today it is partially cloudy and raining.`,
60-
`To be or not to be that is the question.`,
61-
`Alas, poor Yorick! I knew him, Horatio: A fellow of infinite jest`,
62-
`The rest is silence. `,
63-
`Give a man a fish and you feed him for a day; teach a man to fish and you feed him for a lifetime`,
64-
}
65-
66-
func init() {
67-
cumulativeBucketsProbabilities = make([]float64, len(respLenBucketsProbabilities))
68-
sum := 0.0
69-
70-
for i, val := range respLenBucketsProbabilities {
71-
sum += val
72-
cumulativeBucketsProbabilities[i] = sum
73-
}
74-
}
75-
76-
// returns the max tokens or error if incorrect
77-
func GetMaxTokens(maxCompletionTokens *int64, maxTokens *int64) (*int64, error) {
78-
var typeToken string
79-
var tokens *int64
80-
// if both arguments are passed,
81-
// use maxCompletionTokens
82-
// as in the real vllm
83-
if maxCompletionTokens != nil {
84-
tokens = maxCompletionTokens
85-
typeToken = "max_completion_tokens"
86-
} else if maxTokens != nil {
87-
tokens = maxTokens
88-
typeToken = "max_tokens"
89-
}
90-
if tokens != nil && *tokens < 1 {
91-
return nil, fmt.Errorf("%s must be at least 1, got %d", typeToken, *tokens)
92-
}
93-
return tokens, nil
94-
}
95-
9627
// ValidateContextWindow checks if the request fits within the model's context window
9728
// Returns validation result, actual completion tokens, and total tokens
9829
func ValidateContextWindow(promptTokens int, maxCompletionTokens *int64, maxModelLen int) (bool, int64, int64) {
@@ -107,200 +38,6 @@ func ValidateContextWindow(promptTokens int, maxCompletionTokens *int64, maxMode
10738
return isValid, completionTokens, totalTokens
10839
}
10940

110-
// GetRandomResponseLen returns int in range [1, responseLenMax]
111-
// numbers are chosen according a gaussian distribution with mean responseLenMean, and standard deviation responseLenStddev
112-
func GetRandomResponseLen() int {
113-
for {
114-
val := rand.NormFloat64()*responseLenStddev + responseLenMean
115-
if val >= 1 && val <= ResponseLenMax {
116-
return int(math.Round(val))
117-
}
118-
// else reject and resample
119-
}
120-
}
121-
122-
// GetRandomFinishReason returns finish reason with the probability for 'stop' as defined by stopFinishReasonProbability
123-
func GetRandomFinishReason() string {
124-
if rand.Float64() < stopFinishReasonProbability {
125-
return StopFinishReason
126-
}
127-
return LengthFinishReason
128-
}
129-
130-
// GetRandomText generates random text for the required number of tokens,
131-
// select randomly a sentence from chatCompletionFakeResponses,
132-
// if number of tokens is lower than required - select another sentence,
133-
// continue until the required number of tokens is achieved
134-
func GetRandomText(numOfTokens int) string {
135-
allTokens := make([]string, 0)
136-
137-
for len(allTokens) < numOfTokens {
138-
index := RandomInt(0, len(chatCompletionFakeResponses)-1)
139-
// create tokens from text, splitting by spaces and special characters
140-
tokens := Tokenize(chatCompletionFakeResponses[index])
141-
remaining := numOfTokens - len(allTokens)
142-
143-
if len(tokens) > remaining {
144-
// there is too many tokens, append only the relevant part
145-
tokens = tokens[:remaining]
146-
}
147-
148-
if len(allTokens) > 0 {
149-
// for not first sentences add space to the first token to separate between sentences without adding an additional token
150-
tokens[0] = " " + tokens[0]
151-
}
152-
153-
allTokens = append(allTokens, tokens...)
154-
}
155-
156-
// return all tokens as text
157-
return strings.Join(allTokens, "")
158-
}
159-
160-
// GetRandomResponseText generates text to be returned in a response, and the finish reason (stop or length)
161-
// if maxCompletionTokens is defined
162-
// - currently, the generated number of words in the text will be equal to it value
163-
// - in future - need to find statistics about generated tokens distribution and return less tokens in part os requests
164-
// - finish reason will be chosen randomly from the collection (stop, length) with 80% for stop and 20% for length
165-
// if maxCompletionTokens is nil
166-
// - the response text's length is randomly chosen from the range [1, responseLenMax] according additional parameters
167-
// - finish reason is stop
168-
// if ignore_eos is true - the response will be generated with exactly maxCompletionTokens tokens
169-
// - request was validated so that when ignore_eos is true, maxCompletionTokens must be defined
170-
func GetRandomResponseText(maxCompletionTokens *int64, ignore_eos bool) (string, string) {
171-
numOfTokens := 0
172-
finishReason := StopFinishReason
173-
174-
// no max completion tokens, return text with random length
175-
if maxCompletionTokens == nil {
176-
numOfTokens = GetRandomResponseLen()
177-
} else {
178-
maxTokens := int(*maxCompletionTokens)
179-
if ignore_eos {
180-
numOfTokens = maxTokens
181-
finishReason = LengthFinishReason
182-
} else {
183-
// max tokens is defined - generate real length of the response based on it
184-
numOfTokens = getResponseLengthByHistogram(maxTokens)
185-
if numOfTokens == maxTokens {
186-
// if response should be create with maximum number of tokens - finish reason will be 'length'
187-
finishReason = LengthFinishReason
188-
}
189-
}
190-
}
191-
192-
text := GetRandomText(numOfTokens)
193-
return text, finishReason
194-
}
195-
196-
// getResponseLengthByHistogram calculates the number of tokens to be returned in a response based on the max tokens value and the pre-defined buckets.
197-
// The response length is distributed according to the probabilities, defined in respLenBucketsProbabilities.
198-
// The histogram contains equally sized buckets and the last special bucket, which contains only the maxTokens value.
199-
// The last element of respLenBucketsProbabilities defines the probability of a reposnse with maxToken tokens.
200-
// Other values define probabilities for the equally sized buckets.
201-
// If maxToken is small (smaller than number of buckets) - the response length is randomly selected from the range [1, maxTokens]
202-
func getResponseLengthByHistogram(maxTokens int) int {
203-
if maxTokens <= 1 {
204-
return maxTokens
205-
}
206-
// maxTokens is small - no need to use the histogram of probabilities, just select a random value in the range [1, maxTokens]
207-
if maxTokens <= len(cumulativeBucketsProbabilities) {
208-
res := RandomInt(1, maxTokens)
209-
return res
210-
}
211-
212-
r := RandomFloat(0, 1)
213-
214-
// check if r is in the last bucket, then maxTokens should be returned
215-
if r > cumulativeBucketsProbabilities[len(cumulativeBucketsProbabilities)-2] {
216-
return maxTokens
217-
}
218-
219-
// determine which bucket to use, the bucket with a cumulative probability larger than r is the bucket to use
220-
// initialize bucketIndex with the last bucket to handle the case (which should not happen) when the probabilities sum is less than 1
221-
bucketIndex := len(cumulativeBucketsProbabilities) - 1
222-
for i, c := range cumulativeBucketsProbabilities {
223-
if r <= c {
224-
bucketIndex = i
225-
break
226-
}
227-
}
228-
229-
// calculate the size of all of the buckets (except the special last bucket)
230-
start, end := calcBucketBoundaries(maxTokens, bucketIndex)
231-
232-
// pick uniformly within the bucket’s range
233-
return RandomInt(start, end)
234-
}
235-
236-
// calcBucketBoundaries calculates boundaries of a bucket with the given index.
237-
// Maximum size for equally sized buckets is defined by maxFixedBucketSize.
238-
// [maxFixedBucketSize*(number-of-buckets-1)+1] is the value of maxTokens for which
239-
// division to equally size buckets will give buckets with size maxFixedBucketSize.
240-
// If maxTokens is [maxFixedBucketSize*(number-of-buckets-1)+1] or less,
241-
// all buckets will be of equal size, except the last bucket, which contains only one value.
242-
// If maxTokens is higher than [maxFixedBucketSize*(number-of-buckets-1)+1],
243-
// and flexBucketIndex is valid (between 0 and number of buckets - 1) the buckets sizes will not be equal.
244-
// In this case, all buckets except the one at flexBucketIndex index will have size 20 (and the last is with size 1),
245-
// and the bucket at flexBucketIndex index will 'stretch' to cover the remaining range.
246-
func calcBucketBoundaries(maxTokens int, bucketIndex int) (start int, end int) {
247-
maxEquallyBucketsSz := maxFixedBucketSize*(len(cumulativeBucketsProbabilities)-1) + 1
248-
249-
if maxTokens <= maxEquallyBucketsSz || flexBucketIndex < 0 || flexBucketIndex >= len(cumulativeBucketsProbabilities)-1 {
250-
// create equally size buckets
251-
// calculate the size of all of the buckets (except the special last bucket)
252-
bucketSize := float64(maxTokens-1) / float64(len(cumulativeBucketsProbabilities)-1)
253-
start = int(bucketSize*float64(bucketIndex)) + 1
254-
end = int(bucketSize * float64(bucketIndex+1))
255-
} else {
256-
// create non-equally sized buckets and find boundaries of the required bucket
257-
if bucketIndex < flexBucketIndex {
258-
// the relevant bucket is before the flex bucket, all buckets are of the same size (maxFixedBucketSize)
259-
// start is the minimum number in the required bucket
260-
start = maxFixedBucketSize*bucketIndex + 1
261-
end = maxFixedBucketSize * (bucketIndex + 1)
262-
} else {
263-
flexBucketSize := maxTokens - (maxFixedBucketSize * (len(cumulativeBucketsProbabilities) - 2))
264-
265-
if bucketIndex == flexBucketIndex {
266-
// the relevant bucket is the flex bucket
267-
start = int(maxFixedBucketSize*float64(bucketIndex)) + 1
268-
end = maxFixedBucketSize*bucketIndex + flexBucketSize
269-
} else {
270-
// the relevant bucket is one of buckets after the flex bucket
271-
start = int(maxFixedBucketSize*float64(bucketIndex-1)) + flexBucketSize + 1
272-
end = maxFixedBucketSize*bucketIndex + flexBucketSize
273-
}
274-
}
275-
}
276-
277-
// sometimes end could be maxTokens because of rounding, change the value to maxToken-1
278-
if end >= maxTokens {
279-
end = maxTokens - 1
280-
}
281-
282-
return start, end
283-
}
284-
285-
// GetResponseText returns response text, from a given text
286-
// considering max completion tokens if it is not nil, and a finish reason (stop or length)
287-
func GetResponseText(maxCompletionTokens *int64, text string) (string, string) {
288-
// no max completion tokens, return entire text
289-
if maxCompletionTokens == nil {
290-
return text, StopFinishReason
291-
}
292-
293-
// create tokens from text, splitting by spaces
294-
tokens := Tokenize(text)
295-
296-
// return entire text
297-
if *maxCompletionTokens >= int64(len(tokens)) {
298-
return text, StopFinishReason
299-
}
300-
// return truncated text
301-
return strings.Join(tokens[0:*maxCompletionTokens], " "), LengthFinishReason
302-
}
303-
30441
func RandomNumericString(length int) string {
30542
digits := "0123456789"
30643
result := make([]byte, length)

0 commit comments

Comments
 (0)