You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
* Show config in yaml
Signed-off-by: Qifan Deng <[email protected]>
* load or download response dataset
Signed-off-by: Qifan Deng <[email protected]>
* Init dataset when sim starts and show downloading speed of url
Signed-off-by: Qifan Deng <[email protected]>
* Fix tests and init dataset when loading sim
Signed-off-by: Qifan Deng <[email protected]>
* Move dataset init to startSim
Signed-off-by: Qifan Deng <[email protected]>
* Change db structure and add test cases
Signed-off-by: Qifan Deng <[email protected]>
* fix test
Signed-off-by: Qifan Deng <[email protected]>
* remove duplicates in request.go
Signed-off-by: Qifan Deng <[email protected]>
* Move token generation to simulator
Signed-off-by: Qifan Deng <[email protected]>
* Generate tokens instead of strings
Signed-off-by: Qifan Deng <[email protected]>
* Move dataset.go to common
Signed-off-by: Qifan Deng <[email protected]>
* Refactor: abstract dataset and move response generation from common to dataset
Signed-off-by: Qifan Deng <[email protected]>
* fix dataset tests
Signed-off-by: Qifan Deng <[email protected]>
* add tests for custom dataset
Signed-off-by: Qifan Deng <[email protected]>
* fix custom dataset test case
Signed-off-by: Qifan Deng <[email protected]>
* Remove unnecessary config
Signed-off-by: Qifan Deng <[email protected]>
* Add cli arg of dataset path and url, also update readme
Signed-off-by: Qifan Deng <[email protected]>
* Return random from dataset if prmopt hash does not hit
Signed-off-by: Qifan Deng <[email protected]>
* Respect maxTokens
Signed-off-by: Qifan Deng <[email protected]>
* Resolve conflicts and fix test case
Signed-off-by: Qifan Deng <[email protected]>
* Update readme
Signed-off-by: Qifan Deng <[email protected]>
* Remove unnecessary log
Signed-off-by: Qifan Deng <[email protected]>
* Ignore test temp folder
Signed-off-by: Qifan Deng <[email protected]>
* Update README
Signed-off-by: Qifan Deng <[email protected]>
* flat config
Signed-off-by: Qifan Deng <[email protected]>
* Use ctx in main
Signed-off-by: Qifan Deng <[email protected]>
* Update readme and dataset downloading logic
Signed-off-by: Qifan Deng <[email protected]>
* Pass logger when init dataset
Signed-off-by: Qifan Deng <[email protected]>
* Improve progress logging, show it every 5 seconds or 10%
Signed-off-by: Qifan Deng <[email protected]>
* Use in memory database
Signed-off-by: Qifan Deng <[email protected]>
* Use backup api to load dataset from disk to memory
Signed-off-by: Qifan Deng <[email protected]>
* Remove duplicated log of Server starting
Signed-off-by: Qifan Deng <[email protected]>
* use klog
Signed-off-by: Qifan Deng <[email protected]>
* update readme
Signed-off-by: Qifan Deng <[email protected]>
---------
Signed-off-by: Qifan Deng <[email protected]>
-`data-parallel-size`: number of ranks to run in Data Parallel deployment, from 1 to 8, default is 1. The ports will be assigned as follows: rank 0 will run on the configured `port`, rank 1 on `port`+1, etc.
152
-
152
+
---
153
+
-`dataset-path`: Optional local file path to the SQLite database file used for generating responses from a dataset.
154
+
- If not set, hardcoded preset responses will be used.
155
+
- If set but the file does not exist the `dataset-url` will be used to download the database to the path specified by `dataset-path`.
156
+
- If the file exists but is currently occupied by another process, responses will be randomly generated from preset text (the same behavior as if the path were not set).
157
+
- Responses are retrieved from the dataset by the hash of the conversation history, with a fallback to a random dataset response, constrained by the maximum output tokens and EoS token handling, if no matching history is found.
158
+
- Refer to [llm-d converted ShareGPT](https://huggingface.co/datasets/hf07397/inference-sim-datasets/blob/0b60737c2dd2c570f486cef2efa7971b02e3efde/README.md) for detailed information on the expected format of the SQLite database file.
159
+
-`dataset-url`: Optional URL for downloading the SQLite database file used for response generation.
160
+
- This parameter is only used if the `dataset-path` is also set and the file does not exist at that path.
161
+
- If the file needs to be downloaded, it will be saved to the location specified by `dataset-path`.
162
+
- If the file already exists at the `dataset-path`, it will not be downloaded again
163
+
- Example URL `https://huggingface.co/datasets/hf07397/inference-sim-datasets/resolve/91ffa7aafdfd6b3b1af228a517edc1e8f22cd274/huggingface/ShareGPT_Vicuna_unfiltered/conversations.sqlite3`
164
+
-`dataset-in-memory`: If true, the entire dataset will be loaded into memory for faster access. This may require significant memory depending on the size of the dataset. Default is false.
165
+
---
153
166
In addition, as we are using klog, the following parameters are available:
154
167
-`add_dir_header`: if true, adds the file directory to the header of the log messages
155
168
-`alsologtostderr`: log to standard error as well as files (no effect when -logtostderr=true)
// DatasetPath Optional local file path to the SQLite database file used for generating responses from a dataset.
186
+
// - If not set, hardcoded preset responses will be used.
187
+
// - If set but the file does not exist the `dataset-url` will be used to download the database to the path specified by `dataset-path`.
188
+
// - If the file exists but is currently occupied by another process, responses will be randomly generated from preset text (the same behavior as if the path were not set).
189
+
// - Responses are retrieved from the dataset by the hash of the conversation history, with a fallback to a random dataset response, constrained by the maximum output tokens and EoS token handling, if no matching history is found.
190
+
// - Refer to [llm-d converted ShareGPT](https://huggingface.co/datasets/hf07397/inference-sim-datasets/blob/0b7ac1a4daf0aace1556326964bd75633372299e/README.md) for detailed information on the expected format of the SQLite database file.
// DatasetURL Optional URL for downloading the SQLite database file used for response generation.
193
+
// - This parameter is only used if the `dataset-path` is also set and the file does not exist at that path.
194
+
// - If the file needs to be downloaded, it will be saved to the location specified by `dataset-path`.
195
+
// - If the file already exists at the `dataset-path`, it will not be downloaded again
196
+
// - Example URL `https://huggingface.co/datasets/hf07397/inference-sim-datasets/resolve/91ffa7aafdfd6b3b1af228a517edc1e8f22cd274/huggingface/ShareGPT_Vicuna_unfiltered/conversations.sqlite3`
// if response should be create with maximum number of tokens - finish reason will be 'length'
187
-
finishReason=LengthFinishReason
188
-
}
189
-
}
190
-
}
191
-
192
-
text:=GetRandomText(numOfTokens)
193
-
returntext, finishReason
194
-
}
195
-
196
-
// getResponseLengthByHistogram calculates the number of tokens to be returned in a response based on the max tokens value and the pre-defined buckets.
197
-
// The response length is distributed according to the probabilities, defined in respLenBucketsProbabilities.
198
-
// The histogram contains equally sized buckets and the last special bucket, which contains only the maxTokens value.
199
-
// The last element of respLenBucketsProbabilities defines the probability of a reposnse with maxToken tokens.
200
-
// Other values define probabilities for the equally sized buckets.
201
-
// If maxToken is small (smaller than number of buckets) - the response length is randomly selected from the range [1, maxTokens]
202
-
funcgetResponseLengthByHistogram(maxTokensint) int {
203
-
ifmaxTokens<=1 {
204
-
returnmaxTokens
205
-
}
206
-
// maxTokens is small - no need to use the histogram of probabilities, just select a random value in the range [1, maxTokens]
0 commit comments