-
Notifications
You must be signed in to change notification settings - Fork 310
Training Data Collection
When collecting training data from scratch, the latin-hypercube sampling(LHS) is a smarter way than the random sampling. LHS spreads the sample points more evenly across all possible values. We use pyDOE library, more details can be found here.
OtterTune supports LHS in the client side. You should specify lhs_knob_path
and lhs_save_path
in the driver configuration. LHS knob file includes the knobs you may tune and their tuning range. Tuning range (min/max values) is important. e.g. The maximum value of memory-related knobs should not exceed the total memory of your hardware. Also, we find setting default values as the minimum value works in most cases. Notice that this file is a different one from the server-side knob fixture. Knob type in this file is integer/float/bytes/time. The LHS knob file path is set as lhs_knob_path
in the driver configuration. The output of LHS is located at lhs_save_path
Then you can generate LHS samples, 10 sample points by default.
fab lhs_samples # 10 samples by default
fab lhs_samples:20 # 20 samples
After generating LHS samples to lhs_save_path
, you can run loops for all the samples in lhs_save_path
. We recommend to use a no-tuning session. It will experiment on each of the LHS samples and update the results to the server.
fab run_lhs