Skip to content
This repository was archived by the owner on Mar 20, 2021. It is now read-only.

Training Data Collection

Bohan Zhang edited this page Jan 5, 2019 · 4 revisions

Latin-Hypercube Sampling

When collecting training data from scratch, the latin-hypercube sampling(LHS) is a smarter way than the random sampling. LHS spreads the sample points more evenly across all possible values. We use pyDOE library, more details can be found here.

OtterTune supports LHS in the client side. You should specify lhs_knob_path and lhs_save_path in the driver configuration. LHS knob file includes the knobs you may tune and their tuning range. Tuning range (min/max values) is important. e.g. The maximum value of memory-related knobs should not exceed the total memory of your hardware. Also, we find setting default values as the minimum value works in most cases. Notice that this file is a different one from the server-side knob fixture. Knob type in this file is integer/float/bytes/time. The LHS knob file path is set as lhs_knob_path in the driver configuration. The output of LHS is located at lhs_save_path

Then you can generate LHS samples, 10 sample points by default.

fab lhs_samples   # 10 samples by default
fab lhs_samples:20   # 20 samples

After generating LHS samples to lhs_save_path, you can run loops for all the samples in lhs_save_path. We recommend to use a no-tuning session. It will experiment on each of the LHS samples and update the results to the server.

fab run_lhs
Clone this wiki locally