Skip to content

Commit 2a0dd4c

Browse files
authored
Merge pull request freqtrade#11558 from viotemp1/optuna
switch hyperopt from scikit-optimize to Optuna
2 parents befc41a + 575c381 commit 2a0dd4c

File tree

15 files changed

+354
-249
lines changed

15 files changed

+354
-249
lines changed

docs/advanced-hyperopt.md

Lines changed: 32 additions & 35 deletions
Original file line numberDiff line numberDiff line change
@@ -161,56 +161,53 @@ class MyAwesomeStrategy(IStrategy):
161161

162162
### Overriding Base estimator
163163

164-
You can define your own estimator for Hyperopt by implementing `generate_estimator()` in the Hyperopt subclass.
164+
You can define your own optuna sampler for Hyperopt by implementing `generate_estimator()` in the Hyperopt subclass.
165165

166166
```python
167167
class MyAwesomeStrategy(IStrategy):
168168
class HyperOpt:
169169
def generate_estimator(dimensions: List['Dimension'], **kwargs):
170-
return "RF"
170+
return "NSGAIIISampler"
171171

172172
```
173173

174-
Possible values are either one of "GP", "RF", "ET", "GBRT" (Details can be found in the [scikit-optimize documentation](https://scikit-optimize.github.io/)), or "an instance of a class that inherits from `RegressorMixin` (from sklearn) and where the `predict` method has an optional `return_std` argument, which returns `std(Y | x)` along with `E[Y | x]`".
174+
Possible values are either one of "NSGAIISampler", "TPESampler", "GPSampler", "CmaEsSampler", "NSGAIIISampler", "QMCSampler" (Details can be found in the [optuna-samplers documentation](https://optuna.readthedocs.io/en/stable/reference/samplers/index.html)), or "an instance of a class that inherits from `optuna.samplers.BaseSampler`".
175175

176-
Some research will be necessary to find additional Regressors.
176+
Some research will be necessary to find additional Samplers (from optunahub) for example.
177177

178-
Example for `ExtraTreesRegressor` ("ET") with additional parameters:
178+
!!! Note
179+
While custom estimators can be provided, it's up to you as User to do research on possible parameters and analyze / understand which ones should be used.
180+
If you're unsure about this, best use one of the Defaults (`"NSGAIIISampler"` has proven to be the most versatile) without further parameters.
179181

180-
```python
181-
class MyAwesomeStrategy(IStrategy):
182-
class HyperOpt:
183-
def generate_estimator(dimensions: List['Dimension'], **kwargs):
184-
from skopt.learning import ExtraTreesRegressor
185-
# Corresponds to "ET" - but allows additional parameters.
186-
return ExtraTreesRegressor(n_estimators=100)
182+
??? Example "Using `AutoSampler` from Optunahub"
187183

188-
```
184+
[AutoSampler docs](https://hub.optuna.org/samplers/auto_sampler/)
185+
186+
Install the necessary dependencies
187+
``` bash
188+
pip install optunahub cmaes torch scipy
189+
```
190+
Implement `generate_estimator()` in your strategy
189191

190-
The `dimensions` parameter is the list of `skopt.space.Dimension` objects corresponding to the parameters to be optimized. It can be used to create isotropic kernels for the `skopt.learning.GaussianProcessRegressor` estimator. Here's an example:
192+
``` python
193+
# ...
194+
from freqtrade.strategy.interface import IStrategy
195+
from typing import List
196+
import optunahub
197+
# ...
191198

192-
```python
193-
class MyAwesomeStrategy(IStrategy):
194-
class HyperOpt:
195-
def generate_estimator(dimensions: List['Dimension'], **kwargs):
196-
from skopt.utils import cook_estimator
197-
from skopt.learning.gaussian_process.kernels import (Matern, ConstantKernel)
198-
kernel_bounds = (0.0001, 10000)
199-
kernel = (
200-
ConstantKernel(1.0, kernel_bounds) *
201-
Matern(length_scale=np.ones(len(dimensions)), length_scale_bounds=[kernel_bounds for d in dimensions], nu=2.5)
202-
)
203-
kernel += (
204-
ConstantKernel(1.0, kernel_bounds) *
205-
Matern(length_scale=np.ones(len(dimensions)), length_scale_bounds=[kernel_bounds for d in dimensions], nu=1.5)
206-
)
207-
208-
return cook_estimator("GP", space=dimensions, kernel=kernel, n_restarts_optimizer=2)
209-
```
199+
class my_strategy(IStrategy):
200+
class HyperOpt:
201+
def generate_estimator(dimensions: List["Dimension"], **kwargs):
202+
if "random_state" in kwargs.keys():
203+
return optunahub.load_module("samplers/auto_sampler").AutoSampler(seed=kwargs["random_state"])
204+
else:
205+
return optunahub.load_module("samplers/auto_sampler").AutoSampler()
206+
207+
```
208+
209+
Obviously the same approach will work for all other Samplers optuna supports.
210210

211-
!!! Note
212-
While custom estimators can be provided, it's up to you as User to do research on possible parameters and analyze / understand which ones should be used.
213-
If you're unsure about this, best use one of the Defaults (`"ET"` has proven to be the most versatile) without further parameters.
214211

215212
## Space options
216213

docs/faq.md

Lines changed: 1 addition & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -219,10 +219,7 @@ On Windows, the `--logfile` option is also supported by Freqtrade and you can us
219219
First of all, most indicator libraries don't have GPU support - as such, there would be little benefit for indicator calculations.
220220
The GPU improvements would only apply to pandas-native calculations - or ones written by yourself.
221221

222-
For hyperopt, freqtrade is using scikit-optimize, which is built on top of scikit-learn.
223-
Their statement about GPU support is [pretty clear](https://scikit-learn.org/stable/faq.html#will-you-add-gpu-support).
224-
225-
GPU's also are only good at crunching numbers (floating point operations).
222+
GPU's are only good at crunching numbers (floating point operations).
226223
For hyperopt, we need both number-crunching (find next parameters) and running python code (running backtesting).
227224
As such, GPU's are not too well suited for most parts of hyperopt.
228225

docs/hyperopt.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,10 +1,10 @@
11
# Hyperopt
22

33
This page explains how to tune your strategy by finding the optimal
4-
parameters, a process called hyperparameter optimization. The bot uses algorithms included in the `scikit-optimize` package to accomplish this.
4+
parameters, a process called hyperparameter optimization. The bot uses algorithms included in the `optuna` package to accomplish this.
55
The search will burn all your CPU cores, make your laptop sound like a fighter jet and still take a long time.
66

7-
In general, the search for best parameters starts with a few random combinations (see [below](#reproducible-results) for more details) and then uses Bayesian search with a ML regressor algorithm (currently ExtraTreesRegressor) to quickly find a combination of parameters in the search hyperspace that minimizes the value of the [loss function](#loss-functions).
7+
In general, the search for best parameters starts with a few random combinations (see [below](#reproducible-results) for more details) and then uses one of optuna's sampler algorithms (currently NSGAIIISampler) to quickly find a combination of parameters in the search hyperspace that minimizes the value of the [loss function](#loss-functions).
88

99
Hyperopt requires historic data to be available, just as backtesting does (hyperopt runs backtesting many times with different parameters).
1010
To learn how to get data for the pairs and exchange you're interested in, head over to the [Data Downloading](data-download.md) section of the documentation.

freqtrade/optimize/hyperopt/hyperopt.py

Lines changed: 40 additions & 25 deletions
Original file line numberDiff line numberDiff line change
@@ -4,6 +4,7 @@
44
This module contains the hyperopt logic
55
"""
66

7+
import gc
78
import logging
89
import random
910
from datetime import datetime
@@ -13,7 +14,7 @@
1314
from typing import Any
1415

1516
import rapidjson
16-
from joblib import Parallel, cpu_count, delayed, wrap_non_picklable_objects
17+
from joblib import Parallel, cpu_count
1718

1819
from freqtrade.constants import FTHYPT_FILEVERSION, LAST_BT_RESULT_FN, Config
1920
from freqtrade.enums import HyperoptState
@@ -35,9 +36,6 @@
3536

3637
INITIAL_POINTS = 30
3738

38-
# Keep no more than SKOPT_MODEL_QUEUE_SIZE models
39-
# in the skopt model queue, to optimize memory consumption
40-
SKOPT_MODEL_QUEUE_SIZE = 10
4139

4240
log_queue: Any
4341

@@ -92,7 +90,7 @@ def __init__(self, config: Config) -> None:
9290
self.hyperopt_table_header = 0
9391
self.print_json = self.config.get("print_json", False)
9492

95-
self.hyperopter = HyperOptimizer(self.config)
93+
self.hyperopter = HyperOptimizer(self.config, self.data_pickle_file)
9694

9795
@staticmethod
9896
def get_lock_filename(config: Config) -> str:
@@ -158,14 +156,20 @@ def optimizer_wrapper(*args, **kwargs):
158156
log_queue, logging.INFO if self.config["verbosity"] < 1 else logging.DEBUG
159157
)
160158

161-
return self.hyperopter.generate_optimizer(*args, **kwargs)
159+
return self.hyperopter.generate_optimizer_wrapped(*args, **kwargs)
162160

163-
return parallel(delayed(wrap_non_picklable_objects(optimizer_wrapper))(v) for v in asked)
161+
return parallel(optimizer_wrapper(v) for v in asked)
164162

165163
def _set_random_state(self, random_state: int | None) -> int:
166164
return random_state or random.randint(1, 2**16 - 1) # noqa: S311
167165

168-
def get_asked_points(self, n_points: int) -> tuple[list[list[Any]], list[bool]]:
166+
def get_optuna_asked_points(self, n_points: int, dimensions: dict) -> list[Any]:
167+
asked: list[list[Any]] = []
168+
for i in range(n_points):
169+
asked.append(self.opt.ask(dimensions))
170+
return asked
171+
172+
def get_asked_points(self, n_points: int, dimensions: dict) -> tuple[list[Any], list[bool]]:
169173
"""
170174
Enforce points returned from `self.opt.ask` have not been already evaluated
171175
@@ -191,19 +195,19 @@ def unique_list(a_list):
191195
while i < 5 and len(asked_non_tried) < n_points:
192196
if i < 3:
193197
self.opt.cache_ = {}
194-
asked = unique_list(self.opt.ask(n_points=n_points * 5 if i > 0 else n_points))
198+
asked = unique_list(
199+
self.get_optuna_asked_points(
200+
n_points=n_points * 5 if i > 0 else n_points, dimensions=dimensions
201+
)
202+
)
195203
is_random = [False for _ in range(len(asked))]
196204
else:
197205
asked = unique_list(self.opt.space.rvs(n_samples=n_points * 5))
198206
is_random = [True for _ in range(len(asked))]
199207
is_random_non_tried += [
200-
rand
201-
for x, rand in zip(asked, is_random, strict=False)
202-
if x not in self.opt.Xi and x not in asked_non_tried
203-
]
204-
asked_non_tried += [
205-
x for x in asked if x not in self.opt.Xi and x not in asked_non_tried
208+
rand for x, rand in zip(asked, is_random, strict=False) if x not in asked_non_tried
206209
]
210+
asked_non_tried += [x for x in asked if x not in asked_non_tried]
207211
i += 1
208212

209213
if asked_non_tried:
@@ -212,7 +216,9 @@ def unique_list(a_list):
212216
is_random_non_tried[: min(len(asked_non_tried), n_points)],
213217
)
214218
else:
215-
return self.opt.ask(n_points=n_points), [False for _ in range(n_points)]
219+
return self.get_optuna_asked_points(n_points=n_points, dimensions=dimensions), [
220+
False for _ in range(n_points)
221+
]
216222

217223
def evaluate_result(self, val: dict[str, Any], current: int, is_random: bool):
218224
"""
@@ -258,9 +264,7 @@ def start(self) -> None:
258264
config_jobs = self.config.get("hyperopt_jobs", -1)
259265
logger.info(f"Number of parallel jobs set as: {config_jobs}")
260266

261-
self.opt = self.hyperopter.get_optimizer(
262-
config_jobs, self.random_state, INITIAL_POINTS, SKOPT_MODEL_QUEUE_SIZE
263-
)
267+
self.opt = self.hyperopter.get_optimizer(self.random_state)
264268
self._setup_logging_mp_workaround()
265269
try:
266270
with Parallel(n_jobs=config_jobs) as parallel:
@@ -276,9 +280,11 @@ def start(self) -> None:
276280
if self.analyze_per_epoch:
277281
# First analysis not in parallel mode when using --analyze-per-epoch.
278282
# This allows dataprovider to load it's informative cache.
279-
asked, is_random = self.get_asked_points(n_points=1)
280-
f_val0 = self.hyperopter.generate_optimizer(asked[0])
281-
self.opt.tell(asked, [f_val0["loss"]])
283+
asked, is_random = self.get_asked_points(
284+
n_points=1, dimensions=self.hyperopter.o_dimensions
285+
)
286+
f_val0 = self.hyperopter.generate_optimizer(asked[0].params)
287+
self.opt.tell(asked[0], [f_val0["loss"]])
282288
self.evaluate_result(f_val0, 1, is_random[0])
283289
pbar.update(task, advance=1)
284290
start += 1
@@ -290,9 +296,17 @@ def start(self) -> None:
290296
n_rest = (i + 1) * jobs - (self.total_epochs - start)
291297
current_jobs = jobs - n_rest if n_rest > 0 else jobs
292298

293-
asked, is_random = self.get_asked_points(n_points=current_jobs)
294-
f_val = self.run_optimizer_parallel(parallel, asked)
295-
self.opt.tell(asked, [v["loss"] for v in f_val])
299+
asked, is_random = self.get_asked_points(
300+
n_points=current_jobs, dimensions=self.hyperopter.o_dimensions
301+
)
302+
303+
f_val = self.run_optimizer_parallel(
304+
parallel,
305+
[asked1.params for asked1 in asked],
306+
)
307+
f_val_loss = [v["loss"] for v in f_val]
308+
for o_ask, v in zip(asked, f_val_loss, strict=False):
309+
self.opt.tell(o_ask, v)
296310

297311
for j, val in enumerate(f_val):
298312
# Use human-friendly indexes here (starting from 1)
@@ -301,6 +315,7 @@ def start(self) -> None:
301315
self.evaluate_result(val, current, is_random[j])
302316
pbar.update(task, advance=1)
303317
logging_mp_handle(log_queue)
318+
gc.collect()
304319

305320
except KeyboardInterrupt:
306321
print("User interrupted..")

freqtrade/optimize/hyperopt/hyperopt_auto.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -12,7 +12,7 @@
1212

1313

1414
with suppress(ImportError):
15-
from skopt.space import Dimension
15+
from freqtrade.optimize.space import Dimension
1616

1717
from freqtrade.optimize.hyperopt.hyperopt_interface import EstimatorType, IHyperOpt
1818

freqtrade/optimize/hyperopt/hyperopt_interface.py

Lines changed: 7 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -8,19 +8,18 @@
88
from abc import ABC
99
from typing import TypeAlias
1010

11-
from sklearn.base import RegressorMixin
12-
from skopt.space import Categorical, Dimension, Integer
11+
from optuna.samplers import BaseSampler
1312

1413
from freqtrade.constants import Config
1514
from freqtrade.exchange import timeframe_to_minutes
1615
from freqtrade.misc import round_dict
17-
from freqtrade.optimize.space import SKDecimal
16+
from freqtrade.optimize.space import Categorical, Dimension, Integer, SKDecimal
1817
from freqtrade.strategy import IStrategy
1918

2019

2120
logger = logging.getLogger(__name__)
2221

23-
EstimatorType: TypeAlias = RegressorMixin | str
22+
EstimatorType: TypeAlias = BaseSampler | str
2423

2524

2625
class IHyperOpt(ABC):
@@ -44,10 +43,11 @@ def __init__(self, config: Config) -> None:
4443
def generate_estimator(self, dimensions: list[Dimension], **kwargs) -> EstimatorType:
4544
"""
4645
Return base_estimator.
47-
Can be any of "GP", "RF", "ET", "GBRT" or an instance of a class
48-
inheriting from RegressorMixin (from sklearn).
46+
Can be any of "TPESampler", "GPSampler", "CmaEsSampler", "NSGAIISampler"
47+
"NSGAIIISampler", "QMCSampler" or an instance of a class
48+
inheriting from BaseSampler (from optuna.samplers).
4949
"""
50-
return "ET"
50+
return "NSGAIIISampler"
5151

5252
def generate_roi_table(self, params: dict) -> dict[int, float]:
5353
"""

0 commit comments

Comments
 (0)