|
75 | 75 | "import jax\n", |
76 | 76 | "import jax.numpy as jnp\n", |
77 | 77 | "\n", |
78 | | - "import qdax.tasks.brax.v1 as environments\n", |
| 78 | + "import qdax.tasks.brax as environments\n", |
79 | 79 | "from qdax.baselines.dads import DADS, DadsConfig, DadsTrainingState\n", |
80 | 80 | "from qdax.core.neuroevolution.buffers.buffer import QDTransition, ReplayBuffer\n", |
81 | 81 | "from qdax.core.neuroevolution.sac_td3_utils import do_iteration_fn, warmstart_buffer\n", |
82 | 82 | "\n", |
83 | 83 | "from qdax.utils.plotting import plot_skills_trajectory\n", |
84 | 84 | "\n", |
85 | | - "from IPython.display import HTML\n", |
86 | | - "from brax.v1.io import html" |
| 85 | + "from IPython.display import HTML" |
87 | 86 | ] |
88 | 87 | }, |
89 | 88 | { |
|
94 | 93 | "\n", |
95 | 94 | "Most hyperparameters are similar to those introduced in [SAC paper](https://arxiv.org/abs/1801.01290), [DIAYN paper](https://arxiv.org/abs/1802.06070) and [DADS paper](https://arxiv.org/abs/1907.01657).\n", |
96 | 95 | "\n", |
97 | | - "The parameter `descriptor_full_state` is less straightforward, it concerns the information used for diversity seeking and dynamics. In DADS, one can use the full state for diversity seeking, but one can also use a prior to focus on an interesting aspect of the state. Actually, priors are often used in experiments, for instance, focusing on the x/y position rather than the full position. When `descriptor_full_state` is set to True, it uses the full state, when it is set to False, it uses the 'state descriptor' retrieved by the environment. Hence, it is required that the environment has one. (All the `_uni`, `_omni` do, same for `anttrap`, `antmaze` and `pointmaze`.) In the future, we will add an option to use a prior function directly on the full state." |
| 96 | + "The parameter `descriptor_full_state` is less straightforward, it concerns the information used for diversity seeking and dynamics. In DADS, one can use the full state for diversity seeking, but one can also use a prior to focus on an interesting aspect of the state. Actually, priors are often used in experiments, for instance, focusing on the x/y position rather than the full position. When `descriptor_full_state` is set to True, it uses the full state, when it is set to False, it uses the 'state descriptor' retrieved by the environment. Hence, it is required that the environment has one. In the future, we will add an option to use a prior function directly on the full state." |
98 | 97 | ] |
99 | 98 | }, |
100 | 99 | { |
|
385 | 384 | "source": [ |
386 | 385 | "## Plot the trajectories of the skills at the end of the training\n", |
387 | 386 | "\n", |
388 | | - "This only works when the state descriptor considered is two-dimensional, and as a real interest only when this state descriptor is the x/y position. Hence, on all \"omni\" tasks, on pointmaze, anttrap and antmaze." |
| 387 | + "This only works when the state descriptor considered is two-dimensional, and as a real interest only when this state descriptor is the x/y position." |
389 | 388 | ] |
390 | 389 | }, |
391 | 390 | { |
|
419 | 418 | "cell_type": "markdown", |
420 | 419 | "metadata": {}, |
421 | 420 | "source": [ |
422 | | - "# Visualize the skills in the physical simulation\n", |
423 | | - "\n", |
424 | | - "WARNING: this does not work with \"pointmaze\"" |
425 | | - ] |
426 | | - }, |
427 | | - { |
428 | | - "cell_type": "code", |
429 | | - "execution_count": null, |
430 | | - "metadata": {}, |
431 | | - "outputs": [], |
432 | | - "source": [ |
433 | | - "assert env_name != \"pointmaze\", \"No visualisation available for pointmaze at the moment\"" |
| 421 | + "# Visualize the skills in the physical simulation" |
434 | 422 | ] |
435 | 423 | }, |
436 | 424 | { |
|
0 commit comments