Release 0.2.0 · google-deepmind/acme

Highlights

Using stable releases for TensorFlow (>=2.3.0), Reverb, and TensorFlow Probability.
Added Critic Regularized Regression (code, paper)
Added Discrete Batch-Constrained Deep Q-learning (code, paper)
Added EnvironmentLoop.run_episode() for running a single episode.
Update EnvironmentLoop.run() to take num_steps, allowing the control of step count rather than just episode count.
Add more distribution types (e.g. GaussianMixture) which can be used by policies.
Added a environment wrapper for action repeats.
Improvements/tuning to datasets exposed by make_dataset.
Add support for nested / multidimensional rewards and discounts.

Minor changes and fixes

ConstantInfo logger for logging constant information.
Added a should_update parameter to the EnvironmentLoop.
Various modifications and optimizations to the make_reverb_dataset() function.
Improvements to typing and pytype usage.
Other minor bug and documentation fixes.

Provide feedback