@@ -59,7 +59,7 @@ Distributed Reinforcement Learning using RPC and RRef
5959-----------------------------------------------------
6060
6161This section describes steps to build a toy distributed reinforcement learning
62- model using RPC to solve CartPole-v1 from `OpenAI Gym <https://gym.openai.com >`__.
62+ model using RPC to solve CartPole-v1 from `OpenAI Gym <https://www.gymlibrary.dev/environments/classic_control/cart_pole/ >`__.
6363The policy code is mostly borrowed from the existing single-thread
6464`example <https://github.com/pytorch/examples/blob/master/reinforcement_learning >`__
6565as shown below. We will skip details of the ``Policy `` design, and focus on RPC
@@ -156,7 +156,7 @@ send commands. Applications don't need to worry about the lifetime of ``RRefs``.
156156The owner of each ``RRef `` maintains a reference counting map to track its
157157lifetime, and guarantees the remote data object will not be deleted as long as
158158there is any live user of that ``RRef ``. Please refer to the ``RRef ``
159- `design doc <https://pytorch.org/docs/master/notes /rref.html >`__ for details.
159+ `design doc <https://pytorch.org/docs/stable/rpc /rref.html >`__ for details.
160160
161161
162162.. code :: python
@@ -531,7 +531,7 @@ the given arguments (i.e., ``lr=0.05``).
531531In the training loop, it first creates a distributed autograd context, which
532532will help the distributed autograd engine to find gradients and involved RPC
533533send/ recv functions. The design details of the distributed autograd engine can
534- be found in its `design note < https:// pytorch.org/ docs/ master / notes / distributed_autograd.html> ` __.
534+ be found in its `design note < https:// pytorch.org/ docs/ stable / rpc / distributed_autograd.html> ` __.
535535Then, it kicks off the forward pass as if it is a local
536536model, and run the distributed backward pass . For the distributed backward, you
537537only need to specify a list of roots, in this case, it is the loss `` Tensor`` .
0 commit comments