Unity-Technologies
diff --git a/‎com.unity.ml-agents/Documentation~/Background-Machine-Learning.md‎
Lines changed: 29 additions & 168 deletions b/‎com.unity.ml-agents/Documentation~/Background-Machine-Learning.md‎
Lines changed: 29 additions & 168 deletions
diff --git a/‎com.unity.ml-agents/Documentation~/Background-PyTorch.md‎
Lines changed: 3 additions & 27 deletions b/‎com.unity.ml-agents/Documentation~/Background-PyTorch.md‎
Lines changed: 3 additions & 27 deletions
diff --git a/‎com.unity.ml-agents/Documentation~/Background-Unity.md‎
Lines changed: 2 additions & 7 deletions b/‎com.unity.ml-agents/Documentation~/Background-Unity.md‎
Lines changed: 2 additions & 7 deletions
diff --git a/‎com.unity.ml-agents/Documentation~/Custom-GridSensors.md‎
Lines changed: 5 additions & 10 deletions b/‎com.unity.ml-agents/Documentation~/Custom-GridSensors.md‎
Lines changed: 5 additions & 10 deletions
diff --git a/‎com.unity.ml-agents/Documentation~/Custom-SideChannels.md‎
Lines changed: 18 additions & 53 deletions b/‎com.unity.ml-agents/Documentation~/Custom-SideChannels.md‎
Lines changed: 18 additions & 53 deletions
diff --git a/‎com.unity.ml-agents/Documentation~/ELO-Rating-System.md‎
Lines changed: 3 additions & 7 deletions b/‎com.unity.ml-agents/Documentation~/ELO-Rating-System.md‎
Lines changed: 3 additions & 7 deletions
@@ -1,35 +1,11 @@
 # Background: PyTorch
 
-As discussed in our
-[machine learning background page](Background-Machine-Learning.md), many of the
-algorithms we provide in the ML-Agents Toolkit leverage some form of deep
-learning. More specifically, our implementations are built on top of the
-open-source library [PyTorch](https://pytorch.org/). In this page we
-provide a brief overview of PyTorch and TensorBoard
-that we leverage within the ML-Agents Toolkit.
+As discussed in our [machine learning background page](Background-Machine-Learning.md), many of the algorithms we provide in the ML-Agents Toolkit leverage some form of deep learning. More specifically, our implementations are built on top of the open-source library [PyTorch](https://pytorch.org/). In this page we provide a brief overview of PyTorch and TensorBoard that we leverage within the ML-Agents Toolkit.
 
 ## PyTorch
 
-[PyTorch](https://pytorch.org/) is an open source library for
-performing computations using data flow graphs, the underlying representation of
-deep learning models. It facilitates training and inference on CPUs and GPUs in
-a desktop, server, or mobile device. Within the ML-Agents Toolkit, when you
-train the behavior of an agent, the output is a model (.onnx) file that you can
-then associate with an Agent. Unless you implement a new algorithm, the use of
-PyTorch is mostly abstracted away and behind the scenes.
+[PyTorch](https://pytorch.org/) is an open source library for performing computations using data flow graphs, the underlying representation of deep learning models. It facilitates training and inference on CPUs and GPUs in a desktop, server, or mobile device. Within the ML-Agents Toolkit, when you train the behavior of an agent, the output is a model (.onnx) file that you can then associate with an Agent. Unless you implement a new algorithm, the use of PyTorch is mostly abstracted away and behind the scenes.
 
 ## TensorBoard
 
-One component of training models with PyTorch is setting the values of
-certain model attributes (called _hyperparameters_). Finding the right values of
-these hyperparameters can require a few iterations. Consequently, we leverage a
-visualization tool called
-[TensorBoard](https://www.tensorflow.org/tensorboard).
-It allows the visualization of certain agent attributes (e.g. reward) throughout
-training which can be helpful in both building intuitions for the different
-hyperparameters and setting the optimal values for your Unity environment. We
-provide more details on setting the hyperparameters in the
-[Training ML-Agents](Training-ML-Agents.md) page. If you are unfamiliar with
-TensorBoard we recommend our guide on
-[using TensorBoard with ML-Agents](Using-Tensorboard.md) or this
-[tutorial](https://github.com/dandelionmane/tf-dev-summit-tensorboard-tutorial).
+One component of training models with PyTorch is setting the values of certain model attributes (called _hyperparameters_). Finding the right values of these hyperparameters can require a few iterations. Consequently, we leverage a visualization tool called [TensorBoard](https://www.tensorflow.org/tensorboard). It allows the visualization of certain agent attributes (e.g. reward) throughout training which can be helpful in both building intuitions for the different hyperparameters and setting the optimal values for your Unity environment. We provide more details on setting the hyperparameters in the [Training ML-Agents](Training-ML-Agents.md) page. If you are unfamiliar with TensorBoard we recommend our guide on [using TensorBoard with ML-Agents](Using-Tensorboard.md) or this [tutorial](https://github.com/dandelionmane/tf-dev-summit-tensorboard-tutorial).
@@ -1,11 +1,6 @@
 # Background: Unity
 
-If you are not familiar with the [Unity Engine](https://unity3d.com/unity), we
-highly recommend the [Unity Manual](https://docs.unity3d.com/Manual/index.html)
-and [Tutorials page](https://unity3d.com/learn/tutorials). The
-[Roll-a-ball tutorial](https://learn.unity.com/project/roll-a-ball)
-is a fantastic resource to learn all the basic concepts of Unity to get started
-with the ML-Agents Toolkit:
+If you are not familiar with the [Unity Engine](https://unity3d.com/unity), we highly recommend the [Unity Manual](https://docs.unity3d.com/Manual/index.html) and [Tutorials page](https://unity3d.com/learn/tutorials). The [Roll-a-ball tutorial](https://learn.unity.com/project/roll-a-ball) is a fantastic resource to learn all the basic concepts of Unity to get started with the ML-Agents Toolkit:
 
 - [Editor](https://docs.unity3d.com/Manual/sprite/sprite-editor/use-editor.html)
 - [Scene](https://docs.unity3d.com/Manual/CreatingScenes.html)
@@ -15,5 +10,5 @@ with the ML-Agents Toolkit:
 - [Scripting](https://docs.unity3d.com/Manual/ScriptingSection.html)
 - [Physics](https://docs.unity3d.com/Manual/PhysicsSection.html)
 - [Ordering of event functions](https://docs.unity3d.com/Manual/ExecutionOrder.html)
-  (e.g. FixedUpdate, Update)
+(e.g. FixedUpdate, Update)
 - [Prefabs](https://docs.unity3d.com/Manual/Prefabs.html)
@@ -8,23 +8,19 @@ One extra feature with Grid Sensors is that you can derive from the Grid Sensor
 To create a custom grid sensor, you'll need to derive from two classes: `GridSensorBase` and `GridSensorComponent`.
 
 ## Deriving from `GridSensorBase`
-This is the implementation of your sensor. This defines how your sensor process detected colliders,
-what the data looks like, and how the observations are constructed from the detected objects.
-Consider overriding the following methods depending on your use case:
+This is the implementation of your sensor. This defines how your sensor process detected colliders, what the data looks like, and how the observations are constructed from the detected objects. Consider overriding the following methods depending on your use case:
 * `protected virtual int GetCellObservationSize()`: Return the observation size per cell. Default to `1`.
 * `protected virtual void GetObjectData(GameObject detectedObject, int tagIndex, float[] dataBuffer)`: Constructs observations from the detected object. The input provides the detected GameObject and the index of its tag (0-indexed). The observations should be written to the given `dataBuffer` and the buffer size is defined in `GetCellObservationSize()`. This data will be gathered from each cell and sent to the trainer as observation.
 * `protected virtual bool IsDataNormalized()`: Return whether the observation is normalized to 0~1. This affects whether you're able to use compressed observations as compressed data only supports normalized data. Return `true` if all the values written in `GetObjectData` are within the range of (0, 1), otherwise return `false`. Default to `false`.
 
-  There might be cases when your data is not in the range of (0, 1) but you still wish to use compressed data to speed up training. If your data is naturally bounded within a range, normalize your data first to the possible range and fill the buffer with normalized data. For example, since the angle of rotation is bounded within `0 ~ 360`, record an angle `x` as `x/360` instead of `x`. If your data value is not bounded (position, velocity, etc.), consider setting a reasonable min/max value and use that to normalize your data.
+There might be cases when your data is not in the range of (0, 1) but you still wish to use compressed data to speed up training. If your data is naturally bounded within a range, normalize your data first to the possible range and fill the buffer with normalized data. For example, since the angle of rotation is bounded within `0 ~ 360`, record an angle `x` as `x/360` instead of `x`. If your data value is not bounded (position, velocity, etc.), consider setting a reasonable min/max value and use that to normalize your data.
 * `protected internal virtual ProcessCollidersMethod GetProcessCollidersMethod()`: Return the method to process colliders detected in a cell. This defines the sensor behavior when multiple objects with detectable tags are detected within a cell.
-  Currently two methods are provided:
+Currently, two methods are provided:
   * `ProcessCollidersMethod.ProcessClosestColliders` (Default): Process the closest collider to the agent. In this case each cell's data is represented by one object.
   * `ProcessCollidersMethod.ProcessAllColliders`: Process all detected colliders. This is useful when the data from each cell is additive, for instance, the count of detected objects in a cell. When using this option, the input `dataBuffer` in `GetObjectData()` will contain processed data from other colliders detected in the cell. You'll more likely want to add/subtract values from the buffer instead of overwrite it completely.
 
 ## Deriving from `GridSensorComponent`
-To create your sensor, you need to override the sensor component and add your sensor to the creation.
-Specifically, you need to override `GetGridSensors()` and return an array of grid sensors you want to use in the component.
-It can be used to create multiple different customized grid sensors, or you can also include the ones provided in our package (listed in the next section).
+To create your sensor, you need to override the sensor component and add your sensor to the creation. Specifically, you need to override `GetGridSensors()` and return an array of grid sensors you want to use in the component. It can be used to create multiple different customized grid sensors, or you can also include the ones provided in our package (listed in the next section).
 
 Example:
 ```csharp
@@ -38,8 +34,7 @@ public class CustomGridSensorComponent : GridSensorComponent
 ```
 
 ## Grid Sensor Types
-Here we list out two types of grid sensor provided in the package: `OneHotGridSensor` and `CountingGridSensor`.
-Their implementations are also a good reference for making you own ones.
+Here we list two types of grid sensor provided in the package: `OneHotGridSensor` and `CountingGridSensor`. Their implementations are also a good reference for making you own ones.
 
 ### OneHotGridSensor
 This is the default sensor used by `GridSensorComponent`. It detects objects with detectable tags and the observation is the one-hot representation of the detected tag index.
 
@@ -1,77 +1,55 @@
 # Custom Side Channels
 
-You can create your own side channel in C# and Python and use it to communicate
-custom data structures between the two. This can be useful for situations in
-which the data to be sent is too complex or structured for the built-in
+You can create your own side channel in C# and Python and use it to communicate custom data structures between the two. This can be useful for situations in which the data to be sent is too complex or structured for the built-in
 `EnvironmentParameters`, or is not related to any specific agent, and therefore
 inappropriate as an agent observation.
 
 ## Overview
 
-In order to use a side channel, it must be implemented as both Unity and Python
-classes.
+In order to use a side channel, it must be implemented as both Unity and Python classes.
 
 ### Unity side
 
-The side channel will have to implement the `SideChannel` abstract class and the
-following method.
+The side channel will have to implement the `SideChannel` abstract class and the following method.
 
 - `OnMessageReceived(IncomingMessage msg)` : You must implement this method and
-  read the data from IncomingMessage. The data must be read in the order that it
-  was written.
+read the data from IncomingMessage. The data must be read in the order that it was written.
 
 The side channel must also assign a `ChannelId` property in the constructor. The
 `ChannelId` is a Guid (or UUID in Python) used to uniquely identify a side
-channel. This Guid must be the same on C# and Python. There can only be one side
-channel of a certain id during communication.
+channel. This Guid must be the same on C# and Python. There can only be one side channel of a certain id during communication.
 
-To send data from C# to Python, create an `OutgoingMessage` instance, add data
-to it, call the `base.QueueMessageToSend(msg)` method inside the side channel,
-and call the `OutgoingMessage.Dispose()` method.
+To send data from C# to Python, create an `OutgoingMessage` instance, add data to it, call the `base.QueueMessageToSend(msg)` method inside the side channel, and call the `OutgoingMessage.Dispose()` method.
 
 To register a side channel on the Unity side, call
 `SideChannelManager.RegisterSideChannel` with the side channel as only argument.
 
 ### Python side
 
-The side channel will have to implement the `SideChannel` abstract class. You
-must implement :
+The side channel will have to implement the `SideChannel` abstract class. You must implement :
 
 - `on_message_received(self, msg: "IncomingMessage") -> None` : You must
-  implement this method and read the data from IncomingMessage. The data must be
-  read in the order that it was written.
+implement this method and read the data from IncomingMessage. The data must be read in the order that it was written.
 
-The side channel must also assign a `channel_id` property in the constructor.
-The `channel_id` is a UUID (referred in C# as Guid) used to uniquely identify a
-side channel. This number must be the same on C# and Python. There can only be
-one side channel of a certain id during communication.
+The side channel must also assign a `channel_id` property in the constructor. The `channel_id` is a UUID (referred in C# as Guid) used to uniquely identify a side channel. This number must be the same on C# and Python. There can only be one side channel of a certain id during communication.
 
-To assign the `channel_id` call the abstract class constructor with the
-appropriate `channel_id` as follows:
+To assign the `channel_id` call the abstract class constructor with the appropriate `channel_id` as follows:
 
 ```python
 super().__init__(my_channel_id)
 ```
 
-To send a byte array from Python to C#, create an `OutgoingMessage` instance,
-add data to it, and call the `super().queue_message_to_send(msg)` method inside
-the side channel.
+To send a byte array from Python to C#, create an `OutgoingMessage` instance, add data to it, and call the `super().queue_message_to_send(msg)` method inside the side channel.
 
-To register a side channel on the Python side, pass the side channel as argument
-when creating the `UnityEnvironment` object. One of the arguments of the
-constructor (`side_channels`) is a list of side channels.
+To register a side channel on the Python side, pass the side channel as argument when creating the `UnityEnvironment` object. One of the arguments of the constructor (`side_channels`) is a list of side channels.
 
 ## Example implementation
 
-Below is a simple implementation of a side channel that will exchange ASCII
-encoded strings between a Unity environment and Python.
+Below is a simple implementation of a side channel that will exchange ASCII encoded strings between a Unity environment and Python.
 
 ### Example Unity C# code
 
-The first step is to create the `StringLogSideChannel` class within the Unity
-project. Here is an implementation of a `StringLogSideChannel` that will listen
-for messages from python and print them to the Unity debug log, as well as send
-error messages from Unity to python.
+The first step is to create the `StringLogSideChannel` class within the Unity project. Here is an implementation of a `StringLogSideChannel` that will listen for messages from python and print them to the Unity debug log, as well as send error messages from Unity to python.
 
 ```csharp
 using UnityEngine;
@@ -108,14 +86,7 @@ public class StringLogSideChannel : SideChannel
 }
 ```
 
-Once we have defined our custom side channel class, we need to ensure that it is
-instantiated and registered. This can typically be done wherever the logic of
-the side channel makes sense to be associated, for example on a MonoBehaviour
-object that might need to access data from the side channel. Here we show a
-simple MonoBehaviour object which instantiates and registers the new side
-channel. If you have not done it already, make sure that the MonoBehaviour which
-registers the side channel is attached to a GameObject which will be live in
-your Unity scene.
+Once we have defined our custom side channel class, we need to ensure that it is instantiated and registered. This can typically be done wherever the logic of the side channel makes sense to be associated, for example on a MonoBehaviour object that might need to access data from the side channel. Here we show a simple MonoBehaviour object which instantiates and registers the new side channel. If you have not done it already, make sure that the MonoBehaviour which registers the side channel is attached to a GameObject which will be live in your Unity scene.
 
 ```csharp
 using UnityEngine;
@@ -160,8 +131,7 @@ public class RegisterStringLogSideChannel : MonoBehaviour
 
 ### Example Python code
 
-Now that we have created the necessary Unity C# classes, we can create their
-Python counterparts.
+Now that we have created the necessary Unity C# classes, we can create their Python counterparts.
 
 ```python
 from mlagents_envs.environment import UnityEnvironment
@@ -196,9 +166,7 @@ class StringLogChannel(SideChannel):
         super().queue_message_to_send(msg)
 ```
 
-We can then instantiate the new side channel, launch a `UnityEnvironment` with
-that side channel active, and send a series of messages to the Unity environment
-from Python using it.
+We can then instantiate the new side channel, launch a `UnityEnvironment` with that side channel active, and send a series of messages to the Unity environment from Python using it.
 
 ```python
 # Create the channel
@@ -223,7 +191,4 @@ for i in range(1000):
 env.close()
 ```
 
-Now, if you run this script and press `Play` the Unity Editor when prompted, the
-console in the Unity Editor will display a message at every Python step.
-Additionally, if you press the Space Bar in the Unity Engine, a message will
-appear in the terminal.
+Now, if you run this script and press `Play` the Unity Editor when prompted, the console in the Unity Editor will display a message at every Python step. Additionally, if you press the Space Bar in the Unity Engine, a message will appear in the terminal.
@@ -1,11 +1,9 @@
 # ELO Rating System
-In adversarial games, the cumulative environment reward may **not be a meaningful metric** by which to track
-learning progress.
+In adversarial games, the cumulative environment reward may **not be a meaningful metric** by which to track learning progress.
 
 This is because the cumulative reward is **entirely dependent on the skill of the opponent**.
 
-An agent at a particular skill level will get more or less reward against a worse or better agent,
-respectively.
+An agent at a particular skill level will get more or less reward against a worse or better agent, respectively.
 
 Instead, it's better to use ELO rating system, a method to calculate **the relative skill level between two players in a zero-sum game**.
 
@@ -45,9 +43,7 @@ The winning player takes points from the losing one:
 - We start to train our agents.
 - Both of them have the same skills. So ELO score for each of them that we defined using parameter `initial_elo = 1200.0`.
 
-We calculate the expected score E:
-Ea = 0.5
-Eb = 0.5
+We calculate the expected score E: Ea = 0.5 and Eb = 0.5
 
 So it means that each player has 50% chances of winning the point.