Skip to content

Commit 4970cc4

Browse files
committed
More line break removal
1 parent 70fcf6a commit 4970cc4

File tree

4 files changed

+16
-38
lines changed

4 files changed

+16
-38
lines changed

com.unity.ml-agents/Documentation~/Learning-Environment-Create-New.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -94,7 +94,7 @@ Then, edit the new `RollerAgent` script:
9494
using Unity.MLAgents.Actuators;
9595
```
9696
then change the base class from `MonoBehaviour` to `Agent`.
97-
1. Delete `Update()` since we are not using it, but keep `Start()`.
97+
3. Delete `Update()` since we are not using it, but keep `Start()`.
9898

9999
So far, these are the basic steps that you would use to add ML-Agents to any Unity project. Next, we will add the logic that will let our Agent learn to roll to the cube using reinforcement learning. More specifically, we will need to extend three methods from the `Agent` base class:
100100

com.unity.ml-agents/Documentation~/Learning-Environment-Design-Agents.md

Lines changed: 14 additions & 29 deletions
Original file line numberDiff line numberDiff line change
@@ -8,23 +8,16 @@ The `Policy` class abstracts out the decision making logic from the Agent itself
88

99
When you create an Agent, you should usually extend the base Agent class. This includes implementing the following methods:
1010

11-
- `Agent.OnEpisodeBegin()` — Called at the beginning of an Agent's episode,
12-
including at the beginning of the simulation.
13-
- `Agent.CollectObservations(VectorSensor sensor)` — Called every step that the Agent
14-
requests a decision. This is one possible way for collecting the Agent's observations of the environment; see [Generating Observations](#generating-observations) below for more options.
15-
- `Agent.OnActionReceived()` — Called every time the Agent receives an action to
16-
take. Receives the action chosen by the Agent. It is also common to assign a reward in this method.
17-
- `Agent.Heuristic()` - When the `Behavior Type` is set to `Heuristic Only` in
18-
the Behavior Parameters of the Agent, the Agent will use the `Heuristic()` method to generate the actions of the Agent. As such, the `Heuristic()` method writes to the array of floats provided to the Heuristic method as argument. __Note__: Do not create a new float array of action in the `Heuristic()` method, as this will prevent writing floats to the original action array.
11+
- `Agent.OnEpisodeBegin()` — Called at the beginning of an Agent's episode, including at the beginning of the simulation.
12+
- `Agent.CollectObservations(VectorSensor sensor)` — Called every step that the Agent requests a decision. This is one possible way for collecting the Agent's observations of the environment; see [Generating Observations](#generating-observations) below for more options.
13+
- `Agent.OnActionReceived()` — Called every time the Agent receives an action to take. Receives the action chosen by the Agent. It is also common to assign a reward in this method.
14+
- `Agent.Heuristic()` - When the `Behavior Type` is set to `Heuristic Only` in the Behavior Parameters of the Agent, the Agent will use the `Heuristic()` method to generate the actions of the Agent. As such, the `Heuristic()` method writes to the array of floats provided to the Heuristic method as argument. __Note__: Do not create a new float array of action in the `Heuristic()` method, as this will prevent writing floats to the original action array.
1915

2016
As a concrete example, here is how the Ball3DAgent class implements these methods:
2117

22-
- `Agent.OnEpisodeBegin()` — Resets the agent cube and ball to their starting
23-
positions. The function randomizes the reset values so that the training generalizes to more than a specific starting position and agent cube orientation.
24-
- `Agent.CollectObservations(VectorSensor sensor)` — Adds information about the
25-
orientation of the agent cube, the ball velocity, and the relative position between the ball and the cube. Since the `CollectObservations()` method calls `VectorSensor.AddObservation()` such that vector size adds up to 8, the Behavior Parameters of the Agent are set with vector observation space with a state size of 8.
26-
- `Agent.OnActionReceived()` — The action results
27-
in a small change in the agent cube's rotation at each step. In this example, an Agent receives a small positive reward for each step it keeps the ball on the agent cube's head and a larger, negative reward for dropping the ball. An Agent's episode is also ended when it drops the ball so that it will reset with a new ball for the next simulation step.
18+
- `Agent.OnEpisodeBegin()` — Resets the agent cube and ball to their starting positions. The function randomizes the reset values so that the training generalizes to more than a specific starting position and agent cube orientation.
19+
- `Agent.CollectObservations(VectorSensor sensor)` — Adds information about the orientation of the agent cube, the ball velocity, and the relative position between the ball and the cube. Since the `CollectObservations()` method calls `VectorSensor.AddObservation()` such that vector size adds up to 8, the Behavior Parameters of the Agent are set with vector observation space with a state size of 8.
20+
- `Agent.OnActionReceived()` — The action results in a small change in the agent cube's rotation at each step. In this example, an Agent receives a small positive reward for each step it keeps the ball on the agent cube's head and a larger, negative reward for dropping the ball. An Agent's episode is also ended when it drops the ball so that it will reset with a new ball for the next simulation step.
2821
- `Agent.Heuristic()` - Converts the keyboard inputs into actions.
2922

3023
## Decisions
@@ -36,16 +29,12 @@ In order for an agent to learn, the observations should include all the informat
3629

3730
### Generating Observations
3831
ML-Agents provides multiple ways for an Agent to make observations:
39-
1. Overriding the `Agent.CollectObservations()` method and passing the
40-
observations to the provided `VectorSensor`.
41-
1. Adding the `[Observable]` attribute to fields and properties on the Agent.
42-
1. Implementing the `ISensor` interface, using a `SensorComponent` attached to
43-
the Agent to create the `ISensor`.
32+
1. Overriding the `Agent.CollectObservations()` method and passing the observations to the provided `VectorSensor`.
33+
2. Adding the `[Observable]` attribute to fields and properties on the Agent.
34+
3. Implementing the `ISensor` interface, using a `SensorComponent` attached to the Agent to create the `ISensor`.
4435

4536
#### Agent.CollectObservations()
46-
Agent.CollectObservations() is best used for aspects of the environment which are numerical and non-visual. The Policy class calls the
47-
`CollectObservations(VectorSensor sensor)` method of each Agent. Your
48-
implementation of this function must call `VectorSensor.AddObservation` to add vector observations.
37+
Agent.CollectObservations() is best used for aspects of the environment which are numerical and non-visual. The Policy class calls the `CollectObservations(VectorSensor sensor)` method of each Agent. Your implementation of this function must call `VectorSensor.AddObservation` to add vector observations.
4938

5039
The `VectorSensor.AddObservation` method provides a number of overloads for adding common types of data to your observation vector. You can add Integers and booleans directly to the observation vector, as well as some common Unity data types such as `Vector2`, `Vector3`, and `Quaternion`.
5140

@@ -90,8 +79,7 @@ public class Ball3DHardAgent : Agent {
9079
}
9180
}
9281
```
93-
`ObservableAttribute` currently supports most basic types (e.g. floats, ints,
94-
bools), as well as `Vector2`, `Vector3`, `Vector4`, `Quaternion`, and enums.
82+
`ObservableAttribute` currently supports most basic types (e.g. floats, ints, bools), as well as `Vector2`, `Vector3`, `Vector4`, `Quaternion`, and enums.
9583

9684
The behavior of `ObservableAttribute`s are controlled by the "Observable Attribute Handling" in the Agent's `Behavior Parameters`. The possible values for this are:
9785
* **Ignore** (default) - All ObservableAttributes on the Agent will be ignored. If there are no ObservableAttributes on the Agent, this will result in the fastest initialization time.
@@ -102,9 +90,7 @@ The behavior of `ObservableAttribute`s are controlled by the "Observable Attribu
10290

10391
Internally, ObservableAttribute uses reflection to determine which members of the Agent have ObservableAttributes, and also uses reflection to access the fields or invoke the properties at runtime. This may be slower than using CollectObservations or an ISensor, although this might not be enough to noticeably affect performance.
10492

105-
**NOTE**: you do not need to adjust the Space Size in the Agent's
106-
`Behavior Parameters` when you add `[Observable]` fields or properties to an
107-
Agent, since their size can be computed before they are used.
93+
**NOTE**: you do not need to adjust the Space Size in the Agent's `Behavior Parameters` when you add `[Observable]` fields or properties to an Agent, since their size can be computed before they are used.
10894

10995
#### ISensor interface and SensorComponents
11096
The `ISensor` interface is generally intended for advanced users. The `Write()` method is used to actually generate the observation, but some other methods such as returning the shape of the observations must also be implemented.
@@ -616,8 +602,7 @@ Multi Agent Groups should be used with the MA-POCA trainer, which is explicitly
616602

617603
See the [Cooperative Push Block](Learning-Environment-Examples.md#cooperative-push-block) environment for an example of how to use Multi Agent Groups, and the [Dungeon Escape](Learning-Environment-Examples.md#dungeon-escape) environment for an example of how the Multi Agent Group can be used with agents that are removed from the scene mid-episode.
618604

619-
**NOTE**: Groups differ from Teams (for competitive settings) in the following way - Agents
620-
working together should be added to the same Group, while agents playing against each other should be given different Team Ids. If in the Scene there is one playing field and two teams, there should be two Groups, one for each team, and each team should be assigned a different Team Id. If this playing field is duplicated many times in the Scene (e.g. for training speedup), there should be two Groups _per playing field_, and two unique Team Ids _for the entire Scene_. In environments with both Groups and Team Ids configured, MA-POCA and self-play can be used together for training. In the diagram below, there are two agents on each team, and two playing fields where teams are pitted against each other. All the blue agents should share a Team Id (and the orange ones a different ID), and there should be four group managers, one per pair of agents.
605+
**NOTE**: Groups differ from Teams (for competitive settings) in the following way - Agents working together should be added to the same Group, while agents playing against each other should be given different Team Ids. If in the Scene there is one playing field and two teams, there should be two Groups, one for each team, and each team should be assigned a different Team Id. If this playing field is duplicated many times in the Scene (e.g. for training speedup), there should be two Groups _per playing field_, and two unique Team Ids _for the entire Scene_. In environments with both Groups and Team Ids configured, MA-POCA and self-play can be used together for training. In the diagram below, there are two agents on each team, and two playing fields where teams are pitted against each other. All the blue agents should share a Team Id (and the orange ones a different ID), and there should be four group managers, one per pair of agents.
621606

622607
<p align="center"> <img src="images/groupmanager_teamid.png" alt="Group Manager vs Team Id" width="650" border="10" /> </p>
623608

com.unity.ml-agents/Documentation~/ML-Agents-Overview.md

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -245,8 +245,7 @@ Unlike other platforms, where the agent’s observation might be limited to a si
245245
When visual observations are utilized, the ML-Agents Toolkit leverages convolutional neural networks (CNN) to learn from the input images. We offer three network architectures:
246246

247247
- a simple encoder which consists of two convolutional layers
248-
- the implementation proposed by
249-
[Mnih et al.](https://www.nature.com/articles/nature14236), consisting of three convolutional layers,
248+
- the implementation proposed by [Mnih et al.](https://www.nature.com/articles/nature14236), consisting of three convolutional layers,
250249
- the [IMPALA Resnet](https://arxiv.org/abs/1802.01561) consisting of three stacked layers, each with two residual blocks, making a much larger network than the other two.
251250

252251
The choice of the architecture depends on the visual complexity of the scene and the available computational resources.

com.unity.ml-agents/Documentation~/Package-Settings.md

Lines changed: 0 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -21,9 +21,3 @@ By clicking the gear on the top right you'll see all available settings listed i
2121
This allows you to create different settings for different scenarios. For example, you can create two separate settings for training and inference, and specify which one you want to use according to what you're currently running.
2222

2323
![Multiple Settings](images/multiple-settings.png)
24-
25-
26-
27-
28-
29-

0 commit comments

Comments
 (0)