Skip to content

Commit 3370a65

Browse files
authored
Merge branch 'master' into development
2 parents b110d5a + 5ec6dc3 commit 3370a65

File tree

5 files changed

+23
-23
lines changed

5 files changed

+23
-23
lines changed

docs/Getting-Started-with-Balance-Ball.md

Lines changed: 18 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -2,9 +2,9 @@
22

33
![Balance Ball](../images/balance.png)
44

5-
This tutorial will walk through the end-to-end process of installing Unity Agents, building an example environment, training an agent in it, and finally embedding the trained model into the Unity environment.
5+
This tutorial will walk through the end-to-end process of installing Unity Agents, building an example environment, training an agent in it, and finally embedding the trained model into the Unity environment.
66

7-
Unity ML Agents contains a number of example environments which can be used as templates for new environments, or as ways to test a new ML algorithm to ensure it is functioning correctly.
7+
Unity ML Agents contains a number of example environments which can be used as templates for new environments, or as ways to test a new ML algorithm to ensure it is functioning correctly.
88

99
In this walkthrough we will be using the **3D Balance Ball** environment. The environment contains a number of platforms and balls. Platforms can act to keep the ball up by rotating either horizontally or vertically. Each platform is an agent which is rewarded the longer it can keep a ball balanced on it, and provided a negative reward for dropping the ball. The goal of the training process is to have the platforms learn to never drop the ball.
1010

@@ -15,7 +15,7 @@ Let's get started!
1515
In order to install and set-up the Python and Unity environments, see the instructions [here](installation.md).
1616

1717
## Building Unity Environment
18-
Launch the Unity Editor, and log in, if necessary.
18+
Launch the Unity Editor, and log in, if necessary.
1919

2020
1. Open the `unity-environment` folder using the Unity editor. *(If this is not first time running Unity, you'll be able to skip most of these immediate steps, choose directly from the list of recently opened projects)*
2121
- On the initial dialog, choose `Open` on the top options
@@ -38,24 +38,24 @@ Launch the Unity Editor, and log in, if necessary.
3838

3939
To launch jupyter, run in the command line:
4040

41-
`jupyter notebook`
41+
`jupyter notebook`
4242

4343
Then navigate to `localhost:8888` to access the notebooks. If you're new to jupyter, check out the [quick start guide](https://jupyter-notebook-beginner-guide.readthedocs.io/en/latest/execute.html) before you continue.
4444

4545
To ensure that your environment and the Python API work as expected, you can use the `python/Basics` Jupyter notebook. This notebook contains a simple walkthrough of the functionality of the API. Within `Basics`, be sure to set `env_name` to the name of the environment file you built earlier.
4646

4747
### Training with PPO
48-
In order to train an agent to correctly balance the ball, we will use a Reinforcement Learning algorithm called Proximal Policy Optimization (PPO). This is a method that has been shown to be safe, efficient, and more general purpose than many other RL algorithms, as such we have chosen it as the example algorithm for use with ML Agents. For more information on PPO, OpenAI has a recent [blog post](https://blog.openai.com/openai-baselines-ppo/) explaining it.
48+
In order to train an agent to correctly balance the ball, we will use a Reinforcement Learning algorithm called Proximal Policy Optimization (PPO). This is a method that has been shown to be safe, efficient, and more general purpose than many other RL algorithms, as such we have chosen it as the example algorithm for use with ML Agents. For more information on PPO, OpenAI has a recent [blog post](https://blog.openai.com/openai-baselines-ppo/) explaining it.
4949

5050
In order to train the agents within the Ball Balance environment:
5151

5252
1. Open `python/PPO.ipynb` notebook from Jupyter.
53-
2. Set `env_name` to whatever you named your environment file.
54-
3. (optional) Set `run_path` directory to your choice.
53+
2. Set `env_name` to the name of your environment file earlier.
54+
3. (optional) Set `run_path` directory to your choice.
5555
4. Run all cells of notebook with the exception of the last one under "Export the trained Tensorflow graph."
5656

5757
### Observing Training Progress
58-
In order to observe the training process in more detail, you can use Tensorboard.
58+
In order to observe the training process in more detail, you can use Tensorboard.
5959
In your command line, enter into `python` directory and then run :
6060

6161
`tensorboard --logdir=summaries`
@@ -73,28 +73,28 @@ From Tensorboard, you will see the summary statistics of six variables:
7373
## Embedding Trained Brain into Unity Environment _[Experimental]_
7474
Once the training process displays an average reward of ~75 or greater, and there has been a recently saved model (denoted by the `Saved Model` message) you can choose to stop the training process by stopping the cell execution. Once this is done, you now have a trained TensorFlow model. You must now convert the saved model to a Unity-ready format which can be embedded directly into the Unity project by following the steps below.
7575

76-
### Setting up TensorFlowSharp Support
76+
### Setting up TensorFlowSharp Support
7777
Because TensorFlowSharp support is still experimental, it is disabled by default. In order to enable it, you must follow these steps. Please note that the `Internal` Brain mode will only be available once completing these steps.
7878

7979
1. Make sure you are using Unity 2017.1 or newer.
80-
2. Make sure the TensorFlowSharp plugin is in your Asset folder. A Plugins folder which includes TF# can be downloaded [here](https://s3.amazonaws.com/unity-agents/TFSharpPlugin.unitypackage).
80+
2. Make sure the TensorFlowSharp plugin is in your `Assets` folder. A Plugins folder which includes TF# can be downloaded [here](https://s3.amazonaws.com/unity-agents/TFSharpPlugin.unitypackage). Double click and import it once downloaded.
8181
3. Go to `Edit` -> `Project Settings` -> `Player`
8282
4. For each of the platforms you target (**`PC, Mac and Linux Standalone`**, **`iOS`** or **`Android`**):
8383
1. Go into `Other Settings`.
84-
2. Select `Scripting Runtime Version` to `Experimental (.NET 4.6 Equivalent)`
84+
2. Select `Scripting Runtime Version` to `Experimental (.NET 4.6 Equivalent)`
8585
3. In `Scripting Defined Symbols`, add the flag `ENABLE_TENSORFLOW`
8686
5. Restart the Unity Editor.
8787

8888
### Embedding the trained model into Unity
8989

90-
1. Run the final cell of the notebook under "Export the trained TensorFlow graph" to produce an `<env_name >.bytes` file.
91-
2. Move `<env_name>.bytes` from `python/models/...` into `unity-environment/Assets/ML-Agents/Examples/3DBall/TFModels/`.
92-
3. Open the Unity Editor, and select the `3DBall` scene as described above.
93-
4. Select the `3DBallBrain` object from the Scene hierarchy.
90+
1. Run the final cell of the notebook under "Export the trained TensorFlow graph" to produce an `<env_name >.bytes` file.
91+
2. Move `<env_name>.bytes` from `python/models/ppo/` into `unity-environment/Assets/ML-Agents/Examples/3DBall/TFModels/`.
92+
3. Open the Unity Editor, and select the `3DBall` scene as described above.
93+
4. Select the `Ball3DBrain` object from the Scene hierarchy.
9494
5. Change the `Type of Brain` to `Internal`.
9595
6. Drag the `<env_name>.bytes` file from the Project window of the Editor to the `Graph Model` placeholder in the `3DBallBrain` inspector window.
96-
7. Set the `Graph Placeholder` size to 1 (_Note that step 7 and 8 are done because 3DBall is a continuous control environment, and the TensorFlow model requires a noise parameter to decide actions. In cases with discrete control, epsilon is not needed_).
97-
8. Add a placeholder called `epsilon` with a type of `floating point` and a range of values from 0 to 0.
96+
7. Set the `Graph Placeholder` size to 1 (_Note that step 7 and 8 are done because 3DBall is a continuous control environment, and the TensorFlow model requires a noise parameter to decide actions. In cases with discrete control, epsilon is not needed_).
97+
8. Add a placeholder called `epsilon` with a type of `floating point` and a range of values from `0` to `0`.
9898
9. Press the Play button at the top of the editor.
9999

100-
If you followed these steps correctly, you should now see the trained model being used to control the behavior of the balance ball within the Editor itself. From here you can re-build the Unity binary, and run it standalone with your agent's new learned behavior built right in.
100+
If you followed these steps correctly, you should now see the trained model being used to control the behavior of the balance ball within the Editor itself. From here you can re-build the Unity binary, and run it standalone with your agent's new learned behavior built right in.

docs/Limitations-&-Common-Issues.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -40,7 +40,7 @@ If you receive a file-not-found error while attempting to launch an environment,
4040

4141
If you receive an exception `"Couldn't launch new environment because communication port {} is still in use. "`, you can change the worker number in the python script when calling
4242

43-
`UnityEnvironment(file_name=filename, worker_num=X)`
43+
`UnityEnvironment(file_name=filename, worker_id=X)`
4444

4545
### Mean reward : nan
4646

docs/Unity-Agents---Python-API.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -20,7 +20,7 @@ env = UnityEnvironment(file_name=filename, worker_num=0)
2020
A BrainInfo object contains the following fields:
2121

2222
* **`observations`** : A list of 4 dimensional numpy arrays. Matrix n of the list corresponds to the n<sup>th</sup> observation of the brain.
23-
* **`states`** : A two dimensional numpy array of dimension `(batch size, state size)` if the state space is continuous and `(batch size, state size)` if the state space is discrete.
23+
* **`states`** : A two dimensional numpy array of dimension `(batch size, state size)` if the state space is continuous and `(batch size, 1)` if the state space is discrete.
2424
* **`memories`** : A two dimensional numpy array of dimension `(batch size, memory size)` which corresponds to the memories sent at the previous step.
2525
* **`rewards`** : A list as long as the number of agents using the brain containing the rewards they each obtained at the previous step.
2626
* **`local_done`** : A list as long as the number of agents using the brain containing `done` flags (wether or not the agent is done).

docs/best-practices-ppo.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
# Best Practices when training with PPO
22

3-
The process of training a Reinforcement Learning model can often involve the need to tune the hyperparameters in order to achieve
3+
The process of training a Reinforcement Learning model can often involve the need to tune the hyperparameters in order to achieve
44
a level of performance that is desirable. This guide contains some best practices for tuning the training process when the default
55
parameters don't seem to be giving the level of performance you would like.
66

@@ -72,7 +72,7 @@ Typical Range: `5e5 - 1e7`
7272

7373
## Training Statistics
7474

75-
To view training statistics, use Tensorboard. For information on launching and using Tensorboard, see [here](../Getting-Started-with-Balance-Ball.md#observing-training-progress).
75+
To view training statistics, use Tensorboard. For information on launching and using Tensorboard, see [here](./Getting-Started-with-Balance-Ball.md#observing-training-progress).
7676

7777
### Cumulative Reward
7878

docs/installation.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -13,7 +13,7 @@ In order to train an agent within the framework, you will need to install Python
1313

1414
### Windows Users
1515

16-
If you are a Windows user who is new to Python/TensorFlow, follow [this guide](https://nitishmutha.github.io/tensorflow/2017/01/22/TensorFlow-with-gpu-for-windows.html) to set up your Python environment.
16+
If you are a Windows user who is new to Python/TensorFlow, follow [this guide](https://unity3d.college/2017/10/25/machine-learning-in-unity3d-setting-up-the-environment-tensorflow-for-agentml-on-windows-10/) to set up your Python environment.
1717

1818
### Requirements
1919
* Jupyter

0 commit comments

Comments
 (0)