Skip to content

Commit cd3de90

Browse files
committed
Fix bug with result type. Update documentation and CMakeLists
1 parent badf71e commit cd3de90

File tree

7 files changed

+63
-281
lines changed

7 files changed

+63
-281
lines changed

CMakeLists.txt

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -102,13 +102,15 @@ FILE(GLOB SRCS src/rlenvs/*.cpp
102102
src/rlenvs/envs/gymnasium/classic_control/*.cpp
103103
src/rlenvs/envs/gymnasium/classic_control/vector/*.cpp
104104
src/rlenvs/envs/gdrl/*.cpp
105+
src/rlenvs/envs/multi_armed_bandits/*.cpp
105106
#src/rlenvs/envs/gym_pybullet_drones/*.cpp
106107
src/rlenvs/envs/grid_world/*.cpp
107108
src/rlenvs/envs/connect2/*.cpp
108109
src/rlenvs/dynamics/*.cpp
109110
src/rlenvs/utils/*.cpp
110111
src/rlenvs/utils/io/*.cpp
111112
src/rlenvs/utils/io/tensor_board_server/*.cpp
113+
src/rlenvs/utils/maths/statistics/distributions/*.cpp
112114
src/rlenvs/utils/geometry/*.cpp
113115
src/rlenvs/utils/geometry/shapes/*.cpp
114116
src/rlenvs/utils/geometry/mesh/*.cpp

README.md

Lines changed: 1 addition & 265 deletions
Original file line numberDiff line numberDiff line change
@@ -6,270 +6,6 @@ using C++. In addition, the library provides various utilities such as experime
66
representing trajectories via waypoints and simple implementation of popular dynamics such as
77
quadrotor dynamics.
88

9-
## Environments
10-
11-
Currently, ```rlenvscpp``` provides the following environments:
12-
13-
| Environment | Use REST | Example |
14-
| :---------------- | :----------: | :----: |
15-
| FrozenLake 4x4 map | Yes | <a href="examples/example_1/example_1.cpp">example_1</a> |
16-
| FrozenLake 8x8 map | Yes | TODO |
17-
| Blackjack | Yes | <a href="examples/example_1/example_1.cpp">example_1</a> |
18-
| CliffWalking | Yes | <a href="examples/example_1/example_1.cpp">example_1</a> |
19-
| CartPole | Yes | TODO |
20-
| MountainCar | Yes | TODO |
21-
| Taxi | Yes | <a href="examples/example_1/example_1.cpp">example_1</a> |
22-
| Pendulum | Yes | <a href="examples/example_6/example_6.cpp">example_6</a> |
23-
| Acrobot | Yes | TODO |
24-
| GymWalk | Yes | TODO |
25-
| gym-pybullet-drones | TODO | TODO |
26-
| GridWorld | No | <a href="examples/example_5/example_5.cpp">example_5</a> |
27-
| Connect2 | No | <a href="examples/example_7/example_7.cpp">example_7</a> |
28-
29-
The Gymnasium (former OpenAI-Gym) environments utilise a REST API to communicate requests to/from the
30-
environment and ```rlenvscpp```.
31-
32-
Some environments have a vector implementation meaning multiple instances of the same
33-
environment. Currently, ```rlenvscpp``` provides the following vector environments:
34-
35-
| Environment | Use REST | Example |
36-
| :---------------- | :----------: | :----: |
37-
| AcrobotV | Yes | <a href="examples/example_8/example_8.cpp">example_8</a> |
38-
399
Various RL algorithms using the environments can be found at <a href="https://github.com/pockerman/cuberl/tree/master">cuberl</a>.
4010

41-
### How to use
42-
43-
The following is an example how to use the
44-
```FrozenLake``` environment from <a href="https://github.com/Farama-Foundation/Gymnasium/tree/main">Gymnasium</a>.
45-
46-
```cpp
47-
#include "rlenvs/rlenvs_types_v2.h"
48-
#include "rlenvs/envs/gymnasium/toy_text/frozen_lake_env.h"
49-
#include "rlenvs/envs/api_server/apiserver.h"
50-
51-
#include <iostream>
52-
#include <string>
53-
#include <unordered_map>
54-
#include <any>
55-
56-
namespace example_1{
57-
58-
const std::string SERVER_URL = "http://0.0.0.0:8001/api";
59-
60-
using rlenvscpp::envs::gymnasium::FrozenLake;
61-
using rlenvscpp::envs::RESTApiServerWrapper;
62-
63-
64-
void test_frozen_lake(const RESTApiServerWrapper& server){
65-
66-
FrozenLake<4> env(server);
67-
68-
std::cout<<"Environame URL: "<<env.get_url()<<std::endl;
69-
70-
// make the environment
71-
std::unordered_map<std::string, std::any> options;
72-
options.insert({"is_slippery", false});
73-
env.make("v1", options);
74-
75-
std::cout<<"Is environment created? "<<env.is_created()<<std::endl;
76-
std::cout<<"Is environment alive? "<<env.is_alive()<<std::endl;
77-
std::cout<<"Number of valid actions? "<<env.n_actions()<<std::endl;
78-
std::cout<<"Number of states? "<<env.n_states()<<std::endl;
79-
80-
// reset the environment
81-
auto time_step = env.reset(42, std::unordered_map<std::string, std::any>());
82-
83-
std::cout<<"Reward on reset: "<<time_step.reward()<<std::endl;
84-
std::cout<<"Observation on reset: "<<time_step.observation()<<std::endl;
85-
std::cout<<"Is terminal state: "<<time_step.done()<<std::endl;
86-
87-
//...print the time_step
88-
std::cout<<time_step<<std::endl;
89-
90-
// take an action in the environment
91-
// 2 = RIGHT
92-
auto new_time_step = env.step(2);
93-
94-
std::cout<<new_time_step<<std::endl;
95-
96-
// get the dynamics of the environment for the given state and action
97-
auto state = 0;
98-
auto action = 1;
99-
auto dynamics = env.p(state, action);
100-
101-
std::cout<<"Dynamics for state="<<state<<" and action="<<action<<std::endl;
102-
103-
for(auto item:dynamics){
104-
105-
std::cout<<std::get<0>(item)<<std::endl;
106-
std::cout<<std::get<1>(item)<<std::endl;
107-
std::cout<<std::get<2>(item)<<std::endl;
108-
std::cout<<std::get<3>(item)<<std::endl;
109-
}
110-
111-
action = env.sample_action();
112-
new_time_step = env.step(action);
113-
114-
std::cout<<new_time_step<<std::endl;
115-
116-
// synchronize the environment
117-
env.sync(std::unordered_map<std::string, std::any>());
118-
119-
auto copy_env = env.make_copy(1);
120-
copy_env.reset();
121-
122-
std::cout<<"Org env cidx: "<<env.cidx()<<std::endl;
123-
std::cout<<"Copy env cidx: "<<copy_env.cidx()<<std::endl;
124-
125-
copy_env.close();
126-
127-
// close the environment
128-
env.close();
129-
130-
}
131-
132-
}
133-
134-
135-
int main(){
136-
137-
using namespace example_1;
138-
139-
RESTApiServerWrapper server(SERVER_URL, true);
140-
141-
std::cout<<"Testing FrozenLake..."<<std::endl;
142-
example_1::test_frozen_lake(server);
143-
std::cout<<"===================="<<std::endl;
144-
return 0;
145-
}
146-
147-
```
148-
149-
In general, the environments exposed by the library follow the semantics in <a href="https://github.com/deepmind/dm_env/blob/master/docs/index.md">Environment API and Semantics</a> specification.
150-
For more details see the <a href="doc/env_spec.md">```rlenvscpp``` environment specification</a> document.
151-
152-
The general use case is to build the library and link it with your driver code to access its functionality.
153-
The environments specified as using REST in the tables above, that is all ```Gymnasium```, ```gym_pybullet_drones``` and ```GymWalk```
154-
environments are accessed via a client/server pattern. Namely, they are exposed via an API developed using
155-
<a href="https://fastapi.tiangolo.com/">FastAPI</a>.
156-
You need to fire up the FastAPI server, see dependencies, before using the environments in your code.
157-
To do so
158-
159-
```
160-
./start_uvicorn.sh
161-
```
162-
163-
By default the ```uvicorn``` server listents on port 8001. Change this if needed. You can access the OpenAPI specification at
164-
165-
```
166-
http://0.0.0.0:8001/docs
167-
```
168-
169-
Note that currently the implementation is not thread/process safe i.e. if multiple threads/processes access the environment
170-
a global instance of the environment is manipulated. Thus no session based environment exists.
171-
However, you can create copies of the same environment and access this via its dedicate index.
172-
If just one thread/process touches this specific environment you should be ok.
173-
Notice that the FastAPI server only uses a single process to manage all the environments.
174-
In addition, if you need multiple instances of the same environment you can also use one
175-
of the exissting vectorised environments (see table above).
176-
177-
Finally, you can choose to launch several instances of ```uvirocrn``` (listening on different ports).
178-
However in this case you need to implement all the interactions logic yourself as currently no implementation exists to handle such a scenario.
179-
180-
## Dynamics
181-
182-
Apart from the exposed environments, ```rlenvscpp``` exposes classes that
183-
describe the dynamics of some popular rigid bodies:
184-
185-
| Dynamics | Example |
186-
| :---------------- | :----------------------------------------------------------: |
187-
| Differential drive | <a href="examples/example_9/example_9.cpp">example_9</a> |
188-
| Quadrotor | <a href="examples/example_10/example_10.cpp">example_10</a> |
189-
| Bicycle vehicle | TODO |
190-
191-
## Miscellaneous
192-
193-
| Item | Example |
194-
| :---------------- | :----------------------------------------------------------: |
195-
| Environment trajectory | <a href="examples/example_3/example_3.cpp">example_3</a> |
196-
| WaypointTrajectory | <a href="examples/example_11/example_11.cpp">example_11</a> |
197-
| TensorboardServer | <a href="examples/example_12/example_12.cpp">example_12</a> |
198-
199-
## Dependencies
200-
201-
The library has the following general dependencies
202-
203-
- A compiler that supports C++20 e.g. g++-11
204-
- <a href="https://www.boost.org/">Boost C++</a>
205-
- <a href="https://cmake.org/">CMake</a> >= 3.10
206-
- <a href="https://github.com/google/googletest">Gtest</a> (if configured with tests)
207-
- <a href="https://eigen.tuxfamily.org/index.php?title=Main_Page">Eigen3</a>
208-
209-
Using the Gymnasium environments requires <a href="https://github.com/Farama-Foundation/Gymnasium/tree/main">Gymnasium</a> installed on your machine.
210-
In addition, you need to install
211-
212-
- <a href="https://fastapi.tiangolo.com/">FastAPI</a>
213-
- <a href="https://www.uvicorn.org/">Uvicorn</a>
214-
- <a href="https://docs.pydantic.dev/latest/">Pydantic</a>
215-
216-
By installing the requirement under ```requirements.txt``` should set your Python environment up correctly.
217-
218-
In addition, the library also incorporates, see ```(src/extern)```, the following libraries
219-
220-
- <a href="https://github.com/elnormous/HTTPRequest">HTTPRequest</a>
221-
- <a href="https://github.com/nlohmann/json">nlohmann/json</a>
222-
223-
There are extra dependencies if you want to generate the documentation. Namely,
224-
225-
- Doxygen
226-
- Sphinx
227-
- sphinx_rtd_theme
228-
- breathe
229-
- m2r2
230-
231-
## Installation
232-
233-
The usual CMake based installation process is used. Namely
234-
235-
```
236-
mkdir build && cd build && cmake ..
237-
make install
238-
```
239-
240-
You can toggle the following variables
241-
242-
- CMAKE_BUILD_TYPE (default is RELEASE)
243-
- ENABLE_TESTS_FLAG (default is OFF)
244-
- ENABLE_EXAMPLES_FLAG (default is OFF)
245-
- ENABLE_DOC_FLAG (default is OFF)
246-
247-
For example enbling the examples
248-
249-
```
250-
cmake -DENABLE_EXAMPLES_FLAG=ON ..
251-
make install
252-
```
253-
254-
255-
### Run the tests
256-
257-
You can execute all the tests by running the helper script ```execute_tests.sh```.
258-
259-
### Issues
260-
261-
#### Could not find ```boost_system```
262-
263-
It is likely that you are missing the boost_system library with your local Boost installation. This may be the case
264-
is you installed boost via a package manager. On a Ubuntu machine the following should resolve the issue
265-
266-
```
267-
sudo apt-get update -y
268-
sudo apt-get install -y libboost-system-dev
269-
```
270-
271-
#### FastAPI throws 422 Unpocessable entity
272-
273-
Typically, this is a problem with how the client (400-range error) specified the data
274-
to be sent to the server.
275-
11+
The documentation for the library can be found <a href="https://rlenvscpp.readthedocs.io/en/latest/">here</a>

docs/overview.rst

Lines changed: 24 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -6,10 +6,13 @@ using C++. In addition, the library provides various utilities such as experime
66
representing trajectories via waypoints and simple implementation of popular dynamics such as
77
quadrotor dynamics.
88

9-
Environments
10-
------------
9+
Various RL algorithms using the environments can be found at `cuberl <https://github.com/pockerman/cuberl/tree/master>`_.
10+
11+
Gymnasium environments
12+
-----------------------
1113

12-
Currently, ``rlenvscpp`` provides the following environments:
14+
Currently, ``rlenvscpp`` provides the following environments.
15+
Note that you will need to have Gymnasium installed.
1316

1417
+---------------------+--------------+-----------------------------------------------------------------------------------------------------+
1518
| Environment | Use REST | Example |
@@ -32,18 +35,13 @@ Currently, ``rlenvscpp`` provides the following environments:
3235
+---------------------+--------------+-----------------------------------------------------------------------------------------------------+
3336
| Acrobot | Yes | TODO |
3437
+---------------------+--------------+-----------------------------------------------------------------------------------------------------+
35-
| GymWalk | Yes | TODO |
36-
+---------------------+--------------+-----------------------------------------------------------------------------------------------------+
37-
| gym-pybullet-drones | TODO | TODO |
38-
+---------------------+--------------+-----------------------------------------------------------------------------------------------------+
39-
| GridWorld | No | `example_1 <https://github.com/pockerman/rlenvscpp/blob/master/examples/example_5/example_5.cpp>`_ |
40-
+---------------------+--------------+-----------------------------------------------------------------------------------------------------+
41-
| Connect2 | No | `example_1 <https://github.com/pockerman/rlenvscpp/blob/master/examples/example_7/example_7.cpp>`_ |
42-
+---------------------+--------------+-----------------------------------------------------------------------------------------------------+
4338

4439
The Gymnasium (former OpenAI-Gym) environments utilise a REST API to communicate requests to/from the
4540
environment and ``rlenvscpp``.
4641

42+
Gymnasium vector environments
43+
-----------------------------
44+
4745
Some environments have a vector implementation meaning multiple instances of the same
4846
environment. Currently, ``rlenvscpp`` provides the following vector environments:
4947

@@ -53,8 +51,22 @@ environment. Currently, ``rlenvscpp`` provides the following vector environments
5351
| AcrobotV | Yes | `example_8 <https://github.com/pockerman/rlenvscpp/blob/master/examples/example_8/example_8.cpp>`_ |
5452
+---------------------+--------------+-----------------------------------------------------------------------------------------------------+
5553

56-
Various RL algorithms using the environments can be found at `cuberl <https://github.com/pockerman/cuberl/tree/master>`_.
54+
Miscellaneous environments
55+
--------------------------
5756

57+
+---------------------+--------------+-----------------------------------------------------------------------------------------------------+
58+
| Environment | Use REST | Example |
59+
+=====================+==============+=====================================================================================================+
60+
| GymWalk | Yes | TODO |
61+
+---------------------+--------------+-----------------------------------------------------------------------------------------------------+
62+
| gym-pybullet-drones | TODO | TODO |
63+
+---------------------+--------------+-----------------------------------------------------------------------------------------------------+
64+
| GridWorld | No | `example_5 <https://github.com/pockerman/rlenvscpp/blob/master/examples/example_5/example_5.cpp>`_ |
65+
+---------------------+--------------+-----------------------------------------------------------------------------------------------------+
66+
| Connect2 | No | `example_7 <https://github.com/pockerman/rlenvscpp/blob/master/examples/example_7/example_7.cpp>`_ |
67+
+---------------------+--------------+-----------------------------------------------------------------------------------------------------+
68+
| MultiArmedBandits | No | TODO |
69+
+---------------------+--------------+-----------------------------------------------------------------------------------------------------+
5870

5971
Dynamics
6072
---------

0 commit comments

Comments
 (0)