|
1 | 1 | # ScalaPy Gym Facade |
2 | | -A ScalaPy Facade for OpenAI Gym! |
| 2 | +A [ScalaPy](https://scalapy.dev/) Facade for OpenAI Gym! |
3 | 3 | ## Quality |
4 | 4 | [](https://deepsource.io/gh/cric96/scalapy-gym/?ref=repository-badge) |
5 | 5 | [](https://www.codacy.com/gh/cric96/scalapy-gym/dashboard?utm_source=github.com&utm_medium=referral&utm_content=cric96/scalapy-gym&utm_campaign=Badge_Grade) |
6 | 6 | [](https://www.codacy.com/gh/cric96/scalapy-gym/dashboard?utm_source=github.com&utm_medium=referral&utm_content=cric96/scalapy-gym&utm_campaign=Badge_Coverage) |
7 | 7 | ## CI status |
8 | | - |
| 8 | +| Main | Develop | |
| 9 | +|---|---| |
| 10 | +|  |  | |
9 | 11 | ## Links |
10 | 12 | [](https://cric96.github.io/scalapy-gym/latest/api/) |
11 | | -[](https://maven-badges.herokuapp.com/maven-central/io.github.cric96/scalapy-gym_2.13/badge.svg) |
| 13 | + |
| 14 | +## What this project supports |
| 15 | +The main aim of this facade consist in using the basic environment describe in [OpenAI Gym](http://gym.openai.com/envs/#classic_control). |
| 16 | + |
| 17 | +Currently, there is no interesting in creating environment Scala side. The workflow idea is: |
| 18 | +- develop your reinforcement learning in Scala, |
| 19 | +- create a functional facade to interact with ScalaPy Gym |
| 20 | +- test your algorithms in Open AI baselines and share your results! |
| 21 | + |
| 22 | +## Installation |
| 23 | +First of all, you should setup your ScalaPy project correctly, please refer to [this](https://scalapy.dev/docs/) documentation: |
| 24 | + |
| 25 | +Then, you should add this library as dependecy in your sbt file: |
| 26 | +``` |
| 27 | +libraryDependencies += "io.github.cric96" %% "scalapy-gym" % "<x.y.z>" |
| 28 | +``` |
| 29 | +The latest version is: [](https://maven-badges.herokuapp.com/maven-central/io.github.cric96/scalapy-gym_2.13/badge.svg) |
| 30 | + |
| 31 | +Then you should install OpenAI dependencies. I suggest you to use `pyenv`. BTW, the main dependencies are: |
| 32 | +- gym |
| 33 | +- scipy |
| 34 | + |
| 35 | +Look to [requirements.txt](/requirements.txt). |
| 36 | + |
| 37 | +To use other environment (`box2d` or `MuJuCo` and `Atari`), please refer to [OpenAI Documentation](http://gym.openai.com/docs/). |
| 38 | + |
| 39 | +## How to use |
| 40 | + |
| 41 | +This library try to make environments type safe. So you add to define: |
| 42 | +- action type |
| 43 | +- observation type |
| 44 | +- action space type |
| 45 | +- observation space type |
| 46 | + |
| 47 | +For example, for [FrozenLake](http://gym.openai.com/envs/FrozenLake-v0/) you should write: |
| 48 | +```scala |
| 49 | +val env = Gym.make[Int, Int, Discrete, Discrete]("FrozenLake-v0") |
| 50 | +``` |
| 51 | + |
| 52 | +If you do not care about the action and observation type, you can type: |
| 53 | +```scala |
| 54 | +val env = Gym.unsafe("FrozenLake-v0") |
| 55 | +``` |
| 56 | + |
| 57 | +A simple loop that advance in the simulation could be: |
| 58 | +```scala |
| 59 | +import io.github.cric96.gym.Gym |
| 60 | +val env = Gym.unsafe("FrozenLake-v0") // or EnvFactory.ToyText.frozenLakeV0 |
| 61 | +env.reset() |
| 62 | +val observations = (0 to 1000) |
| 63 | + .tapEach(_ => env.render) |
| 64 | + .map(env.step(env.actionSpace.sample())) |
| 65 | +env.close() |
| 66 | +``` |
| 67 | + |
| 68 | +The python counterpart is: |
| 69 | +```python |
| 70 | + |
| 71 | +val env = Gym.unsafe("FrozenLake-v0") |
| 72 | +env.reset() |
| 73 | +for _ in range(1000): |
| 74 | + env.render() |
| 75 | + env.step(env.action_space.sample()) # take a random action |
| 76 | +env.close() |
| 77 | +``` |
| 78 | + |
| 79 | +As you can see, the experience is very similar :) |
| 80 | + |
| 81 | +Some environments, has already the correct typing (Looks to [EnvFactory](/src/main/scala/gym/envs/EnvFactory.scala)) |
| 82 | + |
| 83 | +### Typings |
| 84 | +- ToyTest |
| 85 | + - [x] FrozenLake |
| 86 | + - [x] FrozenLake |
| 87 | + - [x] GuessingGame |
| 88 | + - [x] HotterColder |
| 89 | + - [x] nChain |
| 90 | + - [x] Roulette |
| 91 | +- ClassicControl |
| 92 | + - [x] Acrobot |
| 93 | + - [x] CartPole |
| 94 | + - [x] MountainCar |
| 95 | + - [x] MountainCarContinuous |
| 96 | + - [x] Pendulum |
| 97 | +- [ ] Atari |
| 98 | +- [ ] Box2D |
| 99 | +- [ ] MuJoCo |
| 100 | +- [ ] Robotics |
| 101 | +- [ ] Algorithms |
| 102 | + |
12 | 103 |
|
13 | 104 | ## Example |
14 | 105 | - [Basic Q-Learning implementation](https://github.com/cric96/scala-rl-examples/blob/main/qlearning.ipynb) |
0 commit comments