Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion packages/tasks/src/tasks/placeholder/spec/output.json
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@
"properties": {
"meaningful_output_name": {
"type": "string",
"description": "TODO: Describe what is outputed by the inference here"
"description": "TODO: Describe what is outputted by the inference here"
}
},
"required": ["meaningfulOutputName"]
Expand Down
2 changes: 1 addition & 1 deletion packages/tasks/src/tasks/reinforcement-learning/about.md
Original file line number Diff line number Diff line change
Expand Up @@ -48,7 +48,7 @@ Observations and states are the information our agent gets from the environment.

Inference in reinforcement learning differs from other modalities, in which there's a model and test data. In reinforcement learning, once you have trained an agent in an environment, you try to run the trained agent for additional steps to get the average reward.

A typical training cycle consists of gathering experience from the environment, training the agent, and running the agent on a test environment to obtain average reward. Below there's a snippet on how you can interact with the environment using the `gymnasium` library, train an agent using `stable-baselines3`, evalute the agent on test environment and infer actions from the trained agent.
A typical training cycle consists of gathering experience from the environment, training the agent, and running the agent on a test environment to obtain average reward. Below there's a snippet on how you can interact with the environment using the `gymnasium` library, train an agent using `stable-baselines3`, evaluate the agent on test environment and infer actions from the trained agent.

```python
# Here we are running 20 episodes of CartPole-v1 environment, taking random actions
Expand Down