-
Notifications
You must be signed in to change notification settings - Fork 6.8k
[RLlib] Fix failing env step in MultiAgentEnvRunner
.
#55567
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
Signed-off-by: Kamil Kaczmarek <[email protected]>
Signed-off-by: Kamil Kaczmarek <[email protected]>
Warning You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great PR @kamil-kaczmarek! Thanks a lot for working through this complex setup. I left some comments and request another debug iteration to understand better why this happens now and the intended process is somehow interrupted.
@@ -361,6 +363,9 @@ def _sample( | |||
# Try stepping the environment. | |||
results = self._try_env_step(actions_for_env) | |||
if results == ENV_STEP_FAILURE: | |||
logging.warning( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice message. Can we unify all messages patterns?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't notice single pattern established across RLlib. If you like this direction I can unify for this PR first, then expand and unify component by component. WDYT?
@@ -346,7 +348,7 @@ def _sample( | |||
metrics_prefix_key=(MODULE_TO_ENV_CONNECTOR,), | |||
) | |||
# In case all environments had been terminated `to_module` will be | |||
# empty and no actions are needed b/c we reset all environemnts. | |||
# empty and no actions are needed b/c we reset all environments. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder why this happens now that the to_module
is None
. Can we debug another round and see where this happens? Then check the autoreset and the connector run (I know this is complex).
What should happen is: env resets automatically; init obs goes through the to_module
connector pipeline and produces to_module
which can in turn passed through the module and the to_env
pipeline to produce an action.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let me investigate this more. This first happened last Thursday in the release tests. Will look into the code diff.
MultiAgentEnvRunner
MultiAgentEnvRunner
.
… true after env.step(). Signed-off-by: Kamil Kaczmarek <[email protected]>
Signed-off-by: Kamil Kaczmarek <[email protected]>
This pull request has been automatically marked as stale because it has not had You can always ask for help on our discussion forum or Ray's public slack channel. If you'd like to keep this open, just leave any comment, and the stale label will be removed. |
Signed-off-by: Kamil Kaczmarek <[email protected]>
unstale |
Why are these changes needed?
Fix failing release test:
learning_tests_multi_agent_cartpole_appo_multi_gpu
.Related issue number
Checks
git commit -s
) in this PR.scripts/format.sh
to lint the changes in this PR.method in Tune, I've added it in
doc/source/tune/api/
under thecorresponding
.rst
file.