-
Notifications
You must be signed in to change notification settings - Fork 398
[Feature,BugFix] Fix composite entropy for nested keys #3101
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature,BugFix] Fix composite entropy for nested keys #3101
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/rl/3101
Note: Links to docs will display an error until the docs builds have been completed. ⏳ No Failures, 32 PendingAs of commit 1adc46f with merge base 978424e ( This comment was automatically generated by Dr. CI and updates every 15 minutes. |
Hi @juandelos! Thank you for your pull request and welcome to our community. Action RequiredIn order to merge any pull request (code, docs, etc.), we require contributors to sign our Contributor License Agreement, and we don't seem to have one on file for you. ProcessIn order for us to review and merge your suggested changes, please sign at https://code.facebook.com/cla. If you are contributing on behalf of someone else (eg your employer), the individual CLA may not be sufficient and your employer may need to sign the corporate CLA. Once the CLA is signed, our tooling will perform checks and validations. Afterwards, the pull request will be tagged with If you have received this in error or have any questions, please contact us at [email protected]. Thanks! |
I think that overall this logic probably isn't necessary. I think it's okay for the user to always have to provide the full, nested key. |
@juandelos thanks for this PR! Let me know when I should review it. |
@@ -918,15 +919,37 @@ def _weighted_loss_entropy( | |||
return -self.entropy_coeff * entropy | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
General comment: I feel like this method could use a bit more inline comments to guide the reader.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep agree on that
I can give it a shot
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Giving it a look now.
We're getting
FAILED test/test_cost.py::TestPPO::test_ppo_composite_dists - KeyError: "Missing entropy coeff for head '('agent0', 'action', 'action1', 'sub_action1_log_prob')'"
FAILED test/test_cost.py::TestPPO::test_weighted_entropy_mapping - KeyError: "Missing entropy coeff for head '('head_0', 'action_log_prob')'"
I will write bunch of comments on the _weighted_loss_entropy
function to keep track of everything it's doing - and push it to your branch is that works for you!
@@ -918,15 +919,37 @@ def _weighted_loss_entropy( | |||
return -self.entropy_coeff * entropy | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep agree on that
I can give it a shot
torchrl/objectives/ppo.py
Outdated
@@ -1075,7 +1079,7 @@ def __init__( | |||
clip_epsilon: float = 0.2, | |||
entropy_bonus: bool = True, | |||
samples_mc_entropy: int = 1, | |||
entropy_coeff: float | Mapping[str, float] | None = None, | |||
entropy_coeff: float | Mapping[str | tuple | list, float] | None = None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@louisfaury NestedKey
here too right?
entropy_coeff: float | Mapping[str | tuple | list, float] | None = None, | |
entropy_coeff: float | Mapping[NestedKey | tuple | list, float] | None = None, |
4db6eed
to
1adc46f
Compare
Follow-up: #3109 |
Description
Updated the function _weighted_loss_entropy to enable setting the entropy coefficient for every head for different nested keys definitions. Now it uses the nested keys and goes through the leaves. If the entropy coefficient is defined for the leave, it uses that. If not, it goes one step up until it finds a key that is defined in the entropy coefficient map.
Motivation and Context
Before, the function had problems when we had nested keys of the type:
This would result in a problem as it would loop through the keys and try to find the name 'agents' in the entropy coefficient map.
Also, it was not possible to define different entropy coefficients for the same agent, but different action heads like in this example:
agent1
agent2
I have raised an issue to propose this change (required for new features and bug fixes)
Types of changes
What types of changes does your code introduce? Remove all that do not apply:
Checklist
Go over all the following points, and put an
x
in all the boxes that apply.If you are unsure about any of these, don't hesitate to ask. We are here to help!