New Algorithm: MAPO implementation #388

ZiyiTsang · 2025-09-25T13:02:47Z

This is the implementation to paper MAPO.

It doesn't require much code refactoring, still in the process.....

More "reay-to-go" algorithms and examples are the key to making a repo popular, especially RL-related.

Co-authored-by: Copilot <[email protected]>

Copilot

Pull Request Overview

Copilot reviewed 11 out of 12 changed files in this pull request and generated 8 comments.

_{Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.}

areal/utils/data.py

docs/algorithms/mapo.md

areal/utils/data.py

docs/algorithms/mapo.md

areal/utils/data.py

ZiyiTsang · 2025-09-28T18:13:46Z

Ready for review.

areal/api/cli_args.py

areal/utils/data.py

garrett4wade · 2025-09-30T08:18:28Z

areal/utils/data.py

+        # since the advantages is same within same trajectory, we can get the trajectory_level advantage from first token
+        # base on assumption that the advantage on last dim are totally same
+
+        advantages_ = advantages[:, 0]  # advantages shape [batch_size*group_size]


This line does not take any effect and should be removed.

No, useful. Please see the code comment.

The adv of first token is extract and use for below logic.

garrett4wade · 2025-09-30T08:43:05Z

areal/utils/data.py

+        return (
+            1 - trajectory_reweight
+        ) * deviation_base_norm + trajectory_reweight * mean_base_norm


double-check the formula. Since the trajectory_weight is computed as 4p(1-p) rather than 1-4p(1-p), should we reverse the weighting of these two norms?

my mistake. thank you

areal/utils/data.py

ZiyiTsang

Ready for second review.

github-actions · 2025-10-29T01:19:07Z

This pull request has been automatically marked as stale because it has not had recent activity within the last 14 days.

Please add a comment or push new commits to keep it active.

Thank you for your contribution!

ZiyiTsang added 6 commits September 25, 2025 12:52

MAPO implementation

5f74710

.

e8681a9

.

05bf6b2

.

a9de698

.

62e7cfc

merge

e4cc054

ZiyiTsang had a problem deploying to AReaL-unittests September 27, 2025 11:23 — with GitHub Actions Error

.

a92564b

ZiyiTsang had a problem deploying to AReaL-unittests September 27, 2025 11:30 — with GitHub Actions Error

ZiyiTsang requested a review from Copilot September 27, 2025 11:57

This comment was marked as resolved.

Sign in to view

Update docs/algorithms/mapo.md

e1bbe4b

Co-authored-by: Copilot <[email protected]>

ZiyiTsang had a problem deploying to AReaL-unittests September 27, 2025 12:06 — with GitHub Actions Error

thank you gemini..code change

b6d0228

This comment was marked as resolved.

Sign in to view

thank you gemini..code change

fba252f

ZiyiTsang had a problem deploying to AReaL-unittests September 28, 2025 07:05 — with GitHub Actions Error

This comment was marked as resolved.

Sign in to view

ZiyiTsang requested a review from Copilot September 28, 2025 18:08

Copilot AI reviewed Sep 28, 2025

View reviewed changes

commit for review

ed51e5f

garrett4wade reviewed Sep 30, 2025

View reviewed changes

.

ff69805

ZiyiTsang had a problem deploying to AReaL-unittests October 12, 2025 15:36 — with GitHub Actions Failure

.

488e652

ZiyiTsang had a problem deploying to AReaL-unittests October 13, 2025 06:39 — with GitHub Actions Failure

ZiyiTsang commented Oct 13, 2025

View reviewed changes

ZiyiTsang added 2 commits October 12, 2025 23:32

.

63cd9a3

Merge branch 'main' into MAPO

987044e

ZiyiTsang had a problem deploying to AReaL-unittests October 13, 2025 08:36 — with GitHub Actions Error

ZiyiTsang requested a review from garrett4wade October 13, 2025 08:36

ZiyiTsang had a problem deploying to AReaL-unittests October 13, 2025 08:37 — with GitHub Actions Failure

.

71f51de

ZiyiTsang had a problem deploying to AReaL-unittests October 13, 2025 09:18 — with GitHub Actions Failure

github-actions bot added the stale label Oct 29, 2025

github-actions bot removed the stale label Nov 15, 2025

New Algorithm: MAPO implementation #388

Are you sure you want to change the base?

New Algorithm: MAPO implementation #388

Uh oh!

Conversation

ZiyiTsang commented Sep 25, 2025

Uh oh!

This comment was marked as resolved.

Uh oh!

This comment was marked as resolved.

Uh oh!

This comment was marked as resolved.

This comment was marked as resolved.

Uh oh!

This comment was marked as resolved.

This comment was marked as resolved.

Uh oh!

This comment was marked as resolved.

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ZiyiTsang commented Sep 28, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

garrett4wade Sep 30, 2025

Choose a reason for hiding this comment

Uh oh!

ZiyiTsang Oct 13, 2025

Choose a reason for hiding this comment

Uh oh!

ZiyiTsang Oct 13, 2025

Choose a reason for hiding this comment

Uh oh!

garrett4wade Sep 30, 2025

Choose a reason for hiding this comment

Uh oh!

ZiyiTsang Oct 13, 2025

Choose a reason for hiding this comment

Uh oh!

ZiyiTsang Oct 13, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

ZiyiTsang left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Oct 29, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

ZiyiTsang left a comment •

edited

Loading