Skip to content

Conversation

DNXie
Copy link
Member

@DNXie DNXie commented Aug 28, 2025

Integrate metric logger into grpo training loop with some basic logging

image

@meta-cla meta-cla bot added the CLA Signed This label is managed by the Meta Open Source bot. label Aug 28, 2025
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You probably want to make sure that the input is a dictionary, like here: https://github.com/pytorch/torchtune/blob/67ab86b94de9e7ac7dd9850113ebe69e2bbd307c/recipes/full_finetune_distributed.py#L909

I think that we will have an abundance of metrics coming from dataset and reliability metrics. This is how i envisioned it being used: https://fb.workplace.com/groups/1189731669410969/permalink/1279384097112392/

I understand that its a a simple PR just to get logging started. Just sharing where I think we should land after a few iterations.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah great callout, but lets land this one for now and iterate to that point.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@felipemello1 Thanks for the suggestions! This is a simple PR to simply integrate the logger. In my next PR, I will add logging for various metrics using log_dict.
@joecummings Thanks for approval. I will go ahead and land this one for now.

@DNXie DNXie merged commit 1aa5ab3 into meta-pytorch:main Aug 28, 2025
4 checks passed
@DNXie DNXie deleted the integrate_metric_logger branch September 10, 2025 19:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Meta Open Source bot.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants