feat(examples): add richer evaluation metrics to advanced-pytorch#6713
Open
SalimELMARDI wants to merge 3 commits intoflwrlabs:mainfrom
Open
feat(examples): add richer evaluation metrics to advanced-pytorch#6713SalimELMARDI wants to merge 3 commits intoflwrlabs:mainfrom
SalimELMARDI wants to merge 3 commits intoflwrlabs:mainfrom
Conversation
5 tasks
Contributor
There was a problem hiding this comment.
Pull request overview
This PR enhances the examples/advanced-pytorch evaluation reporting so federated runs expose richer quality signals (top-k and per-class accuracy) while keeping existing metric keys intact (accuracy/loss server-side, eval_acc/eval_loss client-side).
Changes:
- Extend
test(...)to compute top-3 accuracy and per-class top-1 accuracy (Fashion-MNIST, 10 classes). - Emit the new metrics from both
ClientApp.evaluateand server-side centralized evaluation. - Introduce a shared
NUM_CLASSESconstant to drive per-class metric generation.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
| examples/advanced-pytorch/pytorch_example/task.py | Computes top-3 and per-class accuracies and returns a metrics dict from test(...). |
| examples/advanced-pytorch/pytorch_example/client_app.py | Adds eval_acc_top3 and eval_acc_class_{i} metrics while preserving existing keys. |
| examples/advanced-pytorch/pytorch_example/server_app.py | Adds accuracy_top3 and accuracy_class_{i} metrics while preserving existing keys. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Issue
Description
The
advanced-pytorchexample currently reports only basic evaluation metrics (loss/accuracyon server-side andeval_loss/eval_accon client-side).This limits visibility into ranking quality and class-level behavior during federated runs.
Related issues/PRs
Supersedes the earlier quickstart-focused PR : #6638 (moved to
advanced-pytorchbased on @chongshenng feedback).Proposal
Explanation
This PR extends evaluation reporting in
examples/advanced-pytorchwhile keeping existing metric keys backward-compatible.Changes:
examples/advanced-pytorch/pytorch_example/task.py:test(...)to compute:class_accuracy_0...class_accuracy_9)torch.bincount-based accumulation for per-class stats.examples/advanced-pytorch/pytorch_example/client_app.py:eval_lossandeval_acceval_acc_top3eval_acc_class_0...eval_acc_class_9examples/advanced-pytorch/pytorch_example/server_app.py:lossandaccuracyaccuracy_top3accuracy_class_0...accuracy_class_9Validation:
flwr run .\examples\advanced-pytorch --stream --run-config "num-server-rounds=1 fraction-train=0.2 fraction-evaluate=0.2"Checklist
#contributions)Any other comments?
No API-breaking changes. Existing metric keys were preserved for compatibility.