feat(examples): add richer evaluation metrics to advanced-pytorch by SalimELMARDI · Pull Request #6713 · flwrlabs/flower

SalimELMARDI · 2026-03-07T12:08:56Z

Issue

Description

The advanced-pytorch example currently reports only basic evaluation metrics (loss/accuracy on server-side and eval_loss/eval_acc on client-side).
This limits visibility into ranking quality and class-level behavior during federated runs.

Related issues/PRs

Supersedes the earlier quickstart-focused PR : #6638 (moved to advanced-pytorch based on @chongshenng feedback).

Proposal

Explanation

This PR extends evaluation reporting in examples/advanced-pytorch while keeping existing metric keys backward-compatible.

Changes:

Updated examples/advanced-pytorch/pytorch_example/task.py:
- Extended test(...) to compute:
  - top-1 accuracy (existing behavior)
  - top-3 accuracy
  - per-class top-1 accuracy for Fashion-MNIST (class_accuracy_0 ... class_accuracy_9)
- Uses torch.bincount-based accumulation for per-class stats.
Updated examples/advanced-pytorch/pytorch_example/client_app.py:
- Kept existing eval_loss and eval_acc
- Added eval_acc_top3
- Added eval_acc_class_0 ... eval_acc_class_9
Updated examples/advanced-pytorch/pytorch_example/server_app.py:
- Kept existing loss and accuracy
- Added accuracy_top3
- Added accuracy_class_0 ... accuracy_class_9

Validation:

Ran a 1-round simulation locally:
- flwr run .\examples\advanced-pytorch --stream --run-config "num-server-rounds=1 fraction-train=0.2 fraction-evaluate=0.2"
Confirmed new client and server metrics are present in logs.

Checklist

Any other comments?

No API-breaking changes. Existing metric keys were preserved for compatibility.

Copilot

Pull request overview

This PR enhances the examples/advanced-pytorch evaluation reporting so federated runs expose richer quality signals (top-k and per-class accuracy) while keeping existing metric keys intact (accuracy/loss server-side, eval_acc/eval_loss client-side).

Changes:

Extend test(...) to compute top-3 accuracy and per-class top-1 accuracy (Fashion-MNIST, 10 classes).
Emit the new metrics from both ClientApp.evaluate and server-side centralized evaluation.
Introduce a shared NUM_CLASSES constant to drive per-class metric generation.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.

File	Description
examples/advanced-pytorch/pytorch_example/task.py	Computes top-3 and per-class accuracies and returns a metrics dict from `test(...)`.
examples/advanced-pytorch/pytorch_example/client_app.py	Adds `eval_acc_top3` and `eval_acc_class_{i}` metrics while preserving existing keys.
examples/advanced-pytorch/pytorch_example/server_app.py	Adds `accuracy_top3` and `accuracy_class_{i}` metrics while preserving existing keys.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

examples/advanced-pytorch/pytorch_example/task.py

feat(examples): add richer evaluation metrics to advanced-pytorch

1132d2d

Copilot AI review requested due to automatic review settings March 7, 2026 12:08

SalimELMARDI requested review from chongshenng, danieljanes, jafermarq, panh99, tanertopal and yan-gao-GY as code owners March 7, 2026 12:08

Copilot started reviewing on behalf of SalimELMARDI March 7, 2026 12:09 View session

SalimELMARDI mentioned this pull request Mar 7, 2026

feat(examples): Add top-k and per-class eval metrics to quickstart-pytorch #6638

Closed

5 tasks

Copilot AI reviewed Mar 7, 2026

View reviewed changes

examples/advanced-pytorch/pytorch_example/task.py Show resolved Hide resolved

examples/advanced-pytorch/pytorch_example/task.py Outdated Show resolved Hide resolved

docs(examples): update advanced-pytorch README metrics description

df62bbc

github-actions bot added the Contributor Used to determine what PRs (mainly) come from external contributors. label Mar 7, 2026

fix(examples): set eval mode and use per-example eval loss

0776d75

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(examples): add richer evaluation metrics to advanced-pytorch#6713

feat(examples): add richer evaluation metrics to advanced-pytorch#6713
SalimELMARDI wants to merge 3 commits intoflwrlabs:mainfrom
SalimELMARDI:feat/advanced-pytorch-rich-metrics

SalimELMARDI commented Mar 7, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

SalimELMARDI commented Mar 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Issue

Description

Related issues/PRs

Proposal

Explanation

Checklist

Any other comments?

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

SalimELMARDI commented Mar 7, 2026 •

edited

Loading