Add old stack logging support to new stack by quic-abhamidi · Pull Request #889 · quic/efficient-transformers

quic-abhamidi · 2026-03-26T06:30:14Z

Added the following support for easy visualization of training and validation statistics:
1. train_logger callback function which captures the per epoch time, per epoch loss metric and per epoch perplexity
2. This function also captures number of trainable parameters, number of samples in training and eval dataset
3. All these are logged into a log file which can be given as an input by user by setting the flag --log_file_path in the input config .yaml file.

quic-akuruvil · 2026-03-26T06:37:47Z

QEfficient/finetune/experimental/core/callbacks.py

+        # Compute perplexity safely
+        train_metric = None
+        if train_loss is not None:
+            train_metric = math.exp(train_loss)


Verify the train_metric values, check if there is a step wise match, wrt to the old FT stack. Use the same sdk, and same seed and data_seed on both stacks, for reproducibility

Also use try block and handle in case metric value overflows

Yes, will add this check

quic-akuruvil

Also check PP, DDP and Multinode DDP APIs are logging properly, with the new change.

QEfficient/finetune/experimental/core/callbacks.py

quic-akuruvil · 2026-03-26T08:27:49Z

QEfficient/finetune/experimental/core/callbacks.py

+        if self.rank != 0:
+            return
+        logger.log_rank_zero(text)
+        with open(self.log_file, "a") as f:


It would be better to put inside try block, to catch any write errors

quic-akuruvil · 2026-03-26T08:37:08Z

QEfficient/finetune/experimental/core/callbacks.py

+        # Compute perplexity safely
+        train_metric = None
+        if train_loss is not None:
+            train_metric = math.exp(train_loss)


Also use try block and handle in case metric value overflows

quic-akuruvil · 2026-03-26T08:47:51Z

QEfficient/cloud/finetune_experimental.py

 from QEfficient.finetune.experimental.core.utils.training_config_utils import prepare_training_config

 logger = Logger(__name__)
+train_logger = TrainingLogger(rank=0)


In DDP case, this will fail I think. Please check. I believe we can't hardcode 0 here.

Will change this

quic-akuruvil · 2026-03-26T09:22:06Z

QEfficient/finetune/experimental/core/callbacks.py

+    # ----------------------------------------------------
+    # Safe write to log (only rank 0)
+    # ----------------------------------------------------
+    def _write(self, text):


Usually single underscore at the front is for private methods. But _write method is called outside function at finetune_experimental. Please check

Added the following support for easy visualization of training and validation statistics: 1. train_logger callback function which captures the per epoch time, per epoch loss metric and per epoch perplexity 2. This function also captures number of trainable parameters, number of samples in training and eval dataset 3. All these are logged into a log file which can be given as an input by user by setting the flag --log_file_path in the input config .yaml file. Signed-off-by: Anusha Bhamidipati <abhamidi@qti.qualcomm.com>

quic-abhamidi requested review from ochougul, quic-amitraj, quic-hemagnih and quic-rishinr as code owners March 26, 2026 06:30

quic-akuruvil reviewed Mar 26, 2026

View reviewed changes

QEfficient/finetune/experimental/core/callbacks.py Show resolved Hide resolved

quic-akuruvil requested changes Mar 26, 2026

View reviewed changes

QEfficient/finetune/experimental/core/callbacks.py Show resolved Hide resolved

quic-akuruvil reviewed Mar 26, 2026

View reviewed changes

quic-akuruvil requested changes Mar 26, 2026

View reviewed changes

quic-akuruvil reviewed Mar 26, 2026

View reviewed changes

quic-abhamidi force-pushed the loggerv2 branch from 68acdc4 to 8afcc87 Compare March 27, 2026 08:39

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add old stack logging support to new stack#889

Add old stack logging support to new stack#889
quic-abhamidi wants to merge 1 commit intoquic:ft_experimentalfrom
quic-abhamidi:loggerv2

quic-abhamidi commented Mar 26, 2026

Uh oh!

quic-akuruvil Mar 26, 2026

Uh oh!

quic-akuruvil Mar 26, 2026

Uh oh!

quic-abhamidi Mar 26, 2026

Uh oh!

quic-akuruvil left a comment •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

quic-akuruvil Mar 26, 2026

Uh oh!

quic-akuruvil Mar 26, 2026

Uh oh!

quic-akuruvil Mar 26, 2026

Uh oh!

quic-abhamidi Mar 26, 2026

Uh oh!

quic-akuruvil Mar 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

quic-abhamidi commented Mar 26, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

quic-akuruvil left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

quic-akuruvil left a comment •

edited

Loading