Add adapt_checkpoint_hparams hook for customizing checkpoint hyperparameter loading #21408

arrdel · 2025-12-06T02:58:56Z

What does this PR do?

This PR adds a public adapt_checkpoint_hparams() hook to LightningCLI that allows users to customize hyperparameters loaded from checkpoints before they are used to instantiate model classes. This solves the problem of loading checkpoints across different module classes (e.g., from TrainingModule to InferenceModule).

Problem

When using LightningCLI with checkpoints, hyperparameters saved during training are automatically loaded and applied when running other subcommands (test, predict, etc.). This is convenient when using the same module class, but fails when using a different class with incompatible __init__ parameters.

Example scenario:

# TrainingModule saves 'lr' hyperparameter
class TrainingModule(LightningModule):
    def __init__(self, lr: float = 1e-3):
        ...

# InferenceModule doesn't accept 'lr'
class InferenceModule(LightningModule):
    def __init__(self):  # No 'lr' parameter!
        ...

Running cli predict --ckpt_path checkpoint.ckpt with InferenceModule fails because the CLI tries to pass lr=1e-3 from the checkpoint to InferenceModule.__init__().

Solution

Added adapt_checkpoint_hparams() public method that users can override to customize loaded hyperparameters:

class MyCLI(LightningCLI):
    def adapt_checkpoint_hparams(self, checkpoint_hparams: Dict[str, Any]) -> Dict[str, Any]:
        # Remove training-specific hyperparameters
        checkpoint_hparams.pop("lr", None)
        checkpoint_hparams.pop("weight_decay", None)
        return checkpoint_hparams

Implementation Details

Added: adapt_checkpoint_hparams() public method in LightningCLI
Modified: _parse_ckpt_path() to call the hook after loading but before applying hyperparameters
Backward compatible: Default implementation returns hyperparameters unchanged
Flexible: Users can remove, modify, or completely disable checkpoint hyperparameters

Why This Approach?

As discussed in #21255, this is superior to alternatives:

Better than disabling checkpoint loading: Preserves valuable hyperparameter information (e.g., hidden_dim)
Better than CLI flags: Maintains consistency with Trainer parameter pattern
Better than modifying private methods: Provides official public API

Testing

The implementation:

✅ Maintains backward compatibility (existing code unaffected)
✅ Provides maximum flexibility via public hook
✅ Works with both regular and subclass module modes
✅ Handles _class_path modification when needed

Example Use Cases

Remove training-only parameters:

def adapt_checkpoint_hparams(self, hparams):
    hparams.pop("lr", None)
    return hparams

Change module class in subclass mode:

def adapt_checkpoint_hparams(self, hparams):
    hparams["_class_path"] = "mymodule.InferenceModule"
    return hparams

Disable all checkpoint hyperparameters:

def adapt_checkpoint_hparams(self, hparams):
    return {}

Does your PR introduce any breaking changes?

No, this is a purely additive change. The default implementation returns hyperparameters unchanged, preserving existing behavior.

Before submitting

Was this discussed/approved via a GitHub issue? Yes - Allow weight reuse in a different lightning module #21255
Did you read the contributor guideline?
Did you make sure your PR does only one thing, instead of bundling different changes together?
Did you make sure to update the documentation with your changes?
Did you write any new necessary tests?
Did you verify new and existing tests pass locally with your changes?
Did you list all the breaking changes introduced by this pull request?
Did you update the CHANGELOG?

PR review

Is this pull request ready for review? Yes
Check that all items from Before submitting are resolved
Make sure the title is self-explanatory and the description concisely explains the PR
Add labels and milestones (and optionally projects) to the PR so it can be classified

cc: @mauvilsa @ziw-liu

📚 Documentation preview 📚: https://pytorch-lightning--21408.org.readthedocs.build/en/21408/

…ameter loading Fixes Lightning-AI#21255 This commit adds the adapt_checkpoint_hparams() public method to LightningCLI, allowing users to customize hyperparameters loaded from checkpoints before they are used to instantiate model classes. This is particularly useful when using checkpoints from a TrainingModule with a different InferenceModule class that has different __init__ parameters. Problem: When loading a checkpoint trained with TrainingModule(lr=1e-3) into an InferenceModule() that doesn't accept 'lr' as a parameter, the CLI would fail during instantiation because it tries to pass all checkpoint hyperparameters to the new module class. Solution: Added adapt_checkpoint_hparams() hook that is called in _parse_ckpt_path() after loading checkpoint hyperparameters but before applying them. Users can override this method to: - Remove training-specific hyperparameters (e.g., lr, weight_decay) - Modify _class_path for subclass mode - Transform hyperparameter names/values - Completely disable checkpoint hyperparameters by returning {} Example usage: class MyCLI(LightningCLI): def adapt_checkpoint_hparams(self, checkpoint_hparams): checkpoint_hparams.pop('lr', None) checkpoint_hparams.pop('weight_decay', None) return checkpoint_hparams This approach is preferable to: - Disabling checkpoint loading entirely (loses valuable hyperparameter info) - Adding CLI arguments (deviates from Trainer parameter pattern) - Modifying private methods (breaks encapsulation) The hook provides maximum flexibility while maintaining backward compatibility (default implementation returns hyperparameters unchanged).

for more information, see https://pre-commit.ci

Copilot

Pull request overview

This PR adds a public adapt_checkpoint_hparams() hook to LightningCLI that enables users to customize hyperparameters loaded from checkpoints before model instantiation. This addresses the issue of loading checkpoints across different module classes (e.g., from TrainingModule to InferenceModule) where incompatible __init__ parameters would otherwise cause failures.

Key Changes:

Added adapt_checkpoint_hparams() public method with comprehensive documentation
Integrated the hook into _parse_ckpt_path() to allow customization before hyperparameter application
Maintained backward compatibility with a default no-op implementation

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2025-12-06T03:01:33Z

src/lightning/pytorch/cli.py

+    def adapt_checkpoint_hparams(self, checkpoint_hparams: Dict[str, Any]) -> Dict[str, Any]:
+        """Adapt checkpoint hyperparameters before instantiating the model class.
+
+        This method allows for customization of hyperparameters loaded from a checkpoint when
+        using a different model class than the one used for training. For example, when loading
+        a checkpoint from a TrainingModule to use with an InferenceModule that has different
+        ``__init__`` parameters, you can remove or modify incompatible hyperparameters.
+
+        Args:
+            checkpoint_hparams: Dictionary of hyperparameters loaded from the checkpoint.
+
+        Returns:
+            Dictionary of adapted hyperparameters to be used for model instantiation.
+
+        Example::
+
+            class MyCLI(LightningCLI):
+                def adapt_checkpoint_hparams(self, checkpoint_hparams: Dict[str, Any]) -> Dict[str, Any]:
+                    # Remove training-specific hyperparameters not needed for inference
+                    checkpoint_hparams.pop("lr", None)
+                    checkpoint_hparams.pop("weight_decay", None)
+                    return checkpoint_hparams
+
+        Note:
+            If subclass module mode is enabled and ``_class_path`` is present in the checkpoint
+            hyperparameters, you may need to modify it as well to point to your new module class.
+
+        """
+        return checkpoint_hparams


The new adapt_checkpoint_hparams() hook lacks test coverage. Given that tests/tests_pytorch/test_cli.py contains comprehensive tests for checkpoint loading functionality (e.g., test_lightning_cli_ckpt_path_argument_hparams and test_lightning_cli_ckpt_path_argument_hparams_subclass_mode), tests should be added to verify:

The hook is called when loading checkpoint hyperparameters

Modifications made in the hook are applied correctly

Returning an empty dict properly skips checkpoint hyperparameter loading

The hook works in both regular and subclass modes

Copilot · 2025-12-06T03:01:34Z

src/lightning/pytorch/cli.py

        else:
            self.config = parser.parse_args(args)

+    def adapt_checkpoint_hparams(self, checkpoint_hparams: Dict[str, Any]) -> Dict[str, Any]:


Use lowercase dict instead of Dict for type annotations to align with the modern Python 3.9+ style used throughout this file. Change Dict[str, Any] to dict[str, Any] in both the parameter and return type annotations.

Copilot · 2025-12-06T03:01:34Z

src/lightning/pytorch/cli.py

+        Example::
+
+            class MyCLI(LightningCLI):
+                def adapt_checkpoint_hparams(self, checkpoint_hparams: Dict[str, Any]) -> Dict[str, Any]:


Use lowercase dict instead of Dict for type annotations to align with the modern Python 3.9+ style used throughout this file. Change Dict[str, Any] to dict[str, Any] in both the parameter and return type annotations.

mauvilsa

It is looking good. However, the subcommand parameter is missing. Also please add unit tests.

mauvilsa · 2025-12-08T13:58:54Z

src/lightning/pytorch/cli.py

        else:
            self.config = parser.parse_args(args)

+    def adapt_checkpoint_hparams(self, checkpoint_hparams: Dict[str, Any]) -> Dict[str, Any]:


Suggested change

def adapt_checkpoint_hparams(self, checkpoint_hparams: Dict[str, Any]) -> Dict[str, Any]:

def adapt_checkpoint_hparams(self, subcommand: str, checkpoint_hparams: Dict[str, Any]) -> Dict[str, Any]:

As mentioned in my proposal, the method should receive a subcommand parameter.

mauvilsa · 2025-12-08T14:00:34Z

src/lightning/pytorch/cli.py

+                    checkpoint_hparams.pop("lr", None)
+                    checkpoint_hparams.pop("weight_decay", None)


In this example, removing lr and weight_decay should not be done if the subcommand is fit.

mauvilsa · 2025-12-08T14:01:11Z

src/lightning/pytorch/cli.py

                return
+
+            # Allow customization of checkpoint hyperparameters via adapt_checkpoint_hparams hook
+            hparams = self.adapt_checkpoint_hparams(hparams)


Suggested change

hparams = self.adapt_checkpoint_hparams(hparams)

hparams = self.adapt_checkpoint_hparams(subcommand, hparams)

…ook and add tests - Update adapt_checkpoint_hparams signature to include subcommand parameter allowing context-aware customization of checkpoint hyperparameters - Change type annotations to use lowercase dict (Python 3.9+ style) - Update docstring with subcommand parameter documentation - Add example showing conditional logic based on subcommand - Add comprehensive unit tests: - test_adapt_checkpoint_hparams_hook: Tests that hook is called and modifications applied - test_adapt_checkpoint_hparams_hook_empty_dict: Tests disabling checkpoint hparams loading - Tests cover both regular and subclass modes

for more information, see https://pre-commit.ci

arrdel · 2025-12-09T15:46:53Z

Thanks for the response.

I already updated Dict[str, Any] → dict[str, Any] for Python 3.9+ compatibility

Also added subcommand parameter to adapt_checkpoint_hparams() signature, now users can apply different adaptations based on context (fit vs predict vs test vs validate)

def adapt_checkpoint_hparams(self, subcommand: str, checkpoint_hparams: dict[str, Any]) -> dict[str, Any]:
    if subcommand != "fit":
        checkpoint_hparams.pop("lr", None)  # Remove training params for inference
    return checkpoint_hparams

I also included 2 comprehensive tests:

test_adapt_checkpoint_hparams_hook(): Verifies hook is called and modifications applied
test_adapt_checkpoint_hparams_hook_empty_dict(): Tests disabling checkpoint hparams

- Split method signature across multiple lines to stay within 120 char limit - Improves code readability in documentation example

mauvilsa

It is looking good. But the two tests fail. You will need to implement a new Model class for these tests.

mauvilsa · 2025-12-09T22:22:00Z

tests/tests_pytorch/test_cli.py

    assert cli.model.layer.out_features == 4


+def test_adapt_checkpoint_hparams_hook(cleandir):


Suggested change

def test_adapt_checkpoint_hparams_hook(cleandir):

def test_adapt_checkpoint_hparams_hook_pop_keys(cleandir):

mauvilsa · 2025-12-09T22:25:16Z

tests/tests_pytorch/test_cli.py

+        def add_arguments_to_parser(self, parser):
+            parser.link_arguments("model.out_dim", "model.hidden_dim", compute_fn=lambda x: x * 2)
+


Suggested change

def add_arguments_to_parser(self, parser):

parser.link_arguments("model.out_dim", "model.hidden_dim", compute_fn=lambda x: x * 2)

Linking of arguments is not relevant to test this hook. Better to not have it to avoid distraction.

mauvilsa · 2025-12-09T22:25:27Z

tests/tests_pytorch/test_cli.py

+        def add_arguments_to_parser(self, parser):
+            parser.link_arguments("model.out_dim", "model.hidden_dim", compute_fn=lambda x: x * 2)
+


Suggested change

def add_arguments_to_parser(self, parser):

parser.link_arguments("model.out_dim", "model.hidden_dim", compute_fn=lambda x: x * 2)

Linking of arguments is not relevant to test this hook. Better to not have it to avoid distraction.

mauvilsa · 2025-12-09T22:32:39Z

tests/tests_pytorch/test_cli.py

+    # First, create a checkpoint
+    cli_args = ["fit", "--model.out_dim=3", "--trainer.max_epochs=1"]
+    with mock.patch("sys.argv", ["any.py"] + cli_args):
+        cli = AdaptHparamsEmptyCLI(BoringCkptPathModel)


The test fails because of BoringCkptPathModel has a module torch.nn.Linear(32, out_dim). If the out_dim is changed, then there is a tensor size mismatch.

Instead of using BoringCkptPathModel, implement a new class for these two tests, that just sets an attribute that can be asserted after instantiation.

Copilot AI review requested due to automatic review settings December 6, 2025 02:58

arrdel requested review from ethanwharris, justusschock, lantiga and tchaton as code owners December 6, 2025 02:58

github-actions bot added the pl Generic label for PyTorch Lightning package label Dec 6, 2025

[pre-commit.ci] auto fixes from pre-commit.com hooks

ad1a028

for more information, see https://pre-commit.ci

Copilot started reviewing on behalf of arrdel December 6, 2025 02:59 View session

Copilot AI reviewed Dec 6, 2025

View reviewed changes

mauvilsa suggested changes Dec 8, 2025

View reviewed changes

arrdel and others added 2 commits December 9, 2025 10:39

[pre-commit.ci] auto fixes from pre-commit.com hooks

00e7032

for more information, see https://pre-commit.ci

fix: Break long line in adapt_checkpoint_hparams docstring example

b3b1025

- Split method signature across multiple lines to stay within 120 char limit - Improves code readability in documentation example

mauvilsa suggested changes Dec 9, 2025

View reviewed changes

	def adapt_checkpoint_hparams(self, checkpoint_hparams: Dict[str, Any]) -> Dict[str, Any]:
	def adapt_checkpoint_hparams(self, subcommand: str, checkpoint_hparams: Dict[str, Any]) -> Dict[str, Any]:

		checkpoint_hparams.pop("lr", None)
		checkpoint_hparams.pop("weight_decay", None)

	hparams = self.adapt_checkpoint_hparams(hparams)
	hparams = self.adapt_checkpoint_hparams(subcommand, hparams)

		assert cli.model.layer.out_features == 4


		def test_adapt_checkpoint_hparams_hook(cleandir):

	def test_adapt_checkpoint_hparams_hook(cleandir):
	def test_adapt_checkpoint_hparams_hook_pop_keys(cleandir):

		def add_arguments_to_parser(self, parser):
		parser.link_arguments("model.out_dim", "model.hidden_dim", compute_fn=lambda x: x * 2)

Add adapt_checkpoint_hparams hook for customizing checkpoint hyperparameter loading #21408

Are you sure you want to change the base?

Add adapt_checkpoint_hparams hook for customizing checkpoint hyperparameter loading #21408

Conversation

arrdel commented Dec 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Problem

Solution

Implementation Details

Why This Approach?

Testing

Example Use Cases

Does your PR introduce any breaking changes?

Before submitting

PR review

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Dec 6, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 6, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 6, 2025

Choose a reason for hiding this comment

Uh oh!

mauvilsa left a comment

Choose a reason for hiding this comment

Uh oh!

mauvilsa Dec 8, 2025

Choose a reason for hiding this comment

Uh oh!

mauvilsa Dec 8, 2025

Choose a reason for hiding this comment

Uh oh!

mauvilsa Dec 8, 2025

Choose a reason for hiding this comment

Uh oh!

arrdel commented Dec 9, 2025

Uh oh!

mauvilsa left a comment

Choose a reason for hiding this comment

Uh oh!

mauvilsa Dec 9, 2025

Choose a reason for hiding this comment

Uh oh!

mauvilsa Dec 9, 2025

Choose a reason for hiding this comment

Uh oh!

mauvilsa Dec 9, 2025

Choose a reason for hiding this comment

Uh oh!

mauvilsa Dec 9, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

arrdel commented Dec 6, 2025 •

edited

Loading