Skip to content

Commit f014639

Browse files
authored
Merge branch 'master' into feat/dynamo_export_onnx
2 parents 819d3c8 + c6b6553 commit f014639

File tree

8 files changed

+51
-9
lines changed

8 files changed

+51
-9
lines changed

.github/CONTRIBUTING.md

Lines changed: 30 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -109,6 +109,36 @@ ______________________________________________________________________
109109

110110
## Guidelines
111111

112+
### Development environment
113+
114+
To set up a local development environment, we recommend using `uv`, which can be installed following their [instructions](https://docs.astral.sh/uv/getting-started/installation/).
115+
116+
Once `uv` has been installed, begin by cloning the repository:
117+
118+
```bash
119+
git clone https://github.com/Lightning-AI/lightning.git
120+
cd lightning
121+
```
122+
123+
Once in root level of the repository, create a new virtual environment and install the project dependencies.
124+
125+
```bash
126+
uv venv
127+
# uv venv --python 3.11 # use this instead if you need a specific python version
128+
129+
source .venv/bin/activate # command may differ based on your shell
130+
uv pip install ".[dev, examples]"
131+
```
132+
133+
Once the dependencies have been installed, install pre-commit and set up the git hook scripts:
134+
135+
```bash
136+
uv pip install pre-commit
137+
pre-commit install
138+
```
139+
140+
If you would like more information regarding the uv commands, please refer to uv's documentation for more information on their [pip interface](https://docs.astral.sh/uv/pip/).
141+
112142
### Developments scripts
113143

114144
To build the documentation locally, simply execute the following commands from project root (only for Unix):

requirements/ci.txt

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
setuptools <80.9.1
22
wheel <0.46.0
3-
awscli >=1.30.0, <1.41.0
3+
awscli >=1.30.0, <1.42.0
44
twine ==6.1.0
55
importlib-metadata <9.0.0
66
wget

requirements/fabric/test.txt

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
coverage ==7.9.1
1+
coverage ==7.9.2
22
numpy >=1.17.2, <1.27.0
33
pytest ==8.4.1
44
pytest-cov ==6.2.1

requirements/pytorch/base.txt

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
# NOTE: the upper bound for the package version is only set for CI stability, and it is dropped while installing this package
22
# in case you want to preserve/enforce restrictions on the latest compatible version, add "strict" as an in-line comment
33

4-
torch >=2.1.0, <2.8.0
4+
torch >=2.1.0, <=2.8.0
55
tqdm >=4.57.0, <4.68.0
66
PyYAML >5.4, <6.1.0
77
fsspec[http] >=2022.5.0, <2025.6.0

requirements/pytorch/test.txt

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
coverage ==7.9.1
1+
coverage ==7.9.2
22
pytest ==8.4.1
33
pytest-cov ==6.2.1
44
pytest-timeout ==2.4.0

src/lightning/pytorch/callbacks/model_checkpoint.py

Lines changed: 8 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -133,9 +133,15 @@ class ModelCheckpoint(Checkpoint):
133133
will only save checkpoints at epochs 0 < E <= N
134134
where both values for ``every_n_epochs`` and ``check_val_every_n_epoch`` evenly divide E.
135135
save_on_train_epoch_end: Whether to run checkpointing at the end of the training epoch.
136-
If this is ``False``, then the check runs at the end of the validation.
136+
If ``True``, checkpoints are saved at the end of every training epoch.
137+
If ``False``, checkpoints are saved at the end of validation.
138+
If ``None`` (default), checkpointing behavior is determined based on training configuration.
139+
If ``check_val_every_n_epoch != 1``, checkpointing will not be performed at the end of
140+
every training epoch. If there are no validation batches of data, checkpointing will occur at the
141+
end of the training epoch. If there is a non-default number of validation runs per training epoch
142+
(``val_check_interval != 1``), checkpointing is performed after validation.
137143
enable_version_counter: Whether to append a version to the existing file name.
138-
If this is ``False``, then the checkpoint files will be overwritten.
144+
If ``False``, then the checkpoint files will be overwritten.
139145
140146
Note:
141147
For extra customization, ModelCheckpoint includes the following attributes:

src/lightning/pytorch/trainer/connectors/accelerator_connector.py

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -453,10 +453,11 @@ def _check_strategy_and_fallback(self) -> None:
453453

454454
if (
455455
strategy_flag in FSDPStrategy.get_registered_strategies() or type(self._strategy_flag) is FSDPStrategy
456-
) and self._accelerator_flag not in ("cuda", "gpu"):
456+
) and not (self._accelerator_flag in ("cuda", "gpu") or isinstance(self._accelerator_flag, CUDAAccelerator)):
457457
raise ValueError(
458-
f"The strategy `{FSDPStrategy.strategy_name}` requires a GPU accelerator, but got:"
459-
f" {self._accelerator_flag}"
458+
f"The strategy `{FSDPStrategy.strategy_name}` requires a GPU accelerator, but received "
459+
f"`accelerator={self._accelerator_flag!r}`. Please set `accelerator='cuda'`, `accelerator='gpu'`,"
460+
" or pass a `CUDAAccelerator()` instance to use FSDP."
460461
)
461462
if strategy_flag in _DDP_FORK_ALIASES and "fork" not in torch.multiprocessing.get_all_start_methods():
462463
raise ValueError(

tests/tests_pytorch/trainer/connectors/test_accelerator_connector.py

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -582,6 +582,11 @@ class AcceleratorSubclass(CPUAccelerator):
582582
Trainer(accelerator=AcceleratorSubclass(), strategy=FSDPStrategySubclass())
583583

584584

585+
@RunIf(min_cuda_gpus=1)
586+
def test_check_fsdp_strategy_and_fallback_with_cudaaccelerator():
587+
Trainer(strategy="fsdp", accelerator=CUDAAccelerator())
588+
589+
585590
@mock.patch.dict(os.environ, {}, clear=True)
586591
def test_unsupported_tpu_choice(xla_available, tpu_available):
587592
# if user didn't set strategy, _Connector will choose the SingleDeviceXLAStrategy or XLAStrategy

0 commit comments

Comments
 (0)