Skip to content

Commit d6b8ae2

Browse files
authored
Merge branch 'master' into fix/save-last-checkpoint
2 parents edd891a + ce038e8 commit d6b8ae2

File tree

53 files changed

+393
-84
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

53 files changed

+393
-84
lines changed

.github/CONTRIBUTING.md

Lines changed: 45 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@
33
Welcome to the PyTorch Lightning community! We're building the most advanced research platform on the planet to implement the latest, best practices
44
and integrations that the amazing PyTorch team and other research organization rolls out!
55

6-
If you are new to open source, check out [this blog to get started with your first Open Source contribution](https://devblog.pytorchlightning.ai/quick-contribution-guide-86d977171b3a).
6+
If you are new to open source, check out [this blog to get started with your first Open Source contribution](https://medium.com/pytorch-lightning/quick-contribution-guide-86d977171b3a).
77

88
## Main Core Value: One less thing to remember
99

@@ -109,6 +109,50 @@ ______________________________________________________________________
109109

110110
## Guidelines
111111

112+
### Development environment
113+
114+
To set up a local development environment, we recommend using `uv`, which can be installed following their [instructions](https://docs.astral.sh/uv/getting-started/installation/).
115+
116+
Once `uv` has been installed, begin by cloning the forked repository:
117+
118+
```bash
119+
git clone https://github.com/{YOUR_GITHUB_USERNAME}/pytorch-lightning.git
120+
cd pytorch-lightning
121+
```
122+
123+
> If you're using [Lightning Studio](https://lightning.ai) or already have your `uv venv` activated, you can quickly set up the project by running:
124+
125+
```bash
126+
make setup
127+
```
128+
129+
This will:
130+
131+
- Install all required dependencies.
132+
- Perform an editable install of the `pytorch-lightning` project.
133+
- Install and configure `pre-commit`.
134+
135+
#### Manual Setup (Optional)
136+
137+
If you prefer more fine-grained control over the dependencies, you can set up the environment manually:
138+
139+
```bash
140+
uv venv
141+
# uv venv --python 3.11 # use this instead if you need a specific python version
142+
143+
source .venv/bin/activate # command may differ based on your shell
144+
uv pip install ".[dev, examples]"
145+
```
146+
147+
Once the dependencies have been installed, install pre-commit and set up the git hook scripts:
148+
149+
```bash
150+
uv pip install pre-commit
151+
pre-commit install
152+
```
153+
154+
If you would like more information regarding the uv commands, please refer to uv's documentation for more information on their [pip interface](https://docs.astral.sh/uv/pip/).
155+
112156
### Developments scripts
113157

114158
To build the documentation locally, simply execute the following commands from project root (only for Unix):

.github/checkgroup.yml

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -135,7 +135,8 @@ subprojects:
135135
- "build-pl (3.11, 2.4, 12.1.1)"
136136
- "build-pl (3.12, 2.5, 12.1.1)"
137137
- "build-pl (3.12, 2.6, 12.4.1)"
138-
- "build-pl (3.12, 2.7, 12.6.3, true)"
138+
- "build-pl (3.12, 2.7, 12.6.3)"
139+
- "build-pl (3.12, 2.8, 12.6.3, true)"
139140

140141
# SECTION: lightning_fabric
141142

.github/markdown-links-config.json

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -22,5 +22,9 @@
2222
"Accept-Encoding": "zstd, br, gzip, deflate"
2323
}
2424
}
25-
]
25+
],
26+
"timeout": "20s",
27+
"retryOn429": true,
28+
"retryCount": 5,
29+
"fallbackRetryDelay": "20s"
2630
}

.github/workflows/call-clear-cache.yml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -23,7 +23,7 @@ on:
2323
jobs:
2424
cron-clear:
2525
if: github.event_name == 'schedule' || github.event_name == 'pull_request'
26-
uses: Lightning-AI/utilities/.github/workflows/cleanup-caches.yml@v0.14.3
26+
uses: Lightning-AI/utilities/.github/workflows/cleanup-caches.yml@v0.15.0
2727
with:
2828
scripts-ref: v0.14.3
2929
dry-run: ${{ github.event_name == 'pull_request' }}
@@ -32,7 +32,7 @@ jobs:
3232

3333
direct-clear:
3434
if: github.event_name == 'workflow_dispatch' || github.event_name == 'pull_request'
35-
uses: Lightning-AI/utilities/.github/workflows/cleanup-caches.yml@v0.14.3
35+
uses: Lightning-AI/utilities/.github/workflows/cleanup-caches.yml@v0.15.0
3636
with:
3737
scripts-ref: v0.14.3
3838
dry-run: ${{ github.event_name == 'pull_request' }}

.github/workflows/ci-schema.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@ on:
88

99
jobs:
1010
check:
11-
uses: Lightning-AI/utilities/.github/workflows/check-schema.yml@v0.14.3
11+
uses: Lightning-AI/utilities/.github/workflows/check-schema.yml@v0.15.0
1212
with:
1313
# skip azure due to the wrong schema file by MSFT
1414
# https://github.com/Lightning-AI/lightning-flash/pull/1455#issuecomment-1244793607

.github/workflows/docker-build.yml

Lines changed: 7 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -49,7 +49,8 @@ jobs:
4949
- { python_version: "3.11", pytorch_version: "2.4", cuda_version: "12.1.1" }
5050
- { python_version: "3.12", pytorch_version: "2.5", cuda_version: "12.1.1" }
5151
- { python_version: "3.12", pytorch_version: "2.6", cuda_version: "12.4.1" }
52-
- { python_version: "3.12", pytorch_version: "2.7", cuda_version: "12.6.3", latest: "true" }
52+
- { python_version: "3.12", pytorch_version: "2.7", cuda_version: "12.6.3" }
53+
- { python_version: "3.12", pytorch_version: "2.8", cuda_version: "12.6.3", latest: "true" }
5354
steps:
5455
- uses: actions/checkout@v4
5556
with:
@@ -97,7 +98,7 @@ jobs:
9798
# adding dome more images as Thunder mainly using python 3.10,
9899
# and we need to support integrations as for example LitGPT
99100
python_version: ["3.10"]
100-
pytorch_version: ["2.6.0", "2.7.1"]
101+
pytorch_version: ["2.7.1", "2.8.0"]
101102
cuda_version: ["12.6.3"]
102103
include:
103104
# These are the base images for PL release docker images.
@@ -109,6 +110,7 @@ jobs:
109110
- { python_version: "3.12", pytorch_version: "2.5.1", cuda_version: "12.1.1" }
110111
- { python_version: "3.12", pytorch_version: "2.6.0", cuda_version: "12.4.1" }
111112
- { python_version: "3.12", pytorch_version: "2.7.1", cuda_version: "12.6.3" }
113+
- { python_version: "3.12", pytorch_version: "2.8.0", cuda_version: "12.6.3" }
112114
steps:
113115
- uses: actions/checkout@v4
114116
- uses: docker/setup-buildx-action@v3
@@ -129,6 +131,7 @@ jobs:
129131
PYTHON_VERSION=${{ matrix.python_version }}
130132
PYTORCH_VERSION=${{ matrix.pytorch_version }}
131133
CUDA_VERSION=${{ matrix.cuda_version }}
134+
MAKE_FLAGS="-j2"
132135
file: dockers/base-cuda/Dockerfile
133136
push: ${{ env.PUSH_NIGHTLY }}
134137
tags: "pytorchlightning/pytorch_lightning:base-cuda-py${{ matrix.python_version }}-torch${{ env.PT_VERSION }}-cuda${{ matrix.cuda_version }}"
@@ -157,6 +160,8 @@ jobs:
157160
continue-on-error: true
158161
uses: docker/build-push-action@v6
159162
with:
163+
build-args: |
164+
PYTORCH_VERSION="25.04"
160165
file: dockers/nvidia/Dockerfile
161166
push: false
162167
timeout-minutes: 55

Makefile

Lines changed: 18 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
.PHONY: test clean docs
1+
.PHONY: test clean docs setup
22

33
# to imitate SLURM set only single node
44
export SLURM_LOCALID=0
@@ -7,6 +7,23 @@ export SPHINX_MOCK_REQUIREMENTS=1
77
# install only Lightning Trainer packages
88
export PACKAGE_NAME=pytorch
99

10+
setup:
11+
uv pip install -r requirements.txt \
12+
-r requirements/pytorch/base.txt \
13+
-r requirements/pytorch/test.txt \
14+
-r requirements/pytorch/extra.txt \
15+
-r requirements/pytorch/strategies.txt \
16+
-r requirements/fabric/base.txt \
17+
-r requirements/fabric/test.txt \
18+
-r requirements/fabric/strategies.txt \
19+
-r requirements/typing.txt \
20+
-e ".[all]" \
21+
pre-commit
22+
pre-commit install
23+
@echo "-----------------------------"
24+
@echo "✅ Environment setup complete. Ready to Contribute ⚡️!"
25+
26+
1027
clean:
1128
# clean all temp runs
1229
rm -rf $(shell find . -name "mlruns")

README.md

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -55,6 +55,12 @@ ______________________________________________________________________
5555

5656
 
5757

58+
# Why PyTorch Lightning?
59+
60+
Training models in plain PyTorch is tedious and error-prone - you have to manually handle things like backprop, mixed precision, multi-GPU, and distributed training, often rewriting code for every new project. PyTorch Lightning organizes PyTorch code to automate those complexities so you can focus on your model and data, while keeping full control and scaling from CPU to multi-node without changing your core code. But if you want control of those things, you can still opt into more DIY.
61+
62+
Fun analogy: If PyTorch is Javascript, PyTorch Lightning is ReactJS or NextJS.
63+
5864
# Lightning has 2 core packages
5965

6066
[PyTorch Lightning: Train and deploy PyTorch at scale](#why-pytorch-lightning).

_notebooks

dockers/base-cuda/Dockerfile

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -19,8 +19,9 @@ ARG CUDA_VERSION=11.7.1
1919
FROM nvidia/cuda:${CUDA_VERSION}-runtime-ubuntu${UBUNTU_VERSION}
2020

2121
ARG PYTHON_VERSION=3.10
22-
ARG PYTORCH_VERSION=2.1
22+
ARG PYTORCH_VERSION=2.8
2323
ARG MAX_ALLOWED_NCCL=2.22.3
24+
ARG MAKE_FLAGS="-j$(nproc)"
2425

2526
SHELL ["/bin/bash", "-c"]
2627
# https://techoverflow.net/2019/05/18/how-to-fix-configuring-tzdata-interactive-input-when-building-docker-images/
@@ -30,8 +31,7 @@ ENV \
3031
PATH="$PATH:/root/.local/bin" \
3132
CUDA_TOOLKIT_ROOT_DIR="/usr/local/cuda" \
3233
MKL_THREADING_LAYER="GNU" \
33-
# MAKEFLAGS="-j$(nproc)"
34-
MAKEFLAGS="-j2"
34+
MAKEFLAGS=${MAKE_FLAGS}
3535

3636
RUN \
3737
CUDA_VERSION_MM=${CUDA_VERSION%.*} && \

0 commit comments

Comments
 (0)