-
Notifications
You must be signed in to change notification settings - Fork 3.6k
Description
Bug description
When running the tests on a macOS with the MPS backend enabled, one test fails because it attempts to use float64
tensors. The MPS backend only supports float32
, float16
, and (in newer versions) bfloat16
-- not float64
. As a result, the test that runs with precision="64-true"
fail on MPS:
FAILED tests/tests_pytorch/utilities/test_model_summary.py::test_model_size_precision[64-true-mps] - TypeError: Cannot convert a MPS Tensor to float64 dtype as the MPS framework doesn't support float64. Please use float32 instead.
Apple's documentation confirms the lack of float64
support: https://developer.apple.com/documentation/metalperformanceshaders/mpsdatatype
Also, other issues on other projects have reported a similar problem:
- https://developer.apple.com/forums/thread/797778
- https://discuss.pytorch.org/t/apple-m1max-pytorch-error-typeerror-cannot-convert-a-mps-tensor-to-float64-dtype-as-the-mps-framework-doesnt-support-float64-please-use-float32-instead/196669
How to reproduce the bug
The steps executed to reproduce are described in https://github.com/Lightning-AI/pytorch-lightning/tree/master/tests. The commands are:
# clone the repo
git clone https://github.com/Lightning-AI/lightning.git
cd lightning
# install dependencies
export PACKAGE_NAME=pytorch
python -m pip install ".[dev, examples]"
# (optional) set up pre-commit
python -m pip install pre-commit
pre-commit install
# download legacy checkpoints
bash .actions/pull_legacy_checkpoints.sh
# run tests
make test
Failed Test
The test that is failing is the test_model_size_precision
in the file tests/tests_pytorch/utilities/test_model_summary.py
. It fails when parametrized with precision="64-true"
.
The code for the test is the following:
@pytest.mark.parametrize(
"accelerator",
[
pytest.param("gpu", marks=RunIf(min_cuda_gpus=1)),
pytest.param("mps", marks=RunIf(mps=True)),
],
)
@pytest.mark.parametrize("precision", ["16-true", "32-true", "64-true"])
def test_model_size_precision(tmp_path, accelerator, precision):
"""Test model size for different precision types."""
model = PreCalculatedModel(precision=int(precision.split("-")[0]))
trainer = Trainer(
default_root_dir=tmp_path,
accelerator=accelerator,
devices=1,
max_steps=1,
max_epochs=1,
precision=precision,
)
trainer.fit(model)
summary = summarize(model)
assert model.pre_calculated_model_size == summary.model_size
Suggestions
If the maintainers find this report correct, I suggest skipping the tests that use float64 on MPS. For example, add a pytest.skipif to disable unsupported precisions on MPS.
In that case that a fix is necessary, I would like to work on this as my first contribution to the project, if possible.
Error messages and logs
Error message of summarized test results:
FAILED tests/tests_pytorch/utilities/test_model_summary.py::test_model_size_precision[64-true-mps] - TypeError: Cannot convert a MPS Tensor to float64 dtype as the MPS framework doesn't support float64. Please use float32 instead.
Long error message describing test error:
_______________________________________________________ test_model_size_precision[64-true-mps] _______________________________________________________
tmp_path = PosixPath('/private/var/folders/y7/3kz_jfr578x9fgj7d4zzkdv80000gn/T/pytest-of-XXX/pytest-0/test_model_size_precision_64_t0')
accelerator = 'mps', precision = '64-true'
> ???
/XXX/test/pytorch-lightning/tests/tests_pytorch/utilities/test_model_summary.py:337:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
src/lightning/pytorch/trainer/trainer.py:575: in fit
call._call_and_handle_interrupt(
src/lightning/pytorch/trainer/call.py:49: in _call_and_handle_interrupt
return trainer_fn(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^
src/lightning/pytorch/trainer/trainer.py:613: in _fit_impl
self._run(model, ckpt_path=ckpt_path)
src/lightning/pytorch/trainer/trainer.py:1002: in _run
self.strategy.setup(self)
src/lightning/pytorch/strategies/strategy.py:155: in setup
self.model_to_device()
src/lightning/pytorch/strategies/single_device.py:79: in model_to_device
self.model.to(self.root_device)
src/lightning/fabric/utilities/device_dtype_mixin.py:59: in to
return super().to(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^
../../../../../mamba/envs/pl-dev/lib/python3.12/site-packages/torch/nn/modules/module.py:1369: in to
return self._apply(convert)
^^^^^^^^^^^^^^^^^^^^
../../../../../mamba/envs/pl-dev/lib/python3.12/site-packages/torch/nn/modules/module.py:928: in _apply
module._apply(fn)
../../../../../mamba/envs/pl-dev/lib/python3.12/site-packages/torch/nn/modules/module.py:955: in _apply
param_applied = fn(param)
^^^^^^^^^
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
t = Parameter containing:
tensor([[ 1.2140e-02, -1.7626e-01, -5.8662e-02, ..., -1.5010e-02,
1.7339e-01, -1.4900...4901e-02, 8.5089e-02, ..., -8.8245e-02,
-2.1469e-02, 2.5758e-03]], dtype=torch.float64, requires_grad=True)
def convert(t):
try:
if convert_to_format is not None and t.dim() in (4, 5):
return t.to(
device,
dtype if t.is_floating_point() or t.is_complex() else None,
non_blocking,
memory_format=convert_to_format,
)
> return t.to(
device,
dtype if t.is_floating_point() or t.is_complex() else None,
non_blocking,
)
E TypeError: Cannot convert a MPS Tensor to float64 dtype as the MPS framework doesn't support float64. Please use float32 instead.
../../../../../mamba/envs/pl-dev/lib/python3.12/site-packages/torch/nn/modules/module.py:1355: TypeError
Environment
Current environment:
- macOS Sequoia 15.3.2 (24D81)
- PyTorch 2.8.0
- PyTorch Lightning 2.6.0dev0 (commit b7ca4d3)
- Python 3.12.11
- Device: Apple M4 Pro (MPS backend)