Skip to content

Commit 346c9b6

Browse files
committed
Merge branch 'corajr-main' into main
This improves Mac M1 installation instructions and makes the environment easier to install.
2 parents c5e95ad + a528706 commit 346c9b6

File tree

3 files changed

+71
-57
lines changed

3 files changed

+71
-57
lines changed

README-Mac-MPS.md

Lines changed: 22 additions & 33 deletions
Original file line numberDiff line numberDiff line change
@@ -12,8 +12,7 @@ issue](https://github.com/CompVis/stable-diffusion/issues/25), and generally on
1212

1313
You have to have macOS 12.3 Monterey or later. Anything earlier than that won't work.
1414

15-
BTW, I haven't tested any of this on Intel Macs but I have read that one person
16-
got it to work.
15+
Tested on a 2022 Macbook M2 Air with 10-core GPU and 24 GB unified memory.
1716

1817
How to:
1918

@@ -22,24 +21,23 @@ git clone https://github.com/lstein/stable-diffusion.git
2221
cd stable-diffusion
2322
2423
mkdir -p models/ldm/stable-diffusion-v1/
25-
ln -s /path/to/ckpt/sd-v1-1.ckpt models/ldm/stable-diffusion-v1/model.ckpt
24+
PATH_TO_CKPT="$HOME/Documents/stable-diffusion-v-1-4-original" # or wherever yours is.
25+
ln -s "$PATH_TO_CKPT/sd-v1-4.ckpt" models/ldm/stable-diffusion-v1/model.ckpt
2626
27-
conda env create -f environment-mac.yaml
27+
CONDA_SUBDIR=osx-arm64 conda env create -f environment-mac.yaml
2828
conda activate ldm
2929
3030
python scripts/preload_models.py
31-
python scripts/orig_scripts/txt2img.py --prompt "a photograph of an astronaut riding a horse" --plms
31+
python scripts/dream.py --full_precision # half-precision requires autocast and won't work
3232
```
3333

34-
We have not gotten lstein's dream.py to work yet.
35-
36-
After you follow all the instructions and run txt2img.py you might get several errors. Here's the errors I've seen and found solutions for.
34+
After you follow all the instructions and run dream.py you might get several errors. Here's the errors I've seen and found solutions for.
3735

3836
### Is it slow?
3937

4038
Be sure to specify 1 sample and 1 iteration.
4139

42-
python ./scripts/txt2img.py --prompt "ocean" --ddim_steps 5 --n_samples 1 --n_iter 1
40+
python ./scripts/orig_scripts/txt2img.py --prompt "ocean" --ddim_steps 5 --n_samples 1 --n_iter 1
4341

4442
### Doesn't work anymore?
4543

@@ -94,10 +92,6 @@ get quick feedback.
9492

9593
python ./scripts/txt2img.py --prompt "ocean" --ddim_steps 5 --n_samples 1 --n_iter 1
9694

97-
### MAC: torch._C' has no attribute '_cuda_resetPeakMemoryStats' #234
98-
99-
We haven't fixed gotten dream.py to work on Mac yet.
100-
10195
### OSError: Can't load tokenizer for 'openai/clip-vit-large-patch14'...
10296

10397
python scripts/preload_models.py
@@ -108,7 +102,7 @@ Example error.
108102

109103
```
110104
...
111-
NotImplementedError: The operator 'aten::index.Tensor' is not current implemented for the MPS device. If you want this op to be added in priority during the prototype phase of this feature, please comment on [https://github.com/pytorch/pytorch/issues/77764](https://github.com/pytorch/pytorch/issues/77764). As a temporary fix, you can set the environment variable `PYTORCH_ENABLE_MPS_FALLBACK=1` to use the CPU as a fallback for this op. WARNING: this will be slower than running natively on MPS.
105+
NotImplementedError: The operator 'aten::_index_put_impl_' is not current implemented for the MPS device. If you want this op to be added in priority during the prototype phase of this feature, please comment on [https://github.com/pytorch/pytorch/issues/77764](https://github.com/pytorch/pytorch/issues/77764). As a temporary fix, you can set the environment variable `PYTORCH_ENABLE_MPS_FALLBACK=1` to use the CPU as a fallback for this op. WARNING: this will be slower than running natively on MPS.
112106
```
113107

114108
The lstein branch includes this fix in [environment-mac.yaml](https://github.com/lstein/stable-diffusion/blob/main/environment-mac.yaml).
@@ -137,27 +131,18 @@ still working on it.
137131

138132
OMP: Error #15: Initializing libiomp5.dylib, but found libomp.dylib already initialized.
139133

140-
There are several things you can do. First, you could use something
141-
besides Anaconda like miniforge. I read a lot of things online telling
142-
people to use something else, but I am stuck with Anaconda for other
143-
reasons.
144-
145-
Or you can try this.
146-
147-
export KMP_DUPLICATE_LIB_OK=True
148-
149-
Or this (which takes forever on my computer and didn't work anyway).
134+
You are likely using an Intel package by mistake. Be sure to run conda with
135+
the environment variable `CONDA_SUBDIR=osx-arm64`, like so:
150136

151-
conda install nomkl
137+
`CONDA_SUBDIR=osx-arm64 conda install ...`
152138

153-
This error happens with Anaconda on Macs, and
154-
[nomkl](https://stackoverflow.com/questions/66224879/what-is-the-nomkl-python-package-used-for)
155-
is supposed to fix the issue (it isn't a module but a fix of some
156-
sort). [There's more
157-
suggestions](https://stackoverflow.com/questions/53014306/error-15-initializing-libiomp5-dylib-but-found-libiomp5-dylib-already-initial),
158-
like uninstalling tensorflow and reinstalling. I haven't tried them.
139+
This error happens with Anaconda on Macs when the Intel-only `mkl` is pulled in by
140+
a dependency. [nomkl](https://stackoverflow.com/questions/66224879/what-is-the-nomkl-python-package-used-for)
141+
is a metapackage designed to prevent this, by making it impossible to install
142+
`mkl`, but if your environment is already broken it may not work.
159143

160-
Since I switched to miniforge I haven't seen the error.
144+
Do *not* use `os.environ['KMP_DUPLICATE_LIB_OK']='True'` or equivalents as this
145+
masks the underlying issue of using Intel packages.
161146

162147
### Not enough memory.
163148

@@ -226,4 +211,8 @@ What? Intel? On an Apple Silicon?
226211
The processor must support the Intel(R) Streaming SIMD Extensions 4.2 (Intel(R) SSE4.2) instructions.
227212
The processor must support the Intel(R) Advanced Vector Extensions (Intel(R) AVX) instructions.
228213

229-
This was actually the issue that I couldn't solve until I switched to miniforge.
214+
This is due to the Intel `mkl` package getting picked up when you try to install
215+
something that depends on it-- Rosetta can translate some Intel instructions but
216+
not the specialized ones here. To avoid this, make sure to use the environment
217+
variable `CONDA_SUBDIR=osx-arm64`, which restricts the Conda environment to only
218+
use ARM packages, and use `nomkl` as described above.

environment-mac.yaml

Lines changed: 46 additions & 22 deletions
Original file line numberDiff line numberDiff line change
@@ -1,33 +1,57 @@
11
name: ldm
22
channels:
3-
- apple
4-
- conda-forge
53
- pytorch-nightly
6-
- defaults
4+
- conda-forge
75
dependencies:
8-
- python=3.10.4
9-
- pip=22.1.2
6+
- python==3.9.13
7+
- pip==22.2.2
8+
9+
# pytorch-nightly, left unpinned
1010
- pytorch
11+
- torchmetrics
1112
- torchvision
12-
- numpy=1.23.1
13+
14+
# I suggest to keep the other deps sorted for convenience.
15+
# If you wish to upgrade to 3.10, try to run this:
16+
#
17+
# ```shell
18+
# CONDA_CMD=conda
19+
# sed -E 's/python==3.9.13/python==3.10.5/;s/ldm/ldm-3.10/;21,99s/- ([^=]+)==.+/- \1/' environment-mac.yaml > /tmp/environment-mac-updated.yml
20+
# CONDA_SUBDIR=osx-arm64 $CONDA_CMD env create -f /tmp/environment-mac-updated.yml && $CONDA_CMD list -n ldm-3.10 | awk ' {print " - " $1 "==" $2;} '
21+
# ```
22+
#
23+
# Unfortunately, as of 2022-08-31, this fails at the pip stage.
24+
- albumentations==1.2.1
25+
- coloredlogs==15.0.1
26+
- einops==0.4.1
27+
- grpcio==1.46.4
28+
- humanfriendly
29+
- imageio-ffmpeg==0.4.7
30+
- imageio==2.21.2
31+
- imgaug==0.4.0
32+
- kornia==0.6.7
33+
- mpmath==1.2.1
34+
- nomkl
35+
- numpy==1.23.2
36+
- omegaconf==2.1.1
37+
- onnx==1.12.0
38+
- onnxruntime==1.12.1
39+
- opencv==4.6.0
40+
- pudb==2022.1
41+
- pytorch-lightning==1.6.5
42+
- scipy==1.9.1
43+
- streamlit==1.12.2
44+
- sympy==1.10.1
45+
- tensorboard==2.9.0
46+
- transformers==4.21.2
1347
- pip:
14-
- albumentations==0.4.6
15-
- opencv-python==4.6.0.66
16-
- pudb==2019.2
17-
- imageio==2.9.0
18-
- imageio-ffmpeg==0.4.2
19-
- pytorch-lightning==1.4.2
20-
- omegaconf==2.1.1
21-
- test-tube>=0.7.5
22-
- streamlit==1.12.0
23-
- pillow==9.2.0
24-
- einops==0.3.0
25-
- torch-fidelity==0.3.0
26-
- transformers==4.19.2
27-
- torchmetrics==0.6.0
28-
- kornia==0.6.0
29-
- -e git+https://github.com/openai/CLIP.git@main#egg=clip
48+
- invisible-watermark
49+
- test-tube
50+
- tokenizers
51+
- torch-fidelity
52+
- -e git+https://github.com/huggingface/[email protected]#egg=diffusers
3053
- -e git+https://github.com/CompVis/taming-transformers.git@master#egg=taming-transformers
54+
- -e git+https://github.com/openai/CLIP.git@main#egg=clip
3155
- -e git+https://github.com/lstein/k-diffusion.git@master#egg=k-diffusion
3256
- -e .
3357
variables:

ldm/simplet2i.py

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -272,14 +272,15 @@ def process_image(image,seed):
272272
if not(width == self.width and height == self.height):
273273
width, height, _ = self._resolution_check(width, height, log=True)
274274

275-
scope = autocast if self.precision == 'autocast' else nullcontext
275+
scope = autocast if self.precision == 'autocast' and torch.cuda.is_available() else nullcontext
276276

277277
if sampler_name and (sampler_name != self.sampler_name):
278278
self.sampler_name = sampler_name
279279
self._set_sampler()
280280

281281
tic = time.time()
282-
torch.cuda.reset_peak_memory_stats() if self.device == 'cuda' else None
282+
if torch.cuda.is_available():
283+
torch.cuda.reset_peak_memory_stats()
283284
results = list()
284285

285286
try:

0 commit comments

Comments
 (0)