Compiled Models in Python

TobyRoseman · TobyRoseman · commit 4c54204bb06d · 2023-11-07T12:13:31.000-08:00
diff --git a/README.md b/README.md
@@ -429,7 +429,7 @@ This generally takes 15-20 minutes on an M1 MacBook Pro. Upon successful executi
 
 - `--refiner-version`: The refiner version name as published on the [Hugging Face Hub](https://huggingface.co/models?search=stable-diffusion). This is optional and if specified, this argument will convert and bundle the refiner unet alongside the model unet.
 
-- `--bundle-resources-for-swift-cli`: Compiles all 4 models and bundles them along with necessary resources for text tokenization into `<output-mlpackages-directory>/Resources` which should provided as input to the Swift package. This flag is not necessary for the diffusers-based Python pipeline.
+- `--bundle-resources-for-swift-cli`: Compiles all 4 models and bundles them along with necessary resources for text tokenization into `<output-mlpackages-directory>/Resources` which should provided as input to the Swift package. This flag is not necessary for the diffusers-based Python pipeline. However using these compiled models in Python will significantly speed up inference.
 
 - `--quantize-nbits`: Quantizes the weights of unet and text_encoder models down to 2, 4, 6 or 8 bits using a globally optimal k-means clustering algorithm. By default all models are weight-quantized to 16 bits even if this argument is not specified. Please refer to [this section](#compression-6-bits-and-higher for details and further guidance on weight compression.
 
@@ -455,11 +455,11 @@ This generally takes 15-20 minutes on an M1 MacBook Pro. Upon successful executi
 Run text-to-image generation using the example Python pipeline based on [diffusers](https://github.com/huggingface/diffusers):
 
 ```shell
-python -m python_coreml_stable_diffusion.pipeline --prompt "a photo of an astronaut riding a horse on mars" -i <output-mlpackages-directory> -o </path/to/output/image> --compute-unit ALL --seed 93
+python -m python_coreml_stable_diffusion.pipeline --prompt "a photo of an astronaut riding a horse on mars" -i <core-ml-model-directory> -o </path/to/output/image> --compute-unit ALL --seed 93
 ```
 Please refer to the help menu for all available arguments: `python -m python_coreml_stable_diffusion.pipeline -h`. Some notable arguments:
 
-- `-i`: Should point to the `-o` directory from Step 4 of [Converting Models to Core ML](#converting-models-to-coreml) section from above. If you had specified `--bundle-resources-for-swift-cli` during conversion, then `-i` should point to the resulting `Resources` folder which holds the compiled `.mlmodelc` files. The compiled models load much faster after first use.
+- `-i`: Should point to the `-o` directory from Step 4 of [Converting Models to Core ML](#converting-models-to-coreml) section from above. If you specified `--bundle-resources-for-swift-cli` during conversion, then use the resulting `Resources` folder (which holds the compiled `.mlmodelc` files). The compiled models load much faster after first use.
 - `--model-version`: If you overrode the default model version while converting models to Core ML, you will need to specify the same model version here.
 - `--compute-unit`: Note that the most performant compute unit for this particular implementation may differ across different hardware. `CPU_AND_GPU` or `CPU_AND_NE` may be faster than `ALL`. Please refer to the [Performance Benchmark](#performance-benchmark) section for further guidance.
 - `--scheduler`: If you would like to experiment with different schedulers, you may specify it here. For available options, please see the help menu. You may also specify a custom number of inference steps by `--num-inference-steps` which defaults to 50.
diff --git a/python_coreml_stable_diffusion/coreml_model.py b/python_coreml_stable_diffusion/coreml_model.py
@@ -159,8 +159,8 @@ def _load_mlpackage(submodule_name,
         logger.info(f"Loading {submodule_name} mlmodelc")
 
         # FixMe: Submodule names and compiled resources names differ. Can change if names match in the future.
-        submodule_names = ["text_encoder", "text_encoder_2", "unet", "vae_decoder"]
-        compiled_names = ['TextEncoder', 'TextEncoder2', 'Unet', 'VAEDecoder', 'VAEEncoder']
+        submodule_names = ["text_encoder", "text_encoder_2", "unet", "vae_decoder", "vae_encoder", "safety_checker"]
+        compiled_names = ['TextEncoder', 'TextEncoder2', 'Unet', 'VAEDecoder', 'VAEEncoder', 'SafetyChecker']
         name_map = dict(zip(submodule_names, compiled_names))
 
         cname = name_map[submodule_name] + '.mlmodelc'