oracle-devrel · mariapatelkou · Apr 24, 2025 · Apr 24, 2025 · Apr 24, 2025 · Apr 24, 2025
diff --git a/cloud-infrastructure/ai-infra-gpu/ai-infrastructure/flux-1-finetuning/LICENSE b/cloud-infrastructure/ai-infra-gpu/ai-infrastructure/flux-1-finetuning/LICENSE
@@ -0,0 +1,35 @@
+Copyright (c) 2021, 2023 Oracle and/or its affiliates.
+
+The Universal Permissive License (UPL), Version 1.0
+
+Subject to the condition set forth below, permission is hereby granted to any
+person obtaining a copy of this software, associated documentation and/or data
+(collectively the "Software"), free of charge and under any and all copyright
+rights in the Software, and any and all patent rights owned or freely
+licensable by each licensor hereunder covering either (i) the unmodified
+Software as contributed to or provided by such licensor, or (ii) the Larger
+Works (as defined below), to deal in both
+
+(a) the Software, and
+(b) any piece of software and/or hardware listed in the lrgrwrks.txt file if
+one is included with the Software (each a "Larger Work" to which the Software
+is contributed by such licensors),
+
+without restriction, including without limitation the rights to copy, create
+derivative works of, display, perform, and distribute the Software and make,
+use, sell, offer for sale, import, export, have made, and have sold the
+Software and the Larger Work(s), and to sublicense the foregoing rights on
+either these or other terms.
+
+This license is subject to the following condition:
+The above copyright notice and either this complete permission notice or at
+a minimum a reference to the UPL must be included in all copies or
+substantial portions of the Software.
+
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+SOFTWARE.
diff --git a/cloud-infrastructure/ai-infra-gpu/ai-infrastructure/flux-1-finetuning/README.md b/cloud-infrastructure/ai-infra-gpu/ai-infrastructure/flux-1-finetuning/README.md
@@ -0,0 +1,138 @@
+# flux-1-finetuning
+
+This repo is designed to quickly  prepare a demo that showcases how to use OCI GPU shapes to finetune lora models of flux.1-dev. 
+
+Flux is a set of diffusion models created by BlackForest Labs, and are subject to [licensing terms](https://github.com/black-forest-labs/flux/blob/main/model_licenses/LICENSE-FLUX1-dev)
+In this blog we will show how to fine tune flux.1-dev that is available in [HuggingFace](https://huggingface.co/black-forest-labs/FLUX.1-dev)
+
+There are several projects that can be used to work with flux models
+- [ComfyUI](https://github.com/comfyanonymous/ComfyUI) is a powerful and user-friendly tool for creating high-quality images using AI, including the FLUX model. It offers a modular workflow design that allows users to create custom image generation processes by connecting different components
+- [AItoolkit](https://github.com/comfyanonymous/ComfyUI) is a tool that simplifies Flux fine tuning experience expoecially to reduce VRAM requirements. 
+- [SimpleTuner](https://github.com/bghira/SimpleTuner) is a set of scripts that simplify distributed fine tuning on multiple GPUs 
+
+Prerequisites: 
+- Linux based GPU VM with recent Nvidia driver and Cuda toolkit
+- git, Miniconda installed
+- An account in HuggingFace where you can login with huggingface-cli login
+
+## Installing AI toolkit ##
+
+use aitoolkit.yaml  to prepare a conda environment with the required packages 
+
+```
+conda create env -f aitoolkit.yaml
+conda activate aitoolkit
+```
+
+then you can clone the ai-toolkit 
+```
+git clone https://github.com/ostris/ai-toolkit.git
+cd ai-toolkit
+git submodule update --init --recursive
+```
+
+## Dataset generation ##
+
+You can take 20-30 pictures of yourself.Use  high-resolution images, ideally at least 1024x1024 pixels. If your images are larger, crop them to a 1:1 aspect ratio (square), centering your face or the main subject in each image. Ensure all images are sharp, in focus, and free of blur or artifacts. Avoid including any low-quality or poorly lit images. Each image should feature only you as the main subject, clearly visible and centered. Avoid group photos or images with distracting backgrounds. Maximize diversity by taking photos in different environments, with varied backgrounds, lighting conditions, facial expressions, and outfits. This helps the model generalize and prevents overfitting to a single look or scenario
+
+
+## Training
+
+Aitoolkit has a large set of options that can be exploited to train a lora model for flux1. You can find examples in the directory config/examples/.
+According to the different GPUs you can use them to either reduce video memory consumption, or to improve training performance. 
+
+
+- folder_path: "/path/to/images/folder" , speicfy where the dataset is
+- gradient_checkpointing: true  This feature allows to reduce memory footprint, but this increases computation time by about 35%. On large memory GPUs it is convenient to set it to false.
+- model:quantize: true  This uses intermediate 8bit quantization to reduce memory footprint, the final model will still be 16bits, so turn it on only on small GPUs.
+- model:low_vram: true  This further reduces memory footprint on very small GPUs.
+- prompts: this is a list of prompts that are used the create intermidiate images to chck quality, for analyzing performances you can remove them    
+- batch_size: 1 increasing batch size on a single GPU deteriorates performance, recommended to stick with 1.
+- trigger_word: a GPU Specialist   Here you set the keyword that you can use in the prompt.
+
+## Installing ComfyUI
+
+ComfyUI can be used the test the generated lora model. It can be installed in the same conda env as Attoolkit
+
+```
+git clone https://github.com/comfyanonymous/ComfyUI/tree/v0.3.10
+cd ComfyUI
+python main.py
+```
+
+You can connect to the ComfyUI GUI by pointing your browser to port 8188, according to the network configuration a port forward might be required. 
+
+Then you need to import models that are required by the workflow.
+
+Download the [Clip Safetensor](https://huggingface.co/comfyanonymous/flux_text_encoders/blob/main/clip_l.safetensors) to ComfyUI/models/clip/
+This model plays a crucial role in text-to-image generation tasks by processing and encoding textual input.
+
+Download the Text Encoder [T5xxl Safetensor](https://huggingface.co/comfyanonymous/flux_text_encoders/blob/main/t5xxl_fp8_e4m3fn.safetensors) to ComfyUI/models/clip/
+
+Download the [VAE Safetensor](https://huggingface.co/black-forest-labs/FLUX.1-schnell/blob/main/ae.safetensors) to ComfyUI/models/vae
+
+Download the [Flux.1-dev UNET model](https://huggingface.co/black-forest-labs/FLUX.1-dev/tree/main) to ComfyUI/models/unet
+
+## Testing Lora models with ComfyUI
+
+Everytime you create a lora model with Ai-toolkit you can copy it to ComfyUI/models/lora 
+
+Import the workflow by opening the file workflow-lora.json 
+
+You will then be able to select the model in the Load Lora box. Make sure also the proper models are selected in the Load diffusion Model, DualCLIPLoader, and Load VAE boxes.
+
+You can write your own prompt in the CLIP Text Encode box, remeber to refer to the keyword used for training the Lora.
+![Alt text](files/ComfyUI.png?raw=true "ComfyUI Lora workflow")
+
+## Installing SimpleTuner
+
+```
+git clone --branch=release https://github.com/bghira/SimpleTuner.git
+```
+Copy config/config.json.example to config/config.json
+
+Then you execute the training with
+
+```
+./train.sh
+```
+
+Parallel training is possible using Accelerate (the Deepspeed implementation on Flux is buggy at the time of writing.
+When more GPUs are used, the batch size is increased automatically, so the number os steps required to process one full epoch is riduced proportionally.
+
+
+If present the Accelerate configuration will be taken from the config file in
+
+~/.cache/huggingface/accelerate/default_config.yaml
+
+```
+compute_environment: LOCAL_MACHINE
+debug: false
+distributed_type: *MULTI_GPU*
+downcast_bf16: 'no'
+enable_cpu_affinity: true
+gpu_ids: all
+machine_rank: 0
+main_training_function: main
+mixed_precision: bf16
+num_machines: 1
+num_processes: 4
+rdzv_backend: static
+same_network: true
+tpu_env: []
+tpu_use_cluster: false
+tpu_use_sudo: false
+use_cpu: false
+```
+
+If this file is not present you can create a file config/config.env and use it to set this environmental variable:
+
+```
+TRAINING_NUM_PROCESSES=4
+```
+
+
+
+
+
+
diff --git a/...frastructure/ai-infra-gpu/ai-infrastructure/flux-1-finetuning/files/ComfyUI.png b/...frastructure/ai-infra-gpu/ai-infrastructure/flux-1-finetuning/files/ComfyUI.png
diff --git a/cloud-infrastructure/ai-infra-gpu/ai-infrastructure/flux-1-finetuning/files/aitoolkit.yaml b/cloud-infrastructure/ai-infra-gpu/ai-infrastructure/flux-1-finetuning/files/aitoolkit.yaml
@@ -0,0 +1,173 @@
+name: aitoolkit
+channels:
+  - defaults
+  - https://repo.anaconda.com/pkgs/main
+  - https://repo.anaconda.com/pkgs/r
+dependencies:
+  - _libgcc_mutex=0.1=main
+  - _openmp_mutex=5.1=1_gnu
+  - bzip2=1.0.8=h5eee18b_6
+  - ca-certificates=2024.11.26=h06a4308_0
+  - ld_impl_linux-64=2.40=h12ee557_0
+  - libffi=3.4.4=h6a678d5_1
+  - libgcc-ng=11.2.0=h1234567_1
+  - libgomp=11.2.0=h1234567_1
+  - libstdcxx-ng=11.2.0=h1234567_1
+  - libuuid=1.41.5=h5eee18b_0
+  - ncurses=6.4=h6a678d5_0
+  - openssl=3.0.15=h5eee18b_0
+  - pip=24.2=py311h06a4308_0
+  - python=3.11.10=he870216_0
+  - readline=8.2=h5eee18b_0
+  - setuptools=75.1.0=py311h06a4308_0
+  - sqlite=3.45.3=h5eee18b_0
+  - tk=8.6.14=h39e8969_0
+  - wheel=0.44.0=py311h06a4308_0
+  - xz=5.4.6=h5eee18b_1
+  - zlib=1.2.13=h5eee18b_1
+  - pip:
+      - absl-py==2.1.0
+      - accelerate==1.2.1
+      - aiofiles==23.2.1
+      - albucore==0.0.16
+      - albumentations==1.4.15
+      - annotated-types==0.7.0
+      - antlr4-python3-runtime==4.9.3
+      - anyio==4.7.0
+      - attrs==24.3.0
+      - bitsandbytes==0.45.0
+      - certifi==2024.12.14
+      - charset-normalizer==3.4.0
+      - clean-fid==0.1.35
+      - click==8.1.7
+      - clip-anytorch==2.6.0
+      - controlnet-aux==0.0.7
+      - dctorch==0.1.2
+      - diffusers==0.32.0.dev0
+      - docker-pycreds==0.4.0
+      - einops==0.8.0
+      - eval-type-backport==0.2.0
+      - fastapi==0.115.6
+      - ffmpy==0.5.0
+      - filelock==3.16.1
+      - flatten-json==0.1.14
+      - fsspec==2024.12.0
+      - ftfy==6.3.1
+      - gitdb==4.0.11
+      - gitpython==3.1.43
+      - gradio==5.9.1
+      - gradio-client==1.5.2
+      - grpcio==1.68.1
+      - h11==0.14.0
+      - hf-transfer==0.1.8
+      - httpcore==1.0.7
+      - httpx==0.28.1
+      - huggingface-hub==0.27.0
+      - idna==3.10
+      - imageio==2.36.1
+      - importlib-metadata==8.5.0
+      - invisible-watermark==0.2.0
+      - jinja2==3.1.4
+      - jsonmerge==1.9.2
+      - jsonschema==4.23.0
+      - jsonschema-specifications==2024.10.1
+      - k-diffusion==0.1.1.post1
+      - kornia==0.7.4
+      - kornia-rs==0.1.7
+      - lazy-loader==0.4
+      - lpips==0.1.4
+      - lycoris-lora==1.8.3
+      - markdown==3.7
+      - markdown-it-py==3.0.0
+      - markupsafe==2.1.5
+      - mdurl==0.1.2
+      - mpmath==1.3.0
+      - networkx==3.4.2
+      - ninja==1.11.1.3
+      - numpy==1.26.4
+      - nvidia-cublas-cu12==12.4.5.8
+      - nvidia-cuda-cupti-cu12==12.4.127
+      - nvidia-cuda-nvrtc-cu12==12.4.127
+      - nvidia-cuda-runtime-cu12==12.4.127
+      - nvidia-cudnn-cu12==9.1.0.70
+      - nvidia-cufft-cu12==11.2.1.3
+      - nvidia-curand-cu12==10.3.5.147
+      - nvidia-cusolver-cu12==11.6.1.9
+      - nvidia-cusparse-cu12==12.3.1.170
+      - nvidia-nccl-cu12==2.21.5
+      - nvidia-nvjitlink-cu12==12.4.127
+      - nvidia-nvtx-cu12==12.4.127
+      - omegaconf==2.3.0
+      - open-clip-torch==2.29.0
+      - opencv-python==4.10.0.84
+      - opencv-python-headless==4.10.0.84
+      - optimum-quanto==0.2.4
+      - orjson==3.10.12
+      - oyaml==1.0
+      - packaging==24.2
+      - pandas==2.2.3
+      - peft==0.14.0
+      - pillow==11.0.0
+      - platformdirs==4.3.6
+      - prodigyopt==1.1.1
+      - protobuf==5.29.2
+      - psutil==6.1.1
+      - pydantic==2.10.4
+      - pydantic-core==2.27.2
+      - pydub==0.25.1
+      - pygments==2.18.0
+      - python-dateutil==2.9.0.post0
+      - python-dotenv==1.0.1
+      - python-multipart==0.0.20
+      - python-slugify==8.0.4
+      - pytorch-fid==0.3.0
+      - pytz==2024.2
+      - pywavelets==1.8.0
+      - pyyaml==6.0.2
+      - referencing==0.35.1
+      - regex==2024.11.6
+      - requests==2.32.3
+      - rich==13.9.4
+      - rpds-py==0.22.3
+      - ruff==0.8.4
+      - safehttpx==0.1.6
+      - safetensors==0.4.5
+      - scikit-image==0.25.0
+      - scipy==1.14.1
+      - semantic-version==2.10.0
+      - sentencepiece==0.2.0
+      - sentry-sdk==2.19.2
+      - setproctitle==1.3.4
+      - shellingham==1.5.4
+      - six==1.17.0
+      - smmap==5.0.1
+      - sniffio==1.3.1
+      - starlette==0.41.3
+      - sympy==1.13.1
+      - tensorboard==2.18.0
+      - tensorboard-data-server==0.7.2
+      - text-unidecode==1.3
+      - tifffile==2024.12.12
+      - timm==1.0.12
+      - tokenizers==0.21.0
+      - toml==0.10.2
+      - tomlkit==0.13.2
+      - torch==2.5.1
+      - torchdiffeq==0.2.5
+      - torchsde==0.2.6
+      - torchvision==0.20.1
+      - tqdm==4.67.1
+      - trampoline==0.1.2
+      - transformers==4.47.1
+      - triton==3.1.0
+      - typer==0.15.1
+      - typing-extensions==4.12.2
+      - tzdata==2024.2
+      - urllib3==2.2.3
+      - uvicorn==0.34.0
+      - wandb==0.19.1
+      - wcwidth==0.2.13
+      - websockets==14.1
+      - werkzeug==3.1.3
+      - zipp==3.21.0
+prefix: /home/ubuntu/anaconda3/envs/aitoolkit2