leichte-sprache2image

Generate your very own "Leichte Sprache" images by creating a dataset and fine-tune SDXL. This repository will lead you through the process of dataset-creation, fine-tuning, image-generation and evaluation. It consists of 4 directory:

dataset-preparation: Scrape "Leichte Sprache" images with their descriptions and prepare folder-structure for fine-tuning. All in all 4 different variations of datasets are created.
image-generation: Apply the LoRAs on SDXL from diffusers and generate images with the scraped image descriptions as prompts
evaluation: Apply ImageReward and FID to your generated images
storage: In this directory all the datasets and generated images are stored as well as an example configuration.toml for LoRA Fine-Tuning. Fell free to change the hyperparameters.

How to generate "Leichte Sprache" images?

Create your dataset

Install necessary packages pip install -r requirements.txt
Create the basic dataset by following the notebook at dataset-preparation/create-dataset.ipynb
Process the basic dataset and create different variations of it by following the notebook at dataset-preparation/process-dataset.ipynb
(optional) you can find some suggests visualization steps in dataset-preparation/visualize-dataset.ipynb

Fine-Tune SDXL with LoRA

Fill in missing absolut paths in the fine-tuning configuration file storage/basic-lora-config.toml - you can vary the hyperparameters (e.g. reduce the batch size if you cuda runs out of memory)
Clone the fine-tuning scripts repository for stable diffusion
cd ..
git clone https://github.com/kohya-ss/sd-scripts.git
and follow the instructions in its README.md to install necessary requirements
Start fine-tuning by python sdxl_train_network.py --config_file=<absolut-path-to>/basic-lora-config.toml

Generate Images

run generate_images in image-generation/generate-images.py with the base directory from one of the four dataset varies.
It will generate an image for each test image (<dataset-path>/test) and LoRA-Checkpoint (<dataset-path>/loras). The images will be saved in <dataset-path>/output/<checkpoint-name>.

Evaluate Images

calculate fid-score by running calculate_fid in image-evaluation/frechet-inception-distance.py with the base directroy from one of the four dataset varies.
Keep in mind that you will need to have some generated images in <dataset-path>/output/<checkpoint-name> and <dataset-path>/test-images-only/<checkpoint-name>
calculate ImageReward-score by running calculate_image_reward in image-evaluation/image-reward.py with the base directroy from one of the four dataset varies.
Keep in mind that you will need to have some generated images in <dataset-path>/output/<checkpoint-name> and <dataset-path>/test-images-only/<checkpoint-name>
(optional) check the example visualization of ImageReward and FID in image-evaluation/visualize-evaluation.ipynb

DISCLAIMER

This repository has been created as part of a bachelor's thesis titled 'Leichte Sprache und generative KI: Bilder als Unterstützung für Texte in Leichter Sprache.'
The dataset names in this repository follow a slightly different naming strategy. Below is a mapping of the dataset names:

Name in repository	Name in thesis
translated-description_random-split	RS_D
generated-description_random-split	RS_IC
translated-description_category-split	CS_D
generated-description_category-split	CS_IC

The original output is available at storage/CS_D/output, storage/CS_IC/output, storage/RS_D/output & storage/RS_IC/output.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

leichte-sprache2image

How to generate "Leichte Sprache" images?

Create your dataset

Fine-Tune SDXL with LoRA

Generate Images

Evaluate Images

DISCLAIMER

About

Uh oh!

Releases

Packages

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
dataset-preparation		dataset-preparation
image-evaluation		image-evaluation
image-generation		image-generation
storage		storage
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

License

maxiirs/leichte-sprache2image

Folders and files

Latest commit

History

Repository files navigation

leichte-sprache2image

How to generate "Leichte Sprache" images?

Create your dataset

Fine-Tune SDXL with LoRA

Generate Images

Evaluate Images

DISCLAIMER

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages