This repository contains the official code for the paper: "Beyond-the-Brush: Fully-automated Crafting of Realistic Inpainted Images" presented at WIFS, 2024.
The Beyond the Brush (BtB) is a fully automated pipeline for generating realistic inpainted images. It consists of three main modules:
- Mask Extraction: identifies meaningful regions in an image to be inpainted using a segmentation procedure.
- Prompt Generation: uses a Visual Language Model to determine the replacement content for the identified regions.
- Inpaiting: inpaints the input images using the extracted masks and generated prompts.
1. Clone the repository
git clone https://github.com/IAPP-Group/Beyond-the-Brush.git
cd Beyond-the-Brush2. Set up the environment
conda create -n btb python=3.8
conda activate btb
pip install -r requirements.txt3. Download Pre-trained Models Download the RAM model weights and place them in a directory.
This module extracts meaningful regions for inpainting.
python process_images.py --output_path MASKS_FOLDER --top_masks 10 --ram_model_path RAM_WEIGHTS.pth --device cuda input_list_path
--output_pathspecifies the directory to save the extracted masks (default: ./out)--top_masksis the number of top masks to extract per image (corresponding to the highest score)--ram_model_pathis the path to the downloaded RAM model--devicespecifies the device to use for the extraction (cuda or cpu, default: cuda)input_list_pathis the path to a text file containing a list of image paths to process.
The extracted data is saved in JSON and NPZ formats:
- JSON files store metadata and mask information.
- NPZ files contain the binary mask arrays
After extraction, post-process the masks for the next step.
python post_processing.py --input_dir MASKS_FOLDER --num_images 500 --dataset_name Flickr30k --save_dir_masks MASKS_DESTINATION_FOLDER --save_dir_bb BB_DESTINATION_FOLDER
--input_diris the directory with .json and .npz files from mask extraction--num_imagesis the number of images to sample (default: None, which means all)--dataset_nameis the name of the source dataset (e.g., Flickr30k, VISION, FloreView)--save_dir_masksis the directory to save masks as .png files--save_dir_bbis the directory to save the images with green bounding boxes
This script, will generate PNG masks for each input image of three different size (small, medium, and large), and PNG images with green bounding box for prompt generation. Moreover, it will generate three txt files within the project directory:
- "best_mask_labels_dataset_name.txt": it contains the list of images with their labels and sizes. Specifically, each line in the file follows the format: ImageName_MaskAreaSize - Label
- "sampled_num_images_images_dataset_name.txt": it contains the list of sampled json files from the input_dir
- "source_images_path_dataset_name.txt": it contains the paths of the sampled source images.
This module generates text prompts for the inpainting step using a Visual Language Model.
python generate_prompts.py --output_dir PROMPT_DESTINATION_FOLDER --bounding_box_dir BB_DESTINATION_FOLDER --labels_file best_masks_labels_Flickr30k.txt --num_prompt 5
--output_diris the directory to store generated prompts as .json files for each image--bounding_box_diris the directory containing the images with the green bounding box--labels_fileis the txt file containing the labels for each image produced in the postprocessing step--num_promptis the number of prompts to generate for each image (default: 5)
This module inpaints the images using Fooocus.
- Install Fooocus and Fooocus-API by following their setup instructions.
- Start the Fooocus-API app.
python inpaint_images_fooocus.py --images_path source_images_path_Flickr30k.txt --masks_path MASKS_DESTINATION_FOLDER --prompt PROMPT_DESTINATION_FOLDER --save_path BTB_DESTINATION_FOLDER --fooocus_api_dir FOOOCUS_API_FOLDER
--images_pathis the txt file with the path of source images to be inpaint--masks_pathis the directory containing the extracted masks--promptis the directory containing the generated prompts--save_pathis the directory to save the inpainted images--fooocus_api_diris the path of Fooocus-API directory
The Beyond the Brush dataset used in the paper is available on Hugging Face 🤗.
Please, if you use this code cite out paper:
@inproceedings{bertazzini2024beyond,
title={Beyond the Brush: Fully-automated Crafting of Realistic Inpainted Images},
author={Bertazzini, Giulia and Albisani, Chiara and Baracchi, Daniele and Shullani, Dasara and Piva, Alessandro},
booktitle={2024 IEEE International Workshop on Information Forensics and Security (WIFS)},
pages={1--6},
year={2024},
organization={IEEE}
}
