Clawdia Monet is an interactive AI art application that transforms photos of your cats into beautiful sketches and paintings. Powered by Google Gemini models, this Streamlit application simulates a commission with the virtual artist, Clawdia Monet, who specializes in feline art.
This project showcases a multi-agent workflow where different AI agents collaborate to perform a creative task. Users upload a photo, and the application first verifies the presence of a cat. If a cat is found, a series of AI agents work together to first generate a detailed sketch and then a full-color painting based on the original image.
The application is built with Python and Streamlit, making it easy to run locally or deploy as a web app.
The core of Clawdia Monet is its pipeline of specialized AI agents. Each agent has a distinct role, and they pass information to one another to complete the final artwork.
flowchart TB
User[User]
Upload[Upload Image]
IsCat{Is it a Cat?}
CatCheck[1. Cat Check Agent]
InstructSketch[2. Sketch Instructor Agent]
CatSketch[3. Sketch Artist Agent]
InstructPaint[4. Painting Instructor Agent]
CatPaint[5. Painting Artist Agent]
Exit[End Session]
DisplaySketch[Display Sketch]
DisplayPainting[Display Painting]
User -- Uploads Photo --> Upload
Upload -- Passes Image --> CatCheck
CatCheck -- Analyzes Image --> IsCat
IsCat -- No --> Exit
IsCat -- Yes --> InstructSketch
InstructSketch -- Creates Instructions --> CatSketch
CatSketch -- Generates Sketch --> DisplaySketch
DisplaySketch -- User Clicks 'Paint' & Passes Sketch --> InstructPaint
InstructPaint -- Creates Instructions --> CatPaint
CatPaint -- Generates Painting --> DisplayPainting
- Cat Check Agent (
cat_check): An AI agent that analyzes the uploaded image to verify it contains a cat. It returns a structured JSON response. - Sketch Instructor Agent (
instruct_sketch): If a cat is found, this agent generates detailed, step-by-step instructions for creating a pencil sketch from the image. - Sketch Artist Agent (
cat_sketch): This agent takes the instructions and the original image to generate a new image—the sketch. - Painting Instructor Agent (
instruct_artist): After the sketch is approved, this agent creates new, detailed instructions for turning the sketch into a painting, referencing both the original photo and the sketch. - Painting Artist Agent (
cat_paint): The final agent uses the painting instructions and the sketch to generate the final, full-color painting.
-
Image Upload & Validation (
upload_workflow):- The user is prompted to upload an image of their cat.
- The application checks if the file is a valid image format (JPG, PNG) and opens it.
- The image is resized if it's larger than 1024x1024 pixels to ensure it can be processed by the models.
-
Cat Identification (
cat_check_workflow):- The first agent,
cat_check, is invoked to determine if the uploaded image actually contains a cat. - This agent uses the
gemini-2.0-flashmodel with a JSON response schema (CatCheck) to return a booleanis_catand a creativeobservationmessage for the user. - If no cat is found, the process stops, and the user is prompted to start over.
- The first agent,
-
Sketch Instruction & Generation (
draw_cat_workflow):- If a cat is identified, the
instruct_sketchagent is called. This agent acts as an "art instructor," generating detailed, step-by-step instructions on how to draw the cat and its background in a traditional pencil-on-brown-paper style. - These instructions are then passed to the
cat_sketchagent, which embodies the artist "Clawdia Monet." This agent uses thegemini-2.0-flash-preview-image-generationmodel to generate a sketch based on the original image and the detailed instructions. - The generated sketch is displayed to the user, who can then choose to "Start Painting" or "Sketch Again."
- If a cat is identified, the
-
Painting Instruction & Generation (
paint_cat_workflow):- When the user proceeds, the
instruct_artistagent is called. This agent acts as an "artist's assistant," creating a new set of instructions for Clawdia Monet. It uses both the original photo and the newly created sketch as references to describe how to turn the drawing into a painting in a specific artistic style. - Finally, the
cat_paintagent takes these painting instructions and the sketch to generate the final painting, again using thegemini-2.0-flash-preview-image-generationmodel. - The final painting is displayed, and the user can choose to "Paint Again" or "Start Over."
- When the user proceeds, the
- Backend: Python
- Frontend: Streamlit
- AI Models: Google Gemini (including
gemini-2.0-flashand image generation models) - Core Libraries:
google-genai,streamlit,pydantic,pillow - Deployment: Docker
Here is a summary of the key files in this repository:
.
├── .gitignore # Standard Python gitignore file.
├── .streamlit/
│ └── config.toml # Streamlit theme and configuration.
├── Dockerfile # Docker configuration for containerizing the app.
├── LICENSE # Apache 2.0 License.
├── README.md # You are here!
├── app.py # The main Streamlit application logic and agent definitions.
├── flow.mmd # Mermaid diagram of the agentic workflow.
├── images/ # Contains the app icon and other static images.
├── requirements.txt # Python package dependencies.
└── run.py # Startup script to modify Streamlit's HTML before running the app.
To run this project locally, you will need Python 3.12+ and an environment management tool like venv.
-
Clone the repository:
git clone https://github.com/peterjakubowski/clawdia-monet.git cd clawdia-monet -
Create and activate a virtual environment:
python -m venv venv source venv/bin/activate # On Windows, use `venv\Scripts\activate`
-
Install the dependencies:
pip install -r requirements.txt
-
Set up your API Key: You need a Google AI API key to use the Gemini models.
- Create a file named
.envin the root of the project. - Add your API key to this file:
GOOGLE_API_KEY="YOUR_API_KEY_HERE"
The application will load this key automatically.
- Create a file named
Once the setup is complete, you can run the application using Streamlit.
Local Development:
Launch the Streamlit server with the following command:
streamlit run app.py --server.port 8501 --server.headless falseAlternatively, you can run the run.py script first to apply branding changes and then launch the app:
python run.py
streamlit run app.pyUsing Docker:
This repository includes a Dockerfile for easy containerization.
-
Build the Docker image:
docker build -t clawdia-monet . -
Run the Docker container: Make sure to pass your
GOOGLE_API_KEYas an environment variable.docker run -p 8501:8501 -e GOOGLE_API_KEY="YOUR_API_KEY_HERE" clawdia-monetThe application will be available at
http://localhost:8501.
This project is licensed under the Apache License, Version 2.0. See the LICENSE file for more details.
