Skip to content

Conversation

@ChristopheZhao
Copy link

@ChristopheZhao ChristopheZhao commented Mar 15, 2025

What I'm trying to accomplish

This PR aims to enhance the text-to-image generation workflow by adding intelligent prompt processing capabilities. Specifically, it:

  1. Makes the WebUI more accessible to non-English speakers by automatically detecting and translating prompts
  2. Improves image generation quality through prompt optimization using LangGraph-based agent workflows
  3. Provides a seamless experience that works within the existing txt2img interface

Summary of changes in code

  • Added scripts/txt2img_prompt_optimizer.py - A new script that:

    • Implements a Script class that integrates with the WebUI's txt2img tab
    • Uses LangGraph to create an agent-based workflow for prompt processing
    • Detects non-English text and translates it to English
    • Optimizes prompts to improve generation quality while preserving intent
    • Handles API key management through environment variables
    • Provides graceful fallbacks when optional dependencies are missing
  • Updated requirements.txt to include:

    • python-dotenv for environment variable management
    • langgraph for building the agent workflow
  • Updated requirements_versions.txt with specific versions:

    • Added compatible versions of new dependencies
    • Ensured version compatibility with existing dependencies
  • Updated .gitignore to exclude:

    • .env files containing sensitive API keys

Issues fixed

This PR addresses the feature request in Issue #4576, which requested multilingual prompt support but was previously marked as "not planned".

The implementation:

  1. Adds multilingual support through automatic translation of non-English prompts
  2. Goes beyond the original request by also implementing prompt optimization
  3. Integrates seamlessly with the existing txt2img interface without requiring changes to the core pipeline

Screenshots/videos:

Here's a demonstration of how our system handles backend translations and their effectiveness for prompts in various languages. We will use 'a kitten under a pine tree' as a prompt to test the effects across different languages.

Chinese (simplified):

  • backend

image

  • frontend

image

Japanese;

  • backend

image

  • frontend

image

French;

  • backend

image

  • frontend

image

Spanish;

  • backend

image

  • frontend

image

Vietnamese.

  • backend

image

  • frontend

image

Kiswahili

  • backend

image

  • frontend

image

  • And, of course, English prompts are also automatically optimized.
    • backend

image

  • frontend

image

Checklist:

- Add txt2img_prompt_optimizer.py script for automatic prompt translation and optimization
- Support non-English prompts with automatic translation to English
- Implement prompt optimization using LangGraph workflow
- Add python-dotenv and langgraph dependencies
- Update requirements.txt and requirements_versions.txt with new dependencies
@catboxanon
Copy link
Collaborator

This should be an extension.

@catboxanon catboxanon closed this May 2, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants