Skip to content

ru4ls/ComfyUI_Nano_Banana

Repository files navigation

ComfyUI_Nano_Banana

License: MIT Python 3.12 Built with Gemini

A set of custom nodes for ComfyUI that leverage both Google Vertex AI and Google Generative AI SDK to generate images from text prompts, single images, and multiple images with configurable aspect ratios and resolutions using the Gemini Image model.

What's New

Version 6.1.0 - Nano Banana 2 Support

This minor update adds support for the new Nano Banana 2 model (gemini-3.1-flash-image-preview), optimized for speed and high-volume use cases.

New Features:

New Nano Banana 2 AIO Node: A dedicated node for the gemini-3.1-flash-image-preview model with support for:

  • Up to 14 reference images (10 objects + 4 characters)
  • New aspect ratios: 1:4, 4:1, 1:8, 8:1
  • 512px (0.5K) resolution option
  • Google Image Search grounding alongside Web Search

New Nano Banana 2 Multi-Turn Chat Node: A conversational image generation and editing node with all Nano Banana 2 features:

  • Multi-turn conversations with preserved context
  • Up to 14 reference images for iterative editing
  • Extreme aspect ratios and 512px resolution
  • Google Image Search grounding

Important: Before using Nano Banana 2 nodes, please update the google-genai package to the latest version:

pip install google-genai --upgrade

Version 6.0.2 - Auto Aspect Ratio Feature

This patch update adds the "Auto" aspect ratio option, allowing the AI to automatically determine the best aspect ratio for your generated images.

New Features:

Fixed Issue #14: Added "Auto" as an additional aspect ratio option in both NanoBananaAIO and NanoBananaMultiTurnChat nodes. When selected, the AI automatically chooses the optimal aspect ratio based on the prompt content.

Version 6.0.1 - Fix for MALFORMED_FUNCTION_CALL Issue

This patch update fixes the MALFORMED_FUNCTION_CALL error that occurred when using the Google GenAI SDK with the gemini-3-pro-image-preview model and Automatic Function Calling (AFC).

Fixes:

Fixed Issue #12: The MALFORMED_FUNCTION_CALL error has been resolved by explicitly disabling Automatic Function Calling (AFC) configuration in the Google GenAI SDK, preventing malformed function call errors when using the gemini-3-pro-image-preview model. The update also includes enhanced error handling.


Version 6.0 - The Multi-Turn Chat & Interactive Image Generation Update

This major update introduces a Multi-Turn Chat node that enables conversational image generation and editing with preserved context across multiple interactions.

New Features:

Multi-Turn Chat Node! A new "Nano Banana Multi-Turn Chat" node that supports conversational image generation and editing. Maintains conversation history and allows iterative image modifications by referencing previous images as context for new generations. Includes reset functionality to start fresh conversations.

Enhanced Conversation Context The node preserves conversation flow across multiple node executions, allowing for iterative improvements and refinements to generated images. Accepts initial images to start conversations and builds upon them in subsequent turns.


For a complete history of changes, see the CHANGELOG.md file.

Installation

  1. Clone this repository into your custom_nodes folder.
    cd ComfyUI/custom_nodes
    git clone https://github.com/ru4ls/ComfyUI_Nano_Banana.git
  2. Install the required dependencies:
    pip install -r ComfyUI_Nano_Banana/requirements.txt
  3. For Nano Banana 2 AIO and Multi-Turn Chat support, update the google-genai package to the latest version:
    pip install google-genai --upgrade

Configuration Setup

You can use either Google Generative AI API approach (simpler) or Google Vertex AI approach (more powerful) depending on your needs.

Option 1: Google Generative AI API

To use the simpler API approach, you need a Google AI API key. You can obtain one from the Google AI Studio. Please note that the Gemini API is a paid service and may incur costs.

  1. Copy the .env.api.template file to create your .env file:
    cp .env.api.template .env
  2. Edit the .env file and replace YOUR_API_KEY with your actual API key:
    GOOGLE_API_KEY="your-actual-api-key-here"
    

Option 2: Google Vertex AI

For access to the full functionality including the thinking process output and enhanced grounding capabilities, use the Vertex AI approach with your Google Cloud Project.

To use this you need a Google Cloud Project with the Vertex AI API enabled.

  1. Enable the Vertex AI API: Follow the instructions in the Google Cloud documentation to enable the API for your project.

  2. Authenticate Your Environment: This node uses Application Default Credentials (ADC) to securely authenticate with Google Cloud. Run the following gcloud command in your terminal to log in and set up your credentials. This is a one-time setup.

    gcloud auth application-default login

    The node authenticates directly through the installed Python libraries and does not depend on the gcloud.cmd executable being available in your system's PATH at runtime.

  3. Create a .env file: Copy the .env.vertexai.template file to create your .env file:

    cp .env.vertexai.template .env

    Then edit the .env file and add your Google Cloud project details:

    PROJECT_ID="your-gcp-project-id"
    LOCATION="your-gcp-location"  # e.g., us-central1 (Note: gemini-3-pro models automatically use global endpoint)
    

Automatic Approach Detection

The system automatically detects and uses the available credentials:

  • If both PROJECT_ID and LOCATION are set, it uses the Vertex AI approach
  • If only GOOGLE_API_KEY is set, it uses the API approach
  • If neither is available, an error is shown

Nodes

Nano Banana All-in-One (AIO)

This unified node combines all features from the existing nodes into a single, powerful interface. It dynamically adapts its behavior based on the image_count parameter: generating a single image (like NanoBananaGrounding) or multiple images (1-10) with the same powerful grounding, search, and thinking capabilities. This is the recommended node for new workflows.

Inputs:

  • model_name (STRING): The Gemini model to use. Currently using: gemini-3-pro-image-preview for advanced capabilities (default: gemini-3-pro-image-preview).
  • prompt (STRING): The text prompt for image generation or manipulation.
  • image_count (INT): Number of images to generate (1-10). When set to 1, behaves like NanoBananaGrounding; when >1, generates multiple sequential images (default: 1).
  • use_search (BOOLEAN): Toggle to enable or disable Google Search functionality (default: True).
  • image_1 to image_6 (IMAGE, optional): Up to six reference images. Provide at least one image for image-to-image generation.
  • aspect_ratio (STRING): The output aspect ratio for the generated image. Options include: 1:1, 2:3, 3:2, 3:4, 4:3, 4:5, 5:4, 9:16, 16:9, 21:9, Auto (default: 1:1). When set to Auto, the AI automatically determines the optimal aspect ratio based on the prompt content.
  • image_size (STRING): The output image quality/size. Options include: 1K, 2K, 4K (default: 2K).
  • temperature (FLOAT, optional): Controls the creative randomness of the output. Higher values (e.g., 1.2) are more creative, lower values (e.g., 0.5) are more deterministic.

Available Aspect Ratios & Resolutions:

  • 1:1 - 1024x1024 (square)
  • 2:3 - 832x1248 (portrait)
  • 3:2 - 1248x832 (landscape)
  • 3:4 - 864x1184 (portrait)
  • 4:3 - 1184x864 (landscape)
  • 4:5 - 896x1152 (portrait)
  • 5:4 - 1152x896 (landscape)
  • 9:16 - 768x1344 (vertical/video)
  • 16:9 - 1344x768 (horizontal/video)
  • 21:9 - 1536x672 (ultrawide)
  • Auto - AI automatically selects the best aspect ratio

Outputs:

  • images (IMAGE): Batch of generated images (single image when image_count=1, multiple images when image_count>1).
  • thinking (STRING): The AI's thought process and reasoning (only available when using Vertex AI approach; shows helpful message for API users).
  • grounding_sources (STRING): Citation information with source URLs and search queries used to generate the response.

Note: When using the Google Generative AI API approach (as opposed to VertexAI), the thinking and grounding_sources outputs will include helpful messages about using Vertex AI for full capabilities.

Nano Banana Multi-Turn Chat

This node supports conversational image generation and editing with preserved context across multiple interactions. Maintains conversation history and allows iterative image modifications by referencing previous images as context for new generations. Includes reset functionality to start fresh conversations.

Inputs:

  • model_name (STRING): The Gemini model to use. Currently using: gemini-3-pro-image-preview for advanced capabilities (default: gemini-3-pro-image-preview).
  • prompt (STRING): The text prompt for image generation or modification based on previous conversation context.
  • reset_chat (BOOLEAN): Toggle to reset the conversation history and start a fresh chat session (default: False).
  • aspect_ratio (STRING): The output aspect ratio for the generated image. Options include: 1:1, 2:3, 3:2, 3:4, 4:3, 4:5, 5:4, 9:16, 16:9, 21:9, Auto (default: 1:1). When set to Auto, the AI automatically determines the optimal aspect ratio based on the prompt content.
  • image_size (STRING): The output image quality/size. Options include: 1K, 2K, 4K (default: 2K).
  • temperature (FLOAT): Controls the creative randomness of the output. Higher values (e.g., 1.2) are more creative, lower values (e.g., 0.5) are more deterministic (default: 1.0).
  • image_input (IMAGE, optional): Initial image to start the conversation with. Use this to provide an initial image for the first interaction in a conversation.

Available Aspect Ratios & Resolutions:

  • 1:1 - 1024x1024 (square)
  • 2:3 - 832x1248 (portrait)
  • 3:2 - 1248x832 (landscape)
  • 3:4 - 864x1184 (portrait)
  • 4:3 - 1184x864 (landscape)
  • 4:5 - 896x1152 (portrait)
  • 5:4 - 1152x896 (landscape)
  • 9:16 - 768x1344 (vertical/video)
  • 16:9 - 1344x768 (horizontal/video)
  • 21:9 - 1536x672 (ultrawide)

Outputs:

  • image (IMAGE): The generated image based on the current prompt and conversation context.
  • response_text (STRING): The AI's response text to the current prompt.
  • metadata (STRING): Generation metadata including finish reason and safety ratings.
  • chat_history (STRING): Complete conversation history with all prompts and responses.

Nano Banana 2 AIO

A dedicated node for the gemini-3.1-flash-image-preview model, optimized for speed and high-volume use cases. This node provides all the latest features including support for up to 14 reference images, new extreme aspect ratios, and Google Image Search grounding.

Note: Requires the latest google-genai package. Update with: pip install google-genai --upgrade

Inputs:

  • model_name (STRING): The Gemini model to use. Currently using: gemini-3.1-flash-image-preview for high-efficiency image generation (default: gemini-3.1-flash-image-preview).
  • prompt (STRING): The text prompt for image generation or manipulation.
  • image_count (INT): Number of images to generate (1-10). When set to 1, behaves like single image generation; when >1, generates multiple sequential images (default: 1).
  • use_search (BOOLEAN): Toggle to enable or disable Google Search functionality (default: False).
  • use_image_search (BOOLEAN): Toggle to enable Google Image Search grounding for visual reference accuracy (default: False). Enable use_search to use this feature.
  • image_1 to image_14 (IMAGE, optional): Up to fourteen reference images. Gemini 3.1 Flash supports up to 10 object images with high-fidelity and up to 4 character images for consistency.
  • aspect_ratio (STRING): The output aspect ratio for the generated image. Options include: 1:1, 1:4, 1:8, 2:3, 3:2, 3:4, 4:1, 4:3, 4:5, 5:4, 8:1, 9:16, 16:9, 21:9, Auto (default: 1:1). When set to Auto, the AI automatically determines the optimal aspect ratio.
  • image_size (STRING): The output image quality/size. Options include: 512px, 1K, 2K, 4K (default: 2K). The 512px option is exclusive to Nano Banana 2.
  • temperature (FLOAT, optional): Controls the creative randomness of the output. Higher values (e.g., 1.2) are more creative, lower values (e.g., 0.5) are more deterministic.

Available Aspect Ratios & Resolutions:

  • 1:1 - 1024x1024 (square)
  • 1:4 - 512x2048 (extreme portrait) - Nano Banana 2 Exclusive
  • 1:8 - 256x2048 (ultra extreme portrait) - Nano Banana 2 Exclusive
  • 2:3 - 832x1248 (portrait)
  • 3:2 - 1248x832 (landscape)
  • 3:4 - 864x1184 (portrait)
  • 4:3 - 1184x864 (landscape)
  • 4:1 - 2048x512 (extreme landscape) - Nano Banana 2 Exclusive
  • 4:5 - 896x1152 (portrait)
  • 5:4 - 1152x896 (landscape)
  • 8:1 - 2048x256 (ultra extreme landscape) - Nano Banana 2 Exclusive
  • 9:16 - 768x1344 (vertical/video)
  • 16:9 - 1344x768 (horizontal/video)
  • 21:9 - 1536x672 (ultrawide)
  • Auto - AI automatically selects the best aspect ratio
  • 512px - 512x512 base resolution - Nano Banana 2 Exclusive

Outputs:

  • images (IMAGE): Batch of generated images (single image when image_count=1, multiple images when image_count>1).
  • thinking (STRING): The AI's thought process and reasoning (only available when using Vertex AI approach; shows helpful message for API users).
  • grounding_sources (STRING): Citation information with source URLs and search queries used to generate the response. Includes both web and image search results when enabled.

Note: When using the Google Generative AI API approach (as opposed to VertexAI), the thinking and grounding_sources outputs will include helpful messages about using Vertex AI for full capabilities.

Nano Banana 2 Multi-Turn Chat

A conversational image generation and editing node using the gemini-3.1-flash-image-preview model. Supports multi-turn conversations with preserved context, allowing iterative image modifications. Includes all Nano Banana 2 features: 14 reference images, extreme aspect ratios, and Image Search grounding.

Note: Requires the latest google-genai package. Update with: pip install google-genai --upgrade

Inputs:

  • model_name (STRING): The Gemini model to use. Currently using: gemini-3.1-flash-image-preview for high-efficiency image generation (default: gemini-3.1-flash-image-preview).
  • prompt (STRING): The text prompt for image generation or modification based on previous conversation context.
  • reset_chat (BOOLEAN): Toggle to reset the conversation history and start a fresh chat session (default: False).
  • use_search (BOOLEAN): Toggle to enable or disable Google Search functionality (default: False).
  • use_image_search (BOOLEAN): Toggle to enable Google Image Search grounding for visual reference accuracy (default: False). Enable use_search to use this feature.
  • aspect_ratio (STRING): The output aspect ratio for the generated image. Options include: 1:1, 1:4, 1:8, 2:3, 3:2, 3:4, 4:1, 4:3, 4:5, 5:4, 8:1, 9:16, 16:9, 21:9, Auto (default: 1:1). When set to Auto, the AI automatically determines the optimal aspect ratio.
  • image_size (STRING): The output image quality/size. Options include: 512px, 1K, 2K, 4K (default: 2K). The 512px option is exclusive to Nano Banana 2.
  • temperature (FLOAT): Controls the creative randomness of the output. Higher values (e.g., 1.2) are more creative, lower values (e.g., 0.5) are more deterministic (default: 1.0).
  • image_1 to image_14 (IMAGE, optional): Up to fourteen reference images. Gemini 3.1 Flash supports up to 10 object images with high-fidelity and up to 4 character images for consistency.

Available Aspect Ratios & Resolutions:

  • 1:1 - 1024x1024 (square)
  • 1:4 - 512x2048 (extreme portrait) - Nano Banana 2 Exclusive
  • 1:8 - 256x2048 (ultra extreme portrait) - Nano Banana 2 Exclusive
  • 2:3 - 832x1248 (portrait)
  • 3:2 - 1248x832 (landscape)
  • 3:4 - 864x1184 (portrait)
  • 4:3 - 1184x864 (landscape)
  • 4:1 - 2048x512 (extreme landscape) - Nano Banana 2 Exclusive
  • 4:5 - 896x1152 (portrait)
  • 5:4 - 1152x896 (landscape)
  • 8:1 - 2048x256 (ultra extreme landscape) - Nano Banana 2 Exclusive
  • 9:16 - 768x1344 (vertical/video)
  • 16:9 - 1344x768 (horizontal/video)
  • 21:9 - 1536x672 (ultrawide)
  • Auto - AI automatically selects the best aspect ratio
  • 512px - 512x512 base resolution - Nano Banana 2 Exclusive

Outputs:

  • image (IMAGE): The generated image based on the current prompt and conversation context.
  • response_text (STRING): The AI's response text to the current prompt.
  • metadata (STRING): Generation metadata including finish reason and safety ratings.
  • chat_history (STRING): Complete conversation history with all prompts and responses.

Example Workflow:

  • First execution: Connect reference images and enter "Create a product shot of this perfume bottle on a marble pedestal"
  • Second execution: "Change the background to a sunset beach scene"
  • Third execution: "Add water droplets on the bottle for a fresh look"

Note: When using the Google Generative AI API approach (as opposed to VertexAI), outputs will include helpful messages about using Vertex AI for full capabilities.

Example Usage

Text to Image Generation (with configurable aspect ratio)

  1. Add the Nano Banana AIO node to your workflow.
  2. Select your desired aspect_ratio from the dropdown (e.g., 16:9 for wide landscape, 9:16 for vertical, etc.).
  3. Enter a prompt.
  4. Ensure no image_ inputs are connected.
  5. Connect the output image to a PreviewImage or SaveImage node to see the result.

Sample Prompt: "A cinematic close-up of a transparent glass chess piece (a knight) aimed at the camera. Inside the glass piece, a tiny glowing galaxy swirls with purple and gold nebulae. The background is a dimly lit library with dust motes dancing in a single shaft of volumetric golden hour light striking the chess piece, creating caustic light refractions on the wooden table."

Screenshot 2025-11-23 113738 NanoBanana_Pro_00010_

Image Editing and Image Fusion Generation (with configurable aspect ratio and 1 to 6 reference images)

  1. Add the Nano Banana AIO node to your workflow.
  2. Select your desired aspect_ratio from the dropdown (the original images will be adapted to this output aspect ratio).
  3. Connect one or more LoadImage nodes (up to 6) to the image_1 to image_6 inputs.
  4. Enter a prompt describing the desired changes or outcome.
  5. Connect the output image to a PreviewImage or SaveImage node to see the result.

Sample Prompt: "A high-speed freeze-frame photograph of the glass chess knight shattering into thousands of sharp, crystalline shards. The galaxy inside is bursting outward, spilling purple and gold nebulae mist into the room. The glass fragments are suspended in mid-air, each one refracting the golden sunlight and the internal galaxy light. The wooden table is covered in glittering debris. Maintain the realistic depth of field and volumetric dust."

Screenshot 2025-11-23 114319 NanoBanana_Pro_00011_

Grounding with Search Results Generation

  1. Add the Nano Banana AIO node to your workflow.
  2. Enter a prompt that requires current data or information from the web (e.g., weather forecasts, current events, trending topics).
  3. Toggle the use_search parameter to True to enable Google Search functionality.
  4. Optionally connect reference images if needed.
  5. Set the desired aspect_ratio and image_size based on your needs.
  6. Connect the two outputs: image and grounding_sources to appropriate display nodes.
  7. The grounding_sources output will contain citations and links to the sources used in generating the response.

Sample Prompt: "Search for and visualize the current weather forecast for the next 5 days in Jakarta in a clean, modern glass hud style with the city as a background weather chart. Add a realistic visual of what I could wear each day."

Example Workflow:

  • The node will perform a Google search based on your prompt
  • Generate an image based on the search results
  • List all sources and citations used in the generation process
Screenshot 2025-11-23 115440 NanoBanana_Pro_00012_

Grounding with Search and Multi image Results Generation

  1. Add the NanoBanana AIO node to your workflow.
  2. Enter a prompt that requires current data or information from the web (e.g., weather forecasts, current events, trending topics) along with image variation guide (e.g., image 1 of 4 ..., image 2 of 4 ...).
  3. Set image_count to desired number.
  4. Toggle the use_search parameter to True to enable Google Search functionality.
  5. Optionally connect reference images if needed.
  6. Set the desired aspect_ratio and image_size based on your needs.
  7. Connect the two outputs: image and grounding_sources to appropriate display nodes.
  8. The grounding_sources output will contain citations and links to the sources used in generating the response.

Sample Prompt: "Using provided image ensure style consistency, composition and how data displayed. Search for and visualize the current weather forecast for the next 5 days in [CITY], with the city iconic spot as a background weather chart.

CITY image 1 of 4 Jakarta. image 2 of 4 Tokyo. image 3 of 4 London. image 4 of 4 Amsterdam."

Example Workflow:

  • The node will perform a Google search based on your prompt
  • Generate a set of images along with image reference style based on the search results
  • List all sources and citations used in the generation process
Screenshot 2025-11-24 104050 NanoBanana_AIO_00037_ NanoBanana_AIO_00038_ NanoBanana_AIO_00039_ NanoBanana_AIO_00040_

Multi-Turn Chat Conversation (with preserved context)

  1. Add the NanoBananaMultiTurnChat node to your workflow.
  2. Enter your initial prompt to generate the first image.
  3. Optionally connect an initial image_input to start the conversation with a specific image.
  4. Set your desired aspect_ratio and image_size parameters.
  5. Execute the node to generate the initial image and response.
  6. For subsequent interactions, use the same node instance with the same parameters but change the prompt to continue the conversation and modify the image iteratively.
  7. Use reset_chat to start a fresh conversation when needed.
  8. Connect the output image to a PreviewImage or SaveImage node to see the results.
  9. The chat_history output shows the complete conversation history.

Example Workflow:

  • First execution: "Create an image of a clear perfume bottle sitting on a vanity"
  • Second execution: "Change the color of the liquid inside the glass bottle to a vibrant royal blue"
  • Third execution: "Extreme close-up on the glass texture and silver cap of the blue perfume bottle. The framing is cropped tightly and weighted to the left"
Screenshot 2025-11-30 095242 Screenshot 2025-11-30 095511 Screenshot 2025-11-30 100045 NanoBanana_Multiturn_00001_ NanoBanana_Multiturn_00002_ NanoBanana_Multiturn_00003_

License

This project is licensed under the MIT License - see the LICENSE file for details.

About

A set of ComfyUI NanoBanana 2 and NanoBanana Pro custom nodes for Text-to-Image, Image-to-Image, and Image Fusion generation using both Gemini API and VertexAI with multi-image, grounding with web and image search, and multiturn chat features

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Languages