Skip to content

Fix rgb_to_hsv, hsv_to_rgb, and rgb_to_grayscale channel validation for channels_first with keras.Input#22511

Open
xingzihai wants to merge 2 commits intokeras-team:masterfrom
xingzihai:fix-rgb-to-hsv-channels-first-validation
Open

Fix rgb_to_hsv, hsv_to_rgb, and rgb_to_grayscale channel validation for channels_first with keras.Input#22511
xingzihai wants to merge 2 commits intokeras-team:masterfrom
xingzihai:fix-rgb-to-hsv-channels-first-validation

Conversation

@xingzihai
Copy link
Copy Markdown

Fixes #22472

Summary

When using keras.Input with shape=(H, W) and channels_first format, keras creates a 3D tensor (None, H, W) where None is the batch dimension. The original code used channels_axis=-3 for channels_first, which would point to the batch dimension for this 3D input, skipping the channel validation and producing incorrect output shapes.

Changes

This fix:

  1. Detects when a 3D input with channels_first has None as the first dimension (indicating it came from keras.Input)
  2. Raises a clear error explaining the issue and how to fix it
  3. Fixes the channel axis calculation to use 1 for 4D inputs and 0 for 3D inputs with channels_first

Affected Operations

  • RGBToHSV.compute_output_spec
  • HSVToRGB.compute_output_spec
  • RGBToGrayscale.compute_output_spec

Tests Added

  • test_rgb_to_hsv_invalid_channels_first_with_batched_3d_input
  • test_hsv_to_rgb_invalid_channels_first_with_batched_3d_input
  • test_rgb_to_grayscale_invalid_channels_first_with_batched_3d_input

Reproduction (from issue #22472)

import keras
import numpy as np

# Create test input
test_input = np.random.random((4, 4, 3)).astype(np.float32)

# Test with eager tensor (dynamic)
print("Testing with eager tensor:")
try:
    dynamic_result = keras.ops.image.rgb_to_hsv(test_input, data_format="channels_first")
    print(f"Dynamic output shape: {dynamic_result.shape}")
except Exception as e:
    print(f"Dynamic execution error: {e}")

# Test with Keras.Input placeholder (static)
print("\nTesting with Keras.Input placeholder:")
try:
    placeholder_input = keras.Input(shape=(4, 3), dtype="float32")
    static_result = keras.ops.image.rgb_to_hsv(placeholder_input, data_format="channels_first")
    print(f"Static output shape: {static_result.shape}")
except Exception as e:
    print(f"Static execution error: {e}")

Before fix:

  • Eager tensor: input must have 3 channels but input only has 4 channels
  • keras.Input placeholder: returns output shape (None, 4, 3) without raising an error

After fix:

  • Both eager and symbolic inputs correctly raise a ValueError explaining the issue

- Add PositionalEncoding layer implementing sinusoidal positional encodings
- Add TransformerBlock layer with multi-head self-attention and feed-forward network
- Add create_transformer_classifier function demonstrating model composition
- Add comprehensive test suite covering:
  - PositionalEncoding: shape preservation, encoding formula, serialization
  - TransformerBlock: shape preservation, configuration validation, training modes
  - Integration tests: model training, inference, gradient flow

All layers use backend-agnostic Keras operations (keras.ops) for compatibility
with TensorFlow, JAX, and PyTorch backends.
…or channels_first with keras.Input

Fixes keras-team#22472

When using keras.Input with shape=(H, W) and channels_first format,
keras creates a 3D tensor (None, H, W) where None is the batch dimension.
The original code used channels_axis=-3 for channels_first, which would
point to the batch dimension for this 3D input, skipping the channel
validation and producing incorrect output shapes.

This fix:
1. Detects when a 3D input with channels_first has None as the first
   dimension (indicating it came from keras.Input)
2. Raises a clear error explaining the issue and how to fix it
3. Fixes the channel axis calculation to use 1 for 4D inputs and 0
   for 3D inputs with channels_first

The fix is applied to:
- RGBToHSV.compute_output_spec
- HSVToRGB.compute_output_spec
- RGBToGrayscale.compute_output_spec

Tests added:
- test_rgb_to_hsv_invalid_channels_first_with_batched_3d_input
- test_hsv_to_rgb_invalid_channels_first_with_batched_3d_input
- test_rgb_to_grayscale_invalid_channels_first_with_batched_3d_input
@google-cla
Copy link
Copy Markdown

google-cla bot commented Mar 26, 2026

Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

View this failed invocation of the CLA check for more information.

For the most up to date status, view the checks section at the bottom of the pull request.

@gemini-code-assist
Copy link
Copy Markdown
Contributor

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request resolves a critical issue in Keras image color conversion operations (rgb_to_hsv, hsv_to_rgb, rgb_to_grayscale) where channel validation for channels_first inputs was flawed when dealing with symbolic 3D inputs from keras.Input. The fix ensures proper validation and provides actionable error messages, preventing silent failures and unexpected output shapes. Additionally, a new comprehensive example for building custom Transformer components (Positional Encoding and Transformer Block) has been introduced, complete with detailed tests.

Highlights

  • Improved Channel Validation: Implemented detection for 3D inputs with channels_first format that have a None batch dimension (typically from keras.Input), which previously led to incorrect behavior.
  • Enhanced Error Messaging: Introduced clear ValueError messages for invalid channels_first inputs, guiding users on how to correctly define input shapes for batched 3D or 4D images.
  • Corrected Channel Axis Calculation: Adjusted the channel axis calculation for channels_first format to correctly identify the channel dimension as index 1 for 4D inputs and 0 for 3D inputs.
  • New Transformer Example: Added a new example demonstrating custom Keras layers for Positional Encoding and Transformer Blocks, showcasing backend-agnostic operations.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces custom Keras PositionalEncoding and TransformerBlock layers, complete with a demo and comprehensive tests. It also includes fixes for keras.ops.image functions (RGBToGrayscale, RGBToHSV, HSVToRGB) to correctly handle channels_first input shapes, especially for 3D inputs from keras.Input, and adds corresponding tests. Feedback suggests that the new transformer example and tests should be moved to a separate PR for better focus. Additionally, there's a request to update a misleading comment in the PositionalEncoding layer and to refactor duplicated input validation logic in the image operations into helper functions for improved maintainability. The author is also asked to replace [Your Name] in the example file's metadata.

Comment on lines +1 to +8
"""
Title: Custom Transformer Block with Positional Encoding
Author: [Your Name]
Date created: 2024/03/26
Last modified: 2024/03/26
Description: Creating custom positional encoding and transformer block layers
from scratch using Keras 3 backend-agnostic operations.
"""
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This new example file (and its corresponding test file) seems unrelated to the main purpose of this pull request, which is to fix image ops. To keep pull requests focused and easier to review, it would be better to move these files to a separate PR.

@@ -0,0 +1,462 @@
"""
Title: Custom Transformer Block with Positional Encoding
Author: [Your Name]
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Please replace [Your Name] with your actual name or GitHub handle.

Comment on lines +102 to +104
# Use scatter_update to fill in sine values for even indices
# and cosine values for odd indices
# Note: We need to handle this carefully for backend compatibility
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This comment is misleading as scatter_update is not used. The implementation uses stack and reshape to interleave the sine and cosine values, which is a good approach. Please update the comment to reflect the actual implementation.

Suggested change
# Use scatter_update to fill in sine values for even indices
# and cosine values for odd indices
# Note: We need to handle this carefully for backend compatibility
# Create positional encodings by interleaving sine and cosine values.
# `pe[:, 0::2] = sin(position * div_term)`
# `pe[:, 1::2] = cos(position * div_term)`

Comment on lines +28 to +47
rank = len(images_shape)
# Check for invalid batched 3D input with channels_first.
if (
rank == 3
and self.data_format == "channels_first"
and images_shape[0] is None
):
raise ValueError(
"Invalid input shape for channels_first format: "
"3D input with unknown first dimension detected. "
"For channels_first format with keras.Input, use "
"shape=(3, height, width) to create a 4D batched input "
"(batch, channels, height, width), or pass a 3D numpy array "
"of shape (3, height, width) for unbatched input. "
f"Received: images.shape={images_shape}"
)
if self.data_format == "channels_last":
images_shape[-1] = 1
else:
images_shape[-3] = 1
else: # channels_first
images_shape[1 if rank == 4 else 0] = 1
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

There's some duplicated logic across RGBToGrayscale, RGBToHSV, and HSVToRGB for validating channels_first inputs and determining the channel axis. To improve maintainability, you could extract this logic into helper functions.

For example, you could have:

  1. A validation helper:
def _validate_channels_first_input(images_shape, data_format):
    rank = len(images_shape)
    if (
        rank == 3
        and data_format == "channels_first"
        and images_shape[0] is None
    ):
        raise ValueError(
            "Invalid input shape for channels_first format: "
            # ... (rest of the error message)
        )
  1. A helper for the channel axis logic:
def _get_channels_first_axis(rank):
    return 1 if rank == 4 else 0

This would make the compute_output_spec methods cleaner and avoid repeating the same code.

@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented Mar 26, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 83.32%. Comparing base (cc7078b) to head (10361fd).
⚠️ Report is 4 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff             @@
##           master   #22511      +/-   ##
==========================================
+ Coverage   83.20%   83.32%   +0.12%     
==========================================
  Files         596      596              
  Lines       67621    67634      +13     
  Branches    10531    10536       +5     
==========================================
+ Hits        56266    56358      +92     
+ Misses       8630     8538      -92     
- Partials     2725     2738      +13     
Flag Coverage Δ
keras 83.14% <100.00%> (+0.11%) ⬆️
keras-jax 60.00% <100.00%> (+<0.01%) ⬆️
keras-numpy 54.26% <100.00%> (+0.01%) ⬆️
keras-openvino 51.17% <100.00%> (+<0.01%) ⬆️
keras-tensorflow 61.31% <100.00%> (+0.07%) ⬆️
keras-torch 60.18% <100.00%> (+0.08%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Copy Markdown
Collaborator

@hertschuh hertschuh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please also accept the CLA.

Comment on lines +1 to +8
"""
Title: Custom Transformer Block with Positional Encoding
Author: [Your Name]
Date created: 2024/03/26
Last modified: 2024/03/26
Description: Creating custom positional encoding and transformer block layers
from scratch using Keras 3 backend-agnostic operations.
"""
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The demo files are unrelated to the bug, please remove.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

rgb_to_hsv does not validate channel count for Keras Input with channels_first

4 participants