Fix rgb_to_hsv, hsv_to_rgb, and rgb_to_grayscale channel validation for channels_first with keras.Input#22511
Conversation
- Add PositionalEncoding layer implementing sinusoidal positional encodings - Add TransformerBlock layer with multi-head self-attention and feed-forward network - Add create_transformer_classifier function demonstrating model composition - Add comprehensive test suite covering: - PositionalEncoding: shape preservation, encoding formula, serialization - TransformerBlock: shape preservation, configuration validation, training modes - Integration tests: model training, inference, gradient flow All layers use backend-agnostic Keras operations (keras.ops) for compatibility with TensorFlow, JAX, and PyTorch backends.
…or channels_first with keras.Input Fixes keras-team#22472 When using keras.Input with shape=(H, W) and channels_first format, keras creates a 3D tensor (None, H, W) where None is the batch dimension. The original code used channels_axis=-3 for channels_first, which would point to the batch dimension for this 3D input, skipping the channel validation and producing incorrect output shapes. This fix: 1. Detects when a 3D input with channels_first has None as the first dimension (indicating it came from keras.Input) 2. Raises a clear error explaining the issue and how to fix it 3. Fixes the channel axis calculation to use 1 for 4D inputs and 0 for 3D inputs with channels_first The fix is applied to: - RGBToHSV.compute_output_spec - HSVToRGB.compute_output_spec - RGBToGrayscale.compute_output_spec Tests added: - test_rgb_to_hsv_invalid_channels_first_with_batched_3d_input - test_hsv_to_rgb_invalid_channels_first_with_batched_3d_input - test_rgb_to_grayscale_invalid_channels_first_with_batched_3d_input
|
Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA). View this failed invocation of the CLA check for more information. For the most up to date status, view the checks section at the bottom of the pull request. |
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request resolves a critical issue in Keras image color conversion operations ( Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request introduces custom Keras PositionalEncoding and TransformerBlock layers, complete with a demo and comprehensive tests. It also includes fixes for keras.ops.image functions (RGBToGrayscale, RGBToHSV, HSVToRGB) to correctly handle channels_first input shapes, especially for 3D inputs from keras.Input, and adds corresponding tests. Feedback suggests that the new transformer example and tests should be moved to a separate PR for better focus. Additionally, there's a request to update a misleading comment in the PositionalEncoding layer and to refactor duplicated input validation logic in the image operations into helper functions for improved maintainability. The author is also asked to replace [Your Name] in the example file's metadata.
| """ | ||
| Title: Custom Transformer Block with Positional Encoding | ||
| Author: [Your Name] | ||
| Date created: 2024/03/26 | ||
| Last modified: 2024/03/26 | ||
| Description: Creating custom positional encoding and transformer block layers | ||
| from scratch using Keras 3 backend-agnostic operations. | ||
| """ |
| @@ -0,0 +1,462 @@ | |||
| """ | |||
| Title: Custom Transformer Block with Positional Encoding | |||
| Author: [Your Name] | |||
| # Use scatter_update to fill in sine values for even indices | ||
| # and cosine values for odd indices | ||
| # Note: We need to handle this carefully for backend compatibility |
There was a problem hiding this comment.
This comment is misleading as scatter_update is not used. The implementation uses stack and reshape to interleave the sine and cosine values, which is a good approach. Please update the comment to reflect the actual implementation.
| # Use scatter_update to fill in sine values for even indices | |
| # and cosine values for odd indices | |
| # Note: We need to handle this carefully for backend compatibility | |
| # Create positional encodings by interleaving sine and cosine values. | |
| # `pe[:, 0::2] = sin(position * div_term)` | |
| # `pe[:, 1::2] = cos(position * div_term)` |
| rank = len(images_shape) | ||
| # Check for invalid batched 3D input with channels_first. | ||
| if ( | ||
| rank == 3 | ||
| and self.data_format == "channels_first" | ||
| and images_shape[0] is None | ||
| ): | ||
| raise ValueError( | ||
| "Invalid input shape for channels_first format: " | ||
| "3D input with unknown first dimension detected. " | ||
| "For channels_first format with keras.Input, use " | ||
| "shape=(3, height, width) to create a 4D batched input " | ||
| "(batch, channels, height, width), or pass a 3D numpy array " | ||
| "of shape (3, height, width) for unbatched input. " | ||
| f"Received: images.shape={images_shape}" | ||
| ) | ||
| if self.data_format == "channels_last": | ||
| images_shape[-1] = 1 | ||
| else: | ||
| images_shape[-3] = 1 | ||
| else: # channels_first | ||
| images_shape[1 if rank == 4 else 0] = 1 |
There was a problem hiding this comment.
There's some duplicated logic across RGBToGrayscale, RGBToHSV, and HSVToRGB for validating channels_first inputs and determining the channel axis. To improve maintainability, you could extract this logic into helper functions.
For example, you could have:
- A validation helper:
def _validate_channels_first_input(images_shape, data_format):
rank = len(images_shape)
if (
rank == 3
and data_format == "channels_first"
and images_shape[0] is None
):
raise ValueError(
"Invalid input shape for channels_first format: "
# ... (rest of the error message)
)- A helper for the channel axis logic:
def _get_channels_first_axis(rank):
return 1 if rank == 4 else 0This would make the compute_output_spec methods cleaner and avoid repeating the same code.
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## master #22511 +/- ##
==========================================
+ Coverage 83.20% 83.32% +0.12%
==========================================
Files 596 596
Lines 67621 67634 +13
Branches 10531 10536 +5
==========================================
+ Hits 56266 56358 +92
+ Misses 8630 8538 -92
- Partials 2725 2738 +13
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
hertschuh
left a comment
There was a problem hiding this comment.
Please also accept the CLA.
| """ | ||
| Title: Custom Transformer Block with Positional Encoding | ||
| Author: [Your Name] | ||
| Date created: 2024/03/26 | ||
| Last modified: 2024/03/26 | ||
| Description: Creating custom positional encoding and transformer block layers | ||
| from scratch using Keras 3 backend-agnostic operations. | ||
| """ |
There was a problem hiding this comment.
The demo files are unrelated to the bug, please remove.
Fixes #22472
Summary
When using
keras.Inputwithshape=(H, W)andchannels_firstformat, keras creates a 3D tensor(None, H, W)whereNoneis the batch dimension. The original code usedchannels_axis=-3forchannels_first, which would point to the batch dimension for this 3D input, skipping the channel validation and producing incorrect output shapes.Changes
This fix:
channels_firsthasNoneas the first dimension (indicating it came fromkeras.Input)1for 4D inputs and0for 3D inputs withchannels_firstAffected Operations
RGBToHSV.compute_output_specHSVToRGB.compute_output_specRGBToGrayscale.compute_output_specTests Added
test_rgb_to_hsv_invalid_channels_first_with_batched_3d_inputtest_hsv_to_rgb_invalid_channels_first_with_batched_3d_inputtest_rgb_to_grayscale_invalid_channels_first_with_batched_3d_inputReproduction (from issue #22472)
Before fix:
input must have 3 channels but input only has 4 channelskeras.Inputplaceholder: returns output shape(None, 4, 3)without raising an errorAfter fix: