Skip to content

fix:integerlookup one_hot shape inference for 2D inputs#22592

Open
maitry63 wants to merge 2 commits intokeras-team:masterfrom
maitry63:fix_one_hot_shape
Open

fix:integerlookup one_hot shape inference for 2D inputs#22592
maitry63 wants to merge 2 commits intokeras-team:masterfrom
maitry63:fix_one_hot_shape

Conversation

@maitry63
Copy link
Copy Markdown
Contributor

This PR fixes a regression in IntegerLookup with output_mode="one_hot" where 2D inputs (batch, sequence_length) were incorrectly producing output shapes in symbolic mode.
Fixes: #22520

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request addresses a shape collapse issue in the IndexLookup layer when using one_hot output mode with 2D symbolic inputs. The changes update compute_output_shape and compute_output_spec to correctly preserve input dimensions and add regression tests for these scenarios. The review feedback identifies significant code duplication between these two methods regarding depth calculation and one_hot logic, recommending a refactor into a shared helper method to improve code maintainability.

Comment on lines +577 to +585
if self.output_mode == "one_hot":
depth = (
self.max_tokens
if self.pad_to_max_tokens and self.max_tokens is not None
else self.vocabulary_size()
)
output_shape = input_shape + (depth,)
else:
output_shape = self.compute_output_shape(input_shape)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

There is some code duplication between compute_output_spec and compute_output_shape that could be refactored to improve maintainability.

  1. The calculation of depth is identical in both methods (here and in lines 561-564). This could be extracted to a private helper method to avoid redundancy.
  2. The logic for one_hot output shape calculation is also duplicated. It seems compute_output_shape is now correct for all cases. If possible, compute_output_spec could be simplified to call compute_output_shape for all modes except 'int', which would remove the duplication.

If this duplication is intentional and necessary for the fix to work correctly in all execution modes, adding a comment explaining the reason would be very helpful for future maintenance.

@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented Mar 29, 2026

Codecov Report

❌ Patch coverage is 91.66667% with 2 lines in your changes missing coverage. Please review.
✅ Project coverage is 83.30%. Comparing base (8f5ef11) to head (e38e26d).
⚠️ Report is 2 commits behind head on master.

Files with missing lines Patch % Lines
keras/src/layers/preprocessing/index_lookup.py 91.66% 1 Missing and 1 partial ⚠️
Additional details and impacted files
@@           Coverage Diff           @@
##           master   #22592   +/-   ##
=======================================
  Coverage   83.30%   83.30%           
=======================================
  Files         596      596           
  Lines       67962    67960    -2     
  Branches    10580    10578    -2     
=======================================
  Hits        56615    56615           
+ Misses       8600     8599    -1     
+ Partials     2747     2746    -1     
Flag Coverage Δ
keras 83.11% <91.66%> (+<0.01%) ⬆️
keras-jax 59.72% <91.66%> (+<0.01%) ⬆️
keras-numpy 54.34% <87.50%> (+<0.01%) ⬆️
keras-openvino 51.80% <4.16%> (+<0.01%) ⬆️
keras-tensorflow 61.03% <91.66%> (+<0.01%) ⬆️
keras-torch 59.91% <91.66%> (+<0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@keerthanakadiri keerthanakadiri added the stat:awaiting keras-eng Awaiting response from Keras engineer label Mar 30, 2026
@keerthanakadiri
Copy link
Copy Markdown
Contributor

Code Review

This pull request addresses a shape collapse issue in the IndexLookup layer when using one_hot output mode with 2D symbolic inputs. The changes update compute_output_shape and compute_output_spec to correctly preserve input dimensions and add regression tests for these scenarios. The review feedback identifies significant code duplication between these two methods regarding depth calculation and one_hot logic, recommending a refactor into a shared helper method to improve code maintainability.

Hi @maitry63, Can you check with this once, looks like outdated. Thanks !

Copy link
Copy Markdown
Collaborator

@hertschuh hertschuh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Most changes are spurious formatting changes. Please undo all of the these to keep only the actual changes.

Please also rebase.

Thanks!

@hertschuh hertschuh added stat:awaiting response from contributor and removed stat:awaiting keras-eng Awaiting response from Keras engineer labels Mar 31, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

IntegerLookup(one_hot) reports incorrect output rank for symbolic inputs

5 participants