Skip to content

Conversation

@JortBergfeld
Copy link
Contributor

@JortBergfeld JortBergfeld commented Dec 3, 2025

This pull request fixes a problem with type validation in the experimental dataset and sample modules. When we use the is_list boolean of score_field, a complex ndarray type is generated that includes an Any type, which is not accepted by isinstance. To circumvent problems with complex types, whenever isintance fails with a type error, we only validate against the origin type (ndarray instead of ndarray[float32], for example)

Type validation improvements

  • Updated the _validate_attribute_type method in src/datumaro/experimental/dataset.py to correctly handle type validation for generic types by using origin when available, improving support for complex type annotations.

Test enhancements

  • Added a new test, test_sample_with_is_list, in tests/unit/experimental/test_sample.py to verify that samples with list-type fields (using is_list=True in score_field) are created without validation errors.

Resolves #1971

Copilot AI review requested due to automatic review settings December 3, 2025 09:50
@JortBergfeld JortBergfeld requested a review from a team as a code owner December 3, 2025 09:50
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes type validation for complex generic types in the experimental dataset module. The validation logic was incorrectly handling generic type annotations, causing validation errors for fields with list-type specifications.

Key Changes:

  • Fixed _validate_attribute_type to use the origin type for generic type validation
  • Added test coverage for samples with list-type fields

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.

File Description
src/datumaro/experimental/dataset.py Updated type validation to handle generic types by checking the origin type
tests/unit/experimental/test_sample.py Added test for creating samples with list-type fields to verify the validation fix

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@JortBergfeld JortBergfeld force-pushed the jort/fix_sample_validation_bug branch from 1bbdec3 to 5e15aa8 Compare December 3, 2025 09:52
@codecov-commenter
Copy link

⚠️ Please install the 'codecov app svg image' to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

@JortBergfeld JortBergfeld force-pushed the jort/fix_sample_validation_bug branch from 5e15aa8 to 6431a18 Compare December 3, 2025 10:34
Copy link
Contributor

@leoll2 leoll2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fix looks good to me, only one comment

@JortBergfeld JortBergfeld force-pushed the jort/fix_sample_validation_bug branch from 5751690 to f075b50 Compare December 5, 2025 12:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Sample dtype validation fails with optional numpy arrays

5 participants