Skip to content

fix: Handle list[Model], dict[str, Model] and Optional[Model] in nested type detection#630

Closed
andreahlert wants to merge 2 commits intoflyteorg:mainfrom
andreahlert:fix/nested-types-lists-maps
Closed

fix: Handle list[Model], dict[str, Model] and Optional[Model] in nested type detection#630
andreahlert wants to merge 2 commits intoflyteorg:mainfrom
andreahlert:fix/nested-types-lists-maps

Conversation

@andreahlert
Copy link
Contributor

@andreahlert andreahlert commented Feb 8, 2026

Motivation

PR #426 added support for nested Pydantic/dataclass schema detection via $ref, but as noted in flyteorg/flyte#6887 (item 1), it does not handle list and map containers with nested models.

When a schema contains list[NestedModel], dict[str, NestedModel], or Optional[NestedModel], the $ref inside items, additionalProperties, or anyOf was not resolved, causing KeyError or incorrect type inference (falling back to str).

Changes

Centralized $ref resolution

Instead of inlining $ref resolution at every call site, this PR follows the same approach as flytekit PR #3375: _get_element_type now accepts an optional schema parameter and resolves $ref centrally. This means any container type (list, dict, Optional, or nested combinations) automatically gets $ref support.

New helper functions

  • _resolve_ref: Centralized $ref path resolution to schema definition
  • _resolve_property_type: Resolves a JSON schema property to a (name, type) tuple, handling $ref, anyOf, and all type variants
  • _resolve_single_type: Resolves the inner type of Optional[T] from the first anyOf element, properly handling $ref, array, object, and primitive types
  • _resolve_typed_property: Resolves a property with a known type string (array, object, primitives)
  • _is_optional_anyof: Validates that an anyOf truly represents Optional[T] (exactly 2 elements, second is null) instead of blindly assuming any anyOf with $ref is Optional

Bug fixes

  • Optional[T] default values: Fields not in the schema's required list now get default=None in the generated dataclass, so constructing with omitted optional fields works correctly
  • Optional[List[Model]] / Optional[Dict[str, Model]]: Previously would crash with KeyError because the code accessed property_val["items"] instead of anyOf[0]["items"] - now handled correctly via _resolve_single_type
  • Union[TypeA, TypeB]: Previously any anyOf with $ref was blindly wrapped in Optional - now properly validated
  • Optional field inclusion: Properties not in required are included in property_order so fields with defaults (like Optional[X] = None) are not skipped during schema traversal

__init__ wrapper

Extended dict-to-object conversion to handle list[dict] -> list[Model] and dict[str, dict] -> dict[str, Model].

Tests

Added test_nested_lists_maps.py with 10 test cases covering:

  • list[NestedModel] roundtrip (Pydantic + dataclass)
  • dict[str, NestedModel] roundtrip (Pydantic + dataclass)
  • Optional[NestedModel] with value and with None
  • Optional[NestedModel] omitted (default to None)
  • Mixed initialization (dict + object in same list)
  • Empty collections ([], {})

All existing type engine tests pass with zero regressions.

Ref: flyteorg/flyte#6887

@andreahlert andreahlert force-pushed the fix/nested-types-lists-maps branch 3 times, most recently from cc2218c to 2ebaea8 Compare February 8, 2026 08:41
@andreahlert
Copy link
Contributor Author

@wild-endeavor @kumare3 Hey! Picked up item 1 from #6887 - the list/map gap in PR #426.

list[NestedModel], dict[str, NestedModel] and Optional[NestedModel] were all hitting KeyError because $ref wasn't being resolved in array items, additionalProperties, or anyOf.

Bonus bug: optional fields with defaults were silently skipped because property_order only iterated over required, so Optional[Model] = None never made it into the generated class.

@kumare3
Copy link
Contributor

kumare3 commented Feb 8, 2026

cc @AdilFayyaz can you PTAL?

@andreahlert andreahlert force-pushed the fix/nested-types-lists-maps branch from 2ebaea8 to 9a3f813 Compare February 8, 2026 23:39
…ed type detection

The nested schema type detection from PR flyteorg#426 did not resolve $ref for
list items, dict additionalProperties, or anyOf (Optional) fields.

This caused KeyError or incorrect type inference when using Pydantic or
dataclass models with fields like list[NestedModel], dict[str, NestedModel],
or Optional[NestedModel].

Changes:
- Add $ref resolution to _get_element_type for centralized handling
- Extract _resolve_ref, _resolve_property_type, _resolve_single_type,
  and _resolve_typed_property helpers to eliminate duplicated $ref logic
- Properly validate Optional via anyOf (check len==2 and null second element)
- Add default=None for optional fields not in required list
- Include optional properties in property_order so fields with defaults
  are not skipped during schema traversal
- Extend __init__ wrapper to convert list[dict] and dict[str, dict]
  to their respective nested types

Ref: flyteorg/flyte#6887
Signed-off-by: André Ahlert <andre@aex.partners>
@andreahlert andreahlert force-pushed the fix/nested-types-lists-maps branch from 9a3f813 to 76597a5 Compare February 9, 2026 04:15
@AdilFayyaz
Copy link
Collaborator

Thank you for working on this.

I do have some concerns, I believe _get_element_type still has gaps that this PR does not address. The new helper functions introduced _resolve_single_type and _resolve_typed_property handle array, object, and anyOf with $ref at the property level. However, _get_element_type is the recursive function that handles the nested elements. Which means anything beyond one level of nesting would break.

Can you please confirm if these cases pass, as I don't see any tests on recursive/nested components:

  • List[Optional[int]]
  • List[List[int]]
  • List[Dict[str, int]]

@andreahlert
Copy link
Contributor Author

PTAL

You're right. _get_element_tpe doesn't handle nested arrays or objects recursively, so List[List[int]] and List[Dict[str, int]] would break.

List[Optional[int]] should work since we already handle anyOf`.

Will add the recursive cases and tests for all three.

@andreahlert
Copy link
Contributor Author

Good catch. Pushed a fix - _get_element_type now recurses into array items and object additionalProperties, so List[List[int]] and List[Dict[str, int]] work correctly. Added roundtrip tests for all three cases (List[List[int]], List[Dict[str, int]], List[Optional[int]]), all passing.

Address review feedback: _get_element_type now handles nested containers (List[List[int]], List[Dict[str, int]]) by recursing into array items and object additionalProperties. Also adds List[Optional[int]] confirmation via tests.
@andreahlert andreahlert force-pushed the fix/nested-types-lists-maps branch from 2ea6160 to 3fd84ca Compare February 11, 2026 21:18
@AdilFayyaz
Copy link
Collaborator

Thanks for the contribution. We were able to handle this issue ourselves in the meantime. Here's the relevant PR: #640

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants