Skip to content

Conversation

@randolf-scholz
Copy link
Contributor

@randolf-scholz randolf-scholz commented Sep 21, 2025

Fixes #19896, #19902

Some additional changes:

  • I added some code that ensures that the default type always appears last in the union (relevant when a union of multiple keys is given)
  • I ensure that the original value-type is use instead of its proper_type. This simplifies the return in testRecursiveTypedDictMethods.

@github-actions

This comment has been minimized.

@randolf-scholz
Copy link
Contributor Author

randolf-scholz commented Sep 21, 2025

  • ops/model.py:1149: false positive on master (T | None despite key being required)
  • ops/model.py:1223: false positive on master (T | None despite key being required)
  • ops/_private/harness.py:2280 false positive on master (T | None despite key being required)
  • homeassistant/components/evohome/storage.py:98 false positive on master (T | None despite key being required)
  • homeassistant/components/fritz/coordinator.py false negative on master (T | None despite key being required)
  • archinstall/lib/models/users.py:204 false negative on master (T | None despite key being required)
  • steam errors seem mostly related to using some hacky custom enum types.

@randolf-scholz randolf-scholz marked this pull request as ready for review September 21, 2025 14:28
@randolf-scholz randolf-scholz changed the title More precise return types form TypedDict.get More precise return types for TypedDict.get Sep 22, 2025
@randolf-scholz
Copy link
Contributor Author

Just noticed this fixes another bug on master by looking at the discord results: #19902

@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

Updated comments in checkmember.py for clarity on overload definitions.
@github-actions

This comment has been minimized.

Copy link
Collaborator

@sterliakov sterliakov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM - thanks! Primer errors reduce my confidence a bit, but that shouldn't be a big problem.

Could you also add a test or two for incremental mode? Since this plugin performs some node modifications, better check its interaction with the cache explicitly.

One comment that might be critical:

object_type = mx.chk.named_type("builtins.object")

# type variable for default value
tvar = TypeVarType(
Copy link
Collaborator

@sterliakov sterliakov Sep 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure that TypeVarType should ever be introduced without a corresponding symbol (TypeVarExpr). Any other place that needs to synthesize persistent (not single-use like a fake constructor in checkexpr) typevars does so in two steps (grep for TypeVarExpr in semanal_namedtuple.py or dataclasses plugin). I suspect that some nasal demons are expected if TypeVarType isn't anchored to some real symbol, this might bite us later.

@randolf-scholz
Copy link
Contributor Author

Primer errors reduce my confidence a bit, but that shouldn't be a big problem.

Which ones specifically? I already looked in detail at most of them (#19897 (comment)), and the discord ones I looked at are all instances of #19902.

Could you also add a test or two for incremental mode?

Is it essentially enough to just copy-paste some of the TypedDictGetMethod tests into check-incremental.test or are there more steps involved?

@sterliakov
Copy link
Collaborator

sterliakov commented Sep 23, 2025

Which ones specifically

The crash mostly:)

just copy-paste some

There's a detailed explanation in check-incremental.test, written better than I'll be able to sum up myself. I'd probably want to check the total/non-total distinction and varying keys count (0, 1, 2?), testing optional/required/unknown keys every time, but the main point of interest here is having typeddicts defined in a separate file (without any errors), so that the testcase body can actually hit some cached defs.

Also, are there any .get tests with Required and NotRequired?

@JelleZijlstra
Copy link
Member

The crash seems unrelated, it shows up both in the "old" and "new" mypy in mypy-primer. It's an arguable mypy-primer bug that if mypy crashes, the crash always shows up as a "diff" because the stack trace changes.

@sterliakov
Copy link
Collaborator

sterliakov commented Sep 23, 2025

The crash seems unrelated, it shows up both in the "old" and "new" mypy in mypy-primer

Yeah, I know and saw it on a bunch of my own PRs (#19844). But it also means we don't know whether there's any diff on that package, we only have the "damn it crashes" result.

I won't call that a bug - I really think that any crashes should be very prominent in the output

@randolf-scholz
Copy link
Contributor Author

@sterliakov Ok so I:

  1. expanded the existing tests for total/non-total
  2. added an incremental variant of these tests
  3. added an appropriate namespace= to the TypeVarId for the T@get type variable.

@github-actions

This comment has been minimized.

Copy link
Collaborator

@sterliakov sterliakov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, now this is covered by tests really well, thanks! I'm still concerned about "out of thin air" typevars that have no corresponding symbo, but the rest looks perfect

@randolf-scholz
Copy link
Contributor Author

randolf-scholz commented Sep 25, 2025

I'm still concerned about "out of thin air" typevars that have no corresponding symbo, but the rest looks perfect

Maybe @ilevkivskyi can comment on that, I saw that he added for instance the TypeVarExpr to semanal_namedtuple in #6330.

If it is necessary, we can add a TypeVarExpr as well, but I am not sure where exactly one would put it.

@github-actions

This comment has been minimized.

@randolf-scholz
Copy link
Contributor Author

No hidden issues on autosplit 👍

@ilevkivskyi
Copy link
Member

Maybe @ilevkivskyi can comment on that

For generated symbols (like a method generated by @dataclass) it is more prudent to add a TypeVarExpr (just to emulate everything 1:1 to how it would look for a real user-defined method). But for (short-lived) types it is not needed, just be sure TypeVarId is unique, to avoid clashes (for that purpose I guess namespace may be sufficient).

@randolf-scholz
Copy link
Contributor Author

randolf-scholz commented Sep 27, 2025

What confuses me a bit is that here we are talking about methods for which the type-var is bound to the method, not the class. When one writes something like:

class A:
    def method1[T](self, other: T) -> T: ...
    def method2[T](self, other: T) -> T: ...

then if I understood correctly, internally mypy still attaches a TypeVarExpr for T@method1 to the TypeInfo of A?

What if multiple methods like in the example above use the same name for the type-var? Does it do some automatic name-mangling / ID-selection?

@ilevkivskyi
Copy link
Member

internally mypy still attaches a TypeVarExpr for T@method1 to the TypeInfo of A?

Yes, bu this is only needed so that T can be resolved as a name (in appropriate scopes). As I mentioned you should not need this for your use case.

What if multiple methods like in the example above use the same name for the type-var? Does it do some automatic name-mangling / ID-selection?

This is actually a good question, I don't know how it is done for PEP 695. But if you are asking about the type-checking phase (as opposite to semantic analysis), then yes, we use freshen_all_functions_type_vars() to generate new TypeVarIds for a generic function/method. So I think you should be good, your fixed -1 value will be replaced with a fresh one when checking a call to this method.

@github-actions

This comment has been minimized.

Copy link
Collaborator

@sterliakov sterliakov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Your discussion with @ilevkivskyi resolves my last question, LG

fallback=mx.chk.named_type("builtins.function"),
name=name,
)
elif name == "get":
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am actually worried about the performance implications of this part of the PR. Some people have TypedDicts with few hundreds keys (not an exaggeration). This will then create an enormous overload. And overloads are slow to type-check (for various reasons). What exactly do we need this part for? Is it just for the better reveal_type(), or also for something else?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I would say that is mostly for reveal_type. It will also change the error message if a user passes illegal arguments, but I don't think that is as important.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it would be possible to:

  1. cache the overloads on the typeddict type itself (or make it a lazy property)
  2. Call the plugin early to skip all the overload checks.

Alternative, I could roll back this part of the PR and only keep the updated logic in the default plugin.

Copy link
Member

@ilevkivskyi ilevkivskyi Oct 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Calling existing plugin hook early would be a breaking change in the plugin system (IIRC we pass "default return type" as part of the plugin context), and adding a new hook is too big change for something like this. FWIW I don't think reveal_type() is really worth the potential problems, so I think it would be best to roll back this part.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There does seem to be one test where it makes a difference: (#19902)

# https://github.com/python/mypy/issues/19902
from typing import TypedDict, Union
from typing_extensions import TypeAlias, NotRequired
class A(TypedDict):
    key: NotRequired[int]

class B(TypedDict):
    key: NotRequired[int]

class C(TypedDict):
    key: NotRequired[int]

A_or_B: TypeAlias = Union[A, B]
A_or_B_or_C: TypeAlias = Union[A_or_B, C]

def test(d: A_or_B_or_C) -> None:
    reveal_type(d.get("key"))  # N: Revealed type is "Union[builtins.int, None]"

(on master this just gives object)

I think this should be fixable by doing a flatten_nested_union inside check_union_call_expr.

@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

@randolf-scholz
Copy link
Contributor Author

I reverted the changes made to checkmember that achieved a nicer reveal_type, due to the performance concerns.

To make a unit test for #19902 pass it was necessary to insert a flatten_nested_unions in the check_union_call_expr function. I didn't debug this completely, my guess is that for some reason the plugin hook wasn't called or working correctly in this case.

@ilevkivskyi
Copy link
Member

@randolf-scholz

I didn't debug this completely, my guess is that for some reason the plugin hook wasn't called or working correctly in this case.

I think it didn't work because flatten_nested_unions() call in UnionType constructor doesn't handle type aliases (and this call in checkexpr.py is intentionally non-recursive).

@github-actions
Copy link
Contributor

Diff from mypy_primer, showing the effect of this PR on open source code:

operator (https://github.com/canonical/operator)
- ops/model.py:1150: error: Incompatible types in assignment (expression has type "str | None", variable has type "str")  [assignment]
- ops/model.py:1224: error: Incompatible types in assignment (expression has type "str | None", variable has type "str")  [assignment]
- ops/_private/harness.py:2281: error: Value expression in dictionary comprehension has incompatible type "str | int | float | None"; expected type "str | int | float"  [misc]

core (https://github.com/home-assistant/core)
+ homeassistant/components/fritz/coordinator.py:396: error: Statement is unreachable  [unreachable]
+ homeassistant/components/fritz/coordinator.py:413: error: Incompatible types in assignment (expression has type "bool | None", variable has type "bool")  [assignment]

steam.py (https://github.com/Gobot1234/steam.py)
- steam/http.py:349: error: Value of "last_assetid" has incompatible type "str | int"; expected "str"  [typeddict-item]
- steam/user.py:125: error: Argument "game_name" to "CMsgClientPersonaStateFriend" has incompatible type "str | int"; expected "str"  [arg-type]

discord.py (https://github.com/Rapptz/discord.py)
- discord/scheduled_event.py:139: error: Incompatible types in assignment (expression has type "object", variable has type "str | None")  [assignment]
- discord/scheduled_event.py:145: error: Incompatible types in assignment (expression has type "object", variable has type "str | None")  [assignment]
- discord/scheduled_event.py:146: error: Incompatible types in assignment (expression has type "object", variable has type "int")  [assignment]
- discord/scheduled_event.py:150: error: Argument 1 to "store_user" of "ConnectionState" has incompatible type "object"; expected "User | PartialUser"  [arg-type]
- discord/scheduled_event.py:155: error: No overload variant of "parse_time" matches argument type "object"  [call-overload]
- discord/scheduled_event.py:155: note: Possible overload variants:
- discord/scheduled_event.py:155: note:     def parse_time(timestamp: None) -> None
- discord/scheduled_event.py:155: note:     def parse_time(timestamp: str) -> datetime
- discord/scheduled_event.py:155: note:     def parse_time(timestamp: str | None) -> datetime | None
- discord/scheduled_event.py:159: error: Argument 1 to "_unroll_metadata" of "ScheduledEvent" has incompatible type "object"; expected "EntityMetadata | None"  [arg-type]
- discord/poll.py:449: error: Item "None" of "PollMedia | None" has no attribute "get"  [union-attr]
- discord/poll.py:459: error: Argument "question" to "Poll" has incompatible type "str | Any | None"; expected "PollMedia | str"  [arg-type]
- discord/app_commands/models.py:224: error: No overload variant of "int" matches argument type "object"  [call-overload]
- discord/app_commands/models.py:224: note: Possible overload variants:
- discord/app_commands/models.py:224: note:     def __new__(cls, str | Buffer | SupportsInt | SupportsIndex | SupportsTrunc = ..., /) -> int
- discord/app_commands/models.py:224: note:     def __new__(cls, str | bytes | bytearray, /, base: SupportsIndex) -> int
- discord/app_commands/models.py:231: error: Incompatible types in assignment (expression has type "object", variable has type "bool")  [assignment]
- discord/app_commands/models.py:237: error: Argument 1 to "_from_value" of "ArrayFlags" has incompatible type "object"; expected "Sequence[int]"  [arg-type]
- discord/app_commands/models.py:243: error: Argument 1 to "_from_value" of "ArrayFlags" has incompatible type "object"; expected "Sequence[int]"  [arg-type]
- discord/app_commands/models.py:245: error: Incompatible types in assignment (expression has type "object", variable has type "bool")  [assignment]
- discord/app_commands/models.py:246: error: Argument 1 to "_to_locale_dict" has incompatible type "object"; expected "dict[str, str]"  [arg-type]
- discord/app_commands/models.py:247: error: Argument 1 to "_to_locale_dict" has incompatible type "object"; expected "dict[str, str]"  [arg-type]
- discord/app_commands/models.py:1065: error: Argument 1 to "_to_locale_dict" has incompatible type "object"; expected "dict[str, str]"  [arg-type]
- discord/app_commands/models.py:1066: error: Argument 1 to "_to_locale_dict" has incompatible type "object"; expected "dict[str, str]"  [arg-type]
- discord/app_commands/models.py:1164: error: Argument 1 to "_to_locale_dict" has incompatible type "object"; expected "dict[str, str]"  [arg-type]
- discord/app_commands/models.py:1165: error: Argument 1 to "_to_locale_dict" has incompatible type "object"; expected "dict[str, str]"  [arg-type]
- discord/message.py:868: error: "object" has no attribute "items"  [attr-defined]
- discord/interactions.py:247: error: Need type annotation for "raw_channel"  [var-annotated]
- discord/guild.py:244: error: Incompatible types in assignment (expression has type "int | None", variable has type "int")  [assignment]
- discord/guild.py:245: error: Incompatible types in assignment (expression has type "int | None", variable has type "int")  [assignment]

archinstall (https://github.com/archlinux/archinstall)
+ archinstall/lib/models/users.py:206: error: Left operand of "or" is always false  [redundant-expr]
+ archinstall/lib/models/authentication.py:48: error: Statement is unreachable  [unreachable]

@ilevkivskyi ilevkivskyi merged commit 843d133 into python:master Oct 14, 2025
20 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

More precise return types form TypedDict.get

4 participants