Skip to content

Conversation

Tpt
Copy link
Contributor

@Tpt Tpt commented Sep 12, 2025

inspect::TypeHint is an AST of a Python type hint.

It is supposed to be created using a set of const constructors and used only in const context.

Introspection data generation is done using two associated methods. The serialized data is also an AST.

I make use of the AST to generate nice from X import Y as Z lines in the type stubs with code that auto-detect conflicts.

Miscellaneous changes:

  • Rename PyType{Info,Check}::TYPE_INFO into TYPE_HINT
  • Drop redundant PyClassImpl::TYPE_NAME

@Tpt Tpt changed the title Introduce TypeHint struct Introspection: Introduce TypeHint struct Sep 12, 2025
@Tpt Tpt force-pushed the tpt/moduleandtype branch 3 times, most recently from 7d6b267 to 7bb5b56 Compare September 12, 2025 13:24
pub struct TypeHint {
/// The type hint annotation
#[doc(hidden)]
pub annotation: &'static str,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would have preferred it to be an enum but sadly I don't see how to serialize it: const functions cannot concatenate and recursive macros are not possible

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What would the enum definition be? I can see if I can come up with any crazy ideas if I know the preferred form.

Copy link
Contributor Author

@Tpt Tpt Sep 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something like:

enum TypeHint {clashes
   BuiltIn { name: &'static str },
   Member { module: &'static str, name: &'static str },
   Union(&[TypeHint]),
   Subscript { value: TypeHint, parameters: &[TypeHint] },
   Callable { arguments: &[TypeHint], return: TypeHint } // Special case because of the nested [] syntax
}

This way it's easy to ensure the syntax is valid and to do manipulation like introducing aliases in case of name conflicts.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I think if we split the const fn up into one which (recursively) calculates the final length, and a second which serializes to it with &mut [u8] similar to what I did in the concat.rs module, I think we could probably make the const fn work?

Unfortunately this would require MSRV 1.83. But this is currently an experimental feature, I would personally be ok with having it limited to MSRV 1.83.

Copy link
Contributor Author

@Tpt Tpt Sep 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done!

This works fairly well. I have not introduced macros to build the TypeHint enum yet, I plan to do it in a follow up. My guess is that the real use case is for the custom type hint in the signature, most of INPUT_TYPE and OUTPUT_TYPE are likely fine with just the const constructors.

I now serialize the type hint as an AST, this allows to manipulate it when building the type stubs. As an example, I wrote some code that generates nice from X import Y imports and generate aliases in case of name conflicts.

@Tpt Tpt force-pushed the tpt/moduleandtype branch 5 times, most recently from 47a4fc1 to 4ece557 Compare September 12, 2025 14:15
@Tpt Tpt marked this pull request as ready for review September 12, 2025 15:01
Copy link
Member

@davidhewitt davidhewitt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, this looks like the right direction for making the type hint definitions reliable wrt import resolution.

I think we can iterate a bit to find the best design for users, I wrote down some ideas.

}};
}

/// Allows to build a [`TypeHint`] that is the subscripted
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sentence looks unfinished.

pub struct TypeHint {
/// The type hint annotation
#[doc(hidden)]
pub annotation: &'static str,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What would the enum definition be? I can see if I can come up with any crazy ideas if I know the preferred form.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like the annotation_stub stuff is still present, should we remove it at the same time?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code to extract modules was in the stubs.rs module and got removed

inspect::TypeHint is composed of an "annotation" string and a list of "imports" ("from X import Y" kind)

The type is expected to be built using the macros `type_hint!(module, name)`, `type_hint_union!(*args)` and `type_hint_subscript(main, *args)` that take care of maintaining the import list

Introspection data generation is done using the hidden type_hint_json macro to avoid that the proc macros generate too much code

Sadly, outside `type_hint` these macros can't be converted into const functions because they need to do some concatenation. I introduced `type_hint!` for consistency, happy to convert it to a const function.

Miscellaneous changes:
- Rename PyType{Info,Check}::TYPE_INFO into TYPE_HINT
- Drop redundant PyClassImpl::TYPE_NAME
@Tpt Tpt force-pushed the tpt/moduleandtype branch from 4ece557 to 02ce980 Compare September 23, 2025 14:23
#[derive(Debug, Eq, PartialEq, Clone, Hash)]
pub enum TypeHint {
Ast(TypeHintExpr),
Plain(String),
Copy link
Contributor Author

@Tpt Tpt Sep 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is for custom type hint witten in the #[pyo3(signature = ] attribute. I plan a follow up to parse the value in the proc-macros and generate an AST. I tend to think it's fine to restrict the possible type hint grammar and expand it in the future if needs arise (the opposite of what we did by allowing anything inside of a string)

@Tpt Tpt force-pushed the tpt/moduleandtype branch 2 times, most recently from 1f56ea9 to 3fee134 Compare September 23, 2025 14:38
/// Provides the full python type as a type hint.
#[cfg(feature = "experimental-inspect")]
const PYTHON_TYPE: &'static str = "typing.Any";
const TYPE_HINT: TypeHint = TypeHint::module_member("typing", "Any");
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can drop this and only have the field on PyTypeCheck. I will do it in a follow up except if you prefer it done in this MR

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't the blanket impl PyTypeCheck for T: PyTypeInfo need this here?

Copy link
Contributor Author

@Tpt Tpt Oct 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can write in this blanket impl:

    const TYPE_HINT: TypeHint = if let Some(module) = Self::MODULE {
       TypeHint::module_member(module, Self::NAME)
    } else {
       TypeHint:builtins(Self::NAME)
    };

@Tpt Tpt force-pushed the tpt/moduleandtype branch from 3fee134 to a43f077 Compare September 25, 2025 14:39
@Tpt Tpt force-pushed the tpt/moduleandtype branch from a43f077 to 751482e Compare September 25, 2025 14:42
@Tpt
Copy link
Contributor Author

Tpt commented Oct 3, 2025

@davidhewitt If you have time for an other review it would be amazing 🥺

@davidhewitt davidhewitt mentioned this pull request Oct 4, 2025
Copy link
Member

@davidhewitt davidhewitt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the slow review, have been spinning many plates this week.

This is looking very cool, nicely done! It looks like it fixes a lot of the problems we had and makes the API a lot cleaner.

I have quite a few thoughts & questions on this one 😁

Comment on lines +485 to +487
fn deserialize_type_hint<'de, D: Deserializer<'de>>(
deserializer: D,
) -> Result<Option<ChunkTypeHint>, D::Error> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please expand on why the ChunkTypeHint type needs to exist (seems very similar to TypeHint in model.rs)? Maybe add a doc comment to the type.

Copy link
Contributor Author

@Tpt Tpt Oct 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have added a comment. I duplicated the type to make it clear this ChunkTypeHint type is designed for extraction of the introspected data and will have to deal with backward compatibility whereas the TypeHint type is for the user facing API. I guess they will diverge in the future. Is it fine with you?

Comment on lines +609 to +610
"from bat import A as A2",
"from foo import A as A3, B"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does the aliasing introduced here cause IDEs to hint the type to the user as "A2" ?

If so, I wonder if we should go the other direction and have import bat and import foo here and use fully qualified names below?

I think I've seen the Python protobuf compiler handled this same problem, we might find it useful to see what they did for that implementation.

... I wonder if rather than A2 and A3, nicer aliases might be bat_A and foo_A?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does the aliasing introduced here cause IDEs to hint the type to the user as "A2" ?

It is not the case in PyCharm (I properly get A). I have not tested in other IDEs.

If so, I wonder if we should go the other direction and have import bat and import foo here and use fully qualified names below?

We can, it might make regular type stubs a bit less readable (a lot of full paths). We might still have some case of naming conflicts (e.g. between a module and a function) but sounds less likely. Happy to do the change if you want.

I think I've seen the Python protobuf compiler handled this same problem, we might find it useful to see what they did for that implementation.

Thanks! Will have a look. However, I guess their lives is a bit easier than ours because they do not have to deal with arbitrary imports (they only generate code from protobuf whereas we have to deal with arbitrary type hint from user code that can use whatever Python type).

... I wonder if rather than A2 and A3, nicer aliases might be bat_A and foo_A?

Indeed. I have added a TODO. My feeling is that it's a polishing that can be far in the future. WDYT?

/// Provides the full python type as a type hint.
#[cfg(feature = "experimental-inspect")]
const PYTHON_TYPE: &'static str = "typing.Any";
const TYPE_HINT: TypeHint = TypeHint::module_member("typing", "Any");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't the blanket impl PyTypeCheck for T: PyTypeInfo need this here?

Comment on lines 233 to 235
value.serialize_for_introspection(output);
output
.split_at_mut(value.serialized_len_for_introspection())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To avoid iterating through value twice it might make sense for serialize_for_introspection to return the number of bytes written?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed. Done.

/// const T: TypeHint = TypeHint::module_member("datetime", "time");
/// assert_eq!(T.to_string(), "datetime.time");
/// ```
pub const fn module_member(module: &'static str, attr: &'static str) -> Self {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about module_attr? It's slighty shorter, matches the argument names and enum name, and it feels a bit more correct to me than module_member somehow 🤔

Also, does this work for nested classes?

e.g.

# a.py
class A:
    class B:
        ...

... I guess the import would be from a import A, and the annotation would be A.B.

... I guess the answer is yes, this works, as long as in the import aliasing code we only import & alias the ident up the the first dot (i.e. if another "A" is in scope, this might be aliased as A2.B)?

Copy link
Contributor Author

@Tpt Tpt Oct 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about module_attr?

Done

I guess the answer is yes, this works, as long as in the import aliasing code we only import & alias the ident up the the first dot (i.e. if another "A" is in scope, this might be aliased as A2.B)?

Exactly. I just added it in 53119ed (this was missing)

@Tpt
Copy link
Contributor Author

Tpt commented Oct 6, 2025

I have quite a few thoughts & questions on this one 😁

Thank you for taking the time for the thoughtful review. I was not expected so much even if I expected a lot of comments.

Follow ups in 1f30ee7 and 53119ed

Tpt added 2 commits October 6, 2025 09:53
# Conflicts:
#	noxfile.py
#	src/types/weakref/anyref.rs
#	src/types/weakref/proxy.rs
@Tpt Tpt force-pushed the tpt/moduleandtype branch from 613f2a6 to 1f30ee7 Compare October 6, 2025 08:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants