TYP: improve type annotations and remove unnecessary type ignores #62315

simonjayhawkins · 2025-09-11T10:58:28Z

Add and refine type hints in core modules (algorithms, base, indexes, _version).
Remove redundant # type: ignore[no-untyped-call] comments in groupby/categorical.py.
Refactor variable annotations for clarity and consistency.
No functional changes; improves code quality and type safety.

- Add and refine type hints in core modules (algorithms, base, indexes, _version). - Remove redundant `# type: ignore[no-untyped-call]` comments in groupby/categorical.py. - Refactor variable annotations for clarity and consistency. - No functional changes; improves code quality and type safety.

mroeschke · 2025-09-11T16:06:08Z

pandas/_version.py



-def get_versions():
+def get_versions() -> dict[str, Any]:


Suggested change

def get_versions() -> dict[str, Any]:

def get_versions() -> dict[str, str | None]:

?

dict is invariant, so it's generally speaking not a good idea to use unions as type arguments

I don’t understand this point. Why would being more specific not be better?

It's good to be more specific, but you should keep in mind that dict[str, str] is not assignable to dict[str, str | None] due to its invariance. Depending on how this will be used, that could lead to unexpected typing errors down the line.

If you want to be specific here, then the best you can do is use a TypedDict, or a union thereof. If this function is frequently used, then the effort might be worth it.

Another option would be to return a Mapping[str, str | None] instead. The Mapping value type parameter is covariant, so both dict[str, str] and dict[str, str | None] can be assigned to it (since dict <: Mapping).

The "Any trick" might also work, i.e. dict[str, str | Any]. Personally I'm not a fan of this "Any trick"` and don't think it should be promoted like that, but in this specific case it's not all that bad.

Thanks all for the input.

A few observations

the goal of this PR is to remove some #type: ignore from the pandas codebase. This annotation is on vendored code. So firstly, I wasn't particularly concerned about using Any here, but that's just my opinion and maybe because we effectively maintain this code we should treat it equally with the pandas code. Secondly, because it is vendored i would have probably have just created a .pyi file. However, i noticed that others have added type hints to the file directly. adding a .pyi would shadow these. However, i could have added the existing to the .pyi as well so also an option.

i've not contributed greatly to the type annotations of late so my understanding of policies may be outdated. But IIRC we always used to be precise in the return types, wrt to type and therefore would not have used Mapping for return types in the past. Happy to change if this is now acceptable and used elsewhere. If the reason for the original comment here is the reluctance to use Any then I would also be happy using just dict without any type arguments. Still better than untyped function.

Strictness flags and other mypy command line options, or even grep, easily allows us to audit the use of Any. In the past i've not be adverse to use Any. To me this is a lesser offence than a #type: ignore. Hence adding a Any to remove a #type: ignore is IMO a win?

If you want to be specific here, then the best you can do is use a TypedDict, or a union thereof. If this function is frequently used, then the effort might be worth it.

I guess this is likely to be the most acceptable to all?

jorenham

I picked some nits; hope you don't mind

jorenham · 2025-09-11T16:57:31Z

pandas/_version.py



-def get_versions():
+def get_versions() -> dict[str, Any]:


dict is invariant, so it's generally speaking not a good idea to use unions as type arguments

jorenham · 2025-09-11T16:59:56Z

pandas/core/algorithms.py

+@overload
+def unique(values: np.ndarray) -> np.ndarray: ...
+@overload
+def unique(values: Index) -> Index: ...
+@overload
+def unique(values: Series) -> np.ndarray: ...
+@overload
+def unique(values: Categorical) -> Categorical: ...
+@overload
+def unique(values: ExtensionArray) -> ExtensionArray: ...


The Series -> ndarray and ndarray -> ndarray overloads can be combined, e.g.:

Suggested change

@overload

def unique(values: np.ndarray) -> np.ndarray: ...

@overload

def unique(values: Index) -> Index: ...

@overload

def unique(values: Series) -> np.ndarray: ...

@overload

def unique(values: Categorical) -> Categorical: ...

@overload

def unique(values: ExtensionArray) -> ExtensionArray: ...

@overload

def unique(values: Index) -> Index: ...

@overload

def unique(values: Categorical) -> Categorical: ...

@overload

def unique(values: ExtensionArray) -> ExtensionArray: ...

@overload

def unique(values: np.ndarray | Series) -> np.ndarray: ...

Additionally, you could merge the three other overloads by using a TypeVar(..., bound=Index | Categorical | ExtensionArray)

jorenham · 2025-09-11T17:01:19Z

pandas/core/base.py

        if not isinstance(values, np.ndarray):
            # i.e. ExtensionArray
-            result = values.unique()
+            result: np.ndarray | ExtensionArray = values.unique()


If you run mypy with --local-partial-types --allow-redefinition-new this wouldn't be needed anymore.

https://mypy.readthedocs.io/en/latest/command_line.html#cmdoption-mypy-allow-redefinition-new

I've started to look into this, and our global settings were set a long while ago now so a review is perhaps long overdue. Also bearing in mind that these flags will become the defaults in the future then also worth being ahead of the game. My initial impression is that i'm +1 on this. After a quick review it looks like in most cases people have being adding ignores which is IMO worse than adding the union variable annotation up front. But if mypy can now infer these without explicitly defining them then i assume that'll be a plus. 200-300 mypy errors at the moment. Adding the explicit type annotations for the variables defined outside the local scope will probably take a while but i've started on that and will do this as a precursor PR. Will probably also want to remove the variable annotations used like here where mypy can now infer the types. The only issue here maybe that mypy has flagged this as experimental and others may be against. Just to reiterate this looks like a good idea. Thanks for the suggestion.

jorenham · 2025-09-11T17:01:42Z

pandas/core/indexes/base.py

                    # This is faster, because Index.unique() checks for uniqueness
                    # before calculating the unique values.
-                    res = algos.unique1d(res_indexer)
+                    res: Index | ExtensionArray | np.ndarray = algos.unique1d(


ok, so for now have added ignore[assignment] instead, since this PR was specifically targeting ignore[no-untyped-call]. These should either be unnecessary with a precursor PR or can merge before if consider these the lesser of two evils.

simonjayhawkins · 2025-09-12T09:11:17Z

I picked some nits; hope you don't mind

not at all. Thanks for taking the time to review.

…- precursor PR to change global config so these won't be necessary

mroeschke · 2025-09-12T17:18:54Z

Thanks @simonjayhawkins

simonjayhawkins requested a review from rhshadrach as a code owner September 11, 2025 10:58

simonjayhawkins added the Typing type annotations, mypy/pyright type checking label Sep 11, 2025

mroeschke reviewed Sep 11, 2025

View reviewed changes

jorenham reviewed Sep 11, 2025

View reviewed changes

simonjayhawkins added 4 commits September 12, 2025 11:36

Merge remote-tracking branch 'upstream/main' into anno2

a33515e

remove Any for now while discussions ongoing

647c1d4

prefer ignore[assignment] over explicit variable annotations for now …

babc396

…- precursor PR to change global config so these won't be necessary

combine the overloads

3607e75

mroeschke added this to the 3.0 milestone Sep 12, 2025

mroeschke approved these changes Sep 12, 2025

View reviewed changes

mroeschke merged commit c99e52e into pandas-dev:main Sep 12, 2025
41 checks passed

simonjayhawkins mentioned this pull request Sep 12, 2025

TYP: add type annotations for holidays and dtypes modules #62327

Merged

simonjayhawkins deleted the anno2 branch September 12, 2025 20:33

	def get_versions() -> dict[str, Any]:
	def get_versions() -> dict[str, str \| None]:

Uh oh!

TYP: improve type annotations and remove unnecessary type ignores #62315

TYP: improve type annotations and remove unnecessary type ignores #62315

Uh oh!

Conversation

simonjayhawkins commented Sep 11, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jorenham left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

simonjayhawkins commented Sep 12, 2025

Uh oh!

Uh oh!

mroeschke commented Sep 12, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants