-
-
Notifications
You must be signed in to change notification settings - Fork 19.1k
TYP: improve type annotations and remove unnecessary type ignores #62315
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 1 commit
957efd0
a33515e
647c1d4
babc396
3607e75
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change | ||||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
@@ -12,6 +12,7 @@ | |||||||||||||||||||||||||||||||||||||
TYPE_CHECKING, | ||||||||||||||||||||||||||||||||||||||
Literal, | ||||||||||||||||||||||||||||||||||||||
cast, | ||||||||||||||||||||||||||||||||||||||
overload, | ||||||||||||||||||||||||||||||||||||||
) | ||||||||||||||||||||||||||||||||||||||
import warnings | ||||||||||||||||||||||||||||||||||||||
|
||||||||||||||||||||||||||||||||||||||
|
@@ -314,6 +315,18 @@ def _check_object_for_strings(values: np.ndarray) -> str: | |||||||||||||||||||||||||||||||||||||
# --------------- # | ||||||||||||||||||||||||||||||||||||||
|
||||||||||||||||||||||||||||||||||||||
|
||||||||||||||||||||||||||||||||||||||
@overload | ||||||||||||||||||||||||||||||||||||||
def unique(values: np.ndarray) -> np.ndarray: ... | ||||||||||||||||||||||||||||||||||||||
@overload | ||||||||||||||||||||||||||||||||||||||
def unique(values: Index) -> Index: ... | ||||||||||||||||||||||||||||||||||||||
@overload | ||||||||||||||||||||||||||||||||||||||
def unique(values: Series) -> np.ndarray: ... | ||||||||||||||||||||||||||||||||||||||
@overload | ||||||||||||||||||||||||||||||||||||||
def unique(values: Categorical) -> Categorical: ... | ||||||||||||||||||||||||||||||||||||||
@overload | ||||||||||||||||||||||||||||||||||||||
def unique(values: ExtensionArray) -> ExtensionArray: ... | ||||||||||||||||||||||||||||||||||||||
|
@overload | |
def unique(values: np.ndarray) -> np.ndarray: ... | |
@overload | |
def unique(values: Index) -> Index: ... | |
@overload | |
def unique(values: Series) -> np.ndarray: ... | |
@overload | |
def unique(values: Categorical) -> Categorical: ... | |
@overload | |
def unique(values: ExtensionArray) -> ExtensionArray: ... | |
@overload | |
def unique(values: Index) -> Index: ... | |
@overload | |
def unique(values: Categorical) -> Categorical: ... | |
@overload | |
def unique(values: ExtensionArray) -> ExtensionArray: ... | |
@overload | |
def unique(values: np.ndarray | Series) -> np.ndarray: ... |
Additionally, you could merge the three other overloads by using a TypeVar(..., bound=Index | Categorical | ExtensionArray)
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -1100,7 +1100,7 @@ def unique(self): | |
values = self._values | ||
if not isinstance(values, np.ndarray): | ||
# i.e. ExtensionArray | ||
result = values.unique() | ||
result: np.ndarray | ExtensionArray = values.unique() | ||
|
||
else: | ||
result = algorithms.unique1d(values) | ||
return result | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -3301,7 +3301,9 @@ def _intersection(self, other: Index, sort: bool = False): | |
if is_numeric_dtype(self.dtype): | ||
# This is faster, because Index.unique() checks for uniqueness | ||
# before calculating the unique values. | ||
res = algos.unique1d(res_indexer) | ||
res: Index | ExtensionArray | np.ndarray = algos.unique1d( | ||
|
||
res_indexer | ||
) | ||
else: | ||
result = self.take(indexer) | ||
res = result.drop_duplicates() | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
dict
is invariant, so it's generally speaking not a good idea to use unions as type argumentsThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don’t understand this point. Why would being more specific not be better?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's good to be more specific, but you should keep in mind that
dict[str, str]
is not assignable todict[str, str | None]
due to its invariance. Depending on how this will be used, that could lead to unexpected typing errors down the line.If you want to be specific here, then the best you can do is use a
TypedDict
, or a union thereof. If this function is frequently used, then the effort might be worth it.Another option would be to return a
Mapping[str, str | None]
instead. TheMapping
value type parameter is covariant, so bothdict[str, str]
anddict[str, str | None]
can be assigned to it (sincedict <: Mapping
).The "
Any
trick" might also work, i.e.dict[str, str | Any]
. Personally I'm not a fan of this "Any
trick"` and don't think it should be promoted like that, but in this specific case it's not all that bad.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks all for the input.
A few observations
the goal of this PR is to remove some
#type: ignore
from the pandas codebase. This annotation is on vendored code. So firstly, I wasn't particularly concerned about usingAny
here, but that's just my opinion and maybe because we effectively maintain this code we should treat it equally with the pandas code. Secondly, because it is vendored i would have probably have just created a.pyi
file. However, i noticed that others have added type hints to the file directly. adding a.pyi
would shadow these. However, i could have added the existing to the.pyi
as well so also an option.i've not contributed greatly to the type annotations of late so my understanding of policies may be outdated. But IIRC we always used to be precise in the return types, wrt to type and therefore would not have used
Mapping
for return types in the past. Happy to change if this is now acceptable and used elsewhere. If the reason for the original comment here is the reluctance to useAny
then I would also be happy using justdict
without any type arguments. Still better than untyped function.Strictness flags and other mypy command line options, or even grep, easily allows us to audit the use of
Any
. In the past i've not be adverse to useAny
. To me this is a lesser offence than a#type: ignore
. Hence adding aAny
to remove a#type: ignore
is IMO a win?I guess this is likely to be the most acceptable to all?