update testing.integrate_kernel type hints#5137
update testing.integrate_kernel type hints#5137neutrinoceros merged 3 commits intoyt-project:mainfrom
Conversation
neutrinoceros
left a comment
There was a problem hiding this comment.
thanks for catching this. However, note that np.ndarray is not a valid type annotation, we should use NDArray[np.floating] instead (from numpy.typing import NDArray).
Furthermore, as numpy's type annotations get more and more refined, it is becoming apparent that mypy is not the correct typechecker we should be using. Numpy devs recommend basedpyright over it. I might start a migration later, but I just wanted to raise awareness for now. See https://github.com/numpy/numpy/releases/tag/v2.2.1
|
I swear I knew that at one point :) but in this case I actually just copied the type hinting in the previous function without thinking. so i also added a commit here to switch those other occurrences of np.ndarray in type hints in this file. |
|
Looks like you accidentally committed an unrelated (and massive cpp file) |
yt/testing.py
Outdated
| # tested: volume integral is 1. | ||
| def cubicspline_python( | ||
| x: float | np.ndarray, | ||
| x: float | NDArray[np.floating], |
There was a problem hiding this comment.
I guess the return annotation should be changed as well
There was a problem hiding this comment.
oh ya, thought I had put that in. maybe i forgot to commit it.
There was a problem hiding this comment.
(Or did you just not push your changes yet?)
There was a problem hiding this comment.
hah, sorry, didn't push any changes yet, only fixed the stray .cpp
yt/testing.py
Outdated
| kernelfunc: Callable[[float | NDArray[np.floating]], float | NDArray[np.floating]], | ||
| b: float, | ||
| hsml: float, | ||
| ) -> float: |
There was a problem hiding this comment.
the return value is actually np.float64, but it'd be safer to resolve this disparity by actually returning a float. Can you update the return statement ?
There was a problem hiding this comment.
hmm. so the typing here doesn't seem to be consistent with how integrate_kernel is used elsewhere. here's a test that uses arrays for b and hsml:
yt/yt/data_objects/tests/test_rays.py
Lines 120 to 138 in d444721
not certain if the typing for this function needs to updated so that b and hsml are float | NDArray or if that test there is not quite right. @nastasha-w I believe you wrote that test, any thoughts?
There was a problem hiding this comment.
Yeah sorry, the annotations here are a bit of a mess. I basically just edited until the type checker stopped throwing errors; I think I tried adding that these were float arrays at some point, but I didn't get that to work. Feel free to change the types here.
There was a problem hiding this comment.
By the way, I am already editing that function for #5121 , so this will be a fun rebase for me later. Current status:
def integrate_kernel(
kernelfunc: Callable[[float | np.ndarray], np.ndarray],
b: float | np.ndarray,
hsml: float | np.ndarray,
nsample: int = 500,
) -> float:although I'm now realizing that's wrong, and I actually count on it to return arrays in some cases in a different function.
There was a problem hiding this comment.
Yeah sorry, the annotations here are a bit of a mess. I basically just edited until the type checker stopped throwing errors;
No worries! I'm forever learning about proper python type checking myself...
and I actually count on it to return arrays in some cases in a different function.
So sounds like having b, hsml and the return value be float | NDArray is consistent with how you're using it. I'll go with that here. also, FYI in case you didn't read through all the discussion in this PR , this comment on np.ndarray vs np.typing.NDArray in type hints might be helpful for your changes.
| pos3_i1: NDArray[np.floating], | ||
| periodic: tuple[bool, bool, bool] = (True,) * 3, | ||
| periods: np.ndarray = _zeroperiods, | ||
| ) -> np.ndarray: |
There was a problem hiding this comment.
actually here the output dtype is bound to be the same as pos3_i3, so, instead of np.floating, we should use _FloatingT = TypeVar("_FloatingT", bound=np.floating) here
There was a problem hiding this comment.
ok, i think i understand this -- using TypeVar("_FloatingT", bound=np.floating) would ensure the input/output precisions match? in that case though, should I use npt.NBitBase for this (https://numpy.org/doc/stable/reference/typing.html#numpy.typing.NBitBase) ?
There was a problem hiding this comment.
nevermind about the NBitBase. i'm learning. np.floating is the way i think.
baf44ca to
ca9cda0
Compare
|
the stray .cpp is gone now. |
|
Sorry, I seem to have made a mess of the type annotations in those test functions. In case it helps, I think these are all the files where I use the problem function in
|
|
@neutrinoceros just fyi, this is ready for you to look at when you get a chance (no worries if you can't get to it immediately, just wanted to make sure you weren't waiting on me). |
|
Welp, I'm learning some new type annotation options here! Would it make sense to also use the _FloatingT type for the kernel integrator? That one also doesn't make assumptions on the exact float precision of the inputs. |
neutrinoceros
left a comment
There was a problem hiding this comment.
almost there. Sorry for the delay !
yt/testing.py
Outdated
| ], | ||
| b: float | npt.NDArray[np.floating], | ||
| hsml: float | npt.NDArray[np.floating], | ||
| ) -> float | npt.NDArray[np.floating]: |
There was a problem hiding this comment.
It's often much preferable to be strict on the return type. Here I don't see any reason not to be
| ) -> float | npt.NDArray[np.floating]: | |
| ) -> float: |
yt/testing.py
Outdated
| result = pre * integral | ||
| if isinstance(result, np.floating): | ||
| return result.item() | ||
| return result |
There was a problem hiding this comment.
| result = pre * integral | |
| if isinstance(result, np.floating): | |
| return result.item() | |
| return result | |
| return float(pre * integral) |
There was a problem hiding this comment.
I don't think this is a good idea. This function can be called on an arrays of b and hsml values and float(array) will not convert an array to the np.float type. Being strict on return type is fine, but then we should either revert to always returning an array, and changing the dtype, or do that type cast before picking out the single element for a 0d-array.
There was a problem hiding this comment.
... e.g. np.float32(pre * integral) would work for arrays though. We'd need to pick which floating point precision to specify in that case though.
There was a problem hiding this comment.
I'm confused. Why would np.float32(...) work but not float(...) ?
There was a problem hiding this comment.
I'm not sure, but that's what worked and didn't work when I tried it out:
Python 3.13.2 (main, Feb 4 2025, 14:51:09) [Clang 16.0.0 (clang-1600.0.26.6)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import numpy as np
>>> a = np.arange(3)
>>> float(a)
Traceback (most recent call last):
File "<python-input-2>", line 1, in <module>
float(a)
~~~~~^^^
TypeError: only length-1 arrays can be converted to Python scalars
>>> np.float32(a)
array([0., 1., 2.], dtype=float32)
>>> np.float(a)
Traceback (most recent call last):
File "<python-input-4>", line 1, in <module>
np.float(a)
^^^^^^^^
File "/Users/nastasha/code/venvs/ytdev_pixav/lib/python3.13/site-packages/numpy/__init__.py", line 397, in __getattr__
raise AttributeError(__former_attrs__[attr], name=None)
AttributeError: module 'numpy' has no attribute 'float'.
`np.float` was a deprecated alias for the builtin `float`. To avoid this error in existing code, use `float` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.float64` here.
The aliases was originally deprecated in NumPy 1.20; for more details and guidance see the original release note at:
https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
depends. Can you point me to the exact function you're refering to ? |
|
@neutrinoceros I meant the function |
|
Then how is the decision to return an array taken, and would it make sense to unconditionally return an array instead ? |
|
Yeah, the version before this PR would always return an array; if the inputs were floats, the result would be a 0d-array. I just called if not (isinstance(b, np.ndarray) or isinstance(hsml, np.ndarray)):
return result.item()
return result |
This sounds like a good solution to me. I agree that the conditional return type is not ideal if we can simplify it.
how about always returning an array with |
|
Latest push updates integrate_kernel to always return an array (using Don't think there was anything else left to do? |
|
It looks good to me, but then again, so did the initial mess I made here |
|
Ah, now it looks like we can't merge yet because the jenkins server is apparently down ? |
Close #5136