update testing.integrate_kernel type hints by chrishavlin · Pull Request #5137 · yt-project/yt

chrishavlin · 2025-03-19T21:06:20Z

neutrinoceros

thanks for catching this. However, note that np.ndarray is not a valid type annotation, we should use NDArray[np.floating] instead (from numpy.typing import NDArray).

Furthermore, as numpy's type annotations get more and more refined, it is becoming apparent that mypy is not the correct typechecker we should be using. Numpy devs recommend basedpyright over it. I might start a migration later, but I just wanted to raise awareness for now. See https://github.com/numpy/numpy/releases/tag/v2.2.1

chrishavlin · 2025-03-20T19:05:25Z

I swear I knew that at one point :) but in this case I actually just copied the type hinting in the previous function without thinking. so i also added a commit here to switch those other occurrences of np.ndarray in type hints in this file.

neutrinoceros · 2025-03-21T08:34:59Z

Looks like you accidentally committed an unrelated (and massive cpp file)

neutrinoceros · 2025-03-21T08:35:47Z

yt/testing.py

 # tested: volume integral is 1.
 def cubicspline_python(
-    x: float | np.ndarray,
+    x: float | NDArray[np.floating],


I guess the return annotation should be changed as well

oh ya, thought I had put that in. maybe i forgot to commit it.

Did you forget again ? 😅

(Or did you just not push your changes yet?)

hah, sorry, didn't push any changes yet, only fixed the stray .cpp

neutrinoceros · 2025-03-21T08:39:26Z

yt/testing.py

+    kernelfunc: Callable[[float | NDArray[np.floating]], float | NDArray[np.floating]],
+    b: float,
+    hsml: float,
 ) -> float:


the return value is actually np.float64, but it'd be safer to resolve this disparity by actually returning a float. Can you update the return statement ?

hmm. so the typing here doesn't seem to be consistent with how integrate_kernel is used elsewhere. here's a test that uses arrays for b and hsml:

yt/yt/data_objects/tests/test_rays.py

Lines 120 to 138 in d444721

start1 = np.array((1.53, 0.53, 1.0))

end1 = np.array((1.53, 0.53, 3.0))

ray1 = ds.ray(start1, end1)

b1 = np.array([np.sqrt(2.0) * 0.03] * 2)

hsml1 = np.array([0.05] * 2)

len1 = np.sqrt(np.sum((end1 - start1) ** 2))

# for a ParticleDataset like this one, the Ray object attempts

# to generate the 't' and 'dts' fields using the grid method

ray1.field_data["t"] = ray1.ds.arr(ray1._generate_container_field_sph("t"))

ray1.field_data["dts"] = ray1.ds.arr(ray1._generate_container_field_sph("dts"))

# not demanding too much precision;

# from kernel volume integrals, the linear interpolation

# restricts you to 4 -- 5 digits precision

assert_equal(ray1["t"].shape, (2,))

assert_rel_equal(ray1["t"], np.array([0.25, 0.75]), 5)

assert_rel_equal(

ray1["gas", "position"].v, np.array([[1.5, 0.5, 1.5], [1.5, 0.5, 2.5]]), 5

)

dl1 = integrate_kernel(kernelfunc, b1, hsml1)

not certain if the typing for this function needs to updated so that b and hsml are float | NDArray or if that test there is not quite right. @nastasha-w I believe you wrote that test, any thoughts?

Yeah sorry, the annotations here are a bit of a mess. I basically just edited until the type checker stopped throwing errors; I think I tried adding that these were float arrays at some point, but I didn't get that to work. Feel free to change the types here.

By the way, I am already editing that function for #5121 , so this will be a fun rebase for me later. Current status:

def integrate_kernel( kernelfunc: Callable[[float | np.ndarray], np.ndarray], b: float | np.ndarray, hsml: float | np.ndarray, nsample: int = 500, ) -> float:

although I'm now realizing that's wrong, and I actually count on it to return arrays in some cases in a different function.

Yeah sorry, the annotations here are a bit of a mess. I basically just edited until the type checker stopped throwing errors;

No worries! I'm forever learning about proper python type checking myself...

and I actually count on it to return arrays in some cases in a different function.

So sounds like having b, hsml and the return value be float | NDArray is consistent with how you're using it. I'll go with that here. also, FYI in case you didn't read through all the discussion in this PR , this comment on np.ndarray vs np.typing.NDArray in type hints might be helpful for your changes.

neutrinoceros · 2025-03-21T08:41:11Z

yt/testing.py

+    pos3_i1: NDArray[np.floating],
    periodic: tuple[bool, bool, bool] = (True,) * 3,
-    periods: np.ndarray = _zeroperiods,
-) -> np.ndarray:


actually here the output dtype is bound to be the same as pos3_i3, so, instead of np.floating, we should use _FloatingT = TypeVar("_FloatingT", bound=np.floating) here

ok, i think i understand this -- using TypeVar("_FloatingT", bound=np.floating) would ensure the input/output precisions match? in that case though, should I use npt.NBitBase for this (https://numpy.org/doc/stable/reference/typing.html#numpy.typing.NBitBase) ?

nevermind about the NBitBase. i'm learning. np.floating is the way i think.

chrishavlin · 2025-03-21T15:27:54Z

the stray .cpp is gone now.

nastasha-w · 2025-03-24T12:08:25Z

Sorry, I seem to have made a mess of the type annotations in those test functions. In case it helps, I think these are all the files where I use the problem function in testing.py:

yt/visualization/tests/test_offaxisprojection_pytestonly.py
yt/geometry/coordinates/tests/test_sph_pixelization.py
yt/geometry/coordinates/tests/test_sph_pixelization_pytestonly.py

chrishavlin · 2025-03-27T14:53:56Z

@neutrinoceros just fyi, this is ready for you to look at when you get a chance (no worries if you can't get to it immediately, just wanted to make sure you weren't waiting on me).

nastasha-w · 2025-03-27T20:17:40Z

Welp, I'm learning some new type annotation options here! Would it make sense to also use the _FloatingT type for the kernel integrator? That one also doesn't make assumptions on the exact float precision of the inputs.

neutrinoceros

almost there. Sorry for the delay !

neutrinoceros · 2025-04-01T10:39:12Z

yt/testing.py

+    ],
+    b: float | npt.NDArray[np.floating],
+    hsml: float | npt.NDArray[np.floating],
+) -> float | npt.NDArray[np.floating]:


It's often much preferable to be strict on the return type. Here I don't see any reason not to be

Suggested change

) -> float | npt.NDArray[np.floating]:

) -> float:

neutrinoceros · 2025-04-01T10:39:35Z

yt/testing.py

+    result = pre * integral
+    if isinstance(result, np.floating):
+        return result.item()
+    return result


Suggested change

result = pre * integral

if isinstance(result, np.floating):

return result.item()

return result

return float(pre * integral)

I don't think this is a good idea. This function can be called on an arrays of b and hsml values and float(array) will not convert an array to the np.float type. Being strict on return type is fine, but then we should either revert to always returning an array, and changing the dtype, or do that type cast before picking out the single element for a 0d-array.

... e.g. np.float32(pre * integral) would work for arrays though. We'd need to pick which floating point precision to specify in that case though.

I'm confused. Why would np.float32(...) work but not float(...) ?

I'm not sure, but that's what worked and didn't work when I tried it out:

Python 3.13.2 (main, Feb 4 2025, 14:51:09) [Clang 16.0.0 (clang-1600.0.26.6)] on darwin Type "help", "copyright", "credits" or "license" for more information. >>> import numpy as np >>> a = np.arange(3) >>> float(a) Traceback (most recent call last): File "<python-input-2>", line 1, in <module> float(a) ~~~~~^^^ TypeError: only length-1 arrays can be converted to Python scalars >>> np.float32(a) array([0., 1., 2.], dtype=float32) >>> np.float(a) Traceback (most recent call last): File "<python-input-4>", line 1, in <module> np.float(a) ^^^^^^^^ File "/Users/nastasha/code/venvs/ytdev_pixav/lib/python3.13/site-packages/numpy/__init__.py", line 397, in __getattr__ raise AttributeError(__former_attrs__[attr], name=None) AttributeError: module 'numpy' has no attribute 'float'. `np.float` was a deprecated alias for the builtin `float`. To avoid this error in existing code, use `float` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.float64` here. The aliases was originally deprecated in NumPy 1.20; for more details and guidance see the original release note at: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations

neutrinoceros · 2025-04-01T10:43:10Z

Would it make sense to also use the _FloatingT type for the kernel integrator? That one also doesn't make assumptions on the exact float precision of the inputs.

depends.
_FloatingT, as defined here, doesn't just act as a placeholder for an unknown dtype: it is also a TypeVar and as such, it expresses a relation between input types and output types.

Can you point me to the exact function you're refering to ?

nastasha-w · 2025-04-01T12:24:11Z

@neutrinoceros I meant the function integrate_kernel in testing.py (line 120 in my version of testing.py). As-is, the output will depend on the exact float types used for b and hsml, although it isn't obvious to me which type will be used if they aren't the same... Perhaps your explicit type-casting suggestion is a better idea here, but I would like to retain the ability to return arrays and not just single float values.

neutrinoceros · 2025-04-01T13:40:31Z

Then how is the decision to return an array taken, and would it make sense to unconditionally return an array instead ?

nastasha-w · 2025-04-01T14:20:30Z

Yeah, the version before this PR would always return an array; if the inputs were floats, the result would be a 0d-array. I just called np.array on the hsml and b inputs. 0d-arrays are annoying though, I'd be fine with returning a float if neither b nor hsml was input as an array. e.g.

if not (isinstance(b, np.ndarray) or isinstance(hsml, np.ndarray)):
    return result.item()
return result

chrishavlin · 2025-04-01T14:38:57Z

would it make sense to unconditionally return an array instead ? (@neutrinoceros )

This sounds like a good solution to me. I agree that the conditional return type is not ideal if we can simplify it.

0d-arrays are annoying though (@nastasha-w )

how about always returning an array with np.atleast_1d (i.e., return np.atleast_1d(pre * integral))?

chrishavlin · 2025-04-03T20:01:27Z

Latest push updates integrate_kernel to always return an array (using np.atleast_1d so it's never a 0d array).

Don't think there was anything else left to do?

nastasha-w · 2025-04-03T20:55:05Z

It looks good to me, but then again, so did the initial mess I made here

neutrinoceros · 2025-04-04T12:27:57Z

Ah, now it looks like we can't merge yet because the jenkins server is apparently down ?

neutrinoceros requested changes Mar 20, 2025

View reviewed changes

neutrinoceros added this to the 4.4.1 milestone Mar 20, 2025

neutrinoceros reviewed Mar 21, 2025

View reviewed changes

update testing.integrate_kernel type hints

ca9cda0

chrishavlin force-pushed the fix_integrate_kernel_typing branch from baf44ca to ca9cda0 Compare March 21, 2025 15:27

update cubic_spline, integrate_kernel, distancematrix typing

d89cbc0

chrishavlin mentioned this pull request Mar 27, 2025

Updating type hints for numpy arrays #5144

Open

21 tasks

neutrinoceros requested changes Apr 1, 2025

View reviewed changes

nastasha-w mentioned this pull request Apr 1, 2025

WIP: add pixel-averaging option for SPH projections #5121

Draft

4 tasks

integrate_kernel now always returns at least a 1D ndarray

d15afe6

neutrinoceros approved these changes Apr 4, 2025

View reviewed changes

neutrinoceros added the bug label Apr 4, 2025

neutrinoceros merged commit 69acde4 into yt-project:main Apr 4, 2025
12 of 13 checks passed

meeseeksmachine pushed a commit to meeseeksmachine/yt that referenced this pull request Apr 4, 2025

Backport PR yt-project#5137: update testing.integrate_kernel type hints

f64bfd3

meeseeksmachine mentioned this pull request Apr 4, 2025

Backport PR #5137 on branch yt-4.4.x (update testing.integrate_kernel type hints) #5148

Merged

	start1 = np.array((1.53, 0.53, 1.0))
	end1 = np.array((1.53, 0.53, 3.0))
	ray1 = ds.ray(start1, end1)
	b1 = np.array([np.sqrt(2.0) * 0.03] * 2)
	hsml1 = np.array([0.05] * 2)
	len1 = np.sqrt(np.sum((end1 - start1) ** 2))
	# for a ParticleDataset like this one, the Ray object attempts
	# to generate the 't' and 'dts' fields using the grid method
	ray1.field_data["t"] = ray1.ds.arr(ray1._generate_container_field_sph("t"))
	ray1.field_data["dts"] = ray1.ds.arr(ray1._generate_container_field_sph("dts"))
	# not demanding too much precision;
	# from kernel volume integrals, the linear interpolation
	# restricts you to 4 -- 5 digits precision
	assert_equal(ray1["t"].shape, (2,))
	assert_rel_equal(ray1["t"], np.array([0.25, 0.75]), 5)
	assert_rel_equal(
	ray1["gas", "position"].v, np.array([[1.5, 0.5, 1.5], [1.5, 0.5, 2.5]]), 5
	)
	dl1 = integrate_kernel(kernelfunc, b1, hsml1)

Conversation

chrishavlin commented Mar 19, 2025

Uh oh!

neutrinoceros left a comment

Choose a reason for hiding this comment

Uh oh!

chrishavlin commented Mar 20, 2025

Uh oh!

neutrinoceros commented Mar 21, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

neutrinoceros Mar 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

chrishavlin commented Mar 21, 2025

Uh oh!

nastasha-w commented Mar 24, 2025

Uh oh!

chrishavlin commented Mar 27, 2025

Uh oh!

nastasha-w commented Mar 27, 2025

Uh oh!

neutrinoceros left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

neutrinoceros commented Apr 1, 2025

Uh oh!

nastasha-w commented Apr 1, 2025

Uh oh!

neutrinoceros commented Apr 1, 2025

Uh oh!

nastasha-w commented Apr 1, 2025

Uh oh!

chrishavlin commented Apr 1, 2025

Uh oh!

chrishavlin commented Apr 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

nastasha-w commented Apr 3, 2025

Uh oh!

neutrinoceros commented Apr 4, 2025

Uh oh!

Uh oh!

Reviewers

neutrinoceros Mar 21, 2025 •

edited

Loading

chrishavlin commented Apr 3, 2025 •

edited

Loading