Recorded loop over points #970

tomas16 · 2023-11-09T02:42:59Z

tomas16
Nov 9, 2023

Hi, I'm developing an algorithm in which I have a set of N points (Nx3 numpy array input, converted to Point3f). I need to loop over these points, and for each one of them spawn a set of R rays and get the first intersection. Along with each point, I have a linear array of N integers, belonging to each of the N points.

It's similar to having an outer loop over pixels (in this case my points) and an inner loop over rays.

I can use a regular python for loop, but I was wondering if there's a way to use a recorded loop for this. The issue I have is that from the point of view of the input points, N is the width of the data. However, the actual width is R, as I want to process one point at a time in a loop.

I've been playing around with this and made some observations:

Point3f can only be indexed with a python int or a mitsuba scalar int. But indexing gives me the X, Y or Z coordinate for all the points. 2D indexing is possible, but not slicing, so I'd end up with something like origin = mi.Point3f(scoords[0, iteration], scoords[1, iteration], scoords[2, iteration])
The loop construct doesn't seem to work with scalar integers (ruling out the Point3f indexing)

I'm starting to think this isn't what the loop is intended for. Should I basically have a python loop over my original numpy array?

Related question: it's unclear to me what becomes part of the compiled kernel, specifically wrt. the inputs. For instance, if I use a python loop in the above, do I compile a kernel for each iteration? Or can mitsuba somehow figure out that it's the same kernel with different inputs? Or, more likely, does it become one giant kernel with all the input data baked into it?

TLDR; looking for some guidance around when to use recorded vs regular loops.

Answered by njroussel

Nov 9, 2023

Hi @tomas16

I'm starting to think this isn't what the loop is intended for. Should I basically have a python loop over my original numpy array?

Indeed, if I understood your problem correctly, you don't need a loop at all. One way to think about writing Mitsuba/Dr.Jit code, is that every variable and operation is vectorized. We are basically only doing operations on arrays.
The most efficient implementation of your setup would be something like this:

n_points = get_points # dr.shape(n_points) == [3, N] 
repeated_points = dr.repeat(n_points, R) # # dr.shape(repeated_points) == [3, N * R] 
rays = spawn_ray(repeated points) # dr.width(rays) == N*R
si = scene.ray_intersect(rays)

The idea her…

View full answer

njroussel · 2023-11-09T15:39:18Z

njroussel
Nov 9, 2023
Collaborator

Hi @tomas16

I'm starting to think this isn't what the loop is intended for. Should I basically have a python loop over my original numpy array?

Indeed, if I understood your problem correctly, you don't need a loop at all. One way to think about writing Mitsuba/Dr.Jit code, is that every variable and operation is vectorized. We are basically only doing operations on arrays.
The most efficient implementation of your setup would be something like this:

n_points = get_points # dr.shape(n_points) == [3, N] 
repeated_points = dr.repeat(n_points, R) # # dr.shape(repeated_points) == [3, N * R] 
rays = spawn_ray(repeated points) # dr.width(rays) == N*R
si = scene.ray_intersect(rays)

The idea here is to "unroll" the loop into array operations. We gain in parallelization, by doing this.
Of course, this isn't always always possible. In a path tracer for example, you will want to repeat this ray intersection routine for a maximum M bounces. In that case, you need a loop.

Related question: it's unclear to me what becomes part of the compiled kernel, specifically wrt. the inputs. For instance, if I use a python loop in the above, do I compile a kernel for each iteration? Or can mitsuba somehow figure out that it's the same kernel with different inputs? Or, more likely, does it become one giant kernel with all the input data baked into it?

If you haven't already, I'd recommend going through this introduction to Dr.Jit. It goes over kernel caching and recorded loops.
In short Mitsuba/DrJit, will keep everything in a single kernel for as long as possible. If a variable needs to be evaluated, then a kernel is launched. You can log these launches, to get a better understanding if all of your code is being merged/cached/reused into one or many kernels.

2 replies

tomas16 Nov 10, 2023
Author

Thanks, that worked.
I've read all the docs multiple times over, but I still get confused 😉

One more question around this though: what if N*R is so large it doesn't fit in memory? Would you use a python for loop in that case?

The 2nd phase of this algorithm needs to trace a bunch of rays with multiple bounces, starting from every triangle of the mesh. So I may run into a similar problem, where now N = number of faces in the mesh, R = number of rays per face, and a recorded loop over the bounces. When N is very large, I may need an outer (python?) loop to make things fit in memory.

njroussel Nov 10, 2023
Collaborator

The only variables that are evaluated, and therefore need to be in memory are the inputs and outputs to a kernel. Every variable in between those will only be in registers.

For example, if I render a 1920 x 1080 image with 4192 spp. I will have 1920 * 1080 * 4192 rays. Even if I somehow managed to pack a ray into 4 bytes, all of those rays would consume more than 32GB. This would never fit in VRAM, However because this is a "temporary" variable between the input (scene) and the output (rendered image) it will only be stored in thread registers.

This sounds very nice. However, sometimes, in practice we don't have any other option than to evaluate some very wide variable. In our previous example, let's say I wanted to print all the intersection point values of all of my rays. This variable therefore must be written to memory which will indeed be a problem.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Recorded loop over points #970

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment 2 replies

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Recorded loop over points #970

Uh oh!

tomas16 Nov 9, 2023

Replies: 1 comment · 2 replies

Uh oh!

njroussel Nov 9, 2023 Collaborator

Uh oh!

tomas16 Nov 10, 2023 Author

Uh oh!

njroussel Nov 10, 2023 Collaborator

tomas16
Nov 9, 2023

Replies: 1 comment 2 replies

njroussel
Nov 9, 2023
Collaborator

tomas16 Nov 10, 2023
Author

njroussel Nov 10, 2023
Collaborator