Two RNGs of different dimensions in a single mi.Loop instance #573
-
QuestionHello, lhs_rng = mi.PCG32(size=dr.prod(image_res))
M = 16 # the number of samples requires to estimate each rendering integral
rhs_rng = mi.PCG32(size=dr.prod(image_res) * M) In addition, I'd like to iterate this sampling process multiple times to accumulate results or train a neural network on many iterations. However, I found there are two limitations of the below code.
RuntimeError: jit_var_loop_init(): loop state variables have an inconsistent size (262144 vs 16384)!
RuntimeError: jit_var_gather(): cannot gather from a placeholder variable! Any comment on resolving those problems is welcome. Example Codeimport drjit as dr
import mitsuba as mi
mi.set_variant("llvm_ad_rgb")
dr.set_log_level(dr.LogLevel.Info)
scene = mi.load_dict(mi.cornell_box())
# Camera origin in world space
cam_origin = mi.Point3f(0, 1, 3)
# Camera view direction in world space
cam_dir = dr.normalize(mi.Vector3f(0, -0.5, -1))
# Camera width and height in world space
cam_width = 2.0
cam_height = 2.0
# Image pixel resolution
image_res = [256, 256]
# Construct a grid of 2D coordinates
x, y = dr.meshgrid(
dr.linspace(mi.Float, -cam_width / 2, cam_width / 2, image_res[0]),
dr.linspace(mi.Float, -cam_height / 2, cam_height / 2, image_res[1]),
)
# Ray origin in local coordinates
ray_origin_local = mi.Vector3f(x, y, 0)
# Ray origin in world coordinates
ray_origin = mi.Frame3f(cam_dir).to_world(ray_origin_local) + cam_origin
ray = mi.Ray3f(o=ray_origin, d=cam_dir)
si = scene.ray_intersect(ray)
ambient_range = 0.75
ambient_ray_count = 256
# Initialize the random number generator
rng = mi.PCG32(size=dr.prod(image_res))
############################Start###############################
M = 16
integral_rng = mi.PCG32(size=dr.prod(image_res) * M)
############################End################################
# Loop iteration counter
i = mi.UInt32(0)
# Accumulated result
result = mi.Float(0)
# Initialize the loop state (listing all variables that are modified inside the loop)
############################Start###############################
loop = mi.Loop(name="", state=lambda: (rng, integral_rng, i, result))
############################End################################
while loop(si.is_valid() & (i < ambient_ray_count)):
# 1. Draw some random numbers
sample_1, sample_2 = rng.next_float32(), rng.next_float32()
# 2. Compute directions on the hemisphere using the random numbers
wo_local = mi.warp.square_to_uniform_hemisphere([sample_1, sample_2])
# Alternatively, we could also sample a cosine-weighted hemisphere
# wo_local = mi.warp.square_to_cosine_hemisphere([sample_1, sample_2])
# 3. Transform the sampled directions to world space
wo_world = si.sh_frame.to_world(wo_local)
# 4. Spawn a new ray starting at the surface interactions
ray_2 = si.spawn_ray(wo_world)
# 5. Set a maximum intersection distance to only account for the close-by geometry
ray_2.maxt = ambient_range
# 6. Accumulate a value of 1 if not occluded (0 otherwise)
result[~scene.ray_test(ray_2)] += 1.0
############################Start###############################
indices = dr.arange(mi.UInt, 0, dr.prod(image_res))
indices = dr.repeat(indices, M)
si = dr.gather(type(si), si, indices)
sample_3, sample_4 = integral_rng.next_float32(), integral_rng.next_float32()
wi = mi.warp.square_to_uniform_hemisphere([sample_3, sample_4])
ray_3 = si.spawn_ray(si.sh_frame.to_world(wi))
ray_3.maxt = ambient_range
hit = scene.ray_test(ray_3)
hit = hit.torch().reshape(dr.prod(image_res), M).sum(dim=1)
result += mi.Float(hit)
############################End################################
# 7. Increase loop iteration counter
i += 1
# Divide the result by the number of samples
result = result / ambient_ray_count / (1 + M)
image = mi.TensorXf(result, shape=image_res)
import matplotlib.pyplot as plt
plt.imsave("out.png", image, cmap="gray")
plt.axis("off") |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 2 replies
-
Hi @Mephisto405 Yes, this is hard limitation of recorded loops. Every single variable in the loop should have the same size (vectorization width). Think about this loop as a parallel computation where each thread is running a 1-wide version of the loop. You cannot suddenly have some piece of code in your loop that requires a larger width. (....)
M = 16
j = dr.zeros(mi.UInt32, shape=dr.prod(image_res))
integral_rng = mi.PCG32(size=dr.prod(image_res))
inner_result = mi.Float(0)
loop2 = mi.Loop(name="", state=lambda: (integral_rng, j, inner_result))
inner_si = mi.SurfaceInteraction3f(si)
while loop2(inner_si.is_valid() & (j < M)):
sample_3, sample_4 = integral_rng.next_float32(), integral_rng.next_float32()
wi = mi.warp.square_to_uniform_hemisphere([sample_3, sample_4])
ray_3 = si.spawn_ray(si.sh_frame.to_world(wi))
ray_3.maxt = ambient_range
hit = scene.ray_test(ray_3)
value = dr.select(hit, 1, 0)
inner_result += mi.Float(value)
j += 1
result += inner_result
(...) A note on this line: hit = hit.torch().reshape(dr.prod(image_res), M).sum(dim=1) Any conversion from mitsuba/drjit to some other framework like PyTorch (the |
Beta Was this translation helpful? Give feedback.
Hi @Mephisto405
Yes, this is hard limitation of recorded loops. Every single variable in the loop should have the same size (vectorization width). Think about this loop as a parallel computation where each thread is running a 1-wide version of the loop. You cannot suddenly have some piece of code in your loop that requires a larger width.
However, I believe you can add a nested loop by doing something like this: