Differentiation w.r.t samples positions #931

BDoignies · 2023-10-06T11:33:09Z

BDoignies
Oct 6, 2023

Hello,

I've been playing with Mitsuba3 (in python) lately and I'm interested in the differentiable aspect of the renderer. In particular, I'm interested in differentiation with respect to samples. Fortunately, the code allows you to register custom plugins (samplers), override functionality and register parameters. Unfortunately for me, however, differentiation with any sampler parameter is disabled (if I'm not mistaken this is the line that disables it).

It is possible to trick the system and here is a simple and minimal plugin (for a single spp sampler) I used:

class SamplerOptim(mi.Sampler):
    def __init__(self, props):
        super().__init__(props)
        
        # Creates differentiable variable 
        self.shape = (1, 22)
        self.samples = []
        for _ in range(np.prod(self.shape)):
            self.samples.append(mi.Float(np.random.uniform(0, 1))) 
        self.sample_id = 0

    def next_1d(self: mi.Sampler, active: bool = True) -> mi.Float:
        id = self.sample_id
        self.sample_id = (self.sample_id + 1) % self.shape[1]
        return self.samples[id]
        
    def next_2d(self: mi.Sampler, active: bool = True) -> mi.Point2f:
        x = self.next_1d(active)
        y = self.next_1d(active)
        return mi.Point2f(x, y)
    
    def traverse(self, callback):
        # Trick to enable differentiability 
        f_before = callback.flags

        # Resets flags of TraverseCallback !
        callback.flags = int(mi.ParamFlags.Differentiable)
        for i, sp in enumerate(self.samples):
            callback.put_parameter(f'samples_{i}', sp, mi.ParamFlags.Differentiable | mi.ParamFlags.Discontinuous)
        # Restore flags
        callback.flags = f_before

And... surprisingly, it roughly works. With a simple MSE it changes the sample position and reduces the loss. But... But sometimes the code crashes. The exact error is : "Assertion failed in /project/ext/drjit-core/ext/nanothread/src/queue.cpp:354: remain == 1" which I think is due to multi-threading. I do not know if I am using the plugin correctly, but forcing some sample values in the initialisation loop gave coherent results.

I know this is hacky. And to be honest, I did not expect the code to work at all. But it seems to work; at least partially. So I have a few questions :

How is handled multi-threading for (plugin) samplers?
Why is sample differentiation disabled ? It seems to be "theoretically" possible. But is there a technical design choice that makes it impossible in Mitsuba3?
If it does not crash, is the code really working, or is it subject to data races?
Should I implement the main rendering code myself (backed by Mitsuba utilities and DrJit for differentiation) to achieve such differentiation?

I have not been able to dig into the code and answer these questions myself (especially for Q2/Q3). If any information is missing, please let me know.

Thank you !

I'm sorry if I've missed any previous discussion on this. A search with the keywords sample/sampler did not return any similar problems. Here is some system information if it helps:

OS : Ubuntu 22.04.3 LTS
CPU: AMD Ryzen 7 1700X Eight-Core Processor
Package manager: conda
Python version: 3.9.13
Mitsuba 3 version: 3.4.0
DrJit version : 0.4.3
Variant used: 'llvm_ad_rgb'
Test scene: mi.cornell_box()

njroussel · 2023-10-06T12:45:20Z

njroussel
Oct 6, 2023
Collaborator

Hi @BDoignies

How is handled multi-threading for (plugin) samplers?

There is nothing specific to samplers, as far as I know. In short, Dr.Jit will "read"/"trace" through the entire path tracer source code with a single thread. Once it has established the computation graph that will be compiled into LLVM it executes it in parallel with many threads.
However, because each SIMD-lane should be using diffident random numbers, the state of the sampler must be aware of the rendering "width" (x_res * y_res * spp). There is a small write-up of this logic here.

Why is sample differentiation disabled ? It seems to be "theoretically" possible. But is there a technical design choice that makes it impossible in Mitsuba3?

I think this is mostly because we were trying to be cautious. We've never tried it ourselves and haven't fully thought through all the implications of trying to differentiate the Sampler.
This is a fairly deep topic, in my opinion. The initial rays, the interactions at surfaces, the sampling of light sources, ... so many components depend on the sampler and I don't believe we have the tools to properly handle differentiation through all of them w.r.t. to the sampler yet. In fact, we explicitly detach sampling schemes from the automatic differentiation graph for many existing techniques.
I might be mistaken, but a check against a finite differences computation of the gradient should verify this claim. It is quite possible that the gradient currently has the correct sign but a wrong magnitude, in which case you could still get some semblance of correct convergence as you seem to observe in your experiment.
There definitely are cases where I can see this working perfectly: a camera looking straight at a flat textured plane under constant illumination. But I don't think this will work in the general case.

If it does not crash, is the code really working, or is it subject to data races?

We've been aware of issues in the task (thread pool) manager, but have never figured out the root cause. If you're willing to share more of your code we might be able to finally narrow down this long lasting issue.
I don't think there is anything specifically tied to samplers that would make them subject to a data race.

Should I implement the main rendering code myself (backed by Mitsuba utilities and DrJit for differentiation) to achieve such differentiation?

That is one option. The easiest would be to replace the line you found in the sensor.h file and re-compile the project. Another option would be to implement your own Sensor such that you can override this default behaviour of the traverse mechanism.

0 replies

BDoignies · 2023-10-06T14:11:10Z

BDoignies
Oct 6, 2023
Author

Thanks for your reply !

I will try to check later with finite difference methods if the magnitude and sign are correct. It is indeed true that the convergence seems to be very slow (even with higher lr), but I thought it was just the targets which have very small gradients.
If this can help, here is the quick test code I currently have. Obviously it is not possible to predict when the code will crash. Sometimes I can run a full 5k iterations without any problems. But sometimes it crashes after 80 or so...

I have followed the tutorial here, but if you find anything wrong, please let me know.

import mitsuba as mi
import drjit as dr
import numpy as np
import matplotlib.pyplot as plt

from tqdm import tqdm

# Some parameters for sampler, global
# so it can be shared outside the class
MAX_SAMPLES = 22
SAMPLE_NAMES = [f"samples_{i}" for i in range(MAX_SAMPLES)]

mi.set_variant('llvm_ad_rgb')

def get_ref(spp=128):
    scene = mi.cornell_box()
    mscene = mi.load_dict(scene)
    return mi.render(mscene, spp=spp)

def mse(img, ref):
    return dr.mean(dr.sqr(img - ref))

class SamplerOptim(mi.Sampler):
    def __init__(self, props):
        super().__init__(props)
        
        self.shape = (1, MAX_SAMPLES)
        # Instantiate variables independantly 
        self.samples = []

        # Starts in the middle of the cube (arbitrary)
        for _ in range(np.prod(self.shape)):
            self.samples.append(mi.Float(0.5))
        self.sample_id = 0

    def next_1d(self: mi.Sampler, active: bool = True) -> mi.Float:
        id = self.sample_id
        self.sample_id = (self.sample_id + 1) % self.shape[1]
        return self.samples[id]
        
    def next_2d(self: mi.Sampler, active: bool = True) -> mi.Point2f:
        x = self.next_1d(active)
        y = self.next_1d(active)
        return mi.Point2f(x, y)
    
    def traverse(self, callback):
        # Trick to enable differentiability 
        f_before = callback.flags

        callback.flags = int(mi.ParamFlags.Differentiable)
        for i, sp in enumerate(self.samples):
            callback.put_parameter(SAMPLE_NAMES[i], sp, mi.ParamFlags.Differentiable | mi.ParamFlags.Discontinuous)
        # Restore flags
        callback.flags = f_before

mi.register_sampler("sampler_optim", lambda props: SamplerOptim(props))

ref = get_ref()
scene = mi.cornell_box()

scene["sensor"]["sampler"]["type"] = "sampler_optim"
scene["integrator"]["max_depth"]   = 2

mscene = mi.load_dict(scene)
params = mi.traverse(mscene)

keys = ["sensor.sampler." + sname for sname in SAMPLE_NAMES]
params.keep(keys)

# Optimizer
opt = mi.ad.Adam(lr=1e-4)
for k in keys:
    opt[k] = params[k]
    dr.enable_grad(params[k])
params.update(opt)

image = mi.render(mscene, params, spp=1)
bimg = mi.Bitmap(image)
mi.util.write_bitmap("init.png", bimg)
# mi.util.write_bitmap("init.exr", bimg)

errors = []
for it in tqdm(range(2500)):
    image = mi.render(mscene, params, spp=1)
    loss = mse(image, ref)

    dr.backward(loss)

    opt.step()

    for k in keys:
        opt[k] = dr.clamp(opt[k], 0.0, 1.0)
       
    params.update(opt)
    errors.append(loss)
    
    # Save regularly because of crashes
    if it % 50 == 0:
        # Retrieve samples values 
        out_samples = []
        for k in keys:
            out_samples.append(opt[k])

        # print(errors[0], errors[-1])
        bimg = mi.Bitmap(image)

        plt.plot(errors)
        plt.savefig("errors.png")
        plt.close()
        np.savetxt("errors.txt", errors)
        np.savetxt("samples.txt", out_samples)
        mi.util.write_bitmap("out.png", bimg)
        # mi.util.write_bitmap("out.exr", bimg)

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Differentiation w.r.t samples positions #931

Uh oh!

{{title}}

Uh oh!

Replies: 2 comments

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Differentiation w.r.t samples positions #931

Uh oh!

BDoignies Oct 6, 2023

Replies: 2 comments

Uh oh!

njroussel Oct 6, 2023 Collaborator

Uh oh!

BDoignies Oct 6, 2023 Author

BDoignies
Oct 6, 2023

njroussel
Oct 6, 2023
Collaborator

BDoignies
Oct 6, 2023
Author