Replies: 6 comments 3 replies
-
Hi @osylum These numbers don't seem unreasonable. I think you might just have hit a system limit. Is there anything in particular that leaves you to believe this can be speedup ? |
Beta Was this translation helpful? Give feedback.
-
I see. Thanks, it is helpful to know that I reached a limit. I was just unclear if it was possible because of my lack of experience in rendering. I was comparing with NeRF that can learn 1k images and get per-voxel color and density. It is not the same type of learning of course. But I thought I could learn the BSSDF still as fast. From my understanding, NeRF only need to do ray casting for rendering. Maybe that is why it is significantly faster - not sure though. |
Beta Was this translation helpful? Give feedback.
-
I was surprised that activating cuda variant was not faster than using llvm. Maybe because of the loading of the data to gpu. Should I however expect a (strong) speed up in this case? |
Beta Was this translation helpful? Give feedback.
-
I am first generating a dataset using traditional rendering. The dataset is created from 10 different models, variations in the 3 BSSDF parameters and rotation of the env map, giving a total of around 10k images. Then I use the dataset (part of it as train set) for inverse rendering (pbrvolpath) to learn the BSSDF model parameters through a torch CNN neural network. At each epoch I have to render the whole train set using the predicted BSSDF model parameters, which is expensive. |
Beta Was this translation helpful? Give feedback.
-
I believe I re-use the scene. I define first a list of scenes for each model and rotations of the env map (maybe the later can be changed with traverse as well). In the rendering loop I pick one scene from the list, traverse, update the bssdf parameters and render (mi.render(scene)). Here are some timings for rendering 11 images (a bust geo model):
rendered image for illustration: Below is the part of the code I use for rendering. Enabling the line mi.render(scene_00) was faster for cuda as compared to grabing the scene from the scenes list, but not significantly different than using llvm. I refer to https://mitsuba.readthedocs.io/en/stable/src/rendering/editing_a_scene.html to re-use/update the scene, but maybe I am doing something no optimal here in the re-use of the scene? render_resolution = [128,128]
nsamples = 2048 # 1024 2048
# note: taking half because mitsuba raise to next power of two
nsamples_str = f'{nsamples*2}'
traintest = 'test' # train, test, all
dataset_path = os.path.join(datasets_path, f'{render_resolution[0]}x{render_resolution[1]}')
print(f'{dataset_path=}')
# define mesh models
meshmodels_train = ['armadillo', 'buddha', 'bun', 'cube', 'dragon', 'star_smooth'] # train models
meshmodels_test = ['bust', 'cap'] #['bunny', 'bust', 'cap', 'lucy', 'soap'] # test models
meshmodels_all = ['armadillo', 'buddha', 'bun', 'bunny', 'bust', 'cap', 'cube', 'dragon', 'lucy', 'soap', 'star_smooth']
numvertices = [43243, 49929, 23787, 34835, 18187, 158708, 8887, 50000, 50001, 1233, 1152]
if traintest == 'train':
meshmodels = meshmodels_train
elif traintest == 'test':
meshmodels = meshmodels_test
elif traintest == 'all':
meshmodels = meshmodels_all
list_angle = [0., -30., -60., -90., -120., -150., -180.]
list_sigma_t = [30, 36, 44, 54, 67, 82, 100, 122, 150, 184, 225, 276]
list_albedo = [0.39, 0.59, 0.74, 0.87, 0.95]
list_g = [0., 0.2, 0.4, 0.6, 0.8]
dataset_params = [{
'filename': f"{meshmodel}_e{angle}_d{sigma_t}_a{albedo}_g{g}_q{nsamples_str}.exr",
'filename_weights': f"{meshmodel}_e{angle}_d{sigma_t}_a{albedo}_g{g}_q{nsamples_str}.exr",
'filename_mask': f"{meshmodel}_mask.exr",
'meshmodel':meshmodel,
'angle': angle,
'sigma_t':sigma_t,
'albedo':albedo,
'g':g
}
for meshmodel in meshmodels
for angle in list_angle
for sigma_t in list_sigma_t
for albedo in list_albedo
for g in list_g
]
print(f'len(dataset_params): {len(dataset_params)}')
dataset_numsamples = len(meshmodels) * len(list_angle) * len(list_sigma_t) * len(list_albedo) * len(list_g)
print(f'dataset_numsamples: {dataset_numsamples}')
print(f'dataset_params\n: {dataset_params[:10]}')
renders_path = os.path.join(dataset_path, 'renders')
if not os.path.exists(renders_path):
os.makedirs(renders_path)
# create scene for each model
scenes = {}
for meshmodel in meshmodels:
for angle in list_angle: # FIXME: how to update angle
scenes[meshmodel + '_' + str(angle)], _ = create_scene(meshmodel, angle,
sigmaT = 30., albedo = 0.8, g = 0.2,
nsamples = nsamples, render_resolution = render_resolution,
integrator_type='vol')
# access param and define keys to change params values
print('scenes keys:', scenes.keys())
scene_name = meshmodels[0] + '_' + str(list_angle[0])
print(f'{scene_name=}')
scene_00 = scenes[scene_name]
scene_params = mi.traverse(scene_00)
print(scene_params)
key_sigma_t = 'object.interior_medium.sigma_t.value.value'
key_albedo = 'object.interior_medium.albedo.value.value'
key_g = 'object.interior_medium.phase_function.g'
# export renders
meshmodel_prev = ""
start_time = timer()
for i, dataset_param in enumerate(dataset_params):
if i > 10: # REMOVE ME
break
meshmodel = dataset_param['meshmodel']
angle = dataset_param['angle']
sigma_t = dataset_param['sigma_t']
albedo = dataset_param['albedo']
g = dataset_param['g']
#scene = scenes[meshmodel + '_' + str(angle)]
scene_params = mi.traverse(scene_00)
scene_params[key_sigma_t] = sigma_t
scene_params[key_albedo] = albedo
scene_params[key_g] = g
scene_params.update()
#img = mi.render(scene).numpy()
img = mi.render(scene_00).numpy()
filename = f"{meshmodel}_e{angle}_d{sigma_t}_a{albedo}_g{g}_q{nsamples_str}"
filepath = os.path.join(renders_path, filename + '.exr')
pyexr.write(filepath, img.astype(np.float32))
elapsed_time = timer() - start_time
print(f'wrote file {i}: {filename} (elapsed time: {elapsed_time}, per step: {elapsed_time / (i+1)})') |
Beta Was this translation helpful? Give feedback.
-
Yes, last line should be scene_00. I corrected. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Summary
As a target, I would like to learn SVBSSDF on 50k images at 1k resolution. But currently my volumetric renders are too slow (even without considering other operations during training). I don't know if I am reaching a hard limit in rendering time, or if I am not setting things up correctly. Could you please help me with that?
System configuration
OS: Windows-10
CPU: Intel64 Family 6 Model 165 Stepping 5, GenuineIntel
GPU: NVIDIA RTX A4000
Python: 3.9.7 (tags/v3.9.7:1016ef3, Aug 30 2021, 20:19:38) [MSC v.1929 64 bit (AMD64)]
NVidia driver: 517.40
CUDA: 10.0.130
LLVM: 15.-1.-1
Dr.Jit: 0.4.0
Mitsuba: 3.2.0
Is custom build? False
Compiled with: MSVC 19.34.31937.0
Variants:
scalar_rgb
scalar_spectral
cuda_ad_rgb
llvm_ad_rgb
Description
I am rendering 128x128 images using volpath (or pbrvolpath). Currently it takes in the order of 1s per image. A dataset of 10'000 images takes me about a day. Learning from half of the dataset for 50 epochs is about 20 days. This is already a lot for iterating on a solution - my training currently does not work yet. Now if I want to render 1k resolution (or at the very least 512 resolution) images and have 50'000 images, the renders are too slow.
How can I render it much faster?
Thank you
Steps to reproduce
The scene is shown below. Meshes have at most around 50k vertices. I can create a minimal example if necessary but would have to provide the model and env map as well.
Beta Was this translation helpful? Give feedback.
All reactions