Skip to content

Training Error #37

@shilpa-ullas

Description

@shilpa-ullas

Hi @zhyever ,

I recently tried out patchfusion model using this repo.
Currently I'm trying to run the training script to train a model by myself.

Following the training steps in [https://github.com/zhyever/PatchFusion/blob/main/docs/user_training.md] , I was able to run coarse and fine model training for depth_anything_vitb model.
But facing the below error while running the training for fusion model.

rank0: File "PatchFusion/estimator/trainer/trainer.py", line 32
6, in run
rank0: self.train_epoch(epoch_idx)
rank0: File "PatchFusion/estimator/trainer/trainer.py", line 25
0, in train_epoch
rank0: self.optimizer_wrapper.update_params(total_loss)
rank0: File "lib/python3.8/site-packages/mmengine/optim/optimizer/optimizer_wrapper.py",
line 196, in update_params
rank0: self.backward(loss)
rank0: File "lib/python3.8/site-packages/mmengine/optim/optimizer/optimizer_wrapper.py",
line 220, in backward
rank0: loss.backward(**kwargs)
rank0: File "lib/python3.8/site-packages/torch/tensor.py", line 525, in backward
rank0: torch.autograd.backward(
rank0: File "lib/python3.8/site-packages/torch/autograd/init.py", line 260, in backwa
rd
rank0: grad_tensors
= make_grads(tensors, grad_tensors, is_grads_batched=False)
rank0: File "lib/python3.8/site-packages/torch/autograd/init.py", line 133, in make
grads
rank0: raise RuntimeError(
rank0: RuntimeError: grad can be implicitly created only for scalar outputs

Could you please give some inputs on this?
Is there anything to be modified on the script?

One more question out of this. Do we have any onnx/tensorrt or any other deployment model version for patchfusion?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions