Aims to reduce the inference time while maintaining performance for monocular depth estimation. Report
- Added a PixelFormer_new.py to accommodate the new attention mechanisms used.
- Added SAM_cosine.py to use cosine similarity window attention instead of dot product window attention.
- Replaced the window attention with the below two attention mechanisms for fusing encoder and decoder features with global context (instead of the 7*7 window used in the original work) which improves the baseline model performance.
- Added SAM_efficient.py which implements Efficient Attention inside the skip attention module.
- Added SAM_fast.py which implements FAVOR+ Attention inside the skip attention module.
- Download all the models from this Link and place them in the \pretrained folder.
- Environment setup is mentioned in the PixelFormer.ipynb
- By default, the code is configured to train and evaluate the model which utilizes Efficient Attention.
- Changes to the training config can be done in the
configs/arguments_train_nyu.txt
- Run the following Python file to train the model on the NYU depthV2 dataset.
python pixelformer/train.py configs/arguments_train_nyu.txt
- Run the following two commands to run eval and testing.
python pixelformer/eval.py configs/arguments_eval_nyu.txt python pixelformer/test.py configs/arguments_test_nyu.txt
The code utilized in this project has been adapted from the PixelFormer repository. For a comprehensive view of the entire architecture, please refer to the original repository.