Skip to content

Commit ed750c2

Browse files
author
Fangchang Ma
committed
fixed minor bug in dataloaders; updated README.
1 parent f45eb79 commit ed750c2

File tree

4 files changed

+35
-30
lines changed

4 files changed

+35
-30
lines changed

README.md

Lines changed: 30 additions & 23 deletions
Original file line numberDiff line numberDiff line change
@@ -1,21 +1,21 @@
11
sparse-to-dense.pytorch
22
============================
33

4-
This repo implements the training and testing of deep regression neural networks for ["Sparse-to-Dense: Depth Prediction from Sparse Depth Samples and a Single Image"](https://arxiv.org/pdf/1709.07492.pdf) by [Fangchang Ma](http://www.mit.edu/~fcma) and [Sertac Karaman](http://karaman.mit.edu/) at MIT. A video demonstration is available on [YouTube](https://youtu.be/vNIIT_M7x7Y).
4+
This repo implements the training and testing of deep regression neural networks for ["Sparse-to-Dense: Depth Prediction from Sparse Depth Samples and a Single Image"](https://arxiv.org/pdf/1709.07492.pdf) by [Fangchang Ma](http://www.mit.edu/~fcma) and [Sertac Karaman](http://karaman.mit.edu/) at MIT. A video demonstration is available on [YouTube](https://youtu.be/vNIIT_M7x7Y).
55
<p align="center">
66
<img src="http://www.mit.edu/~fcma/images/ICRA2018.png" alt="photo not available" width="50%" height="50%">
77
<img src="https://j.gifs.com/Z4qDow.gif" alt="photo not available" height="50%">
88
</p>
99

10-
This repo can be used for training and testing of
10+
This repo can be used for training and testing of
1111
- RGB (or grayscale image) based depth prediction
1212
- sparse depth based depth prediction
1313
- RGBd (i.e., both RGB and sparse depth) based depth prediction
1414

15-
The original Torch implementation of the paper can be found [here](https://github.com/fangchangma/sparse-to-dense). This PyTorch version is under development and is subject to major modifications in the future.
15+
The original Torch implementation of the paper can be found [here](https://github.com/fangchangma/sparse-to-dense).
1616

1717
## Thanks
18-
Thanks to [Tim](https://github.com/timethy) and [Akari](https://github.com/AkariAsai) for their contribution.
18+
Thanks to [Tim](https://github.com/timethy) and [Akari](https://github.com/AkariAsai) for their contributions.
1919

2020
## Contents
2121
0. [Requirements](#requirements)
@@ -27,30 +27,31 @@ Thanks to [Tim](https://github.com/timethy) and [Akari](https://github.com/Akari
2727

2828
## Requirements
2929
This code was tested with Python 3 and PyTorch 0.4.0.
30-
- Install [PyTorch](http://pytorch.org/) on a machine with CUDA GPU.
30+
- Install [PyTorch](http://pytorch.org/) on a machine with CUDA GPU.
3131
- Install the [HDF5](https://en.wikipedia.org/wiki/Hierarchical_Data_Format) and other dependencies (files in our pre-processed datasets are in HDF5 formats).
3232
```bash
3333
sudo apt-get update
3434
sudo apt-get install -y libhdf5-serial-dev hdf5-tools
35-
pip install h5py matplotlib imageio scikit-image
35+
pip3 install h5py matplotlib imageio scikit-image opencv-python
3636
```
37-
- Download the preprocessed [NYU Depth V2](http://cs.nyu.edu/~silberman/datasets/nyu_depth_v2.html) dataset in HDF5 formats, and place them under the `data` folder. The downloading process might take an hour or so. The NYU dataset requires 32G of storage space.
37+
- Download the preprocessed [NYU Depth V2](http://cs.nyu.edu/~silberman/datasets/nyu_depth_v2.html) and/or [KITTI Odometry](http://www.cvlibs.net/datasets/kitti/eval_odometry.php) dataset in HDF5 formats, and place them under the `data` folder. The downloading process might take an hour or so. The NYU dataset requires 32G of storage space, and KITTI requires 81G.
3838
```bash
39-
mkdir data
40-
cd data
41-
wget http://datasets.lids.mit.edu/sparse-to-dense/data/nyudepthv2.tar.gz
42-
tar -xvf nyudepthv2.tar.gz && rm -f nyudepthv2.tar.gz
39+
mkdir data; cd data
40+
wget http://datasets.lids.mit.edu/sparse-to-dense/data/nyudepthv2.tar.gz
41+
tar -xvf nyudepthv2.tar.gz && rm -f nyudepthv2.tar.gz
42+
wget http://datasets.lids.mit.edu/sparse-to-dense/data/kitti.tar.gz
43+
tar -xvf kitti.tar.gz && rm -f kitti.tar.gz
4344
cd ..
4445
```
4546
## Training
46-
The training scripts come with several options, which can be listed with the `--help` flag. Currently this repo only supports training on the NYU dataset.
47+
The training scripts come with several options, which can be listed with the `--help` flag. Currently this repo only supports training on the NYU dataset.
4748
```bash
4849
python3 main.py --help
4950
```
5051

51-
For instance, run the following command to train a network with ResNet50 as the encoder, deconvolutions of kernel size 3 as the decoder, and both RGB and 100 random sparse depth samples as the input to the network.
52+
For instance, run the following command to train a network with ResNet50 as the encoder, deconvolutions of kernel size 3 as the decoder, and both RGB and 100 random sparse depth samples as the input to the network.
5253
```bash
53-
python3 main.py -a resnet50 -d deconv3 -m rgbd -s 100
54+
python3 main.py -a resnet50 -d deconv3 -m rgbd -s 100 -data nyudepthv2
5455
```
5556

5657
Training results will be saved under the `results` folder.
@@ -59,14 +60,14 @@ Training results will be saved under the `results` folder.
5960
## Testing
6061
To test the performance of a trained model, simply run main.py with the `-e` option, along with other model options. For instance,
6162
```bash
62-
python3 main.py -e -a resnet50 -d deconv3 -m rgbd -s 100
63+
python3 main.py -e -a resnet50 -d deconv3 -m rgbd -s 100 -data nyudepthv2
6364
```
6465

6566
## Trained Models
66-
Trained models will be released later.
67+
Trained models will be released later.
6768

6869
## Benchmark
69-
The following numbers are from the original Torch repo.
70+
The following numbers are from the original Torch repo.
7071
- Error metrics on NYU Depth v2:
7172

7273
| RGB | rms | rel | delta1 | delta2 | delta3 |
@@ -107,14 +108,20 @@ The following numbers are from the original Torch repo.
107108

108109
Note: our networks are trained on the KITTI odometry dataset, using only sparse labels from laser measurements.
109110

110-
## Citation
111-
If you use our code or method in your work, please cite:
111+
## Citation
112+
If you use our code or method in your work, please consider citing the following:
112113

113114
@article{Ma2017SparseToDense,
114-
title={Sparse-to-Dense: Depth Prediction from Sparse Depth Samples and a Single Image},
115-
author={Ma, Fangchang and Karaman, Sertac},
116-
journal={arXiv preprint arXiv:1709.07492},
117-
year={2017}
115+
title={Sparse-to-Dense: Depth Prediction from Sparse Depth Samples and a Single Image},
116+
author={Ma, Fangchang and Karaman, Sertac},
117+
booktitle={ICRA},
118+
year={2018}
119+
}
120+
@article{ma2018self,
121+
title={Self-supervised Sparse-to-Dense: Self-supervised Depth Completion from LiDAR and Monocular Camera},
122+
author={Ma, Fangchang and Cavalheiro, Guilherme Venturelli and Karaman, Sertac},
123+
journal={arXiv preprint arXiv:1807.00275},
124+
year={2018}
118125
}
119126

120127
Please direct any questions to [Fangchang Ma](http://www.mit.edu/~fcma) at [email protected].

dataloaders/dataloader.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -45,6 +45,7 @@ def h5_loader(path):
4545

4646
class MyDataloader(data.Dataset):
4747
modality_names = ['rgb', 'rgbd', 'd'] # , 'g', 'gd'
48+
color_jitter = transforms.ColorJitter(0.4, 0.4, 0.4)
4849

4950
def __init__(self, root, type, sparsifier=None, modality='rgb', loader=h5_loader):
5051
classes, class_to_idx = find_classes(root)

dataloaders/kitti_dataloader.py

Lines changed: 2 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -2,11 +2,9 @@
22
import dataloaders.transforms as transforms
33
from dataloaders.dataloader import MyDataloader
44

5-
color_jitter = transforms.ColorJitter(0.4, 0.4, 0.4)
6-
75
class KITTIDataset(MyDataloader):
86
def __init__(self, root, type, sparsifier=None, modality='rgb'):
9-
super(KITTIDataset, self).__init__(root, type, sparsifier=None, modality='rgb')
7+
super(KITTIDataset, self).__init__(root, type, sparsifier, modality)
108
self.output_size = (228, 912)
119

1210
def train_transform(self, rgb, depth):
@@ -24,7 +22,7 @@ def train_transform(self, rgb, depth):
2422
transforms.HorizontalFlip(do_flip)
2523
])
2624
rgb_np = transform(rgb)
27-
rgb_np = color_jitter(rgb_np) # random color jittering
25+
rgb_np = self.color_jitter(rgb_np) # random color jittering
2826
rgb_np = np.asfarray(rgb_np, dtype='float') / 255
2927
# Scipy affine_transform produced RuntimeError when the depth map was
3028
# given as a 'numpy.ndarray'

dataloaders/nyu_dataloader.py

Lines changed: 2 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -3,11 +3,10 @@
33
from dataloaders.dataloader import MyDataloader
44

55
iheight, iwidth = 480, 640 # raw image size
6-
color_jitter = transforms.ColorJitter(0.4, 0.4, 0.4)
76

87
class NYUDataset(MyDataloader):
98
def __init__(self, root, type, sparsifier=None, modality='rgb'):
10-
super(NYUDataset, self).__init__(root, type, sparsifier=None, modality='rgb')
9+
super(NYUDataset, self).__init__(root, type, sparsifier, modality)
1110
self.output_size = (228, 304)
1211

1312
def train_transform(self, rgb, depth):
@@ -25,7 +24,7 @@ def train_transform(self, rgb, depth):
2524
transforms.HorizontalFlip(do_flip)
2625
])
2726
rgb_np = transform(rgb)
28-
rgb_np = color_jitter(rgb_np) # random color jittering
27+
rgb_np = self.color_jitter(rgb_np) # random color jittering
2928
rgb_np = np.asfarray(rgb_np, dtype='float') / 255
3029
depth_np = transform(depth_np)
3130

0 commit comments

Comments
 (0)