|
1 | 1 | # TorchSparse |
2 | 2 |
|
3 | | -## News |
| 3 | +TorchSparse is a high-performance neural network library for point cloud processing. |
4 | 4 |
|
5 | | -2020/09/20: We released `torchsparse` v1.1, which is significantly faster than our `torchsparse` v1.0 and is also achieves **1.9x** speedup over [MinkowskiEngine](https://github.com/NVIDIA/MinkowskiEngine) v0.5 alpha when running MinkUNet18C! |
6 | | - |
7 | | -2020/08/30: We released `torchsparse` v1.0. |
8 | | - |
9 | | -## Overview |
| 5 | +## Installation |
10 | 6 |
|
11 | | -We release `torchsparse`, a high-performance computing library for efficient 3D sparse convolution. This library aims at accelerating sparse computation in 3D, in particular the Sparse Convolution operation. |
| 7 | +TorchSparse depends on the [Google Sparse Hash](https://github.com/sparsehash/sparsehash) library. |
12 | 8 |
|
13 | | -<img src="https://hanlab.mit.edu/projects/spvnas/figures/sparseconv_illustration.gif" width="1080"> |
| 9 | +* On Ubuntu, it can be installed by |
14 | 10 |
|
15 | | -The major advantage of this library is that we support all computation on the GPU, especially the kernel map construction (which is done on the CPU in latest [MinkowskiEngine](https://github.com/NVIDIA/MinkowskiEngine) V0.4.3). |
| 11 | + ```bash |
| 12 | + sudo apt-get install libsparsehash-dev |
| 13 | + ``` |
16 | 14 |
|
17 | | -## Installation |
| 15 | +* On Mac OS, it can be installed by |
18 | 16 |
|
19 | | -You may run the following command to install torchsparse. |
| 17 | + ```bash |
| 18 | + brew install google-sparsehash |
| 19 | + ``` |
20 | 20 |
|
21 | | -```bash |
22 | | -pip install --upgrade git+https://github.com/mit-han-lab/torchsparse.git |
23 | | -``` |
| 21 | +* You can also compile the library locally (if you do not have the sudo permission) and add the library path to the environment variable `CPLUS_INCLUDE_PATH`. |
24 | 22 |
|
25 | | -Note that this library depends on Google's [sparse hash map project](https://github.com/sparsehash/sparsehash). In order to install this library, you may run |
| 23 | +The latest released TorchSparse (v1.4.0) can then be installed by |
26 | 24 |
|
27 | 25 | ```bash |
28 | | -sudo apt-get install libsparsehash-dev |
| 26 | +pip install --upgrade git+https://github.com/mit-han-lab/[email protected] |
29 | 27 | ``` |
30 | 28 |
|
31 | | -on Ubuntu servers. If you are not sudo, please clone Google's codebase, compile it and install locally. Finally, add the path to this library to your `CPLUS_INCLUDE_PATH` environmental variable. |
32 | | - |
33 | | -For GPU server users, we currently support PyTorch 1.6.0 + CUDA 10.2 + CUDNN 7.6.2. For CPU users, we support PyTorch 1.6.0 (CPU version), MKLDNN backend is optional. |
34 | | - |
35 | | -## Usage |
36 | | - |
37 | | -Our [SPVNAS](https://github.com/mit-han-lab/e3d) project (ECCV2020) is built with torchsparse. You may navigate to this project and follow the instructions in that codebase to play around. |
38 | | - |
39 | | -Here, we also provide a walk-through on some important concepts in torchsparse. |
40 | | - |
41 | | -### Sparse Tensor and Point Tensor |
42 | | - |
43 | | -In torchsparse, we have two data structures for point cloud storage, namely `torchsparse.SparseTensor` and `torchsparse.PointTensor`. Both structures has two data fields `C` (coordinates) and `F` (features). In `SparseTensor`, we assume that all coordinates are **integer** and **do not duplicate**. However, in `PointTensor`, all coordinates are **floating-point** and can duplicate. |
44 | | - |
45 | | -### Sparse Quantize and Sparse Collate |
46 | | - |
47 | | -The way to convert a point cloud to `SparseTensor` so that it can be consumed by networks built with Sparse Convolution or Sparse Point-Voxel Convolution is to use the function `torchsparse.utils.sparse_quantize`. An example is given here: |
48 | | - |
49 | | -```python |
50 | | -inds, labels, inverse_map = sparse_quantize(pc, feat, labels, return_index=True, return_invs=True) |
51 | | -``` |
| 29 | +If you use TorchSparse in your code, please remember to specify the exact version as your dependencies. |
52 | 30 |
|
53 | | -where `pc`, `feat`, `labels` corresponds to point cloud (coordinates, should be integer), feature and ground-truth. The `inds` denotes unique indices in the point cloud coordinates, and `inverse_map` denotes the unique index each point is corresponding to. The `inverse map` is used to restore full point cloud prediction from downsampled prediction. |
| 31 | +## Benchmark |
54 | 32 |
|
55 | | -To combine a list of `SparseTensor`s to a batch, you may want to use the `torchsparse.utils.sparse_collate_fn` function. |
| 33 | +We compare TorchSparse with [MinkowskiEngine](https://github.com/NVIDIA/MinkowskiEngine) (where the latency is measured on NVIDIA GTX 1080Ti): |
56 | 34 |
|
57 | | -Detailed results are given in [SemanticKITTI dataset preprocessing code](https://github.com/mit-han-lab/e3d/blob/master/spvnas/core/datasets/semantic_kitti.py) in our [SPVNAS](https://github.com/mit-han-lab/e3d) project. |
| 35 | +| | MinkowskiEngine v0.4.3 | TorchSparse v1.0.0 | |
| 36 | +| :----------------------- | :--------------------: | :----------------: | |
| 37 | +| MinkUNet18C (MACs / 10) | 224.7 ms | 124.3 ms | |
| 38 | +| MinkUNet18C (MACs / 4) | 244.3 ms | 160.9 ms | |
| 39 | +| MinkUNet18C (MACs / 2.5) | 269.6 ms | 214.3 ms | |
| 40 | +| MinkUNet18C | 323.5 ms | 294.0 ms | |
58 | 41 |
|
59 | | -### Computation API |
| 42 | +## Getting Started |
60 | 43 |
|
61 | | -The computation interface in torchsparse is straightforward and very similar to original PyTorch. An example here defines a basic convolution block: |
| 44 | +### Sparse Tensor |
62 | 45 |
|
63 | | -```python |
64 | | -class BasicConvolutionBlock(nn.Module): |
65 | | - def __init__(self, inc, outc, ks=3, stride=1, dilation=1): |
66 | | - super().__init__() |
67 | | - self.net = nn.Sequential( |
68 | | - spnn.Conv3d(inc, outc, kernel_size=ks, dilation=dilation, stride=stride), |
69 | | - spnn.BatchNorm(outc), |
70 | | - spnn.ReLU(True) |
71 | | - ) |
72 | | - |
73 | | - def forward(self, x): |
74 | | - out = self.net(x) |
75 | | - return out |
76 | | -``` |
| 46 | +Sparse tensor (`SparseTensor`) is the main data structure for point cloud, which has two data fields: |
| 47 | +* Coordinates (`coords`): a 2D integer tensor with a shape of N x 4, where the first three dimensions correspond to quantized x, y, z coordinates, and the last dimension denotes the batch index. |
| 48 | +* Features (`feats`): a 2D tensor with a shape of N x C, where C is the number of feature channels. |
77 | 49 |
|
78 | | -where `spnn`denotes `torchsparse.nn`, and `spnn.Conv3d` means 3D sparse convolution operation, `spnn.BatchNorm` and `spnn.ReLU` denotes 3D sparse tensor batchnorm and activations, respectively. We also support direct convolution kernel call via `torchsparse.nn.functional`, for example: |
| 50 | +Most existing datasets provide raw point cloud data with float coordinates. We can use `sparse_quantize` (provided in `torchsparse.utils.quantize`) to voxelize x, y, z coordinates and remove duplicates: |
79 | 51 |
|
80 | 52 | ```python |
81 | | -outputs = torchsparse.nn.functional.conv3d(inputs, kernel, stride=1, dilation=1, transpose=False) |
| 53 | +coords -= np.min(coords, axis=0, keepdims=True) |
| 54 | +coords, indices = sparse_quantize(coords, voxel_size, return_index=True) |
| 55 | +coords = torch.tensor(coords, dtype=torch.int) |
| 56 | +feats = torch.tensor(feats[indices], dtype=torch.float) |
| 57 | +tensor = SparseTensor(coords=coords, feats=feats) |
82 | 58 | ``` |
83 | 59 |
|
84 | | -where we need to define `inputs`(SparseTensor), `kernel` (of shape k^3 x OC x IC when k > 1, or OC x IC when k = 1, where k denotes the kernel size and IC, OC means input / output channels). The `outputs` is still a SparseTensor. |
| 60 | +We can then use `sparse_collate_fn` (provided in `torchsparse.utils.collate`) to assemble a batch of `SparseTensor`'s (and add the batch dimension to `coords`). Please refer to [this example](https://github.com/mit-han-lab/torchsparse/blob/dev/pre-commit/examples/example.py) for more details. |
85 | 61 |
|
86 | | -Detailed examples are given in [here](https://github.com/mit-han-lab/e3d/blob/master/spvnas/core/modules/dynamic_sparseop.py), where we use the `torchsparse.nn.functional` interfaces to implement weight-shared 3D-NAS modules. |
| 62 | +### Sparse Neural Network |
87 | 63 |
|
88 | | -### Sparse Hashmap API |
89 | | - |
90 | | -Sparse hash map query is important in 3D sparse computation. It is mainly used to infer a point's memory location (*i.e.* index) given its coordinates. For example, we use this operation in kernel map construction part of 3D sparse convolution, and also sparse voxelization / devoxelization in [Sparse Point-Voxel Convolution](https://arxiv.org/abs/2007.16100). Here, we provide the following example for hash map API: |
| 64 | +The neural network interface in TorchSparse is very similar to PyTorch: |
91 | 65 |
|
92 | 66 | ```python |
93 | | -source_hash = torchsparse.nn.functional.sphash(torch.floor(source_coords).int()) |
94 | | -target_hash = torchsparse.nn.functional.sphash(torch.floor(target_coords).int()) |
95 | | -idx_query = torchsparse.nn.functional.sphashquery(source_hash, target_hash) |
| 67 | +from torch import nn |
| 68 | +from torchsparse import nn as spnn |
| 69 | + |
| 70 | +model = nn.Sequential( |
| 71 | + spnn.Conv3d(in_channels, out_channels, kernel_size), |
| 72 | + spnn.BatchNorm(out_channels), |
| 73 | + spnn.ReLU(True), |
| 74 | +) |
96 | 75 | ``` |
97 | 76 |
|
98 | | -In this example, `sphash` is the function converting integer coordinates to hashing. The `sphashquery(source_hash, target_hash)` performs the hash table lookup. Here, the hash map has key `target_hash` and value corresponding to point indices in the target point cloud tensor. For each point in the `source_coords`, we find the point index in `target_coords` which has the same coordinate as it. |
99 | | - |
100 | | -### Dummy Training Example |
101 | | - |
102 | | -We here provides an entire training example with dummy input [here](examples/example.py). In this example, we cover |
103 | | - |
104 | | -- How we start from point cloud data and convert it to SparseTensor format; |
105 | | -- How we can implement SparseTensor batching; |
106 | | -- How to train a semantic segmentation SparseConvNet. |
107 | | - |
108 | | -You are also welcomed to check out our [SPVNAS](https://github.com/mit-han-lab/e3d) project to implement training / inference with real data. |
109 | | - |
110 | | -### Mixed Precision (float16) Support |
111 | | - |
112 | | -Mixed precision training is supported via `torch.cuda.amp.autocast` and `torch.cuda.amp.GradScaler`. Enabling mixed precision training can speed up training and reduce GPU memory usage. By wrapping your training code in a `torch.cuda.amp.autocast` block, feature tensors will automatically be converted to float16 if possible. See [here](examples/example.py) for a complete example. |
113 | | - |
114 | | -## Speed Comparison Between torchsparse and MinkowskiEngine |
115 | | - |
116 | | -We benchmark the performance of our torchsparse and latest [MinkowskiEngine V0.4.3](https://github.com/NVIDIA/MinkowskiEngine) here, latency is measured on NVIDIA GTX 1080Ti GPU: |
117 | | - |
118 | | -| Network | Latency (ME V0.4.3) | Latency (torchsparse V1.0.0) | |
119 | | -| :----------------------: | :-----------------: | :--------------------------: | |
120 | | -| MinkUNet18C (MACs / 10) | 224.7 | 124.3 | |
121 | | -| MinkUNet18C (MACs / 4) | 244.3 | 160.9 | |
122 | | -| MinkUNet18C (MACs / 2.5) | 269.6 | 214.3 | |
123 | | -| MinkUNet18C | 323.5 | 294.0 | |
124 | | - |
125 | 77 | ## Citation |
126 | 78 |
|
127 | | -If you find this code useful, please consider citing: |
| 79 | +If you use TorchSparse in your research, please use the following BibTeX entry: |
128 | 80 |
|
129 | 81 | ```bibtex |
130 | | -@inproceedings{ |
131 | | - tang2020searching, |
132 | | - title = {Searching Efficient 3D Architectures with Sparse Point-Voxel Convolution}, |
133 | | - author = {Tang, Haotian* and Liu, Zhijian* and Zhao, Shengyu and Lin, Yujun and Lin, Ji and Wang, Hanrui and Han, Song}, |
134 | | - booktitle = {European Conference on Computer Vision}, |
| 82 | +@inproceedings{tang2020searching, |
| 83 | + title = {{Searching Efficient 3D Architectures with Sparse Point-Voxel Convolution}}, |
| 84 | + author = {Tang, Haotian and Liu, Zhijian and Zhao, Shengyu and Lin, Yujun and Lin, Ji and Wang, Hanrui and Han, Song}, |
| 85 | + booktitle = {European Conference on Computer Vision (ECCV)}, |
135 | 86 | year = {2020} |
136 | 87 | } |
137 | 88 | ``` |
138 | 89 |
|
139 | 90 | ## Acknowledgements |
140 | 91 |
|
141 | | -This library is inspired by [MinkowskiEngine](https://github.com/NVIDIA/MinkowskiEngine), [SECOND](https://github.com/traveller59/second.pytorch) and [SparseConvNet](https://github.com/facebookresearch/SparseConvNet). |
| 92 | +TorchSparse is inspired by many existing open-source libraries, including (but not limited to) [MinkowskiEngine](https://github.com/NVIDIA/MinkowskiEngine), [SECOND](https://github.com/traveller59/second.pytorch) and [SparseConvNet](https://github.com/facebookresearch/SparseConvNet). |
0 commit comments