Skip to content

Commit 59561a8

Browse files
committed
misc: update readme and third_party code
1 parent 8479b87 commit 59561a8

File tree

23 files changed

+5589
-8
lines changed

23 files changed

+5589
-8
lines changed

.gitignore

Lines changed: 0 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -125,8 +125,6 @@ compile_commands.json
125125
data
126126
results
127127

128-
third_party
129-
130128
figs
131129
stats
132130
temp
@@ -137,7 +135,6 @@ temp
137135
setup_gscodec.py
138136
*.ipynb
139137
Readme_GSCodec.md
140-
third_party
141138
temp
142139
stats
143140
figs

README.md

Lines changed: 15 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,8 @@
22

33
GSCodec Studio is an open-source framework for Gaussian Splats Compression, including static and dynamic splats representation, reconstruction and compression. It is bulit upon an open-source 3D Gaussian Splatting library [gsplat](https://github.com/nerfstudio-project/gsplat), and extended to support 1) dynamic splats representation, 2) training-time compression simulation, 3) more test-time compression strategies.
44

5+
[Teaser](./assets/Teaser.png)
6+
57
## Installation
68

79
**Dependence**: Please install [Pytorch](https://pytorch.org/get-started/locally/) first.
@@ -13,7 +15,7 @@ pip install .
1315
# pip install -e . (develop mode)
1416
```
1517

16-
## Evaluation
18+
## Examples
1719

1820
**Preparations**
1921
Same as gsplat, we need to install some extra dependencies and download the relevant datasets before the evaluation.
@@ -26,15 +28,23 @@ python datasets/download_dataset.py
2628
# place other dataset under 'data' folder
2729
```
2830

29-
**Static Gaussian Splats Compression**
31+
We also use third-party library, 'python-fpnge', to accelerate image saving operations during the experiment for now.
32+
33+
```bash
34+
cd ../third_party/python-fpnge-master
35+
pip install .
36+
```
37+
38+
**Static Gaussian Splats Training and Compression**
3039

31-
We provide a script that enables more memory-efficient Gaussian splats while maintaining high visual quality, such as representing the Truck scene with only about 8MB of storage.
40+
We provide a script that enables more memory-efficient Gaussian splats while maintaining high visual quality, such as representing the Truck scene with only about 8MB of storage. The script includes 1) the static splats training with compression simulation, 2) the compression of trained static splats, and 3) the metric evaluation of uncompressed and compressed static splats.
3241

3342
```bash
3443
bash benchmarks/compression/final_exp/mcmc_tt_sim.sh
3544
```
3645

37-
## More examples
38-
**Dynamic Gaussian Splats Compression**
46+
**Dynamic Gaussian Splats Training and Compression**
47+
(Will be provided on 2/15/2025)
3948

4049
**Extract Per-Frame Static Gaussian from Dynamic Splats**
50+
(Will be provided on 2/15/2025)

assets/Teaser.png

463 KB
Loading
Lines changed: 57 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,57 @@
1+
SCENE_DIR="data/neural_3d"
2+
SCENE_LIST="coffee_martini cook_spinach cut_roasted_beef flame_salmon_1 flame_steak sear_steak"
3+
#
4+
5+
RESULT_DIR="results/stg_neu3d_compression_final_lambda_0.01"
6+
7+
NUM_FRAME=50
8+
9+
run_single_scene() {
10+
local GPU_ID=$1
11+
local SCENE=$2
12+
13+
echo "Running $SCENE"
14+
15+
CUDA_VISIBLE_DEVICES=$GPU_ID python simple_trainer_STG.py compression_sim \
16+
--model_path $RESULT_DIR/$SCENE/ \
17+
--data_dir $SCENE_DIR/$SCENE/colmap_0 \
18+
--result_dir $RESULT_DIR/$SCENE/ \
19+
--duration $NUM_FRAME \
20+
--entropy_model_opt \
21+
--rd_lambda 0.01
22+
# --profiler_enabled
23+
24+
CUDA_VISIBLE_DEVICES=$GPU_ID python simple_trainer_STG.py default \
25+
--model_path $RESULT_DIR/$SCENE/ \
26+
--data_dir $SCENE_DIR/$SCENE/colmap_0 \
27+
--result_dir $RESULT_DIR/$SCENE/ \
28+
--duration $NUM_FRAME \
29+
--lpips_net vgg \
30+
--compression stg \
31+
--ckpt $RESULT_DIR/$SCENE/ckpts/ckpt_best_rank0.pt
32+
}
33+
34+
GPU_LIST=(0 1 2 3 4 5)
35+
GPU_COUNT=${#GPU_LIST[@]}
36+
37+
SCENE_IDX=-1
38+
39+
for SCENE in $SCENE_LIST;
40+
do
41+
SCENE_IDX=$((SCENE_IDX + 1))
42+
{
43+
run_single_scene ${GPU_LIST[$SCENE_IDX]} $SCENE
44+
} &
45+
46+
done
47+
48+
wait
49+
50+
# Zip the compressed files and summarize the stats
51+
if command -v zip &> /dev/null
52+
then
53+
echo "Zipping results"
54+
python benchmarks/stg/summarize_stats.py --results_dir $RESULT_DIR --scenes $SCENE_LIST --num_frame $NUM_FRAME
55+
else
56+
echo "zip command not found, skipping zipping"
57+
fi
Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
from .grid import GridEncoder

third_party/gridencoder/backend.py

Lines changed: 40 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,40 @@
1+
import os
2+
from torch.utils.cpp_extension import load
3+
4+
_src_path = os.path.dirname(os.path.abspath(__file__))
5+
6+
nvcc_flags = [
7+
'-O3', '-std=c++14',
8+
'-U__CUDA_NO_HALF_OPERATORS__', '-U__CUDA_NO_HALF_CONVERSIONS__', '-U__CUDA_NO_HALF2_OPERATORS__',
9+
]
10+
11+
if os.name == "posix":
12+
c_flags = ['-O3', '-std=c++14']
13+
elif os.name == "nt":
14+
c_flags = ['/O2', '/std:c++17']
15+
16+
# find cl.exe
17+
def find_cl_path():
18+
import glob
19+
for edition in ["Enterprise", "Professional", "BuildTools", "Community"]:
20+
paths = sorted(glob.glob(r"C:\\Program Files (x86)\\Microsoft Visual Studio\\*\\%s\\VC\\Tools\\MSVC\\*\\bin\\Hostx64\\x64" % edition), reverse=True)
21+
if paths:
22+
return paths[0]
23+
24+
# If cl.exe is not on path, try to find it.
25+
if os.system("where cl.exe >nul 2>nul") != 0:
26+
cl_path = find_cl_path()
27+
if cl_path is None:
28+
raise RuntimeError("Could not locate a supported Microsoft Visual C++ installation")
29+
os.environ["PATH"] += ";" + cl_path
30+
31+
_backend = load(name='_grid_encoder',
32+
extra_cflags=c_flags,
33+
extra_cuda_cflags=nvcc_flags,
34+
sources=[os.path.join(_src_path, 'src', f) for f in [
35+
'gridencoder.cu',
36+
'bindings.cpp',
37+
]],
38+
)
39+
40+
__all__ = ['_backend']

third_party/gridencoder/grid.py

Lines changed: 215 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,215 @@
1+
import numpy as np
2+
3+
import torch
4+
import torch.nn as nn
5+
from torch.autograd import Function
6+
from torch.autograd.function import once_differentiable
7+
from torch.cuda.amp import custom_bwd, custom_fwd
8+
9+
try:
10+
import _gridencoder as _backend
11+
except ImportError:
12+
from .backend import _backend
13+
14+
_gridtype_to_id = {
15+
'hash': 0,
16+
'tiled': 1,
17+
}
18+
19+
_interp_to_id = {
20+
'linear': 0,
21+
'smoothstep': 1,
22+
}
23+
24+
class STE_binary(torch.autograd.Function):
25+
@staticmethod
26+
def forward(ctx, input):
27+
ctx.save_for_backward(input)
28+
# out = torch.sign(input)
29+
p = (input >= 0) * (+1.0)
30+
n = (input < 0) * (-1.0)
31+
out = p + n
32+
return out
33+
@staticmethod
34+
def backward(ctx, grad_output):
35+
# mask: to ensure x belongs to (-1, 1)
36+
input, = ctx.saved_tensors
37+
i2 = input.clone().detach()
38+
i3 = torch.clamp(i2, -1, 1)
39+
mask = (i3 == i2) + 0.0
40+
return grad_output * mask
41+
42+
43+
class STE_multistep(torch.autograd.Function):
44+
@staticmethod
45+
def forward(ctx, input, Q):
46+
return torch.round(input/Q)*Q
47+
@staticmethod
48+
def backward(ctx, grad_output):
49+
return grad_output, None
50+
51+
52+
class _grid_encode(Function):
53+
@staticmethod
54+
@custom_fwd
55+
def forward(ctx, inputs, embeddings, offsets_list, resolutions_list, calc_grad_inputs=False, max_level=None):
56+
# inputs: [N, num_dim], float in [0, 1]
57+
# embeddings: [sO, n_features], float. self.embeddings = nn.Parameter(torch.empty(offset, n_features))
58+
# offsets_list: [n_levels + 1], int
59+
# RETURN: [N, F], float
60+
61+
inputs = inputs.contiguous()
62+
63+
N, num_dim = inputs.shape # batch size, coord dim # N_rays, 3
64+
n_levels = offsets_list.shape[0] - 1 # level # 层数=16
65+
n_features = embeddings.shape[1] # embedding dim for each level # 就是channel数=2
66+
67+
max_level = n_levels if max_level is None else min(max_level, n_levels)
68+
69+
# manually handle autocast (only use half precision embeddings, inputs must be float for enough precision)
70+
# if n_features % 2 != 0, force float, since half for atomicAdd is very slow.
71+
if torch.is_autocast_enabled() and n_features % 2 == 0:
72+
embeddings = embeddings.to(torch.half)
73+
74+
# n_levels first, optimize cache for cuda kernel, but needs an extra permute later
75+
outputs = torch.empty(n_levels, N, n_features, device=inputs.device, dtype=embeddings.dtype) # 创建一个buffer给cuda填充
76+
# outputs = [hash层数=16, N_rays, channels=2]
77+
78+
# zero init if we only calculate partial levels
79+
if max_level < n_levels: outputs.zero_()
80+
81+
if calc_grad_inputs: # inputs.requires_grad
82+
dy_dx = torch.empty(N, n_levels * num_dim * n_features, device=inputs.device, dtype=embeddings.dtype)
83+
if max_level < n_levels: dy_dx.zero_()
84+
else:
85+
dy_dx = None
86+
87+
_backend.grid_encode_forward(
88+
inputs,
89+
embeddings,
90+
offsets_list,
91+
resolutions_list,
92+
outputs,
93+
N, num_dim, n_features, n_levels, max_level,
94+
dy_dx
95+
)
96+
97+
# permute back to [N, n_levels * n_features] # [N_rays, hash层数=16 * channels=2]
98+
outputs = outputs.permute(1, 0, 2).reshape(N, n_levels * n_features)
99+
100+
ctx.save_for_backward(inputs, embeddings, offsets_list, resolutions_list, dy_dx)
101+
ctx.dims = [N, num_dim, n_features, n_levels, max_level]
102+
103+
return outputs
104+
105+
@staticmethod
106+
#@once_differentiable
107+
@custom_bwd
108+
def backward(ctx, grad):
109+
110+
inputs, embeddings, offsets_list, resolutions_list, dy_dx = ctx.saved_tensors
111+
N, num_dim, n_features, n_levels, max_level = ctx.dims
112+
113+
# grad: [N, n_levels * n_features] --> [n_levels, N, n_features]
114+
grad = grad.view(N, n_levels, n_features).permute(1, 0, 2).contiguous()
115+
116+
# 是梯度的占位变量,和本体的形状相同,因为代码里是直接加原始值的,所以这里得定义为全0
117+
grad_embeddings = torch.zeros_like(embeddings)
118+
119+
if dy_dx is not None:
120+
grad_inputs = torch.zeros_like(inputs, dtype=embeddings.dtype)
121+
else:
122+
grad_inputs = None
123+
124+
_backend.grid_encode_backward(
125+
grad,
126+
inputs,
127+
embeddings,
128+
offsets_list,
129+
resolutions_list,
130+
grad_embeddings,
131+
N, num_dim, n_features, n_levels, max_level,
132+
dy_dx,
133+
grad_inputs
134+
)
135+
136+
if dy_dx is not None:
137+
grad_inputs = grad_inputs.to(inputs.dtype)
138+
139+
return grad_inputs, grad_embeddings, None, None, None, None
140+
141+
142+
grid_encode = _grid_encode.apply
143+
144+
145+
class GridEncoder(nn.Module):
146+
def __init__(self,
147+
num_dim=3,
148+
n_features=2,
149+
resolutions_list=(16, 23, 32, 46, 64, 92, 128, 184, 256, 368, 512, 736),
150+
log2_hashmap_size=19,
151+
ste_binary=False,
152+
):
153+
super().__init__()
154+
155+
resolutions_list = torch.tensor(resolutions_list).to(torch.int)
156+
n_levels = resolutions_list.numel()
157+
158+
self.num_dim = num_dim # coord dims, 2 or 3
159+
self.n_levels = n_levels # num levels, each level multiply resolution by 2
160+
self.n_features = n_features # encode channels per level
161+
self.log2_hashmap_size = log2_hashmap_size
162+
self.output_dim = n_levels * n_features
163+
self.ste_binary = ste_binary
164+
165+
# allocate parameters
166+
offsets_list = [] # 每层hashtable长度的cumsum
167+
offset = 0 # 用于统计所有层加起来一共需要多少长度的hashtable
168+
self.max_params = 2 ** log2_hashmap_size # 按论文算法的每层的hashtable长度上限
169+
for i in range(n_levels):
170+
resolution = resolutions_list[i].item()
171+
params_in_level = min(self.max_params, resolution ** num_dim) # limit max number
172+
params_in_level = int(np.ceil(params_in_level / 8) * 8) # make divisible
173+
offsets_list.append(offset)
174+
offset += params_in_level
175+
offsets_list.append(offset)
176+
offsets_list = torch.from_numpy(np.array(offsets_list, dtype=np.int32))
177+
self.register_buffer('offsets_list', offsets_list)
178+
self.register_buffer('resolutions_list', resolutions_list)
179+
180+
self.n_params = offsets_list[-1] * n_features # 所有的params的数量
181+
182+
# parameters
183+
self.embeddings = nn.Parameter(torch.empty(offset, n_features))
184+
185+
self.reset_parameters()
186+
187+
self.n_output_dims = n_levels * n_features
188+
189+
def reset_parameters(self):
190+
std = 1e-4
191+
self.embeddings.data.uniform_(-std, std)
192+
193+
def __repr__(self):
194+
return f"GridEncoder: num_dim={self.num_dim} n_levels={self.n_levels} n_features={self.n_features} resolution={self.base_resolution} -> {int(round(self.base_resolution * self.per_level_scale ** (self.n_levels - 1)))} per_level_scale={self.per_level_scale:.4f} params={tuple(self.embeddings.shape)} gridtype={self.gridtype} align_corners={self.align_corners} interpolation={self.interpolation}"
195+
196+
def forward(self, inputs, max_level=None):
197+
# inputs: [..., num_dim], normalized real world positions in [-1, 1]
198+
# max_level: only calculate first max_level levels (None will use all levels)
199+
# return: [..., n_levels * n_features]
200+
201+
#print('inputs', inputs.shape, inputs.dtype, inputs.min().item(), inputs.max().item())
202+
203+
prefix_shape = list(inputs.shape[:-1])
204+
inputs = inputs.view(-1, self.num_dim)
205+
206+
if self.ste_binary:
207+
embeddings = STE_binary.apply(self.embeddings)
208+
else:
209+
embeddings = self.embeddings
210+
outputs = grid_encode(inputs, embeddings, self.offsets_list, self.resolutions_list, inputs.requires_grad, max_level)
211+
outputs = outputs.view(prefix_shape + [self.output_dim])
212+
213+
#print('outputs', outputs.shape, outputs.dtype, outputs.min().item(), outputs.max().item())
214+
215+
return outputs

0 commit comments

Comments
 (0)