You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
**Results:** You can find below the results obtained when running the code on the github repository. Details on the training and on the implementation can be found in the following <ahref="/assets/pdf/Report_Deephoughvoting.pdf"> pdf. </a>
15
15
16
-
<divclass="row">
17
-
<div class="col-sm mt-3 mt-md-0">
18
-
{% include figure.html path="assets/img/project/Npm3d/project_deephoughvoting_results.jpg" title="example image" class="img-fluid rounded z-depth-1" %}
19
-
</div>
20
-
</div>
21
-
<divclass="caption">
22
-
Point cloud, ground-truth and prediction on some scans of the 3rd testing batch of ScannetV2.
23
-
</div>
24
-
25
-
<divclass="row">
26
-
<div class="col-sm mt-4 mt-md-0">
27
-
{% include figure.html path="assets/img/project/Npm3d/project_deephoughvoting_results_3.jpg" title="example image" class="img-fluid rounded z-depth-1" %}
28
-
</div>
29
-
</div>
30
-
<divclass="caption">
31
-
Point cloud, ground-truth and prediction on some scans of the 10th testing batch of Sun RGB-D v2.
**Summary:** Video inpainting is the task of reconstructing missing pixels in a video. It is an important problem in computer vision and an essential feature in many imaging and graphic applications, e.g. object removal, image restoration, manipulation, retargeting, image composition and rendering. While image inpainting is an almost solved problem, video inpainting is more difficult to solve as approaches are often unable to maintain the sharpness of the edges and create blurry effects while being unable to remove the correlated effects of an object. In addition, some suffer from temporal coherence. Although modern approaches overcome some of these problems, most of them require a complex input mask, cannot handle multiple deletion and are unable to remove correlations associated with an object. Recently, a paper has found a new way to combine objects and their effects to create masks containing subjects and effects in a self-supervised manner using only masks and coarse segmentation images. It does this by decomposing a video into a set of RGBA layers representing the appearance of different objects and their effects in the video. Although this requires training one model per video, it can lead to many applications.
13
13
14
14
**Results:** You can find below my personal results. Details on the training and on the implementation can be found in the following <ahref="/assets/pdf/Report_Omnimatte"> pdf </a>. The hardest parts were to pre-process videos by calculating homographies, optical flow, binary masks, etc. and to do a notebook in order to run the code on Colab.
15
-
<divclass="row">
16
-
<div class="col-sm mt-3 mt-md-0">
17
-
{% include figure.html path="assets/img/project/DeepL/project_omnimatte_results1.jpg" title="example image" class="img-fluid rounded z-depth-1" %}
18
-
</div>
19
-
</div>
20
15
21
-
<divclass="row">
22
-
<div class="col-sm mt-3 mt-md-0">
23
-
{% include figure.html path="assets/img/project/DeepL/project_omnimatte_results1.jpg" title="example image" class="img-fluid rounded z-depth-1" %}
24
-
</div>
25
-
</div>
26
-
<divclass="caption">
27
-
Our results and the ones of the paper on "Drift chicane".
28
-
</div>
29
-
30
-
<divclass="row">
31
-
<div class="col-sm mt-3 mt-md-0">
32
-
{% include figure.html path="assets/img/project/DeepL/project_omnimatte_results3.jpg" title="example image" class="img-fluid rounded z-depth-1" %}
33
-
</div>
34
-
</div>
35
-
36
-
<divclass="row">
37
-
<div class="col-sm mt-3 mt-md-0">
38
-
{% include figure.html path="assets/img/project/DeepL/project_omnimatte_results4.jpg" title="example image" class="img-fluid rounded z-depth-1" %}
39
-
</div>
40
-
</div>
41
-
<divclass="caption">
42
-
Our results and the ones of the paper on "Blackswan".
@@ -14,17 +14,9 @@ Several modern methods to reconstruct a 3D meshcan be grouped into two categorie
14
14
15
15
**Results:** You can find below the results obtained after training DeepSDF, Occupancy network and Shape as Points on the sofa category of ShapeNet. Details on the training and on the implementation can be found in the following <ahref="/assets/pdf/Report_DeepSDF.pdf"> pdf. </a>
16
16
17
-
<divclass="row justify-content-md-center">
18
-
<div class="col-sm">
19
-
</div>
20
-
<div class="col-auto-4">
21
-
{% include figure.html path="assets/img/project/RecVis/project_recvis_results.jpg" title="example image" class="img-fluid rounded z-depth-1" %}
Copy file name to clipboardExpand all lines: _projects/4_project.md
+4-10Lines changed: 4 additions & 10 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -2,7 +2,7 @@
2
2
layout: page
3
3
title: KP-conv
4
4
description: 3D semantic segmentation
5
-
img: assets/img/project/IC/profile.png
5
+
img: assets/img/project/kpconv/profile.png
6
6
importance: 1
7
7
category: master
8
8
---
@@ -16,15 +16,9 @@ category: master
16
16
**Results:** To solve this challenge, I used KP-Conv, a deep neural network build to classify and segment 3D point clouds. I ranked first in the competition with a score of 0.9400 on the private leaderboard. Details on the training and on the implementation can be found in the following <ahref="/assets/pdf/Report_IC"> pdf </a>.
17
17
18
18
19
-
<divclass="row">
20
-
<div class="col-sm mt-3 mt-md-0">
21
-
{% include figure.html path="assets/img/project/IC/results_1.png"
22
-
title="example image"
23
-
class="img-fluid rounded z-depth-1" %}
24
-
</div>
25
-
</div>
26
-
<divclass="caption">
27
-
Results obtained on the test set. Some classes are over-represented, which corresponds to the distribution of classes in the training scans.
<div class="caption">Results obtained on the test set. Some classes are over-represented, which corresponds to the distribution of classes in the training scans.</div>
28
22
</div>
29
23
30
24
**Ressources**: I trained the network on Google Colab pro using a P100 GPU for 10 hours.
<p align="justify">Bird's-eye View (BeV) representations have emerged as the de-facto shared space in driving applications, offering a unified space for sensor data fusion and supporting various downstream tasks. However, conventional models use grids with fixed resolution and range and face computational inefficiencies due to the uniform allocation of resources across all cells. To address this, we propose PointBeV, a novel sparse BeV segmentation model operating on sparse BeV cells instead of dense grids. This approach offers precise control over memory usage, enabling the use of long temporal contexts and accommodating memory-constrained platforms. PointBeV employs an efficient two-pass strategy for training, enabling focused computation on regions of interest. At inference time, it can be used with various memory/performance trade-offs and flexibly adjusts to new specific use cases. PointBeV achieves state-of-the-art results on the nuScenes dataset for vehicle, pedestrian, and lane segmentation, showcasing superior performance in static and temporal settings despite being trained solely with sparse signals. We will release our code along with two new efficient modules used in the architecture: Sparse Feature Pulling, designed for the effective extraction of features from images to BeV, and Submanifold Attention, which enables efficient temporal modeling.</p>
48
45
49
46
<hr>
47
+
48
+
<divclass="text-center">
49
+
<div class="mt-3 mt-md-0">{% include figure.html path="assets/img/paper/2024_pointbev/pointbev.PNG" class="img-fluid rounded z-depth-1" %}</div>
GaussRender is a 3D Occupancy module that can be plugged into any 3D Occupancy model to enhance its predictions and ensure 2D-3D consistency while improving mIoU, IoU, and RayIoU.
53
+
<div class="caption mt-3">GaussRender is a 3D Occupancy module that can be plugged into any 3D Occupancy model to enhance its predictions and ensure 2D-3D consistency while improving mIoU, IoU, and RayIoU.</div>
<div class="caption">GaussRender can be plugged to any model. The core idea is to transform voxels into gaussians before performing a depth and a semantic rendering.</div>
JAFAR improves metrics on many downstream tasks: semantic segmentation, depth estimation, feature activation, zero-shot open vocabulary, bird's eye view segmentation by upsampling features from any backbone.
55
59
</div>
@@ -63,7 +67,7 @@ Once trained, JAFAR can efficiently upsample any backbone features to any resolu
0 commit comments