CIS565-Fall-2022 · Liang-Hao-Quan · Jan 16, 2021 · Sep 10, 2021 · Sep 10, 2021 · Sep 18, 2021
diff --git a/CMakeLists.txt b/CMakeLists.txt
@@ -73,6 +73,18 @@ set(headers
     src/sceneStructs.h
     src/preview.h
     src/utilities.h
+    src/ImGui/imconfig.h
+    src/tiny_obj_loader.h
+
+	src/ImGui/imgui.h
+    src/ImGui/imconfig.h
+    src/ImGui/imgui_impl_glfw.h
+     src/ImGui/imgui_impl_opengl3.h 
+     src/ImGui/imgui_impl_opengl3_loader.h 
+     src/ImGui/imgui_internal.h 
+     src/ImGui/imstb_rectpack.h 
+     src/ImGui/imstb_textedit.h 
+     src/ImGui/imstb_truetype.h 
     )
 
 set(sources
@@ -84,6 +96,14 @@ set(sources
     src/scene.cpp
     src/preview.cpp
     src/utilities.cpp
+
+    src/ImGui/imgui.cpp 
+      src/ImGui/imgui_demo.cpp 
+      src/ImGui/imgui_draw.cpp 
+     src/ImGui/imgui_impl_glfw.cpp 
+     src/ImGui/imgui_impl_opengl3.cpp 
+     src/ImGui/imgui_tables.cpp 
+     src/ImGui/imgui_widgets.cpp 
     )
 
 list(SORT headers)
@@ -92,6 +112,7 @@ list(SORT sources)
 source_group(Headers FILES ${headers})
 source_group(Sources FILES ${sources})
 
+#add_subdirectory(src/ImGui)
 #add_subdirectory(stream_compaction)  # TODO: uncomment if using your stream compaction
 
 cuda_add_executable(${CMAKE_PROJECT_NAME} ${sources} ${headers})

diff --git a/INSTRUCTION.md b/INSTRUCTION.md
diff --git a/README.md b/README.md
@@ -1,13 +1,149 @@
-CUDA Path Tracer
+CUDA Denoiser For CUDA Path Tracer
 ================
 
-**University of Pennsylvania, CIS 565: GPU Programming and Architecture, Project 3**
+**University of Pennsylvania, CIS 565: GPU Programming and Architecture, Project 4**
 
-* (TODO) YOUR NAME HERE
-* Tested on: (TODO) Windows 22, i7-2222 @ 2.22GHz 22GB, GTX 222 222MB (Moore 2222 Lab)
+* Haoquan Liang
+  * [LinkedIn](https://www.linkedin.com/in/leohaoquanliang/)
+* Tested on: Windows 10, Ryzen 7 5800X 8 Core 3.80 GHz, NVIDIA GeForce RTX 3080 Ti 12 GB
 
-### (TODO: Your README)
+# Overview
+This project is a CUDA-based pathtracing denoiser that uses geometry buffers (G-buffers) to guide a smoothing filter. It is based on the paper "Edge-Avoiding A-Trous Wavelet Transform for fast Global Illumination Filtering" and it helps produce a smoother appearance in a pathtraced image with fewer samples-per-pixel. 
 
-*DO NOT* leave the README to the last minute! It is a crucial part of the
-project, and we will not be able to grade you without a good README.
+Denoiser Off | Denoiser On
+:----------:|:-----------:
+![](img/Denoiser/denoise-off.png) | ![](img/Denoiser/denoise-on.png) 
 
+# Table of Contents  
+* [Features](#features)   
+
+* [Performance Analysis](#performance)   
+* [Reference](#reference)
+
+# <a name="features"> Features</a>
+### Core features
+* **G-Buffer Visualization**
+
+We use normal/position/and time to intersect data (per pixel) as weight to avoid edges when applying blurs.   
+These data can be visualized by clicking `Show GBuffer` on the GUI. And the user can switch between different data type by pressing 0 for time to intersect, 1 for position, and 2 for normal. 
+
+|Normal | Position | Time to Intersect |
+|:-----: | :-----: |:-----: |
+|![](img/Denoiser/g-buffer-nor.png) | ![](img/Denoiser/g-buffer-pos.png) | ![](img/Denoiser/g-buffer-t.png) |
+
+* **A-Trous Filtering**
+
+A-Trous Filtering is the key to our high-performance denoiser. Instead of sampling all the neighboring pixels in the radius like Gaussian blur, A-Trous Filtering iteratively applying sparse blurs of increasing size. By doing so, it can achieve a comparable result to a big filter with a small filter. 
+
+|No Filter | Filter Size = 16 | Filter Size = 64 |
+|:-----: | :-----: |:-----: |
+|![](img/Denoiser/Non-ATroused.png) | ![](img/Denoiser/ATroused-16.png) | ![](img/Denoiser/ATroused-64.png) |
+
+* **Edge-Avoiding Filtering**
+
+Although A-Trous Filtering clears the noise effectively, the details and focus of the image are also blurred. We want the image to be able to preserve key details. With the information from the G-buffer, we can do this by avoiding blurring the edges. When there is a sharp change in position/normal/depth, there is usually a change in edge. By decreasing the blurring weight on the edges, the denoiser satisfy its purpose effectively. 
+
+|No Filter | A-Trous (64) | A-Trous with Edge-Avoiding (64) |
+|:-----: | :-----: |:-----: |
+|![](img/Denoiser/Non-ATroused.png) | ![](img/Denoiser/ATroused-64.png) | ![](img/Denoiser/ATroused-Edge-avoiding.png) |
+
+### Additional features
+* **<a name="gaussian"> Gaussian Filtering </a>**
+
+As mentioned above, Gaussian Filter blurs an image by sampling all the neighboring pixels of each pixel, and compute its new color by taking the weighted average of them, with the closer pixels having a higher weight.    
+According to my own result, Gaussian Filter seems to produce a blurrier image with edge-avoiding turned off, and it produce a slightly noisy image with edge-avoiding turned on.
+
+|No Filter | A-Trous (64)  | Gaussian (64) |
+|:-------: | :-----------: |:------------: |
+|![](img/Denoiser/Non-ATroused.png) | ![](img/Denoiser/ATroused-64.png) | ![](img/Denoiser/Gaussian-64.png) |
+| Edge-Avoiding | ![](img/Denoiser/ATroused-Edge-avoiding.png) | ![](img/Denoiser/Gaussian-Edge-avoiding.png) |
+
+
+
+# <a name="performance">Performance Analysis</a>
+* **How much time denoising adds to the renders**
+
+The denoiser is ran after the path tracer has generated an image, hence its runtime should be irrelevant to the complexity of the scene. It will only be affected by the resolution of the image and the filter size.    
+My data proves this. For a 800x800 image, with a 80x80 filter size, the addtional runtime added by denoising shows no relation to the number of iteration. It always adds a small additional time at about 3.55 ms. In the following analysis, it will further show that the image resolution and filter size will affect how much time denoising adds to the renders.       
+
+![](img/Denoiser/chart1.png)
+
+
+* **How denoising influences the number of iterations needed to get an "acceptably smooth" result**
+
+It is worth mentioning that "acceptably smooth" is a very subjective feeling. Different people have different tolerance on the image noise. Also, some scenes and images are less obvious on noise due to color contrast, geometry composition, and etc.          
+Testing the `cornell_ceiling_light` scene, without denoising, the image looks very smooth at 300 iterations. With denoising on, the image starts looking smooth at 60 iterations. This means that denoiser is able to reduce the number of iterations needed by **70%**.   
+However, in the below cow scene, the denoised image only looks comparable to the original image (400 iterations) at around 200 iterations. The reduction is only **50%**.   
+My conclusion is that it varies greatly by the scene itself, but overall denoising does help reduce the number of iterations needed signicantly. 
+
+No Denoising, 200 iterations | Denoised, 60 iterations
+:----------:|:-----------:
+![](img/Denoiser/200-iter.png) | ![](img/Denoiser/60-iter-denoised.png) 
+
+No Denoising, 400 iterations | Denoised, 150 iterations
+:----------:|:-----------:
+![](img/Denoiser/400-iter.png) | ![](img/Denoiser/200-iter-denoised.png) 
+
+* **How denoising at different resolutions impacts runtime**
+
+Since denoising operates on the final image, and the input size is depending on the number of pixels on the image, as the resolution quadruple (for example, from 200x200 to 400x400), the additional runtime is expected to be 4 times longer.   
+However, according to my result, when the image size doubles, the runtime only increases by about 50%, and when the image size increased by 36 times (from 200x200 to 1200x1200), the runtime only increased by 7 times. So the runtime is not directly linearly proportional to the resolution, but higher resolution does result in slower runtime.   
+
+![](img/Denoiser/chart2.png)
+
+
+* **How varying filter sizes affect performance**
+
+As expected, the greater the filter size, the more runtime denoising will add, as more neighboring pixels are sampled for computing the new color for each pixel.       
+The following chart is generated with the `cornell_ceiling_light` scene at 800x800 resolution.The addtional time is proportional to the filter size but not linearly.  
+
+![](img/Denoiser/chart3.png)
+
+* **How visual results vary with filter size -- does the visual quality scale uniformly with filter size?**
+
+The visual results of denoising do not scale uniformly with filter size. The following images are generated with the `cornell_ceiling_light` scene at 1 iteration.    
+The visual improvement from 0 to 20 is tremendous, the improvement from 20 to 40 is noticeable, but from this point on, the visual improvements are very small.    
+
+Filter Size = 0 | Filter Size = 20 | Filter Size = 40 | Filter Size = 60 | Filter Size = 80 | Filter Size = 100 |
+:----------:|:-----------:|:-----------:|:-----------:|:-----------:|:-----------:|
+![](img/Denoiser/filter-0.png) | ![](img/Denoiser/filter-20.png) | ![](img/Denoiser/filter-40.png) | ![](img/Denoiser/filter-60.png) | ![](img/Denoiser/filter-80.png) | ![](img/Denoiser/filter-100.png) |  
+
+* **How effective/ineffective is this method with different material types**   
+
+This method works very well with diffuse materials, although the "roughness" feeling gets reduced and it looks very soft due to the color being "smoothed".    
+It works less well on specular materials, because the reflection gets blurred very noticeably and the shiness of the material also gets reduced.    
+It does not work well on refractional materials, as the translucency will be marred by averaging the color, and the specular component will also be reduced. 
+
+Diffuse | Specular | Refractions
+:----------:|:-----------:|:-----------:
+![](img/Denoiser/diffuse.png) | ![](img/Denoiser/specular.png)  | ![](img/Denoiser/refract.png) 
+
+
+* **How do results compare across different scenes - for example, between `cornell.txt` and `cornell_ceiling_light.txt`. Does one scene produce better denoised results? Why or why not?**
+
+The results across different scenes vary greatly. For example, the denoiser works exceptionally on the `cornell_ceiling_light` scene, but not so much on the regular `cornell` scene.    
+From my testing, denoiser seems to work better on bright scenes. As I dig deeper, I realize that it's not actually the brightness, but the color variations. In a bright scene, most pixels are lit up uniformly and the LTE computation converges more quickly. On the other hand, when the scene is dark, the bright pixels are more sparse and there are inherently more noises in the scene, which makes denoising a harder task.     
+Since we are using normals/positions/time to intersect to avoid the edges, when the different edges actually have different normal/position/time, our algorithm will expectedly work better.    
+It is worth mentioning that different scenes also require different norm/pos/t configurations to look the best.   
+
+Bright Scene | Dark Scene 
+:----------:|:-----------:
+![](img/Denoiser/big-light.png) | ![](img/Denoiser/small-light.png) 
+![](img/Denoiser/big-light-cow.png) | ![](img/Denoiser/small-light-cow.png) 
+
+
+* **A-Trous Filtering vs. Gaussian Filtering: Performance Analysis**
+
+Please see [here](#gaussian) for visual comparison.    
+For performance comparison, as expected, A-Trous Filtering outperforms Gaussian Filtering significantly. Specifically, the performance of A-Trous and Gaussian are comparable with a filter size of 10 (resolution 800x800), but the runtime of A-Trous increases almost linearly whereas the runtime of Gaussian increases exponentially. They are also comparable at very small resolution, but again the runtime of Gaussian increases exponentially with the resolution whereas A-Trous only increases linearly.    
+This makes perfect sense since A-Trous algorithm always takes 5x5 samples for each pixel, and only increase the number of iterations when the filter size increases. However, Gaussian blur takes nxn samples, which is a exponential increase.     
+
+![](img/Denoiser/chart4.png)
+
+
+
+# <a name="reference">Refrence</a>
+* [Edge-Avoiding A-Trous Wavelet Transform for fast Global Illumination Filtering](https://jo.dreggn.org/home/2010_atrous.pdf)
+* [Spatiotemporal Variance-Guided Filtering](https://research.nvidia.com/publication/2017-07_Spatiotemporal-Variance-Guided-Filtering%3A)
+* [A Survey of Efficient Representations for Independent Unit Vectors](http://jcgt.org/published/0003/02/01/paper.pdf)
+* ocornut/imgui - https://github.com/ocornut/imgui
diff --git a/cmake/CUDAComputesList.cmake b/cmake/CUDAComputesList.cmake
@@ -60,6 +60,8 @@ IF(    CUDA_COMPUTE_20
     OR CUDA_COMPUTE_70
     OR CUDA_COMPUTE_72
     OR CUDA_COMPUTE_75
+    OR CUDA_COMPUTE_80
+    OR CUDA_COMPUTE_86
     )
     SET(FALLBACK OFF)
 ELSE()
@@ -70,8 +72,8 @@ LIST(LENGTH COMPUTES_DETECTED_LIST COMPUTES_LEN)
 IF(${COMPUTES_LEN} EQUAL 0 AND ${FALLBACK})
     MESSAGE(STATUS "You can use -DCOMPUTES_DETECTED_LIST=\"AB;XY\" (semicolon separated list of CUDA Compute versions to enable the specified computes")
     MESSAGE(STATUS "Individual compute versions flags are also available under CMake Advance options")
-    LIST(APPEND COMPUTES_DETECTED_LIST "30" "50" "60" "70")
-    MESSAGE(STATUS "No computes detected. Fall back to 30, 50, 60 70")
+    LIST(APPEND COMPUTES_DETECTED_LIST "30" "50" "60" "70" "80")
+    MESSAGE(STATUS "No computes detected. Fall back to 30, 50, 60, 70, 80")
 ENDIF()
 
 LIST(LENGTH COMPUTES_DETECTED_LIST COMPUTES_LEN)
@@ -90,7 +92,7 @@ MACRO(SET_COMPUTE VERSION)
 ENDMACRO(SET_COMPUTE)
 
 # Iterate over compute versions. Create variables and enable computes if needed
-FOREACH(VER 20 30 32 35 37 50 52 53 60 61 62 70 72 75)
+FOREACH(VER 20 30 32 35 37 50 52 53 60 61 62 70 72 75 80 86)
     OPTION(CUDA_COMPUTE_${VER} "CUDA Compute Capability ${VER}" OFF)
     MARK_AS_ADVANCED(CUDA_COMPUTE_${VER})
     IF(${CUDA_COMPUTE_${VER}})

diff --git a/cmake/FindGLFW.cmake b/cmake/FindGLFW.cmake
@@ -20,66 +20,66 @@
 include(FindPackageHandleStandardArgs)
 
 if (WIN32)
-	# Find include files
-	find_path(
-		GLFW_INCLUDE_DIR
-		NAMES GLFW/glfw3.h
-		PATHS
-		$ENV{PROGRAMFILES}/include
-		${GLFW_ROOT_DIR}/include
-		DOC "The directory where GLFW/glfw.h resides")
+  # Find include files
+  find_path(
+    GLFW_INCLUDE_DIR
+    NAMES GLFW/glfw3.h
+    PATHS
+    $ENV{PROGRAMFILES}/include
+    ${GLFW_ROOT_DIR}/include
+    DOC "The directory where GLFW/glfw.h resides")
 
-	# Use glfw3.lib for static library
-	if (GLFW_USE_STATIC_LIBS)
-		set(GLFW_LIBRARY_NAME glfw3)
-	else()
-		set(GLFW_LIBRARY_NAME glfw3dll)
-	endif()
+  # Use glfw3.lib for static library
+  if (GLFW_USE_STATIC_LIBS)
+    set(GLFW_LIBRARY_NAME glfw3)
+  else()
+    set(GLFW_LIBRARY_NAME glfw3dll)
+  endif()
 
-	# Find library files
-	find_library(
-		GLFW_LIBRARY
-		NAMES ${GLFW_LIBRARY_NAME}
-		PATHS
-		$ENV{PROGRAMFILES}/lib
-		${GLFW_ROOT_DIR}/lib)
+  # Find library files
+  find_library(
+    GLFW_LIBRARY
+    NAMES ${GLFW_LIBRARY_NAME}
+    PATHS
+    $ENV{PROGRAMFILES}/lib
+    ${GLFW_ROOT_DIR}/lib)
 
-	unset(GLFW_LIBRARY_NAME)
+  unset(GLFW_LIBRARY_NAME)
 else()
-	# Find include files
-	find_path(
-		GLFW_INCLUDE_DIR
-		NAMES GLFW/glfw.h
-		PATHS
-		/usr/include
-		/usr/local/include
-		/sw/include
-		/opt/local/include
-		DOC "The directory where GL/glfw.h resides")
+  # Find include files
+  find_path(
+    GLFW_INCLUDE_DIR
+    NAMES GLFW/glfw.h
+    PATHS
+    /usr/include
+    /usr/local/include
+    /sw/include
+    /opt/local/include
+    DOC "The directory where GL/glfw.h resides")
 
-	# Find library files
-	# Try to use static libraries
-	find_library(
-		GLFW_LIBRARY
-		NAMES glfw3
-		PATHS
-		/usr/lib64
-		/usr/lib
-		/usr/local/lib64
-		/usr/local/lib
-		/sw/lib
-		/opt/local/lib
-		${GLFW_ROOT_DIR}/lib
-		DOC "The GLFW library")
+  # Find library files
+  # Try to use static libraries
+  find_library(
+    GLFW_LIBRARY
+    NAMES glfw3
+    PATHS
+    /usr/lib64
+    /usr/lib
+    /usr/local/lib64
+    /usr/local/lib
+    /sw/lib
+    /opt/local/lib
+    ${GLFW_ROOT_DIR}/lib
+    DOC "The GLFW library")
 endif()
 
 # Handle REQUIRD argument, define *_FOUND variable
 find_package_handle_standard_args(GLFW DEFAULT_MSG GLFW_INCLUDE_DIR GLFW_LIBRARY)
 
 # Define GLFW_LIBRARIES and GLFW_INCLUDE_DIRS
 if (GLFW_FOUND)
-	set(GLFW_LIBRARIES ${OPENGL_LIBRARIES} ${GLFW_LIBRARY})
-	set(GLFW_INCLUDE_DIRS ${GLFW_INCLUDE_DIR})
+  set(GLFW_LIBRARIES ${OPENGL_LIBRARIES} ${GLFW_LIBRARY})
+  set(GLFW_INCLUDE_DIRS ${GLFW_INCLUDE_DIR})
 endif()
 
 # Hide some variables