Update proj1.html

cjxthecoder · web-flow · commit d7f3dc50057d · 2025-09-12T16:02:24.000-07:00
diff --git a/project-1/proj1.html b/project-1/proj1.html
@@ -15,7 +15,6 @@ <h1>Project 1: Colorizing the Prokudin-Gorskii photo collection</h1>
 
 <div align="center">
 <img src="images/intro.jpg" alt="intro.jpg" width="50%">
-<figcaption style="margin-bottom:6px;">see submission/code/sobel.py</figcaption>
 </div>
 
 <!-- Section 1 -->
@@ -73,7 +72,9 @@ <h2>Naive Search</h2>
 
 </div>
 
+<p>
 Unfortunately, because each crop has a height of <i>(7W / 8)</i> &times; <i>(5H / 24)</i>, the total number of NCC computations for each alignment is <i>((W - 7W / 8) + 1)</i> &times; <i>((H/3 - 5H / 24) + 1)</i> = <i>(W / 8 + 1)</i> &times; <i>(H / 8 + 1)</i> = <i>O(HW)</i>. Since each NCC computation requires <i>O(HW)</i> operations, aligning an image of dimensions <i>W</i> &times; <i>3W</i> takes <i>O(W<sup>2</sup>)</i> time using the naive search above. Because the .tif files are about 9 times bigger than the .jpg files in both dimensions, performing the same search on these files will take more than 6500x more times longer to compute. Even if we don't crop horizontally at all, it would still have a complexity of <i>O(W<sup>3</sup>)</i> and require more than 9<sup>3</sup> &approx; 720x more time on .tif files. A more efficient method is required.
+</p>
 <hr>
 
 <!-- Section 4 -->
@@ -85,7 +86,9 @@ <h2>Image Pyramid</h2>
 <img src="images/Image_pyramid_svg.png" alt="Image_pyramid_svg.png" width="25%">
 <figcaption style="margin-bottom:6px;">Source: <a href="https://en.wikipedia.org/wiki/Pyramid_%28image_processing%29">Wikipedia</a></figcaption>
 </div>
+<p>
 Now, we will only calculate the best displacement if the given width is below a certain threshold. For images above this threshold, we can first downscale the image by 2x, pass it back to the function, and the returned shifts are scaled up by 2x to return the best shifts on the input image. This means that the best shifts in the base case (<i>x</i><sub>lowest</sub>, <i>y</i><sub>lowest</sub>) will be scaled up and used in the previous recursive call, which is the scaled image 1 call above. This will continue until we return to the top-level call, and at that point, the returned shifts will be within 1 or 2 pixels of the best overall displacement. The last thing to keep in mind is to downscale the cropping box as well, which is simple to do since it is computed on the full image, and one can simply divide its coordinates by 2 for each recursive call. In practice, setting W<sub>min</sub> = 72 gives the best tradeoff between the search size and the number of rescales. With these optimizations in place, the computing time is now much faster:
+</p>
 
 <div style="display:flex; flex-wrap:wrap; justify-content:center; text-align:center;">
 
@@ -191,5 +194,44 @@ <h2>Image Pyramid</h2>
 </p>
 <hr>
 
+<!-- Section 5 -->
+<h2>Cropping with Sobel</h2>
+<p>
+Although having a fixed cropping dimension worked for aligning the images, because the final result is simply the intersection of the translations, artifacts from the border still remain. Is there a way to automatically crop the borders? The answer is yes. We can start by applying the sobel operator
+</p>
+
+<div style="display:flex; flex-wrap:wrap; justify-content:center; text-align:center;">
+
+<figure style="flex: 1 0 40%; margin:12px;">
+<img src="images/equationX.png" alt="equationX.png" width="50%">
+</figure>
+
+<figure style="flex: 1 0 40%; margin:12px;">
+<img src="images/equationY.png" alt="equationY.png" width="50%">
+</figure>
+
+</div>
+
+<p>
+with convolution to the base image, and add the results together. This will produce a composite of the detected edges along the <i>x</i>- and <i>y</i>-axis. The next step is to reconstruct the borders of each plate from the image. Because Sobel is distance-invariant, we can convolve the kernel on only the valid patches, resulting in the new image having dimensions <i>(W - 1)</i> &times; <i>(H - 1)</i>.
+</p>
+
+<div align="center">
+<img src="images/edges" alt="edges.png" width="50%">
+<figcaption style="margin-bottom:6px;">sobelX &starf; img + sobelY &starf; img</figcaption>
+</div>
+
+<p>
+Now, we can take the average of each row, and those with the lowest and highest values will be the edges. By splitting the resulting array into 4 subarrays [0 : <i>H/8</i>], [<i>H/8</i> : <i>H/2</i>], [<i>H/2</i> : <i>7H / 8</i>], and [<i>H/8</i> : <i>H</i>], we can find the argmax and argmin in each subarray, which will be used as the horizontal borders of each plate image.
+</p>
+
+<div align="center">
+<img src="images/y.png" alt="y.png" width="50%">
+</div>
+
+The vertical borders can be estimated using a similar strategy, except that we need to split the column averages into 3 subsections based on the values found in the previous part. This will ensure that we are able to find the vertical edges of each image plate. Although the center plate
+
+<hr>
+
 </body>
 </html>