2025/02/15 Weekly Meeting Notes #134

himanshunaidu · 2025-02-16T02:12:49Z

himanshunaidu
Feb 16, 2025
Maintainer

The following notes contain the progress and next steps for each sub-task:

Computer Vision Model Update
Assigned to Himanshu Naidu.
While the Deeplabv3 TensorFlow 1 model has been successfully converted to CoreML with better on-report performance, there are more issues with it than initially understood.
While the underperformance on important classes is still an important factor, more importantly, the real-time segmentation performance degrades much more rapidly than expected, to the point where the model is barely able to process 1 frame per second. Further analysis will be done on the performance (such as checks on the CPU/GPU/NE load), however, it would be prudent to focus on other tasks for now.
Computer Vision ML Pipeline
Assigned to Himanshu Naidu.
As mentioned before, the following ML pipeline is being set up to train models from scratch.
https://github.com/himanshunaidu/CoreML_Pipeline_iOSPointMapper
It would be wise to focus on this since we eventually get more control over the model we produce (preferably start with a TorchVision model).
Currently, only the data loader/pre-processor for CityScapes has been merged, but further work is expected to be merged soon.
Depth Aware Instance Segmentation
Assigned to Himanshu Naidu.
The current plan is to use a first version of depth-aware instance segmentation which is a combination of the watershed algorithm and depth-based clustering.
After further analysis, it became clear that developing a water-shed algorithm from scratch is not very practical if there are existing solutions. Fortunately, one can integrate OpenCV into Swift development, although this is not as well-documented.
Still, this seems to be the most viable way to integrate watershed algorithm.

OpenCV has now been integrated into the application with some basic methods for experimentation.
Watershed algorithm will now be integrated into the application.

Still need to see what would be the best possible way to integrate depth-based clustering with DBSCAN.

Segmentation Mask Post-Processing
Assigned to Nik Wilson.
Nik has gone through the segment_streets code of OASIS framework on a high level.
We had further discussion on how the algorithm works, and some high-level next steps have been established, which include going further through the centroid-tracking and homography transformation code.
Note: Since the post-processing code will likely depend largely on OpenCV, the current integration of OpenCV into the application will benefit development greatly.

himanshunaidu · 2025-02-17T03:29:21Z

himanshunaidu
Feb 17, 2025
Maintainer Author

Notes from Nik

Review the following functions for comprehension:
find_objects_tracking
find_homo
centroidtracker
Segment_sidewalks
Review the following file:
centroidtracker.py

Find & read literature on:
Homography transformation

From a higher-level perspective (Himanshu will research):
What inputs do we need?
Possibly use opencv on pictures

Def find_homo(im1,im2): description

The function find_homo is designed to find the homography matrix between two images using feature matching. This matrix can be used to warp one image to align with another. The function takes two images as input: im1, the image to be warped, and im2, the reference image.
First, the function converts both images to grayscale using OpenCV's cvtColor function. This is a common preprocessing step in computer vision tasks to simplify the data and reduce computational load, as color information is not necessary for feature detection.
Next, the function detects ORB (Oriented FAST and Rotated BRIEF) features and computes their descriptors for both grayscale images. ORB is a fast and efficient feature detector and descriptor extractor. The ORB_create method initializes the ORB detector with a maximum number of features to detect, specified by MAX_FEATURES.
The function then matches the descriptors from both images using a brute-force matcher with the Hamming distance as the metric. The matches are sorted by their distance, with the closest matches (i.e., the best matches) being selected. The number of good matches is determined by the GOOD_MATCH_PERCENT parameter, which specifies the top percentage of matches to retain.
After selecting the good matches, the function extracts the locations of these matches from both images. These locations are stored in two arrays, points1 and points2, which hold the coordinates of the matched keypoints.
Finally, the function computes the homography matrix using the RANSAC (Random Sample Consensus) algorithm. This matrix, h, represents the transformation needed to warp im1 to align with im2. The function returns this homography matrix, which can be used for further image processing tasks such as image stitching or perspective correction.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

2025/02/15 Weekly Meeting Notes #134

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 1 comment

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

2025/02/15 Weekly Meeting Notes #134

Uh oh!

Uh oh!

himanshunaidu Feb 16, 2025 Maintainer

Replies: 1 comment

Uh oh!

himanshunaidu Feb 17, 2025 Maintainer Author

Notes from Nik

himanshunaidu
Feb 16, 2025
Maintainer

himanshunaidu
Feb 17, 2025
Maintainer Author