SightLinks is a computer vision system designed to detect and georeference crosswalks in aerial imagery. It processes .jpg (or .jpeg, .png) with their corresponding .jgw file and .tif files, providing oriented bounding boxes with latitude and longitude coordinates. The system uses a combination of image segmentation, mobileNet detection, YOLO-based detection, georeferencing, and filtering to accurately identify and locate crosswalks in aerial photographs.
- Features
- Prerequisites
- Installation
- Quick Start
- Usage Guide
- Project Structure
- Technical Details
- Troubleshooting
- Contributing
- Supports .jpg with .jgw files and .tif files
- Automatic extraction and handling of zip input files
- Crosswalk detection using YOLO-based models (multiple variants available)
- Automatic georeferencing and filtering of detected crosswalks
- Multiple output formats (JSON/TXT)
- Progress tracking with detailed progress bars
- Organized output with timestamped directories
- Handles both single files and batch processing
- Visualization tool for comparing detection results.
- Python 3.10 or higher
- pip package manager
- Git
- CUDA-capable GPU recommended for faster processing
- Sufficient disk space for image processing
- Required Python packages (installed via requirements.txt)
- Clone the repository:
git clone https://github.com/UCL-SightLinks/SightLinks-Main.git
cd SightLinks-Main- Create and activate a virtual environment:
# Create virtual environment
python -m venv venv
# Activate virtual environment
# On macOS:
Not necessary as we will be using conda
# On Windows/Linux:
venv\Scripts\activate
# On Linux:
source venv/bin/activate- Install dependencies: For Windows and Linux Machines:
sudo apt update
sudo apt install gdal-bin libgdal-dev
pip install -r requirements.txtFor MacOS machines:
conda create --name venv python=3.10.12
conda activate venv
conda install -c conda-forge gdal=3.6.4
pip install -r requirements.txtIMPORTANT NOTE:
- Please verify that the correct GDAL version has been installed (3.6.4), If a newer version is installed, it will not run.
- You can check the version through your terminal with: gdalinfo --version
- It is important to note that GDAL is difficult to work with on brew as they only store most recent versions of gdal.
- If on a higher version:
- Delete the current gdal
- Install specifically for version 3.6.4
-
Place input files in the
inputdirectory:- For .jpg/.jgw data: Place zip files containing .jpg/.jgw files (e.g. from Digimap), or directly place the .jpg/.jgw files.
- For .tif data: Place zip files containing .tif files, or directly place the .tif files.
-
Run the system:
python run.py// This snippet of code is from run.py
from main import execute
execute(
uploadDir="input", # Input directory
inputType="0", # "0" for .jpg/.jgw, "1" for .tif files
classificationThreshold=0.35,
predictionThreshold=0.5,
saveLabeledImage=False,
outputType="0", # "0" for JSON, "1" for TXT
yolo_model_type="n" # "n" for nano model
)- JSON Format (output_type="0"):
[
{
"image": "image_name.jpg",
"coordinates": [
[[lon1,lat1], [lon2,lat2], [lon3,lat3], [lon4,lat4]], # crosswalk 1
[[lon1,lat1], [lon2,lat2], [lon3,lat3], [lon4,lat4]] # crosswalk 2
]
}
]- TXT Format (output_type="1"):
- One file per original image
- Each line represents one building:
lon1,lat1 lon2,lat2 lon3,lat3 lon4,lat4
run/output/YYYYMMDD_HHMMSS/ # Timestamp-based directory
├── output.json # If JSON output selected
├── image_name.txt # If TXT output selected (one per image)
└── labeledImages/ # Optional: Images with visualized detections
SightLinks-Main/
├── classificationScreening/ # Building classification module
│ ├── classify.py # Main classification logic
│ └── utils/ # Classification utilities
├── imageSegmentation/ # Image segmentation modules
│ ├── boundBoxSegmentation.py # Bounding box segmentation
│ └── classificationSegmentation.py # Classification segmentation
├── models/ # YOLO model files
│ ├── yolo-n.pt # Nano model
│ └── mn3_vs55.pth # Classification Model
├── georeference/ # Georeferencing utilities
│ └── Georeference.py # Coordinate conversion functions
├── utils/ # Utility functions
│ ├── extract.py # File extraction handling
│ ├── compress.py # File compression handling
│ ├── filterOutput.py # Filters bounding boxes to remove duplicates
│ ├── saveToOutput.py # Saves stored coordinates to output file
│ └── visualize.py # Result analysis tools
├── run/ # Runtime directories
│ └── output/ # Timestamped outputs
├── input/ # Input file directory
├── requirements.txt # Python dependencies
├── main.py # Main execution module
└── run.py # Quick start script
-
File Extraction
- Handles .jpg/.jgw files and .tif files
- Filters system files and unsupported formats
- Organizes files for processing
-
Image Segmentation
- Segments large aerial images
- Prepares chunks for classification
-
Image classification
- Process the segmented images using the classification model
- Returns True if the model's confidence is greater than a certain threshold
-
Image Segmentation
- Re-segment the images based on the rows and columns of interest (where the classification model returns True)
- Prepares chunks for classification
-
Crosswalk Detection
- Uses selected YOLO model variant
- Applies confidence thresholds
- Supports multiple model types for different performance/accuracy trade-offs
-
Georeferencing
- Converts pixel coordinates to geographical coordinates
- Uses .jgw world files or data stored in .tif files for accurate mapping
- Handles coordinate system transformations
-
Filtering
- Removes duplicate bounding boxes by using Non-Maximum Suppression
-
Output Generation
- Creates timestamped directories
- Generates selected output format
- Optionally saves labeled images
- GPU acceleration for faster processing
- Filter by row and column for a more optimised filter
- Progress tracking with a progress bar
- Configurable model selection for speed/accuracy balance
Common issues and solutions:
- Memory errors: Reduce batch size or use nano model
- Missing files: Check input directory structure