Modular research toolkit for deep learning, computer vision, data preprocessing and analysis, 3D visualization, and associated workflows on HPC systems. Features shared utilities for analysis, logging, I/O, and visualization, plus applications for depth processing, multi-view stereo analysis, and scene rendering. Includes SLURM-based automations and configuration-driven execution for reproducible HPC workflows.
Modular research codebase optimized for HPC environments. Combines reusable shared utilities with specialized application modules to support scalable, automated, and reproducible workflows in deep learning, computer vision, and 3D processing under job scheduler management.
Repository Structure
code/
├── research_utils/ # Shared utilities package
│ ├── src/research_utils/
│ │ ├── core/
│ │ ├── io/
│ │ ├── logging/
│ │ ├── ops/
│ │ └── viz/
│ └── pyproject.toml
│
├── apps/ # Application-specific code
│ ├── analysis/
│ │ ├── depth_overlay/
│ │ │ ├── src/
│ │ │ ├── scripts/
│ │ │ ├── configs/
│ │ │ ├── output/
│ │ │ ├── logs/
│ │ │ └── pyproject.toml
│ │ └── depth_overlay_compare/
│ │ ├── src/
│ │ ├── scripts/
│ │ ├── configs/
│ │ ├── output/
│ │ ├── logs/
│ │ └── pyproject.toml
│ │
│ └── rendering/
│ └── viser/
│ ├── dl3dv/
│ │ ├── drafts/
│ │ ├── logs/
│ │ ├── render_scene.py
│ │ ├── render_scene_v2.py
│ │ └── pyproject.toml
│ └── wai-vis/ # 3D rendering
│ ├── modules/
│ ├── main.py
│ ├── pyproject.toml
│ └── requirements.txt
│
└── scripts/ # HPC workflow and utility scripts
├── python/
├── slurm/
│ └── da3/
└── archive/
Reusable Python package with core utilities for research workflows:
- Core: I/O operations and logging infrastructure
- Logging: Advanced logging handlers with color support and JSON logging
- Operations: Geometry and mathematical operations
- Visualization: Overlay and plotting utilities
Installation:
cd research_utils
pip install -e .Application-specific modules that depend on research_utils. Each application includes a pyproject.toml file for dependency management and can be installed in editable mode:
depth_overlay/: Visualizes depth maps generated by deep learning models via RGB-depth overlays.depth_overlay_compare/: Side-by-side comparison of depth maps extracted from multiple models (e.g.,Depth-Anything,MoGe,MVSAnywhere) overlaid on the same RGB images to evaluate model performance.
dl3dv/: Scene rendering scripts for 3D visualizationwai-vis/: A real-time, interactive web-based 3D visualization tool for WAI/Nerfstudio datasets.- This tool allows researchers to inspect preprocessing results-specifically RGB images coupled with depth maps. It projects 2D RGB-D data into an interactive 3D point cloud environment, complete with camera frustums, accessible directly via a web browser.
- See: WAI-Viser
Top-level scripts directory containing automated HPC workflows and utility tools:
Utilities for local development and common data processing and research tasks. For rapid validation of deep learning outputs and dataset integrity before scaling to full HPC pipelines.
Automations for distributed HPC execution. These manage the end-to-end lifecycle of model inference, multi-view stereo analysis, large-scale data engineering, and preprocessing workflows, including resource allocation, environment staging, and structured error handling.
This codebase follows a modular architecture that promotes code reuse and maintainability:
The research_utils package serves as a centralized library of reusable components that all applications can import and use. This design pattern provides several benefits:
- Code Reuse: Common functionality (logging, I/O, visualization) is implemented once and shared across all applications
- Consistency: All applications use the same logging system, configuration loading, and utility functions
- Maintainability: Bug fixes and improvements to shared utilities benefit all applications automatically
- Separation of Concerns: Application-specific logic is isolated from general-purpose utilities
Example Usage:
from research_utils import plot_overlay, print_args, setup_logging, load_config
# All apps can use these shared utilities
setup_logging("configs/default_logging.json")
config = load_config("configs/default.json")
plot_overlay(rgb_path="...", depth_path="...", save_dir="...")Each application in the apps/ directory is:
- Self-contained: Has its own configuration files, scripts, and output directories
- Independent: Can be run separately without affecting other applications
- Configurable: Uses JSON configuration files for easy customization
- CLI-based: Provides command-line interfaces for HPC job submission
Applications are designed to be configuration-driven, making them ideal for HPC workflows where parameters may vary between job runs:
- Default Configurations: Each app includes default JSON configs in
configs/ - Custom Configurations: Override defaults via command-line arguments
- Auto-Discovery: The
load_config()utility automatically searches for config files relative to the calling script
Example:
# Use default config
python -m scripts.run_depth_overlay --rgb_path /path/to/rgb.jpg --depth_path /path/to/depth.exr
# Use custom config
python -m scripts.run_depth_overlay \
--config_path configs/custom.json \
--log_config configs/logging_verbos.json \
--rgb_path /path/to/rgb.jpg \
--depth_path /path/to/depth.exrAll applications follow a consistent CLI pattern:
- Required arguments: Essential inputs (e.g., file paths)
- Optional arguments: Configurable parameters with sensible defaults
- Config integration: Arguments can override or complement JSON configs
- Logging setup: Logging is initialized early via
--log_configargument
Applications are designed to be submitted directly or as HPC jobs.
Multi-formatter system optimized for interactive and batch use:
-
Multiple Formatters:
- JSON Formatter: Structured logging for programmatic analysis and HPC job monitoring
- Color Formatters: Human-readable console output with syntax highlighting
- Readable Formatters: Plain text formats for log files
-
Structured Logging: Support for custom
extrafields that are automatically captured:logger.info("Processing file", extra={"path": "/data/file.csv", "request_id": "abc123"})
-
Configurable via JSON: Logging behavior is configured through JSON files, allowing different log levels and handlers per application
-
Multiple Handlers: Simultaneous output to:
- Console (stdout): Colored, human-readable output for interactive use
- File: Persistent logs with rotation support
- JSONL files: Structured logs for post-processing
Logging is configured via JSON files (e.g., configs/default_logging.json):
{
"version": 1,
"formatters": {
"json": {
"()": "research_utils.logging.handlers.JSONFormatter"
},
"readable_color_stdout": {
"()": "research_utils.logging.handlers.ReadableColorFormatter"
}
},
"handlers": {
"file": {
"class": "logging.handlers.RotatingFileHandler",
"formatter": "readable",
"filename": "logs/app.log"
},
"stdout": {
"class": "logging.StreamHandler",
"formatter": "readable_color_stdout",
"level": "DEBUG"
}
}
}Applications initialize logging early in their execution:
from research_utils import setup_logging
import logging
# setup logging from config file
setup_logging("configs/default_logging.json")
logger = logging.getLogger(__name__)
# use structured logging
logger.info("Processing started", extra={"input_args": vars(args)})
logger.error("File not found", extra={"path": file_path})- Post-Processing: JSON logs can be parsed and analyzed after job completion
- Debugging: Detailed logs with function names, line numbers, and timestamps
- Monitoring: Structured logs enable automated job monitoring and error detection
- Reproducibility: Log configurations are version-controlled alongside code
This codebase is designed to integrate with HPC job schedulers (e.g., SLURM, PBS). When running on HPC systems:
- Environment Setup: Use virtual environments (
.venvs/directory) to manage dependencies per job or workflow - Resource Management: Configure job scripts to request appropriate compute resources (CPU, GPU, memory)
- Data I/O: Ensure data paths are accessible from compute nodes (shared filesystems, network storage)
- Logging: Structured logging outputs (JSON, JSONL) are designed for post-processing and analysis of HPC job outputs
- Deep Learning:
PyTorch,TorchVision - Core & Math:
NumPy,OpenCV,Matplotlib,pathlib,OpenEXR,imageio,Pillow - 3D Processing & Geometry:
trimesh,scipy,shapely,manifold3d,rtree - 3D Formats & Parsing:
lxml,jsonschema,pycollada,xxhash
See individual pyproject.toml and requirements.txt files in each application directory for specific dependencies.
All applications depend on research-utils, which must be installed first.
- Install the shared utilities package:
cd research_utils
pip install -e .- Install application-specific dependencies. Each app can be installed in editable mode using its
pyproject.toml:
# Install a specific application
cd apps/analysis/depth_overlay
pip install -e .
# Or install multiple apps
cd apps/rendering/viser/wai-vis
pip install -e .Each application typically includes:
- Configuration files in
configs/ - Scripts in
scripts/ - Output directories for results
- Logging directories for job outputs
Refer to individual application directories for specific usage instructions.
- Python 3.8+
- Virtual environments are strictly recommended due to version conflicts
- Follow the existing structure when adding new applications or utilities
- This repository is part of a larger HPC workflow system
- Configuration files use JSON format