Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
37 commits
Select commit Hold shift + click to select a range
3f6a949
Refactor code structure for improved readability and maintainability
surendhar-palanisamy Mar 10, 2026
e554d7c
Implement cold-start stabilization and queue management enhancements
surendhar-palanisamy Mar 10, 2026
366044a
Increase bootstrap passes for cold-start stabilization and update mod…
surendhar-palanisamy Mar 11, 2026
7216e15
Comment out destination list and planner tests in main function for c…
surendhar-palanisamy Mar 11, 2026
c0f68b4
Enhance cold-start stabilization and queue management with improved l…
surendhar-palanisamy Mar 11, 2026
935c7bf
Add detailed logging for planner and localization calls with parameters
surendhar-palanisamy Mar 11, 2026
b283a74
Reduce bootstrap passes from 3 to 2 for refinement queue processing
surendhar-palanisamy Mar 11, 2026
4230ff3
Add top_k optimization suggestions for improved candidate matching
surendhar-palanisamy Mar 11, 2026
430f211
Set default value for top_k parameter in UnavServer class to 50
surendhar-palanisamy Mar 12, 2026
944b6e9
Set top_k default to None in UnavServer class
surendhar-palanisamy Mar 12, 2026
59f47b5
add h200
surendhar-palanisamy Mar 12, 2026
8d45dac
add h200
surendhar-palanisamy Mar 12, 2026
c13d451
add Agents.md file
surendhar-palanisamy Mar 13, 2026
a5f82fa
refactor navigation
surendhar-palanisamy Mar 13, 2026
f270207
Refactor: Extract initialization and navigation logic to separate mod…
surendhar-palanisamy Mar 13, 2026
51525b8
remove bunch of un-used methods
surendhar-palanisamy Mar 13, 2026
a50bbb8
Fix: Remove duplicate placeholder methods causing empty dict returns
surendhar-palanisamy Mar 13, 2026
86aa2b2
Clean up: Remove duplicate placeholder methods
surendhar-palanisamy Mar 13, 2026
b5fce70
Clean up: Remove duplicate placeholder methods
surendhar-palanisamy Mar 13, 2026
ddb1a11
Refactor: Move get_places to logic/places.py
surendhar-palanisamy Mar 13, 2026
437d49e
Extract maps, utils, and VLM functions to logic/ modules
surendhar-palanisamy Mar 13, 2026
8ce752f
Extract _safe_serialize to logic/utils.py
surendhar-palanisamy Mar 13, 2026
f82b999
Extract localize_user to thin wrapper calling run_localize_user
surendhar-palanisamy Mar 13, 2026
3b48917
remove @method decordator
surendhar-palanisamy Mar 13, 2026
4df48ce
Remove redundant wrapper methods, call run_* functions directly from …
surendhar-palanisamy Mar 13, 2026
a596f98
Remove remaining wrapper methods and unused imports
surendhar-palanisamy Mar 13, 2026
2f0629e
Update AGENTS.md with new project structure and refactoring patterns
surendhar-palanisamy Mar 13, 2026
0dee938
Fix: call run_get_places and run_ensure_maps_loaded directly instead …
surendhar-palanisamy Mar 13, 2026
6da6850
planner change
surendhar-palanisamy Mar 13, 2026
5ba0868
Add detailed logging for planner and localize_user functions
surendhar-palanisamy Mar 13, 2026
aca6b26
Add timing and error logging to planner and localize_user
surendhar-palanisamy Mar 13, 2026
7db5492
Add bootstrap pass logging for cold start localization
surendhar-palanisamy Mar 13, 2026
01bb8fa
Update test parameters for 17_floor
surendhar-palanisamy Mar 13, 2026
37550e8
Simplify navigation to match original.py - single pass localize, prop…
surendhar-palanisamy Mar 13, 2026
a7631d0
Revert "Simplify navigation to match original.py - single pass locali…
surendhar-palanisamy Mar 13, 2026
d1002be
fix image shape
surendhar-palanisamy Mar 13, 2026
5d302c0
Fix refinement queue handling to match unav-server
surendhar-palanisamy Mar 13, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 3 additions & 2 deletions .github/workflows/deploy-unav-v2-modal.yml
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,7 @@ on:
- t4
- any
- a100
- h200
ram_mb:
description: "RAM reservation in MiB (max in this workflow: 98304)"
required: true
Expand Down Expand Up @@ -58,8 +59,8 @@ jobs:
if parsed <= 0:
raise ValueError("scaledown_window must be a positive integer")
gpu = os.environ["UNAV_GPU_TYPE"].strip().lower()
if gpu not in {"a10", "t4", "any", "a100"}:
raise ValueError("gpu_type must be one of: a10, t4, any, a100")
if gpu not in {"a10", "t4", "any", "a100", "h200"}:
raise ValueError("gpu_type must be one of: a10, t4, any, a100, h200")
ram_mb = int(os.environ["UNAV_RAM_MB"])
max_ram_mb = 98304
if ram_mb <= 0:
Expand Down
197 changes: 197 additions & 0 deletions AGENTS.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,197 @@
# UNav-Server Agent Guidelines

This document provides guidelines for agents working on the UNav-Server codebase.

## Project Overview

UNav-Server provides a serverless implementation for indoor navigation using computer vision. It leverages Modal for deployment and offers features like visual localization, path planning, and navigation guidance.

## Project Structure

```
UNav-Server/
├── src/
│ └── modal_functions/
│ ├── unav_v1/ # Legacy version
│ ├── unav_v2/ # Current production version
│ │ ├── unav_modal.py # Main Modal app (~200 lines)
│ │ ├── logic/ # Extracted business logic
│ │ │ ├── __init__.py # Exports all run_* functions
│ │ │ ├── navigation.py # run_planner, run_localize_user
│ │ │ ├── init.py # Initialization & monkey-patching
│ │ │ ├── places.py # run_get_places, run_get_fallback_places
│ │ │ ├── maps.py # run_ensure_maps_loaded
│ │ │ ├── utils.py # run_safe_serialize, etc.
│ │ │ └── vlm.py # run_vlm_on_image
│ │ ├── server_methods/
│ │ │ └── helpers.py # Queue utility functions
│ │ ├── test_modal_functions.py
│ │ ├── modal_config.py
│ │ ├── deploy_config.py
│ │ ├── destinations_service.py
│ │ └── media/ # Test images
│ └── volume_utils/ # Volume management utilities
├── .github/workflows/ # CI/CD workflows
├── requirements.txt # Python dependencies
└── TODO.md # Technical documentation
```

## Build/Lint/Test Commands

### Python Version
- Minimum: Python 3.10+
- Recommended: Python 3.11 (used in CI/CD)

### Setup
```bash
# Create virtual environment
python -m venv .venv

# Activate (macOS/Linux)
source .venv/bin/activate

# Install dependencies
pip install -r requirements.txt
```

### Running Tests
```bash
# Navigate to the module directory
cd src/modal_functions/unav_v2

# Run a single test file
python test_modal_functions.py

# Run with pytest (if installed)
pytest test_modal_functions.py -v
```

### Deployment Commands
```bash
# Deploy to Modal (from unav_v2 directory)
cd src/modal_functions/unav_v2
modal deploy unav_modal.py

# Deploy with custom parameters
UNAV_SCALEDOWN_WINDOW=600 UNAV_GPU_TYPE=t4 UNAV_RAM_MB=73728 modal deploy unav_modal.py
```

### GitHub Actions Deployment
1. Go to Actions -> "Deploy UNav v2 Modal" -> "Run workflow"
2. Set inputs: scaledown_window, gpu_type, ram_mb
3. Requires secrets: MODAL_TOKEN_ID, MODAL_TOKEN_SECRET

## Code Style Guidelines

### Import Organization
Order: stdlib -> third-party -> local imports, with blank lines between groups.

```python
import os
import json
from typing import Dict, List, Any, Optional

import modal
import cv2
import numpy as np

from .deploy_config import get_scaledown_window
from .logic import run_planner, run_localize_user
```

### Naming Conventions
- **Functions/variables**: snake_case (e.g., `get_destinations_list`, `image_data`)
- **Classes**: PascalCase (e.g., `UnavServer`, `FacilityNavigator`)
- **Constants**: UPPER_SNAKE_CASE (e.g., `BUILDING`, `PLACE`)
- **Logic functions**: prefix with `run_` (e.g., `run_planner`, `run_safe_serialize`)
- **Private methods**: prefix with underscore (e.g., `_configure_middleware_tracing`)

### Type Hints
Use type hints for function parameters and return values.

```python
def get_destinations_list_impl(
server: Any,
floor: str = "6_floor",
place: str = "New_York_City",
enable_multifloor: bool = False,
) -> Dict[str, Any]:
```

### Refactoring Pattern: Logic Extraction

When extracting code from `unav_modal.py`:

1. **Keep `@method()` decorators in `unav_modal.py`** - Modal requires them
2. **Move logic to `logic/` directory** - Each function gets `run_` prefix
3. **Thin wrapper pattern** - Method in unav_modal.py just calls the logic function

```python
# unav_modal.py - thin wrapper
@method()
def planner(self, session_id: str, ...):
return run_planner(self, session_id=session_id, ...)

# logic/navigation.py - actual logic
def run_planner(self, session_id: str, ...) -> Dict[str, Any]:
# Full implementation here
pass
```

**DO NOT create wrapper methods** for internal functions (e.g., `get_session`, `update_session`) - call `run_*` functions directly from logic modules.

### Error Handling
- Use try/except blocks for operations that may fail
- Catch specific exceptions when possible
- Return error dictionaries for recoverable errors
- Use print statements with emojis for logging

```python
try:
result = some_function()
except ValueError as e:
print(f"❌ Invalid value: {e}")
return {"status": "error", "message": str(e)}
except Exception as e:
print(f"❌ Unexpected error: {e}")
raise
```

### Code Formatting
- Maximum line length: 100 characters (soft limit)
- Use 4 spaces for indentation (no tabs)
- Use blank lines to separate logical sections
- Use trailing commas in multi-line collections
- Use f-strings for string interpolation

### Logging Patterns
- `print("🔧 [Phase X] ...")` - Initialization steps
- `print("✅ ...")` - Success messages
- `print("⚠️ ...")` - Warnings
- `print("❌ ...")` - Errors
- `print(f"[DEBUG] ...")` - Debug info

### Testing Guidelines
- Test files: `test_modal_functions.py`
- Use descriptive test parameters (BUILDING, PLACE, FLOOR, etc.)
- Include error handling for Modal class lookup
- Test both success and failure paths when applicable

## Environment Variables

| Variable | Default | Description |
|----------|---------|-------------|
| UNAV_SCALEDOWN_WINDOW | 300 | Modal scaledown window (seconds) |
| UNAV_GPU_TYPE | t4 | GPU type (a10, t4, a100, any, h200) |
| UNAV_RAM_MB | 73728 | RAM reservation in MiB |
| MODAL_TOKEN_ID | - | Modal token (GitHub secret) |
| MODAL_TOKEN_SECRET | - | Modal secret (GitHub secret) |

## Notes for Agents

- This is a Modal-based serverless application
- Tests require a deployed Modal app to run against
- The codebase uses the unav-core library internally (runtime dependency - LSP errors are expected locally)
- Code changes may require redeployment to take effect
- Check TODO.md for technical context on implementation decisions
- Runtime imports (torch, unav, middleware, google.genai) only exist in Modal container
121 changes: 121 additions & 0 deletions TODO.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,121 @@
# UNav-VIS4ION Alignment Tasks

## Context
Output differences between Modal.com and static GPU are from **wrapper/orchestration layer**, not from unav-core itself.

## Root Cause Analysis

### Why unav-server (static GPU) works better:
1. **Map loading**: Loads ALL floors at startup (`localizer.load_maps_and_features()`)
2. **Session in memory across requests persistence**: Queue persists
3. **Queue works**: Subsequent calls use refinement queue → better accuracy

### Why Modal fails:
1. **Floor-locked maps**: Only loads target floor, not all floors
2. **No session persistence**: `enable_memory_snapshot=False`, each call hits different container
3. **Queue doesn't work**: Session/queue lost on cold-start = **ALWAYS cold start**

This is why X-value drifts - every single localization is cold-start!

## Codex Comparison Results
- Test case: `vianys_640_360_langone_elevator.jpeg`, NYU Langone, destination context set to `15_floor`
- Floor-lock issue: 0/5 successful localizations (floor-locked) → 5/5 successful (multifloor)
- Cold-start stabilization: XY error improved from 160.79 px → 58.90 px → 47.92 px

---

## Changes Implemented (Chronological)

### 1. Enable Multifloor by Default
- **Changed**: `enable_multifloor` default from `False` to `True` in both `planner()` and `localize_user()`
- **Why**: Matches unav-server behavior - loads all floors for the building instead of just target floor
- **Impact**: Fixes 0/5 → 5/5 success rate

### 2. Queue Bucketing by Image Shape
- **Added** helper functions:
- `_get_queue_key_for_image_shape()` - generates queue key based on `image.shape[:2]` (e.g., "360x640")
- `_get_refinement_queue_for_map()` - retrieves queue for specific map_key and queue_key
- `_update_refinement_queue()` - updates queue for specific map_key and queue_key
- **Modified** queue structure to be nested: `{best_map_key: {queue_key: {pairs, initial_poses, pps}}}`
- **Note**: Less critical since queue doesn't persist in serverless anyway

### 3. Cold-Start Multi-Pass Stabilization (v1 - 2 passes)
- **Initial**: Ran 2 localization passes on cold-start, averaged results
- **Why**: Since queue doesn't work in serverless, each request needs self-correction

### 4. Cold-Start Multi-Pass Stabilization (v2 - 3 passes)
- **Changed**: Upgraded from 2 passes to 3 passes
- **Updated**: bootstrap_mode label from "mean_pass2_pass3" to "mean_all_passes"
- **Why**: Better averaging with more samples, diminishing returns after 3 but still improved

### 5. Add Debug Fields
Added `debug_info` to responses:
- `map_scope`: "building_level_multifloor" or "floor_locked"
- `bootstrap_mode`: "mean_all_passes", "single_pass", or "none"
- `bootstrap_passes`: number of passes run
- `queue_key`: image shape bucket key
- `n_frames`: number of frames in queue
- `top_candidates_count`: number of VPR candidates

---

## Technical Details

### Helper Functions (lines 13-42)
```python
def _get_queue_key_for_image_shape(image_shape):
"""Get a queue key based on image shape for bucket-based refinement queue handling."""
if image_shape is None:
return "default"
h, w = image_shape[:2]
return f"{h}x{w}"

def _get_refinement_queue_for_map(queue_dict, map_key, queue_key):
"""Get the refinement queue for a specific map_key and queue_key."""
...

def _update_refinement_queue(queue_dict, map_key, queue_key, new_queue_state):
"""Update the refinement queue for a specific map_key and queue_key."""
...
```

### Cold-Start Stabilization Logic
- Triggered when `len(refinement_queue) == 0` (always in serverless)
- Runs 3 bootstrap passes on empty queue
- Averages XY coordinates and angle from all 3 passes
- Updates queue after each pass for next iteration
- Falls back to single pass if fewer than 2 succeed

### Code Locations
- `planner()`: lines ~1470-1520
- `localize_user()`: lines ~1902-1952

---

## Expected Improvements

| Metric | Before | After |
|--------|--------|-------|
| Success rate (floor-locked) | 0/5 | 5/5 |
| XY error (cold-start) | ~160px | ~50px |
| Map scope | floor-locked | building-level |

---

## Future Considerations

1. **Enable memory snapshots**: Could persist queue across cold-starts (but adds ~5-10s restore time)
2. **Client-side queue**: Pass queue with each request
3. **External storage**: Redis for queue persistence
4. **top_k optimization**: Experiment with different top_k values:
- Default (None) uses config value (~10-20)
- Lower top_k = faster but fewer candidates
- Higher top_k = slower but more candidates to match against
- Consider making it dynamic based on image quality

---

## Notes
- Test image: `vianys_640_360_langone_elevator.jpeg`
- Expected floor: `15_floor`
- Reference mean output should be used for comparison
1 change: 1 addition & 0 deletions src/modal_functions/unav_v2/deploy_config.py
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,7 @@ def get_gpu_config() -> List[str]:
"a10": "A10",
"a10g": "A10",
"a100": "A100",
"h200": "H200",
"any": "any",
}

Expand Down
13 changes: 9 additions & 4 deletions src/modal_functions/unav_v2/destinations_service.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,8 @@
from typing import Any

from .logic.places import run_get_places
from .logic.maps import run_ensure_maps_loaded


def get_destinations_list_impl(
server: Any,
Expand Down Expand Up @@ -32,15 +35,17 @@ def _run():
print(f"🎯 [Phase 3] Getting destinations for {place}/{building}/{floor}")

# Ensure maps are loaded for this location.
server.ensure_maps_loaded(
place,
building,
run_ensure_maps_loaded(
server=server,
place=place,
building=building,
floor=floor,
enable_multifloor=enable_multifloor,
)

if enable_multifloor:
places = server.get_places(
places = run_get_places(
server,
target_place=place,
target_building=building,
enable_multifloor=True,
Expand Down
Loading
Loading