agfianf · Rajkisan · Feb 7, 2026 · Feb 11, 2026 · Feb 11, 2026 · Feb 12, 2026
diff --git a/.gitignore b/.gitignore
@@ -8,7 +8,13 @@ wheels/
 
 # Virtual environments
 .venv
+venv/
 apps/api-inference/.venv
+apps/api-inference-yolo/.venv
+
+# SAM3 Model Weights
+apps/sam3.pt
+sam3.pt
 
 # Environment variables
 .env

diff --git a/README.md b/README.md
@@ -35,6 +35,14 @@
 ## ✨ Features
 
 - **⚡ Automated Segmentation**: SAM3 inference runs locally or via optimized endpoints to auto-segment objects instantly. Use text prompts or bounding boxes to get pixel-perfect masks in milliseconds.
+  - **Text Prompts**: Describe objects in natural language ("person walking", "car on road")
+  - **Bounding Box Exemplars**: Draw boxes around objects to find all similar instances
+  - **Smart Prompt Memory**: Automatically remembers the last text prompt used for each label class
+
+- **🔄 Intelligent Modes**: Choose the workflow that matches your task
+  - **Single Mode**: Process one image at a time with full control
+  - **Auto-Apply Mode**: Set your prompt once, automatically processes each new image
+  - **Batch Mode**: Select multiple images and process them all with the same prompts
 
 - **🎯 Manual Precision**: Need to tweak the AI's work? Use our pixel-perfect pen, rectangle, and polygon tools for fine-tuning your annotations with complete control.
 
@@ -52,75 +60,237 @@
 
 ## Architecture
 
-AnnotateANU is a simple monorepo with two independent applications:
+AnnotateANU is a simple monorepo with two independent applications and **two backend options**:
 
 ```
-sam3-app/                    # Simple Monorepo
+annotate-anu/                # Simple Monorepo
 ├── apps/
 │   ├── web/                 # React annotation interface
 │   │   ├── src/
 │   │   ├── Dockerfile
 │   │   └── package.json
-│   └── api-inference/       # FastAPI SAM3 backend
-│       ├── src/app/
-│       ├── Dockerfile
-│       └── pyproject.toml
+│   ├── api-inference/       # FastAPI HuggingFace SAM3 backend (gated model)
+│   │   ├── src/app/
+│   │   ├── Dockerfile
+│   │   └── pyproject.toml
+│   ├── api-inference-yolo/  # FastAPI Ultralytics SAM3 backend (recommended)
+│   │   ├── src/app/
+│   │   ├── Dockerfile
+│   │   └── pyproject.toml
+│   └── sam3.pt              # SAM3 model weights (not included, see setup)
+├── venv/                    # Python virtual environment
 ├── docker-compose.yml       # Orchestrates all services
+├── setup-yolo.sh/bat        # Setup script for Ultralytics backend
+├── setup-hf.sh/bat          # Setup script for HuggingFace backend
+├── run-yolo.sh/bat          # Run script for Ultralytics backend
+├── run-hf.sh/bat            # Run script for HuggingFace backend
 ├── Makefile                 # Development commands
-├── package.json             # Root config
 └── README.md
 ```
 
+### Backend Options
+
+**🚀 Ultralytics SAM3 (Recommended)** - `api-inference-yolo`
+- ✅ Faster inference with FP16 support
+- ✅ Better text prompt segmentation with semantic understanding
+- ✅ Bounding box exemplar-based segmentation for finding similar objects
+- ✅ No HuggingFace account required (but model download is manual)
+- ✅ Supports single, auto-apply, and batch processing modes
+- 📦 Uses: `ultralytics`, PyTorch, SAM3SemanticPredictor
+
+**🔄 HuggingFace SAM3** - `api-inference`
+- ✅ Auto-downloads model on first run
+- ⚠️ Requires HuggingFace account and gated model access
+- 📦 Uses: `transformers`, `huggingface-hub`
+
 ## Quick Start
 
 ### Prerequisites
 
-- **Docker & Docker Compose** (recommended)
-- **Python 3.12+** and **uv** (for local backend development)
-- **Node.js 18+** and **npm** (for local frontend development)
-- **HuggingFace Account & Token** (required for SAM3 model access)
+- **Python 3.12+** (required for local setup)
+- **Node.js 18+** and **npm** (required for frontend)
+- **Docker & Docker Compose** (optional, for containerized setup)
+
+### Choose Your Backend
 
-### HuggingFace Setup (REQUIRED)
+#### Option 1: Ultralytics SAM3 (Recommended) ⚡
 
-SAM3 is a gated model. You must:
+**Requirements:**
+- Download `sam3.pt` model weights manually (see instructions below)
+- No HuggingFace account needed
 
-1. Create account: https://huggingface.co/join
-2. Request access: https://huggingface.co/facebook/sam3
-3. Generate token: https://huggingface.co/settings/tokens
-4. Add to apps/api-inference/.env:
+**Setup & Run:**
 
 ```bash
-cp apps/api-inference/.env.example apps/api-inference/.env
-# Edit apps/api-inference/.env and add:
-HF_TOKEN=hf_your_token_here
+# 1. Clone the repository
+git clone https://github.com/agfianf/annotate-anu.git
+cd annotate-anu
+
+# 2. Download SAM3 model weights
+# Visit: https://huggingface.co/facebook/sam3
+# Request access, then download sam3.pt
+# Place it in: apps/sam3.pt
+
+# 3. Run setup script
+./setup-yolo.sh   # Linux/Mac
+setup-yolo.bat    # Windows
+
+# 4. Start the application
+./run-yolo.sh     # Linux/Mac
+run-yolo.bat      # Windows
+
+# Access the application
+# Frontend: http://localhost:5173
+# Backend API: http://localhost:8000
+# API Docs: http://localhost:8000/docs
 ```
 
-### Docker (Recommended)
+#### Option 2: HuggingFace SAM3
+
+**Requirements:**
+- HuggingFace account with gated model access
+- Model auto-downloads on first run (~2.4GB)
+
+**Setup & Run:**
 
 ```bash
-# 1. Setup environment
+# 1. Clone the repository
+git clone https://github.com/agfianf/annotate-anu.git
+cd annotate-anu
+
+# 2. Setup HuggingFace token
+# - Create account: https://huggingface.co/join
+# - Request access: https://huggingface.co/facebook/sam3
+# - Generate token: https://huggingface.co/settings/tokens
+
+# 3. Run setup script
+./setup-hf.sh     # Linux/Mac
+setup-hf.bat      # Windows
+
+# 4. Add your HuggingFace token
+# Edit apps/api-inference/.env:
+# HF_TOKEN=hf_your_token_here
+
+# 5. Start the application
+./run-hf.sh       # Linux/Mac
+run-hf.bat        # Windows
+```
+
+### SAM3 Model Weights Setup
+
+**⚠️ IMPORTANT: Unlike other Ultralytics models, SAM3 weights (`sam3.pt`) are NOT automatically downloaded.**
+
+You must manually download the model weights:
+
+1. **Request Access**: Visit [https://huggingface.co/facebook/sam3](https://huggingface.co/facebook/sam3) and click "Request Access"
+2. **Wait for Approval**: You'll receive an email when approved (usually within a few hours)
+3. **Download Model**: Once approved, go to the "Files" tab and download `sam3.pt` (~2.4GB)
+4. **Place File**: Put `sam3.pt` in `apps/sam3.pt` (relative to project root)
+
+```bash
+# Correct location:
+annotate-anu/
+└── apps/
+    └── sam3.pt    # Place the downloaded file here
+```
+
+### Docker Setup (Alternative)
+
+```bash
+# 1. Setup environment (choose your backend)
+cp apps/api-inference-yolo/.env.example apps/api-inference-yolo/.env
+# OR
 cp apps/api-inference/.env.example apps/api-inference/.env
-# Edit apps/api-inference/.env and add your HF_TOKEN
 
-# 2. Start all services
+# 2. Add credentials and place sam3.pt in apps/
+
+# 3. Start all services
 make docker-up
 
-# 3. Access the application
+# 4. Access the application
 # Frontend: http://localhost:5173
 # Backend API: http://localhost:8000
 # API Docs: http://localhost:8000/docs
 ```
 
+## Troubleshooting
+
+### TypeError: 'SimpleTokenizer' object is not callable
+
+If you encounter this error during prediction with Ultralytics SAM3:
+
+```bash
+# Activate your virtual environment first
+source venv/bin/activate  # Linux/Mac
+# OR
+venv\Scripts\activate.bat # Windows
+
+# Fix the CLIP package conflict
+pip uninstall clip -y
+pip install git+https://github.com/ultralytics/CLIP.git
+```
+
+This error occurs when the wrong `clip` package is installed. The Ultralytics-specific CLIP package is required.
+
+### Model Loading Issues
+
+**Problem**: "SAM3 model weights not found"
+- **Solution**: Ensure `sam3.pt` is in `apps/sam3.pt` directory
+- Check file permissions: `ls -la apps/sam3.pt`
+
+**Problem**: "CUDA out of memory"
+- **Solution**: Reduce image size or switch to CPU mode in `.env`:
+  ```bash
+  SAM3_DEVICE=cpu
+  ```
+
+**Problem**: Backend shows "0 detections"
+- **Solution**: Lower the confidence threshold (default 0.25)
+- Try different text prompts (e.g., "object" instead of specific names)
+- Check image quality and size
+
+### Port Already in Use
+
+```bash
+# Linux/Mac
+lsof -ti:8000 | xargs kill -9  # Kill backend
+lsof -ti:5173 | xargs kill -9  # Kill frontend
+
+# Windows
+netstat -ano | findstr :8000  # Find PID
+taskkill /F /PID <PID>        # Kill process
+```
+
 ## 🚀 Roadmap - Coming Soon
 
 We are constantly evolving. Here's what's shipping next to AnnotateANU:
 
+#### 🎨 Enhanced Annotation Tools
+- **Magic Wand Tool**: Click-to-segment for quick region selection
+- **Edge Refinement**: AI-powered edge smoothing for precise mask boundaries
+- **Annotation Templates**: Save and reuse common annotation patterns
+
 #### 🔌 Bring Your Own Model (BYOM)
 Connect your existing custom models via API. Pre-label your images using your own weights to bootstrap the annotation process even faster.
 
+#### 🤖 Advanced AI Features
+- **Active Learning**: Intelligently suggest which images to annotate next
+- **Cross-Image Tracking**: Track objects across video frames or image sequences
+- **Multi-Model Ensemble**: Combine predictions from multiple models for better accuracy
+
 #### ☁️ Enterprise Storage Integration
 Move beyond browser storage. We're adding native integration for MinIO and S3-compatible object storage, allowing you to pull and sync datasets directly from your cloud buckets.
 
+#### 👥 Collaboration Features
+- **Team Workspaces**: Share projects and annotations across team members
+- **Review Mode**: Approve or reject annotations with comment threads
+- **Version Control**: Track annotation history and changes over time
+
+#### 📊 Analytics & Insights
+- **Annotation Statistics**: Track productivity metrics and dataset balance
+- **Quality Checks**: Automated validation for annotation consistency
+- **Export Analytics**: Detailed reports on dataset composition
+
 
 ## 🤝 Contributing
 
@@ -141,10 +311,17 @@ MIT License - see [LICENSE](LICENSE) file for details.
 
 ## References
 
-- [SAM3 Model](https://huggingface.co/facebook/sam3)
+- [SAM3 Model (HuggingFace)](https://huggingface.co/facebook/sam3)
+- [Ultralytics SAM3 Documentation](https://docs.ultralytics.com/models/sam-3/)
 - [T-REX Label](https://www.trexlabel.com/)
 - [MakeSense.ai](https://www.makesense.ai/)
 
+## Acknowledgments
+
+- **Meta AI** for the SAM3 (Segment Anything Model 3) architecture
+- **Ultralytics** for the excellent SAM3 implementation and PyTorch optimization
+- The open-source community for inspiration and tools
+
 
 ---
 

diff --git a/apps/api-inference-yolo/.env.example b/apps/api-inference-yolo/.env.example
@@ -0,0 +1,28 @@
+# Application Settings
+DEBUG=true
+APP_HOST=0.0.0.0
+APP_PORT=8000
+LOG_LEVEL=INFO
+
+# HuggingFace Authentication (REQUIRED for SAM3)
+# Get your token from: https://huggingface.co/settings/tokens
+HF_TOKEN=hf_your_token_here
+
+# SAM3 Model Settings
+SAM3_MODEL_NAME=facebook/sam3
+SAM3_DEVICE=auto
+SAM3_DEFAULT_THRESHOLD=0.5
+SAM3_DEFAULT_MASK_THRESHOLD=0.5
+
+# API Limits
+MAX_IMAGE_SIZE_MB=10
+MAX_BATCH_SIZE=10
+MAX_IMAGE_DIMENSION=4096
+
+# Visualization
+VISUALIZATION_FORMAT=PNG
+VISUALIZATION_QUALITY=95
+
+# CORS
+# Must be a JSON array. Examples: ["*"], ["http://localhost:3000","http://localhost:5173"]
+ALLOWED_ORIGINS=["*"]
diff --git a/apps/api-inference-yolo/.gitignore b/apps/api-inference-yolo/.gitignore
@@ -0,0 +1,2 @@
+.venv
+.env
diff --git a/apps/api-inference-yolo/.python-version b/apps/api-inference-yolo/.python-version
@@ -0,0 +1 @@
+3.12
diff --git a/apps/api-inference-yolo/Dockerfile b/apps/api-inference-yolo/Dockerfile
@@ -0,0 +1,45 @@
+# 1. Change Base Image to NVIDIA CUDA (Devel version ensures compatibility with Transformers/PyTorch extensions)
+# Note: Use CUDA 12.1 or 12.4 depending on your PyTorch version requirements. 12.1 is widely supported.
+FROM nvidia/cuda:12.1.0-devel-ubuntu22.04
+
+WORKDIR /code
+
+# 2. Install system dependencies
+# We need software-properties-common to install Python easily if needed, 
+# but 'uv' can actually manage Python for us.
+RUN apt-get update && apt-get install -y \
+    git \
+    curl \
+    ca-certificates \
+    libsndfile1 \
+    && rm -rf /var/lib/apt/lists/*
+
+# 3. Install uv package manager
+COPY --from=ghcr.io/astral-sh/uv:0.9.11 /uv /uvx /bin/
+
+# 4. Configure uv to install Python 3.12 automatically
+# Since the base image is Ubuntu without Python 3.12, uv will fetch it.
+ENV UV_PYTHON=3.12
+ENV UV_COMPILE_BYTECODE=1
+
+# 5. Install dependencies
+# Using --frozen ensures we respect the lockfile
+RUN --mount=type=bind,source=uv.lock,target=uv.lock \
+    --mount=type=bind,source=pyproject.toml,target=pyproject.toml \
+    uv sync --frozen --no-cache
+
+# 6. Install transformers (and ensure PyTorch uses CUDA)
+# Note: If you need specific CUDA PyTorch, you might need to define extra-index-url in pyproject.toml
+RUN uv pip install git+https://github.com/huggingface/transformers.git
+
+# Copy application code
+COPY src/ ./src/
+
+EXPOSE 8000
+
+# Set Python path
+ENV PYTHONPATH=/code/src
+
+# 7. Run application
+# We use 'uv run' which ensures the correct python environment is used
+CMD ["uv", "run", "app/main.py"]