Skip to content
Open
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 6 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,13 @@ wheels/

# Virtual environments
.venv
venv/
apps/api-inference/.venv
apps/api-inference-yolo/.venv

# SAM3 Model Weights
apps/sam3.pt
sam3.pt

# Environment variables
.env
Expand Down
229 changes: 203 additions & 26 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,14 @@
## ✨ Features

- **⚡ Automated Segmentation**: SAM3 inference runs locally or via optimized endpoints to auto-segment objects instantly. Use text prompts or bounding boxes to get pixel-perfect masks in milliseconds.
- **Text Prompts**: Describe objects in natural language ("person walking", "car on road")
- **Bounding Box Exemplars**: Draw boxes around objects to find all similar instances
- **Smart Prompt Memory**: Automatically remembers the last text prompt used for each label class

- **🔄 Intelligent Modes**: Choose the workflow that matches your task
- **Single Mode**: Process one image at a time with full control
- **Auto-Apply Mode**: Set your prompt once, automatically processes each new image
- **Batch Mode**: Select multiple images and process them all with the same prompts

- **🎯 Manual Precision**: Need to tweak the AI's work? Use our pixel-perfect pen, rectangle, and polygon tools for fine-tuning your annotations with complete control.

Expand All @@ -52,75 +60,237 @@

## Architecture

AnnotateANU is a simple monorepo with two independent applications:
AnnotateANU is a simple monorepo with two independent applications and **two backend options**:

```
sam3-app/ # Simple Monorepo
annotate-anu/ # Simple Monorepo
├── apps/
│ ├── web/ # React annotation interface
│ │ ├── src/
│ │ ├── Dockerfile
│ │ └── package.json
│ └── api-inference/ # FastAPI SAM3 backend
│ ├── src/app/
│ ├── Dockerfile
│ └── pyproject.toml
│ ├── api-inference/ # FastAPI HuggingFace SAM3 backend (gated model)
│ │ ├── src/app/
│ │ ├── Dockerfile
│ │ └── pyproject.toml
│ ├── api-inference-yolo/ # FastAPI Ultralytics SAM3 backend (recommended)
│ │ ├── src/app/
│ │ ├── Dockerfile
│ │ └── pyproject.toml
│ └── sam3.pt # SAM3 model weights (not included, see setup)
├── venv/ # Python virtual environment
├── docker-compose.yml # Orchestrates all services
├── setup-yolo.sh/bat # Setup script for Ultralytics backend
├── setup-hf.sh/bat # Setup script for HuggingFace backend
├── run-yolo.sh/bat # Run script for Ultralytics backend
├── run-hf.sh/bat # Run script for HuggingFace backend
├── Makefile # Development commands
├── package.json # Root config
└── README.md
```

### Backend Options

**🚀 Ultralytics SAM3 (Recommended)** - `api-inference-yolo`
- ✅ Faster inference with FP16 support
- ✅ Better text prompt segmentation with semantic understanding
- ✅ Bounding box exemplar-based segmentation for finding similar objects
- ✅ No HuggingFace account required (but model download is manual)
- ✅ Supports single, auto-apply, and batch processing modes
- 📦 Uses: `ultralytics`, PyTorch, SAM3SemanticPredictor

**🔄 HuggingFace SAM3** - `api-inference`
- ✅ Auto-downloads model on first run
- ⚠️ Requires HuggingFace account and gated model access
- 📦 Uses: `transformers`, `huggingface-hub`

## Quick Start

### Prerequisites

- **Docker & Docker Compose** (recommended)
- **Python 3.12+** and **uv** (for local backend development)
- **Node.js 18+** and **npm** (for local frontend development)
- **HuggingFace Account & Token** (required for SAM3 model access)
- **Python 3.12+** (required for local setup)
- **Node.js 18+** and **npm** (required for frontend)
- **Docker & Docker Compose** (optional, for containerized setup)

### Choose Your Backend

### HuggingFace Setup (REQUIRED)
#### Option 1: Ultralytics SAM3 (Recommended) ⚡

SAM3 is a gated model. You must:
**Requirements:**
- Download `sam3.pt` model weights manually (see instructions below)
- No HuggingFace account needed

1. Create account: https://huggingface.co/join
2. Request access: https://huggingface.co/facebook/sam3
3. Generate token: https://huggingface.co/settings/tokens
4. Add to apps/api-inference/.env:
**Setup & Run:**

```bash
cp apps/api-inference/.env.example apps/api-inference/.env
# Edit apps/api-inference/.env and add:
HF_TOKEN=hf_your_token_here
# 1. Clone the repository
git clone https://github.com/agfianf/annotate-anu.git
cd annotate-anu

# 2. Download SAM3 model weights
# Visit: https://huggingface.co/facebook/sam3
# Request access, then download sam3.pt
# Place it in: apps/sam3.pt

# 3. Run setup script
./setup-yolo.sh # Linux/Mac
setup-yolo.bat # Windows

# 4. Start the application
./run-yolo.sh # Linux/Mac
run-yolo.bat # Windows

# Access the application
# Frontend: http://localhost:5173
# Backend API: http://localhost:8000
# API Docs: http://localhost:8000/docs
```

### Docker (Recommended)
#### Option 2: HuggingFace SAM3

**Requirements:**
- HuggingFace account with gated model access
- Model auto-downloads on first run (~2.4GB)

**Setup & Run:**

```bash
# 1. Setup environment
# 1. Clone the repository
git clone https://github.com/agfianf/annotate-anu.git
cd annotate-anu

# 2. Setup HuggingFace token
# - Create account: https://huggingface.co/join
# - Request access: https://huggingface.co/facebook/sam3
# - Generate token: https://huggingface.co/settings/tokens

# 3. Run setup script
./setup-hf.sh # Linux/Mac
setup-hf.bat # Windows

# 4. Add your HuggingFace token
# Edit apps/api-inference/.env:
# HF_TOKEN=hf_your_token_here

# 5. Start the application
./run-hf.sh # Linux/Mac
run-hf.bat # Windows
```

### SAM3 Model Weights Setup

**⚠️ IMPORTANT: Unlike other Ultralytics models, SAM3 weights (`sam3.pt`) are NOT automatically downloaded.**

You must manually download the model weights:

1. **Request Access**: Visit [https://huggingface.co/facebook/sam3](https://huggingface.co/facebook/sam3) and click "Request Access"
2. **Wait for Approval**: You'll receive an email when approved (usually within a few hours)
3. **Download Model**: Once approved, go to the "Files" tab and download `sam3.pt` (~2.4GB)
4. **Place File**: Put `sam3.pt` in `apps/sam3.pt` (relative to project root)

```bash
# Correct location:
annotate-anu/
└── apps/
└── sam3.pt # Place the downloaded file here
```

### Docker Setup (Alternative)

```bash
# 1. Setup environment (choose your backend)
cp apps/api-inference-yolo/.env.example apps/api-inference-yolo/.env
# OR
cp apps/api-inference/.env.example apps/api-inference/.env
# Edit apps/api-inference/.env and add your HF_TOKEN

# 2. Start all services
# 2. Add credentials and place sam3.pt in apps/

# 3. Start all services
make docker-up

# 3. Access the application
# 4. Access the application
# Frontend: http://localhost:5173
# Backend API: http://localhost:8000
# API Docs: http://localhost:8000/docs
```

## Troubleshooting

### TypeError: 'SimpleTokenizer' object is not callable

If you encounter this error during prediction with Ultralytics SAM3:

```bash
# Activate your virtual environment first
source venv/bin/activate # Linux/Mac
# OR
venv\Scripts\activate.bat # Windows

# Fix the CLIP package conflict
pip uninstall clip -y
pip install git+https://github.com/ultralytics/CLIP.git
```

This error occurs when the wrong `clip` package is installed. The Ultralytics-specific CLIP package is required.

### Model Loading Issues

**Problem**: "SAM3 model weights not found"
- **Solution**: Ensure `sam3.pt` is in `apps/sam3.pt` directory
- Check file permissions: `ls -la apps/sam3.pt`

**Problem**: "CUDA out of memory"
- **Solution**: Reduce image size or switch to CPU mode in `.env`:
```bash
SAM3_DEVICE=cpu
```

**Problem**: Backend shows "0 detections"
- **Solution**: Lower the confidence threshold (default 0.25)
- Try different text prompts (e.g., "object" instead of specific names)
- Check image quality and size

### Port Already in Use

```bash
# Linux/Mac
lsof -ti:8000 | xargs kill -9 # Kill backend
lsof -ti:5173 | xargs kill -9 # Kill frontend

# Windows
netstat -ano | findstr :8000 # Find PID
taskkill /F /PID <PID> # Kill process
```

## 🚀 Roadmap - Coming Soon

We are constantly evolving. Here's what's shipping next to AnnotateANU:

#### 🎨 Enhanced Annotation Tools
- **Magic Wand Tool**: Click-to-segment for quick region selection
- **Edge Refinement**: AI-powered edge smoothing for precise mask boundaries
- **Annotation Templates**: Save and reuse common annotation patterns

#### 🔌 Bring Your Own Model (BYOM)
Connect your existing custom models via API. Pre-label your images using your own weights to bootstrap the annotation process even faster.

#### 🤖 Advanced AI Features
- **Active Learning**: Intelligently suggest which images to annotate next
- **Cross-Image Tracking**: Track objects across video frames or image sequences
- **Multi-Model Ensemble**: Combine predictions from multiple models for better accuracy

#### ☁️ Enterprise Storage Integration
Move beyond browser storage. We're adding native integration for MinIO and S3-compatible object storage, allowing you to pull and sync datasets directly from your cloud buckets.

#### 👥 Collaboration Features
- **Team Workspaces**: Share projects and annotations across team members
- **Review Mode**: Approve or reject annotations with comment threads
- **Version Control**: Track annotation history and changes over time

#### 📊 Analytics & Insights
- **Annotation Statistics**: Track productivity metrics and dataset balance
- **Quality Checks**: Automated validation for annotation consistency
- **Export Analytics**: Detailed reports on dataset composition


## 🤝 Contributing

Expand All @@ -141,10 +311,17 @@ MIT License - see [LICENSE](LICENSE) file for details.

## References

- [SAM3 Model](https://huggingface.co/facebook/sam3)
- [SAM3 Model (HuggingFace)](https://huggingface.co/facebook/sam3)
- [Ultralytics SAM3 Documentation](https://docs.ultralytics.com/models/sam-3/)
- [T-REX Label](https://www.trexlabel.com/)
- [MakeSense.ai](https://www.makesense.ai/)

## Acknowledgments

- **Meta AI** for the SAM3 (Segment Anything Model 3) architecture
- **Ultralytics** for the excellent SAM3 implementation and PyTorch optimization
- The open-source community for inspiration and tools


---

Expand Down
28 changes: 28 additions & 0 deletions apps/api-inference-yolo/.env.example
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
# Application Settings
DEBUG=true
APP_HOST=0.0.0.0
APP_PORT=8000
LOG_LEVEL=INFO

# HuggingFace Authentication (REQUIRED for SAM3)
# Get your token from: https://huggingface.co/settings/tokens
HF_TOKEN=hf_your_token_here

# SAM3 Model Settings
SAM3_MODEL_NAME=facebook/sam3
SAM3_DEVICE=auto
SAM3_DEFAULT_THRESHOLD=0.5
SAM3_DEFAULT_MASK_THRESHOLD=0.5

# API Limits
MAX_IMAGE_SIZE_MB=10
MAX_BATCH_SIZE=10
MAX_IMAGE_DIMENSION=4096

# Visualization
VISUALIZATION_FORMAT=PNG
VISUALIZATION_QUALITY=95

# CORS
# Must be a JSON array. Examples: ["*"], ["http://localhost:3000","http://localhost:5173"]
ALLOWED_ORIGINS=["*"]
2 changes: 2 additions & 0 deletions apps/api-inference-yolo/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
.venv
.env
1 change: 1 addition & 0 deletions apps/api-inference-yolo/.python-version
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
3.12
45 changes: 45 additions & 0 deletions apps/api-inference-yolo/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
# 1. Change Base Image to NVIDIA CUDA (Devel version ensures compatibility with Transformers/PyTorch extensions)
# Note: Use CUDA 12.1 or 12.4 depending on your PyTorch version requirements. 12.1 is widely supported.
FROM nvidia/cuda:12.1.0-devel-ubuntu22.04

WORKDIR /code

# 2. Install system dependencies
# We need software-properties-common to install Python easily if needed,
# but 'uv' can actually manage Python for us.
RUN apt-get update && apt-get install -y \
git \
curl \
ca-certificates \
libsndfile1 \
&& rm -rf /var/lib/apt/lists/*

# 3. Install uv package manager
COPY --from=ghcr.io/astral-sh/uv:0.9.11 /uv /uvx /bin/

# 4. Configure uv to install Python 3.12 automatically
# Since the base image is Ubuntu without Python 3.12, uv will fetch it.
ENV UV_PYTHON=3.12
ENV UV_COMPILE_BYTECODE=1

# 5. Install dependencies
# Using --frozen ensures we respect the lockfile
RUN --mount=type=bind,source=uv.lock,target=uv.lock \
--mount=type=bind,source=pyproject.toml,target=pyproject.toml \
uv sync --frozen --no-cache

# 6. Install transformers (and ensure PyTorch uses CUDA)
# Note: If you need specific CUDA PyTorch, you might need to define extra-index-url in pyproject.toml
RUN uv pip install git+https://github.com/huggingface/transformers.git

# Copy application code
COPY src/ ./src/

EXPOSE 8000

# Set Python path
ENV PYTHONPATH=/code/src

# 7. Run application
# We use 'uv run' which ensures the correct python environment is used
CMD ["uv", "run", "app/main.py"]
Comment on lines +35 to +45

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Action required

6. Docker cmd wrong path 🐞 Bug ⛯ Reliability

• The api-inference-yolo image copies code under /code/src but runs uv run app/main.py, which
  does not exist at that path.
• This will cause the backend container to fail immediately on startup, blocking the recommended
  backend option.
Agent Prompt
### Issue description
The `apps/api-inference-yolo` Docker image will not start because the `CMD` points at `app/main.py`, but the file is located at `src/app/main.py` inside the container.

### Issue Context
The Dockerfile copies `src/` to `/code/src` and sets `PYTHONPATH=/code/src`, so Python imports should use `app.*`.

### Fix Focus Areas
- apps/api-inference-yolo/Dockerfile[35-45]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

Loading