Skip to content

Commit c65c655

Browse files
authored
Merge pull request #667 from danactive/moveRaw
Move raw
2 parents baf12e0 + 56a5495 commit c65c655

File tree

7 files changed

+107
-47
lines changed

7 files changed

+107
-47
lines changed

.github/workflows/ci.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -51,7 +51,7 @@ jobs:
5151
run: npm run typecheck --if-present
5252

5353
unittest:
54-
name: Jest
54+
name: Unit tests
5555
runs-on: ubuntu-latest
5656
steps:
5757
- name: ⬇️ Checkout repo

Makefile

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -33,7 +33,8 @@ build-ai-api:
3333
ai-api:
3434
# OpenAI model stores in ~/.cache/clip
3535
docker run --rm --name ai-api -p 8080:8080 \
36-
-v $(HOME)/.cache/clip:/root/.cache/clip
36+
-v $(HOME)/.cache/clip:/root/.cache/clip \
37+
ai-api
3738

3839
build-test:
3940
docker build -f apps/api/Dockerfile --build-arg INSTALL_TEST=true -t ai-api-test .

apps/api/README.md

Lines changed: 74 additions & 28 deletions
Original file line numberDiff line numberDiff line change
@@ -1,45 +1,91 @@
1-
# 🌿 iNaturalist 2021 Image Classifier (FastAPI + EVA-CLIP)
1+
# 🌿 iNaturalist 2021 Image Classifier & LAION Aesthetic Scoring API
22

3-
This project serves a high-accuracy image classification API using a **Vision Transformer** model fine-tuned on the **iNaturalist 2021** biodiversity dataset. It supports top-k prediction and an optional debug mode with detailed logits, scores, and resized input images.
3+
This project provides a robust FastAPI-based backend for two advanced computer vision endpoints:
44

5-
## 🧠 Model Details
5+
- **Biodiversity Image Classification** using a fine-tuned Vision Transformer (ViT) on the iNaturalist 2021 dataset.
6+
- **Aesthetic Scoring** using the LAION regression head on OpenAI CLIP ViT-B/16 features.
67

7-
- **Model Family**: [`timm`](https://github.com/rwightman/pytorch-image-models)
8-
- **Model Name**: `eva02_large_patch14_clip_336.merged2b_ft_inat21`
9-
- **Source**: Hugging Face Hub via `hf-hub:timm/eva02_large_patch14_clip_336.merged2b_ft_inat21`
10-
- **Architecture**: Vision Transformer (EVA-CLIP backbone)
11-
- **Pretraining**: Internally pre-trained CLIP-like architecture
12-
- **Fine-tuned On**: iNaturalist 2021 (10,000+ species of plants, animals, fungi, and microbes)
13-
- **Output Classes**: Mapped using `inat21_class_index.json`
14-
- **Label URL**: Provided via `model.default_cfg['label_url']`
8+
Both endpoints support raw image uploads, and leverage state-of-the-art models for their respective tasks while keeping our data private.
159

16-
## 🖼️ Input Format
10+
---
1711

18-
- Accepts raw image bytes (e.g., `image/jpeg`, `image/png`)
19-
- Auto-converted to RGB using Pillow
20-
- Resized to 384x384, then center cropped to 336x336
21-
- Normalized using CLIP-style mean and std values:
22-
- `mean = [0.48145466, 0.4578275, 0.40821073]`
23-
- `std = [0.26862954, 0.26130258, 0.27577711]`
12+
## 🧠 Model Details
2413

25-
## CLI commands
26-
- `make build-ai-api`
27-
- `make ai-api`
14+
### 1. Biodiversity Classifier
2815

16+
- **Model Family:** [`timm`](https://github.com/rwightman/pytorch-image-models)
17+
- **Model Name:** `eva02_large_patch14_clip_336.merged2b_ft_inat21`
18+
- **Source:** Hugging Face Hub
19+
- **Architecture:** Vision Transformer (EVA-CLIP backbone)
20+
- **Fine-tuned On:** iNaturalist 2021 (10,000+ species)
21+
- **Output Classes:** Mapped using `inat21_class_index.json`
2922

30-
# Aesthetic Scoring
23+
### 2. Aesthetic Scoring
3124

32-
This project provides an API for **aesthetic scoring** of images using a regression head trained on top of OpenAI's CLIP ViT-B/16 backbone.
25+
- **Backbone:** OpenAI CLIP ViT-B/16
26+
- **Regression Head:** Multilayer Perceptron (MLP) trained for aesthetic prediction ([LAION aesthetic predictor](https://github.com/LAION-AI/aesthetic-predictor))
27+
- **Head Weights:** `models/aesthetic/sa_0_4_vit_b_16_linear.pth`
28+
- **Feature Dimension:** 512
3329

3430
---
3531

36-
## 🧠 Model Details
32+
## 🚀 API Endpoints
3733

38-
- **Backbone:** OpenAI CLIP ViT-B/16
39-
- **Regression Head:** Multilayer Perceptron (MLP) trained for aesthetic prediction
40-
- **Head Weights:** `models/aesthetic/sa_0_4_vit_b_16_linear.pth`
41-
- **Feature Dimension:** 512
34+
### 1. `/classify` — Biodiversity Image Classification
4235

36+
**Description:**
37+
Predicts the top-3 most likely species for a given image using a ViT model fine-tuned on iNaturalist 2021.
38+
39+
**Request:**
40+
- **Method:** `POST`
41+
- **Content-Type:** `image/jpeg` or `image/png`
42+
- **Body:** Raw image bytes
43+
44+
**Example (using curl):**
45+
```sh
46+
curl -X POST -H "Content-Type: image/jpeg" --data-binary @your_image.jpg http://localhost:8080/classify
47+
```
48+
49+
**Response:**
50+
- **Status:** 200 OK
51+
- **Content-Type:** `application/json`
52+
- **Body:** JSON object with top-3 species predictions, e.g.,
53+
```json
54+
{
55+
"predictions": [
56+
{"species": "Cardinalis cardinalis", "score": 0.987},
57+
{"species": "Pica pica", "score": 0.005},
58+
{"species": "Corvus corax", "score": 0.003}
59+
]
60+
}
61+
```
62+
63+
### 2. `/score` — Aesthetic Scoring
64+
65+
**Description:**
66+
Predicts the aesthetic score of an image on a scale from 0 to 10 using the LAION regression head.
67+
68+
**Request:**
69+
- **Method:** `POST`
70+
- **Content-Type:** `image/jpeg` or `image/png`
71+
- **Body:** Raw image bytes
72+
73+
**Example (using curl):**
74+
```sh
75+
curl -X POST -H "Content-Type: image/jpeg" --data-binary @your_image.jpg http://localhost:8080/score
76+
```
77+
78+
**Response:**
79+
- **Status:** 200 OK
80+
- **Content-Type:** `application/json`
81+
- **Body:** JSON object with the aesthetic score, e.g.,
82+
```json
83+
{
84+
"score": 7.5
85+
}
86+
```
87+
88+
---
4389

4490
## Local setup
4591

apps/api/aesthetic.py

Lines changed: 2 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,6 @@
22
import torch
33
import torch.nn as nn
44
import torchvision.transforms as T
5-
import open_clip
65
from PIL import Image
76
import logging
87
from collections import OrderedDict
@@ -15,7 +14,6 @@
1514
logger.setLevel(logging.DEBUG)
1615

1716
HEAD_PATH = "models/aesthetic/sa_0_4_vit_b_16_linear.pth"
18-
CHECKPOINT_PATH = "models/openai_CLIP-ViT-L-16/open_clip_pytorch_model.bin"
1917

2018
device = "cuda" if torch.cuda.is_available() else "cpu"
2119

@@ -78,6 +76,7 @@ async def score_aesthetic(req: Request) -> float:
7876
image_tensor = preprocess(img).unsqueeze(0)
7977
image_features = _clip_model.encode_image(image_tensor)
8078
image_features /= image_features.norm(dim=-1, keepdim=True)
81-
score = regression_head(image_features).item()
79+
score_tensor = regression_head(image_features)
80+
score = score_tensor.item()
8281

8382
return float(score)

apps/api/requirements.txt

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,9 +1,9 @@
11
fastapi==0.115.12
2-
open_clip_torch==2.24.0
2+
git+https://github.com/openai/CLIP.git
3+
numpy==1.26.4
34
pillow==11.2.1
45
scikit-learn==1.5.0
56
timm==1.0.15
6-
torch==2.7.1
7-
torchvision==0.22.1
7+
torch==2.0.1
8+
torchvision==0.15.2
89
uvicorn==0.34.3
9-
git+https://github.com/openai/CLIP.git

src/lib/__tests__/rename-moveRaw.vitest.ts

Lines changed: 11 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -22,22 +22,24 @@ describe('moveRaws function', () => {
2222
beforeEach(() => {
2323
vi.resetAllMocks()
2424
originalPath = '/'
25-
filesOnDisk = ['image.heic', 'photo.heif', 'document.txt']
25+
filesOnDisk = ['image.heic', 'photo.heif', 'clip.raw', 'movie.mov', 'document.txt', 'photo.jpg']
2626
errors = []
2727
formatErrorMessage = vi.fn((err, msg) => `${msg}: ${err.message}`)
2828
})
2929

3030

31-
test("should create 'raws' folder and move HEIC/HEIF files", async () => {
31+
test("should create 'raws' folder and move all configured raw files", async () => {
3232
vi.mocked(fs.mkdir).mockResolvedValue(undefined)
3333
vi.mocked(fs.rename).mockResolvedValue(undefined)
3434

3535
await moveRaws({ originalPath, filesOnDisk, errors, formatErrorMessage })
3636

3737
expect(fs.mkdir).toHaveBeenCalledWith(path.join(originalPath, 'raws'), { recursive: true })
38-
expect(fs.rename).toHaveBeenCalledTimes(2)
38+
expect(fs.rename).toHaveBeenCalledTimes(4)
3939
expect(fs.rename).toHaveBeenCalledWith(path.join(originalPath, 'image.heic'), path.join(originalPath, 'raws/image.heic'))
4040
expect(fs.rename).toHaveBeenCalledWith(path.join(originalPath, 'photo.heif'), path.join(originalPath, 'raws/photo.heif'))
41+
expect(fs.rename).toHaveBeenCalledWith(path.join(originalPath, 'clip.raw'), path.join(originalPath, 'raws/clip.raw'))
42+
expect(fs.rename).toHaveBeenCalledWith(path.join(originalPath, 'movie.mov'), path.join(originalPath, 'raws/movie.mov'))
4143
expect(errors).toHaveLength(0)
4244
})
4345

@@ -47,16 +49,17 @@ describe('moveRaws function', () => {
4749

4850
await moveRaws({ originalPath, filesOnDisk, errors, formatErrorMessage })
4951

50-
expect(errors).toHaveLength(2)
51-
expect(formatErrorMessage).toHaveBeenCalledTimes(2)
52-
expect(errors[0]).toContain('Error moving HEIF file: image.heic')
52+
// Only raw files should trigger errors
53+
expect(errors).toHaveLength(4)
54+
expect(formatErrorMessage).toHaveBeenCalledTimes(4)
55+
expect(errors[0]).toContain('Error moving raw file: image.heic')
5356
})
5457

55-
test('should not move non-HEIF files', async () => {
58+
test('should not move files not in config', async () => {
5659
vi.mocked(fs.mkdir).mockResolvedValue(undefined)
5760
vi.mocked(fs.rename).mockResolvedValue(undefined)
5861

59-
filesOnDisk = ['document.txt', 'photo.jpg'] // No HEIC/HEIF files
62+
filesOnDisk = ['document.txt', 'photo.jpg'] // No raw files
6063

6164
await moveRaws({ originalPath, filesOnDisk, errors, formatErrorMessage })
6265

src/lib/rename.ts

Lines changed: 13 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,7 @@ import { validateRequestBody, type RequestSchema } from '../models/rename'
66
import checkPathExists from './exists'
77
import { futureFilenamesOutputs } from './filenames'
88
import { type ErrorFormatter } from './resize'
9+
import config from '../models/config'
910

1011
type ResponseBody = {
1112
renamed: boolean;
@@ -125,16 +126,26 @@ async function moveRaws(
125126
const rawsPath = path.join(path.dirname(originalPath), 'raws')
126127
await fs.mkdir(rawsPath, { recursive: true })
127128

129+
// Collect all raw extensions from config (lowercase, with dot)
130+
const rawExtensions = new Set(
131+
[
132+
...['heic', 'heif'],
133+
...(config.rawFileTypes?.photo ?? []),
134+
...(config.rawFileTypes?.video ?? []),
135+
].map(ext => `.${ext.toLowerCase()}`)
136+
)
137+
128138
for (const file of filesOnDisk) {
129-
if (file.toLowerCase().endsWith('.heic') || file.toLowerCase().endsWith('.heif')) {
139+
const ext = path.extname(file).toLowerCase()
140+
if (rawExtensions.has(ext)) {
130141
const sourceFile = path.join(originalPath, file)
131142
const destinationFile = path.join(rawsPath, file)
132143

133144
try {
134145
await fs.rename(sourceFile, destinationFile) // Move file
135146
console.log(`Moved: ${file} → raws`)
136147
} catch (err) {
137-
errors.push(formatErrorMessage(err, `Error moving HEIF file: ${file}`))
148+
errors.push(formatErrorMessage(err, `Error moving raw file: ${file}`))
138149
}
139150
}
140151
}

0 commit comments

Comments
 (0)