Skip to content

Commit f6577ff

Browse files
Merge pull request #154 from FocoosAI/feat/export-quant
feat: add classification, quantization and simplify
2 parents cc6ae8b + 249593c commit f6577ff

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

46 files changed

+3001
-796
lines changed

docs/inference.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
# How to Use a Computer Vision Model with Focoos
1+
# Inference with Focoos Models
22
Focoos provides a powerful inference framework that makes it easy to deploy and use state-of-the-art computer vision models in production. Whether you're working on object detection, image classification, or other vision tasks, Focoos offers flexible deployment options that adapt to your specific needs.
33

44
[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/FocoosAI/focoos/blob/main/tutorials/inference.ipynb)

docs/models/bisenetformer.md

Lines changed: 6 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -121,18 +121,21 @@ Currently, you can find 3 bisenetformer models on the Focoos Hub, all for the se
121121
### Quick Start with Pre-trained Model
122122

123123
```python
124-
from focoos.model_manager import ModelManager
124+
from focoos import ASSETS_DIR, ModelManager
125+
from PIL import Image
125126

126127
# Load a pre-trained BisenetFormer model
127128
model = ModelManager.get("bisenetformer-m-ade")
128129

129130
# Run inference on an image
130-
image = Image.open("path/to/image.jpg")
131-
result = model(image)
131+
image = ASSETS_DIR / "ADE_val_00000034"
132+
result = model.infer(image, threshold=0.5, annotate=True)
132133

133134
# Process results
134135
for detection in result.detections:
135136
print(f"Class: {detection.label}, Confidence: {detection.conf:.3f}")
137+
# Visualize image
138+
Image.fromarray(result.image)
136139
```
137140

138141
### Custom Model Configuration

docs/models/fai_cls.md

Lines changed: 56 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -2,10 +2,25 @@
22

33
## Overview
44

5-
FAI-CLS is a versatile image classification model developed by FocoosAI that can utilize any backbone architecture for feature extraction. This model is designed for both single-label and multi-label image classification tasks, offering flexibility in architecture choices and training configurations.
5+
Fai-cls is a versatile image classification model developed by FocoosAI that can utilize any backbone architecture for feature extraction. This model is designed for both single-label and multi-label image classification tasks, offering flexibility in architecture choices and training configurations.
66

77
The model employs a simple yet effective approach: a configurable backbone extracts features from input images, followed by a classification head that produces class predictions. This design enables easy adaptation to different domains and datasets while maintaining high performance and computational efficiency.
88

9+
## Available Models
10+
11+
Currently, you can find 3 fai-cls models on the Focoos Hub, all trained on COCO dataset for image classification.
12+
13+
| Model Name | Architecture | Domain (Classes) | Dataset | Metric | FPS Nvidia-T4 |
14+
|------------|--------------|------------------|----------|---------|--------------|
15+
| fai-cls-n-coco | Classification (STDC-Small) | Common Objects (80) | [COCO](https://cocodataset.org/#home) | F1: 48.66<br>Precision: 58.48<br>Recall: 41.66 | - |
16+
| fai-cls-s-coco | Classification (STDC-Small) | Common Objects (80) | [COCO](https://cocodataset.org/#home) | F1: 61.92<br>Precision: 68.69<br>Recall: 56.37 | - |
17+
| fai-cls-m-coco | Classification (STDC-Large) | Common Objects (80) | [COCO](https://cocodataset.org/#home) | F1: 66.98<br>Precision: 73.00<br>Recall: 61.88 | - |
18+
19+
## Supported dataset
20+
- [ROBOFLOW_COCO](/focoos/api/ports/#focoos.ports.DatasetLayout) (multi-class)
21+
22+
- [CLASSIFICATION_FOLDER](/focoos/api/ports/#focoos.ports.DatasetLayout)
23+
924
## Neural Network Architecture
1025

1126
The FAI-CLS architecture consists of two main components:
@@ -20,12 +35,14 @@ The FAI-CLS architecture consists of two main components:
2035
### Classification Head
2136
- **Architecture**: Multi-layer perceptron (MLP) with configurable depth
2237
- **Components**:
38+
2339
- Global Average Pooling (AdaptiveAvgPool2d) for spatial dimension reduction
2440
- Flatten layer to convert 2D features to 1D
2541
- Linear layers with ReLU activation
2642
- Dropout for regularization
2743
- Final linear layer for class predictions
2844
- **Configurations**:
45+
2946
- **Single Layer**: Direct mapping from features to classes
3047
- **Two Layer**: Hidden layer with ReLU and dropout for better feature transformation
3148

@@ -53,22 +70,26 @@ The FAI-CLS architecture consists of two main components:
5370
### Single-Label Classification
5471
- **Output**: Single class prediction per image
5572
- **Use Cases**:
56-
- Image categorization (animals, objects, scenes)
57-
- Medical image diagnosis
58-
- Quality control in manufacturing
59-
- Content moderation
60-
- Agricultural crop classification
73+
74+
- Image categorization (animals, objects, scenes)
75+
- Medical image diagnosis
76+
- Quality control in manufacturing
77+
- Content moderation
78+
- Agricultural crop classification
79+
6180
- **Loss**: Cross-entropy or focal loss
6281
- **Configuration**: Set `multi_label=False`
6382

6483
### Multi-Label Classification
6584
- **Output**: Multiple class predictions per image
6685
- **Use Cases**:
67-
- Multi-object recognition
68-
- Image tagging and annotation
69-
- Scene attribute recognition
70-
- Medical condition classification
71-
- Content-based image retrieval
86+
87+
- Multi-object recognition
88+
- Image tagging and annotation
89+
- Scene attribute recognition
90+
- Medical condition classification
91+
- Content-based image retrieval
92+
7293
- **Loss**: Binary cross-entropy with logits
7394
- **Configuration**: Set `multi_label=True`
7495

@@ -96,12 +117,6 @@ The model supports multiple loss function configurations:
96117
- **Features**: Optional label smoothing for better generalization
97118
- **Activation**: Softmax for probability distribution
98119

99-
### Focal Loss
100-
- **Use Case**: Imbalanced datasets with hard-to-classify examples
101-
- **Parameters**:
102-
- Alpha (α): Controls importance of rare class
103-
- Gamma (γ): Focuses learning on hard examples
104-
- **Benefits**: Improved performance on imbalanced datasets
105120

106121
### Binary Cross-Entropy Loss
107122
- **Use Case**: Multi-label classification tasks
@@ -144,7 +159,30 @@ AdaptiveAvgPool2d(1) → Flatten → Linear(features → hidden_dim) → ReLU
144159
This flexible architecture makes FAI-CLS suitable for a wide range of image classification applications, from simple binary classification to complex multi-label scenarios, while maintaining computational efficiency and ease of use.
145160

146161

147-
## Example Usage
162+
### Quick Start with Pre-trained Model
163+
164+
```python
165+
from focoos import ASSETS_DIR, ModelManager
166+
from PIL import Image
167+
168+
# Load a pre-trained model
169+
model = ModelManager.get("fai-cls-m-coco")
170+
171+
image = ASSETS_DIR / "federer.jpg"
172+
result = model.infer(image,threshold=0.5, annotate=True)
173+
174+
# Process results
175+
for detection in result.detections:
176+
print(f"Class: {detection.label}, Confidence: {detection.conf:.3f}")
177+
178+
# Visualize image
179+
Image.fromarray(result.image)
180+
181+
```
182+
For the training process, please refer to the specific section of the documentation.
183+
184+
185+
## Custom Model Configuration
148186

149187
### Single-Label Classification Setup
150188

docs/models/fai_detr.md

Lines changed: 7 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -133,18 +133,22 @@ Currently, you can find 3 fai-detr models on the Focoos Hub, 2 trained on COCO a
133133
### Quick Start with Pre-trained Model
134134

135135
```python
136-
from focoos.model_manager import ModelManager
136+
from focoos import ASSETS_DIR, ModelManager
137+
from PIL import Image
137138

138139
# Load a pre-trained model
139140
model = ModelManager.get("fai-detr-m-coco")
140141

141142
# Run inference on an image
142-
image = Image.open("path/to/image.jpg")
143-
result = model(image)
143+
image = ASSETS_DIR / "federer.jpg"
144+
result = model.infer(image,threshold=0.5, annotate=True)
144145

145146
# Process results
146147
for detection in result.detections:
147148
print(f"Class: {detection.label}, Confidence: {detection.conf:.3f}")
149+
150+
# Visualize image
151+
Image.fromarray(result.image)
148152
```
149153

150154
### Custom Model Configuration

docs/models/fai_mf.md

Lines changed: 7 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -115,18 +115,22 @@ Currently, you can find 5 fai-mf models on the Focoos Hub, 2 for semantic segmen
115115
### Quick Start with Pre-trained Model
116116

117117
```python
118-
from focoos.model_manager import ModelManager
118+
from focoos import ASSETS_DIR, ModelManager
119+
from PIL import Image
119120

120121
# Load a pre-trained BisenetFormer model
121122
model = ModelManager.get("fai-mf-l-ade")
122123

123124
# Run inference on an image
124-
image = Image.open("path/to/image.jpg")
125-
result = model(image)
125+
image = ASSETS_DIR / "ADE_val_00000034"
126+
result = model.infer(image,threshold=0.5, annotate=True)
126127

127128
# Process results
128129
for detection in result.detections:
129130
print(f"Class: {detection.label}, Confidence: {detection.conf:.3f}")
131+
132+
# Visualize image
133+
Image.fromarray(result.image)
130134
```
131135

132136
### Custom Model Configuration

docs/models/rtmo.md

Lines changed: 6 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -127,18 +127,21 @@ The following RTMO models are available on the Focoos Hub for multi-person pose
127127
```python
128128
from PIL import Image
129129

130-
from focoos.model_manager import ModelManager
130+
from focoos import ModelManager, ASSETS_DIR
131131

132132
# Load a pre-trained RTMO model
133133
model = ModelManager.get("rtmo-s-coco")
134134

135135
# Run inference on an image
136-
image = Image.open("path/to/image.jpg")
137-
result = model.infer(image)
136+
image = ASSETS_DIR / "federer.jpg"
137+
result = model.infer(image,threshold=0.5, annotate=True)
138138

139139
# Process results
140140
for detection in result.detections:
141141
print(f"Class: {detection.label}, Confidence: {detection.conf:.3f}")
142+
143+
# Visualize image
144+
Image.fromarray(result.image)
142145
```
143146

144147
### Custom Model Configuration

docs/quantization.md

Lines changed: 61 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,61 @@
1+
# Quantization (Beta)
2+
3+
The quantization of Focoos models is currently in working in progress stage.
4+
5+
currently tested and working for **classification models**.
6+
7+
## Example
8+
9+
```python
10+
from focoos import ModelManager,ASSETS_DIR, MODELS_DIR, RuntimeType
11+
from focoos.infer.quantizer import OnnxQuantizer, QuantizationCfg
12+
import os
13+
14+
image_size = 224 # 224px input size
15+
model_name = "fai-cls-m-coco" # you can also take model from focoos hub with "hub://YOUR_MODEL_REF"
16+
im = ASSETS_DIR / "federer.jpg"
17+
18+
model = ModelManager.get(model_name)
19+
20+
exported_model = model.export(runtime_type=RuntimeType.ONNX_CPU, # optimized for edge or cpu
21+
image_size=image_size,
22+
dynamic_axes=False, # quantization need static axes!
23+
simplify_onnx=False, # simplify and optimize onnx model graph
24+
onnx_opset=18,
25+
out_dir=os.path.join(MODELS_DIR, "my_edge_model")) # save to models dir
26+
27+
# benchmark onnx model
28+
exported_model.benchmark(iterations=100)
29+
30+
# test onnx model
31+
32+
result = exported_model.infer(im,annotate=True)
33+
Image.fromarray(result.image)
34+
35+
36+
quantization_cfg = QuantizationCfg(
37+
size = image_size, # input size: must be same as exported model
38+
calibration_images_folder = str(ASSETS_DIR), # Calibration images folder: It is strongly recommended
39+
# to use the dataset validation split on which the model was trained.
40+
# Here, for example, we will use the assets folder.
41+
format="QDQ", # QO (QOperator): All the quantized operators have their own ONNX definitions, like QLinearConv, MatMulInteger etc.
42+
# QDQ (Quantize-DeQuantize): inserts DeQuantizeLinear(QuantizeLinear(tensor)) between the original operators to simulate the quantization and dequantization process.
43+
per_channel=True, # Per-channel quantization: each channel has its own scale/zero-point → more accurate,
44+
# especially for convolutions, at the cost of extra memory and computation.
45+
normalize_images=True, # normalize images during preprocessing: some models have normalization outside of model forward
46+
)
47+
48+
quantizer = OnnxQuantizer(
49+
input_model_path=exported_model.model_path,
50+
cfg=quantization_cfg
51+
)
52+
model_path = quantizer.quantize(
53+
benchmark=True # benchmark bot fp32 and int8 models
54+
)
55+
56+
quantized_model = InferModel(model_path, runtime_type=RuntimeType.ONNX_CPU)
57+
58+
res = quantized_model.infer(im,annotate=True)
59+
Image.fromarray(res.image)
60+
61+
```

focoos/__init__.py

Lines changed: 0 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -66,7 +66,6 @@
6666
from .utils.system import get_cpu_name, get_cuda_version, get_device_name, get_system_info
6767
from .utils.timer import took
6868
from .utils.vision import (
69-
annotate_frame,
7069
annotate_image,
7170
base64mask_to_mask,
7271
binary_mask_to_base64,
@@ -165,7 +164,6 @@
165164
"get_device_name",
166165
"get_gpus_count",
167166
"get_cuda_version",
168-
"annotate_frame",
169167
"annotate_image",
170168
"ModelRegistry",
171169
"InferLatency",

focoos/data/auto_dataset.py

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -22,7 +22,7 @@
2222
is_inside_sagemaker,
2323
)
2424

25-
logger = get_logger(__name__)
25+
logger = get_logger("AutoDataset")
2626

2727

2828
class AutoDataset:
@@ -61,23 +61,23 @@ def __init__(
6161
self.dataset_path = str(dataset_path)
6262
self.dataset_name = dataset_name
6363
logger.info(
64-
f"✅ Dataset name: {self.dataset_name}, Dataset Path: {self.dataset_path}, Dataset Layout: {self.layout}"
64+
f"🔄 Loading dataset {self.dataset_name}, 📁 Dataset Path: {self.dataset_path}, 🗂️ Dataset Layout: {self.layout}"
6565
)
6666

6767
def _load_split(self, dataset_name: str, split: DatasetSplitType) -> DictDataset:
6868
if self.layout == DatasetLayout.CATALOG:
69-
return DictDataset.from_catalog(ds_name=dataset_name, split=split, root=self.dataset_path)
69+
return DictDataset.from_catalog(ds_name=dataset_name, split_type=split, root=self.dataset_path)
7070
else:
7171
ds_root = self.dataset_path
7272
if not check_folder_exists(ds_root):
7373
raise FileNotFoundError(f"Dataset {ds_root} not found")
7474
split_path = self._get_split_path(dataset_root=ds_root, split_type=split)
7575
if self.layout == DatasetLayout.ROBOFLOW_SEG:
76-
return DictDataset.from_roboflow_seg(ds_dir=split_path, task=self.task)
76+
return DictDataset.from_roboflow_seg(ds_dir=split_path, task=self.task, split_type=split)
7777
elif self.layout == DatasetLayout.CLS_FOLDER:
78-
return DictDataset.from_folder(root_dir=split_path)
78+
return DictDataset.from_folder(root_dir=split_path, split_type=split)
7979
elif self.layout == DatasetLayout.ROBOFLOW_COCO:
80-
return DictDataset.from_roboflow_coco(ds_dir=split_path, task=self.task)
80+
return DictDataset.from_roboflow_coco(ds_dir=split_path, task=self.task, split_type=split)
8181
else: # Focoos
8282
raise NotImplementedError(f"Dataset layout {self.layout} not implemented")
8383

focoos/data/converters.py

Lines changed: 7 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -17,7 +17,7 @@
1717

1818
from focoos.data.datasets.dict_dataset import DictDataset
1919
from focoos.data.transforms.resize_short_length import resize_shortest_length
20-
from focoos.ports import DatasetMetadata, Task
20+
from focoos.ports import DatasetMetadata, DatasetSplitType, Task
2121
from focoos.utils.logger import get_logger
2222
from focoos.utils.system import list_files_with_extensions
2323

@@ -450,10 +450,14 @@ def convert_datasetninja_to_mask_dataset(
450450
)
451451

452452
task = Task.SEMSEG
453-
train_dataset = DictDataset.from_segmentation(ds_dir=os.path.join(dataset_path, train_split_name), task=task)
453+
train_dataset = DictDataset.from_segmentation(
454+
ds_dir=os.path.join(dataset_path, train_split_name), task=task, split_type=DatasetSplitType.TRAIN
455+
)
454456
logger.info(f"Train dataset: {train_dataset}")
455457

456-
val_dataset = DictDataset.from_segmentation(ds_dir=os.path.join(dataset_path, val_split_name), task=task)
458+
val_dataset = DictDataset.from_segmentation(
459+
ds_dir=os.path.join(dataset_path, val_split_name), task=task, split_type=DatasetSplitType.VAL
460+
)
457461
logger.info(f"Val dataset: {val_dataset}")
458462

459463
for split in [(train_dataset, "train"), (val_dataset, "val")]:

0 commit comments

Comments
 (0)