Skip to content

Commit 3c8f337

Browse files
authored
Merge pull request #120 from FocoosAI/feat/implement-rtmo
Key Changes ✨ Introduce keypoints models add RTMO-S/M/L-COCO keypoint pretrained model example: from focoos import ModelManager from PIL import Image im = "https://public.focoos.ai/samples/federer.jpg" model = ModelManager.get("rtmo-s-coco") detections = model.infer(im,annotate=True, threshold=0.5) Image.fromarray(detections.image) # visualise or save annotated image 📷 Unified Inference API Standardize infer Method Signatures consistent infer() method across FocoosModel, InferModel, and RemoteModel with unified parameters: infer(image, threshold=0.5, annotate=False) and use unified image loader for infer methods (with also remote image support) add default threshold to 0.5 Remove dependency on external annotate_image() function calls Streamlined workflow: get detections and visual annotations in a single call example torch and exported model: from focoos import ModelManager, RuntimeType from PIL import Image im = "https://public.focoos.ai/samples/motogp.jpg" # remote image, can also be local path, numpy array, or PIL image model = ModelManager.get("fai-detr-l-obj365") detections = model.infer(im,annotate=True, threshold=0.5) # annotatate param # Image.fromarray(detections.image) # visualise or save annotated image # export model model = model.export(RuntimeType.ONNX_CUDA32) res = model.infer(im, annotate=True, threshold=0.5) Image.fromarray(detections.image) # visualise or save annotated image example with remote inference: from focoos import FocoosHUB from PIL import Image hub = FocoosHUB() model_ref = "fai-detr-l-obj365" # use any of pretrained model on app.focoos.ai or your own model reference remote_model = hub.get_remote_model(model_ref) im = "https://public.focoos.ai/samples/federer.jpg" detections = remote_model.infer(im,annotate=True, threshold=0.5) Image.fromarray(detections.image) # visualise or save annotated image Enhanced FocoosDetections Structure add new image field: stores annotated results as base64 string or numpy array migrated from Pydantic to pure Python dataclasses for better performance, Improved serialization and memory usage add new keypoints field add pprint and print_infer methods to unify detections prints. ⌨️ CLI add new CLI command: focoos gradio to launch a Gradio interface for image and video inference using Focoos pretrained models. 🕹️ Trainer fix missing model preprocessing when amp=True (Automatic Mixed Precision) is enabled add COSINE scheduler quadratic warmup add KeypointEvaluator enhance logging with additional info Update Visualizer (preview hook) to save RGB images instead of BGR Restore TensorBoard Hook 📖 ModelRegistry model registry now support automatic loading json configs from registry folder instead of declare model configs manually 🏞️ Processor add image_size into init instead of preprocess methods improve image loader performance add non-blocking image transfer optimize preprocessor speed add focoos palette to annotators 📖 Docs add RTMO docs update Readme, Docs and notebook with from focoos import x for all exported classes and functions instead of absolute path
2 parents 1db51b1 + 3244a55 commit 3c8f337

File tree

103 files changed

+8153
-1186
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

103 files changed

+8153
-1186
lines changed

.gitignore

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -90,14 +90,14 @@ notebooks/.data
9090
.venv
9191
/data
9292
tests/junit.xml
93-
notebooks/datasets
94-
notebooks/experiments
9593
site/
9694
/datasets/
9795
/examples/
98-
notebooks/test.ipynb
96+
notebooks/
9997
gradio/output/
10098
tutorials/experiments
101-
experiments/
102-
notebooks/
103-
wandb/
99+
experiments
100+
experiments_debug
101+
*.pth
102+
.vscode
103+

.vscode/settings.json

Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -9,4 +9,20 @@
99
"[python]": {
1010
"editor.defaultFormatter": "charliermarsh.ruff"
1111
},
12+
"cursorpyright.analysis.autoImportCompletions": true,
13+
"cursorpyright.analysis.typeCheckingMode": "basic",
14+
"files.autoSave": "afterDelay",
15+
"files.autoSaveDelay": 1000,
16+
"editor.formatOnSave": true,
17+
"[jupyter-notebook]": {
18+
"files.autoSave": "off",
19+
"editor.formatOnSave": false
20+
},
21+
"jupyter.interactiveWindow.textEditor.executeSelection": true,
22+
"notebook.output.textLineLimit": 30,
23+
"jupyter.askForKernelRestart": false,
24+
"jupyter.alwaysTrustNotebooks": true,
25+
"files.exclude": {
26+
"**/.ipynb_checkpoints": true
27+
}
1228
}

README.md

Lines changed: 8 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,7 @@
1+
<a href="https://www.focoos.ai" target="_blank">
2+
<img src="https://public.focoos.ai/library/focoos_banner.png" alt="FocoosAI" style="max-width:100%;">
3+
</a>
4+
15
![Tests](https://github.com/FocoosAI/focoos/actions/workflows/test.yml/badge.svg??event=push&branch=main)
26
[![Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/FocoosAI/focoos/blob/main/tutorials/training.ipynb)
37
[![Documentation](https://img.shields.io/badge/docs-latest-blue)](https://focoosai.github.io/focoos/)
@@ -16,7 +20,7 @@ Whether you're working in the cloud or on edge devices, the Focoos library seaml
1620
### Key Features 🔑
1721

1822
1. **Frugal Pretrained Models** 🌿
19-
Get started quickly by selecting one of our efficient, [pre-trained models](https://focoosai.github.io/focoos/models/models/) that best suits your data and application needs.
23+
Get started quickly by selecting one of our efficient, [pre-trained models](https://focoosai.github.io/focoos/models/) that best suits your data and application needs.
2024
Focoos Model Registry give access to 11 pretrained models of different size from different families: RTDetr, Maskformer, BisenetFormer
2125

2226
2. **Fine Tune Your Model** ✨ Adapt the model to your specific use case by customize its config and training it on your own dataset.
@@ -41,11 +45,11 @@ uv pip install 'focoos @ git+https://github.com/FocoosAI/focoos.git'
4145
from focoos import ModelManager
4246

4347

44-
im = Image.open("image.jpg")
48+
im = "https://public.focoos.ai/samples/motogp.jpg" # can be local/remote path, np.array, PIL image
4549

4650
model = ModelManager.get("fai-detr-l-obj365") # any models from ModelRegistry, FocoosHub or local folder
4751

48-
detections = model(im)
52+
detections = model.infer(im,annotate=true)
4953

5054
```
5155

@@ -110,7 +114,7 @@ Using Focoos AI helps you save both time and money while delivering high-perform
110114
- **4x Cheaper** 💰: Our models require up to 4x less computational power, letting you save on hardware or cloud bill while ensuring high-quality results.
111115
- **Tons of CO2 saved annually per model** 🌱: Our models are energy-efficient, helping you reduce your carbon footprint by using less powerful hardware with respect to mainstream models.
112116

113-
See the list of our models in the [models](https://focoosai.github.io/focoos/models/models) section.
117+
See the list of our models in the [models](https://focoosai.github.io/focoos/models/) section.
114118

115119
---
116120
### Start now!

docs/api/hub.md

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,3 @@
1-
::: focoos.hub.api_client
21
::: focoos.hub.focoos_hub
32
::: focoos.hub.remote_dataset
43
::: focoos.hub.remote_model

docs/cli.md

Lines changed: 24 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -36,6 +36,9 @@ focoos predict --model fai-detr-m-coco --source image.jpg
3636

3737
# Export model
3838
focoos export --model fai-detr-m-coco --format onnx
39+
40+
# Launch gradio interface
41+
focoos gradio
3942
```
4043

4144
## 📚 Usage
@@ -47,7 +50,7 @@ focoos COMMAND [OPTIONS]
4750
```
4851

4952
Where:
50-
- **COMMAND**: Main operations like `train`, `val`, `predict`, `export`, `benchmark`, `hub`
53+
- **COMMAND**: Main operations like `train`, `val`, `predict`, `export`, `benchmark`, `gradio`, `hub`
5154
- **OPTIONS**: Command-specific flags and parameters with intelligent defaults
5255

5356
## 🛠️ Available Commands
@@ -60,6 +63,9 @@ Where:
6063
| **`predict`** | Run inference on images | `focoos predict --model fai-detr-m-coco --source image.jpg` |
6164
| **`export`** | Export models to different formats | `focoos export --model fai-detr-m-coco --format onnx` |
6265
| **`benchmark`** | Benchmark model performance | `focoos benchmark --model fai-detr-m-coco --iterations 100` |
66+
| **`gradio`** | Launch interactive web interface | `focoos gradio` |
67+
68+
6369

6470
### Hub Commands
6571
| Command | Description | Example |
@@ -217,6 +223,23 @@ focoos hub datasets
217223
focoos hub datasets --include-shared
218224
```
219225

226+
### 🖥️ Interactive Web Interface
227+
228+
```bash
229+
# Launch Gradio web interface
230+
focoos gradio
231+
```
232+
233+
The Gradio interface provides an interactive web-based experience for running inference with Focoos models:
234+
235+
- **Image Inference**: Upload images and run detection/segmentation with real-time results
236+
- **Video Inference**: Process video files with object detection and tracking
237+
- **Model Selection**: Choose from available pretrained models
238+
- **Confidence Tuning**: Adjust detection thresholds interactively
239+
- **Visual Results**: View annotated outputs with bounding boxes and masks
240+
241+
The interface will automatically open in your default web browser, typically at `http://localhost:7860`.
242+
220243
## ⚙️ Configuration Options
221244

222245
### Common Parameters

docs/concepts.md

Lines changed: 16 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -28,6 +28,7 @@ The Focoos Hub is a cloud-based model repository where you can store, share, and
2828
**Requirements**: Valid API key for private models, internet connection for initial download.
2929

3030
```python
31+
from focoos import FocoosHub, ModelManager
3132
# Loading from hub using hub:// protocol
3233
# The model is automatically downloaded and cached locally
3334
hub = FocoosHUB(api_key="your_api_key")
@@ -51,6 +52,7 @@ The Model Registry contains curated, pretrained models that are immediately avai
5152
**Requirements**: No internet connection needed, models are bundled with the library.
5253

5354
```python
55+
from focoos import ModelRegistry, ModelManager
5456
# Loading pretrained models from registry
5557
# Object detection model trained on COCO dataset
5658
model = ModelManager.get("fai-detr-l-coco")
@@ -59,7 +61,6 @@ model = ModelManager.get("fai-detr-l-coco")
5961
model = ModelManager.get("fai-mf-l-ade")
6062

6163
# Check available models first
62-
from focoos import ModelRegistry
6364
available_models = ModelRegistry.list_models()
6465
print("Available models:", available_models)
6566

@@ -133,7 +134,7 @@ model_info = ModelInfo(
133134
model = ModelManager.get("custom_detector", model_info=model_info)
134135
```
135136

136-
### Predict
137+
### Inference
137138

138139
Performs end-to-end inference on input images with automatic preprocessing and postprocessing. The model accepts input images in various formats including:
139140

@@ -151,7 +152,9 @@ The input images are automatically preprocessed to the correct size and format r
151152
This provides a simple, unified interface for running inference regardless of the underlying model architecture or task.
152153

153154
**Parameters:**
154-
- `inputs`: Input images in various supported formats (`PIL.Image.Image`, `numpy.ndarray`, `torch.Tensor`)
155+
- `image`: Input image in various supported formats (`PIL.Image.Image`, `numpy.ndarray`, `torch.Tensor`, local or remote path)
156+
- `threshold`: detections threshold
157+
- `annotate`: if you want to annotate detections on provided image
155158
- `**kwargs`: Additional arguments passed to postprocessing
156159

157160
**Returns:** [`FocoosDetections`](/focoos/api/ports/#focoos.ports.FocoosDetections) containing detection/segmentation results
@@ -161,15 +164,17 @@ This provides a simple, unified interface for running inference regardless of th
161164
from PIL import Image
162165

163166
# Load an image
164-
image = Image.open("example.jpg")
167+
im_path = "example.jpg"
165168

166169
# Run inference
167-
detections = model(image)
170+
detections = model.infer(im_path,threshold=0.5,annotate=True)
168171

169172
# Access results
170173
for detection in detections.detections:
171174
print(f"Class: {detection.label}, Confidence: {detection.conf}")
172175
print(f"Bounding box: {detection.bbox}")
176+
177+
Image.fromarray(detections.image)
173178
```
174179

175180
### Training
@@ -259,7 +264,7 @@ infer_model = model.export(
259264
results = infer_model(input_image)
260265
```
261266

262-
### Predict
267+
### Inference
263268

264269
Performs end-to-end inference on input images with automatic preprocessing and postprocessing on the selected runtime. The model accepts input images in various formats including:
265270

@@ -277,7 +282,9 @@ The input images are automatically preprocessed to the correct size and format r
277282
This provides a simple, unified interface for running inference regardless of the underlying model architecture or task.
278283

279284
**Parameters:**
280-
- `inputs`: Input images in various supported formats (`PIL.Image.Image`, `numpy.ndarray`, `torch.Tensor`)
285+
- `image`: Input image in various supported formats (`PIL.Image.Image`, `numpy.ndarray`, `torch.Tensor`, local or remote path)
286+
- `threshold`: detections threshold
287+
- `annotate`: if you want to annotate detections on provided image
281288
- `**kwargs`: Additional arguments passed to postprocessing
282289

283290
**Returns:** [`FocoosDetections`](/focoos/api/ports/#focoos.ports.FocoosDetections) containing detection/segmentation results
@@ -287,14 +294,14 @@ This provides a simple, unified interface for running inference regardless of th
287294
from PIL import Image
288295

289296
# Load an image
290-
image = Image.open("example.jpg")
297+
image_path = "example.jpg"
291298

292299
# Run inference
293300
infer_model = model.export(
294301
runtime_type=RuntimeType.TORCHSCRIPT_32,
295302
out_dir="./exported_models"
296303
)
297-
detections = infer_model(image)
304+
detections = infer_model.infer(image_path,threshold=0.5, annotate = True)
298305

299306
# Access results
300307
for detection in detections.detections:
File renamed without changes.

docs/hub/remote_inference.md

Lines changed: 3 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -44,7 +44,7 @@ Remote models can also be called directly like functions:
4444

4545
```python
4646
# This is equivalent to calling remote_model.infer()
47-
results = remote_model("path/to/image.jpg", threshold=0.5)
47+
results = remote_model.infer("path/to/image.jpg", threshold=0.5)
4848
```
4949

5050
## Supported Input Types
@@ -129,13 +129,10 @@ for i, detection in enumerate(results.detections):
129129
Visualize results using the built-in utilities:
130130

131131
```python
132-
from focoos import annotate_image
133132

134-
results = model.infer(image=image, threshold=0.5)
133+
results = model.infer(image=image, threshold=0.5,annotate=True)
135134

136-
annotated_image = annotate_image(
137-
im=image, detections=results, task=model.model_info.task, classes=model.model_info.classes
138-
)
135+
Image.fromarray(results.image)
139136
```
140137

141138
## Model Management for Remote Inference

docs/inference.md

Lines changed: 12 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -56,7 +56,7 @@ Using the model is as simple as it could! Just call it with an image.
5656
```python
5757
from PIL import Image
5858
image = Image.open("<PATH-TO-IMAGE>")
59-
detections = model(image)
59+
detections = model.infer(image)
6060
```
6161

6262
`detections` is a [FocoosDetections](/focoos/api/ports/#focoos.ports.FocoosDetections) object, containing a list of [FocoosDet](/focoos/api/ports/#focoos.ports.FocoosDet) objects and optionally a dict of information about the latency of the inference. The `FocoosDet` object contains the following attributes:
@@ -66,13 +66,14 @@ detections = model(image)
6666
- `cls_id`: Class ID (0-indexed).
6767
- `label`: Label (name of the class).
6868
- `mask`: Mask (base64 encoded string having origin in the top left corner of bbox and the same width and height of the bbox).
69+
- `keypoints`: keypoints detected
6970

70-
If you want to visualize the result on the image, there's a utily for you.
71+
If you want to visualize the result on the image,just set annotate=true
7172

7273
```python
73-
from focoos import annotate_image
74-
75-
annotate_image(image, detections, task=model.model_info.task, classes=model.model_info.classes).save("predictions.png")
74+
from PIL import Image
75+
detections = model.infer(image,annotate=True)
76+
Image.fromarray(detections.image)
7677
```
7778

7879
## 2. 🔥 PyTorch Inference
@@ -118,9 +119,9 @@ Now, again, you can now run the model by simply passing it an image and visualiz
118119
```python
119120
from focoos import annotate_image
120121

121-
detections = model(image)
122+
detections = model.infer(image,annotate=True)
122123

123-
annotate_image(image, detections, task=model.model_info.task, classes=model.model_info.classes).save("predictions.png")
124+
Image.fromarray(detections.image)
124125
```
125126

126127
`detections` is a [FocoosDetections](/focoos/api/ports/#focoos.ports.FocoosDetections) object.
@@ -158,15 +159,16 @@ Let's visualize the output. As you will see, there are not differences from the
158159
```python
159160
from focoos import annotate_image
160161

161-
detections = optimized_model(image)
162-
annotate_image(image, detections, task=model.model_info.task, classes=model.model_info.classes).save("prediction.png")
162+
detections = optimized_model(image, annotate = True)
163+
Image.fromarray(detections.image)
163164
```
165+
164166
`detections` is a [FocoosDetections](/focoos/api/ports/#focoos.ports.FocoosDetections) object.
165167

166168

167169
But, let's see its latency, that should be substantially lower than the pure pytorch model.
168170
```python
169-
optimized_model.benchmark(iterations=10, size=512)
171+
optimized_model.benchmark(iterations=10)
170172
```
171173

172174
You can use different runtimes that may fit better your device, such as TensorRT. See the list of available Runtimes at [`RuntimeTypes`](/focoos/api/ports/#focoos.ports.RuntimeType). Please note that you need to install the relative packages for onnx and tensorRT for using them.
Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -39,6 +39,19 @@ With the Focoos SDK, you can take advantage of a collection of foundational mode
3939
| [fai-mf-m-coco-ins](fai_mf.md) | [Mask2Former](https://github.com/facebookresearch/Mask2Former) ([Resnet-101](https://github.com/pytorch/vision/blob/main/torchvision/models/resnet.py)) | Common Objects (80) | [COCO](https://cocodataset.org/#home) | segm/AP: 43.09<br>segm/AP50: 65.87 | 70 |
4040
| [fai-mf-l-coco-ins](fai_mf.md) | [Mask2Former](https://github.com/facebookresearch/Mask2Former) ([Resnet-101](https://github.com/pytorch/vision/blob/main/torchvision/models/resnet.py)) | Common Objects (80) | [COCO](https://cocodataset.org/#home) | segm/AP: 44.23<br>segm/AP50: 67.53 | 55 |
4141

42+
<small> AP = Average Precision averaged by class </small> <br>
43+
<small> AP50 = Average Precision at IoU threshold 0.50 averaged by class </small> <br>
44+
<small> FPS = Frames per second computed using TensorRT with resolution 640x640 </small> <br>
45+
46+
## Keypoint Detection 🥷
47+
48+
| Model Name | Architecture | Domain (Classes) | Dataset | Metric | FPS Nvidia-T4 |
49+
|------------|--------------|------------------|----------|---------|--------------|
50+
| [rtmo-s-coco](rtmo.md) | [RTMO](https://github.com/open-mmlab/mmpose/tree/main/projects/rtmo) ([CSP-Darknet](https://github.com/open-mmlab/mmpose/blob/main/mmpose/models/backbones/csp_darknet.py)) | Persons (1) | [COCO](https://cocodataset.org/#home) | keypoints/AP: 67.94<br>keypoints/AP50: 87.86 | 104 |
51+
| [rtmo-m-coco](rtmo.md) | [RTMO](https://github.com/open-mmlab/mmpose/tree/main/projects/rtmo) ([CSP-Darknet](https://github.com/open-mmlab/mmpose/blob/main/mmpose/models/backbones/csp_darknet.py)) | Persons (1) | [COCO](https://cocodataset.org/#home) | keypoints/AP: 70.94<br>keypoints/AP50: 89.47 | 89 |
52+
| [rtmo-l-coco](rtmo.md) | [RTMO](https://github.com/open-mmlab/mmpose/tree/main/projects/rtmo) ([CSP-Darknet](https://github.com/open-mmlab/mmpose/blob/main/mmpose/models/backbones/csp_darknet.py)) | Persons (1) | [COCO](https://cocodataset.org/#home) | keypoints/AP: 72.14<br>keypoints/AP50: 89.85 | 63 |
53+
54+
4255
<small> AP = Average Precision averaged by class </small> <br>
4356
<small> AP50 = Average Precision at IoU threshold 0.50 averaged by class </small> <br>
4457
<small> FPS = Frames per second computed using TensorRT with resolution 640x640 </small> <br>

0 commit comments

Comments
 (0)