-
Notifications
You must be signed in to change notification settings - Fork 4
Document ultra light face detector without postproc variant #14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,52 @@ | ||
| # Classification | ||
| - tensor-group: yes | ||
| - layer-type: output | ||
| - use-case: face-detection | ||
|
|
||
| # Description | ||
|
|
||
| A specialized model that detects faces | ||
|
|
||
| Ultra-Light-Fast-Generic-Face-Detector Output Tensors: | ||
|
|
||
| | Name |shape | | ||
| |--- |--- | | ||
| | [boxes] |1 x COUNT x 4 | | ||
| | [scores] |1 x COUNT x 2 | | ||
|
|
||
| Where COUNT is a value selected at training time. | ||
|
|
||
| # Tensor Decoding Logic | ||
|
|
||
| ``` | ||
| Foreach i in count: | ||
| X = boxes_processed(i, 0) | ||
| Y = boxes_processed(i, 1) | ||
| W = boxes_processed(i, 2) | ||
| H = boxes_processed(i, 3) | ||
| S = scores[i][1] | ||
| If S > threshold: | ||
| detection_candidates[j++] = [X, Y, W, H, S] | ||
| detections = non_max_suppression(detection_candidates) | ||
|
|
||
| ``` | ||
|
|
||
| Where X, Y, W and H are values between 0 and 1. The boxes have been processed | ||
| from the anchors. | ||
|
|
||
| # External References | ||
| * https://github.com/Linzaer/Ultra-Light-Fast-Generic-Face-Detector-1MB | ||
|
|
||
| # Models | ||
| * [Model source](https://github.com/Linzaer/Ultra-Light-Fast-Generic-Face-Detector-1MB/tree/master) | ||
| * [ONNX pre-trained model](https://github.com/Linzaer/Ultra-Light-Fast-Generic-Face-Detector-1MB/tree/master/models/) | ||
|
|
||
| # Tensor Decoders | ||
| |Framework | Links | | ||
| |--- |--- | | ||
| |GStreamer | [perm](https://gitlab.freedesktop.org/gstreamer/gstreamer/-/blob/main/subprojects/gst-plugins-bad/gst/tensordecoders/gstfacedetectortensordecoder.c) | | ||
|
|
||
|
|
||
| [boxes]: /tensors/ultra-lightweight-face-detection-rfb-320-v1-variant-1-out-boxes-without-postproc.md | ||
| [scores]: /tensors/ultra-lightweight-face-detection-rfb-320-v1-variant-1-out-scores.md | ||
|
|
| Original file line number | Diff line number | Diff line change | ||||
|---|---|---|---|---|---|---|
| @@ -0,0 +1,80 @@ | ||||||
| # Classification | ||||||
|
|
||||||
| - tensor-group: no | ||||||
| - layer-type: output | ||||||
| - use-case: face-detection | ||||||
| - part-of-tensor-groups: | ||||||
| - [ultra-lightweight-face-detection-rfb-320-v1-without-postproc-out](/tensor-groups/ultra-lightweight-face-detection-rfb-320-v1-without-postproc-out.md) | ||||||
|
|
||||||
| # Description | ||||||
| Location of faces detected | ||||||
|
|
||||||
| ## Boxes Tensor | ||||||
|
|
||||||
| - tensor-shape: 1 x COUNT x 4 | ||||||
| - tensor-datatype: float32 | ||||||
| - tensor-id: ultra-lightweight-face-detection-rfb-320-v1-variant-1-out-boxes-without-postproc | ||||||
|
|
||||||
| ### Encoding | ||||||
|
|
||||||
| Scheme: (top left X, top left Y, width, height) | ||||||
|
|
||||||
| The COUNT is variable and depends on the model training. | ||||||
|
|
||||||
| Other constants are extracted from the training process: | ||||||
| - CENTER_VARIANCE = 0.1 | ||||||
| - SIZE_VARIANCE = 0.2 | ||||||
| - COUNT | ||||||
|
|
||||||
| The COUNT can be retrieved from the tensor dimensions. | ||||||
|
|
||||||
| Each entry in the table is the variance on the matching anchor which | ||||||
| is also defined at training time. Each anchor is defined as the | ||||||
| following tuple: | ||||||
| `(anchor-center-x, anchor-center-y, anchor-width anchor-height)` | ||||||
|
|
||||||
| Once the following formulas have been applied, the bounding boxes | ||||||
| center is calculated along with its size. | ||||||
|
|
||||||
| ``` | ||||||
| bounding-box-center-x = top-left-x * CENTER_VARIANCE * anchor-width + anchor-center-x | ||||||
| bounding-box-center-y = top-left-y * CENTER_VARIANCE * anchor-height + anchor-center-y | ||||||
| bounding-box-width = expf (width * SIZE_VARIANCE) * anchor-width | ||||||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. what is
Collaborator
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. it's expf(x) is the C function that does e^(x) .. would that be clearer ?
Collaborator
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. something like:
Suggested change
|
||||||
| bounding-box-height = expf (height * SIZE_VARIANCE) * anchor-height | ||||||
| ``` | ||||||
|
|
||||||
| |box-1 | box-1 | box-1 | box-1 | box-2 | box-2 | box-2 | box-2 | ... | COUNT|COUNT|COUNT|COUNT| | ||||||
| |--- |--- |--- |--- |--- |--- |--- |--- |--- |--- |--- |--- |--- | | ||||||
| | top-left-X | top-left-Y | width | height | top-left-X | top-left-Y | width | height | ... top-left-X | top-left-Y | width | height | | ||||||
|
|
||||||
| Memory layout of tensor data: | ||||||
|
|
||||||
| |Index |Value | | ||||||
| |--- |--- | | ||||||
| | - | - | | ||||||
| |0 |top-left-x | | ||||||
| |1 |top-left-y | | ||||||
| |2 |width | | ||||||
| |3 |height | | ||||||
| |4 |top-left-x | | ||||||
| |5 |top-left-y | | ||||||
| |6 |width | | ||||||
| |7 |height | | ||||||
| |... | ... | | ||||||
| |(COUNT - 1) x 4 |top-left-x | | ||||||
| |(COUNT - 1) x 4 + 1 |top-left-y | | ||||||
| |(COUNT - 1) x 4 + 2 |width | | ||||||
| |(COUNT - 1) x 4 + 3 |height | | ||||||
|
|
||||||
|
|
||||||
|
|
||||||
| # Models | ||||||
|
|
||||||
| * [Model source](https://github.com/Linzaer/Ultra-Light-Fast-Generic-Face-Detector-1MB/tree/master) | ||||||
| * [ONNX pre-trained model](https://github.com/Linzaer/Ultra-Light-Fast-Generic-Face-Detector-1MB/tree/master/models/) | ||||||
|
|
||||||
| # Tensor Decoders | ||||||
| |Framework | Links | | ||||||
| |--- |--- | | ||||||
| |GStreamer | [perm](https://gitlab.freedesktop.org/gstreamer/gstreamer/-/blob/main/subprojects/gst-plugins-bad/gst/tensordecoders/gstfacedetectortensordecoder.c) | | ||||||
|
|
||||||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't really understand this formula but maybe once you clarify anchor-center-x and anchor-center-y. It will be more clear.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The formula is taken from the tensor decoder... It moves the anchor box by a certain value I think