-
Notifications
You must be signed in to change notification settings - Fork 0
Panoptic-Deeplab functional full network #2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
models/experimental/panoptic_deeplab/tests/test_resnet52_bottleneck.py
Outdated
Show resolved
Hide resolved
6c5b055 to
4e738ec
Compare
|
Working on a cleaner reference implementation and weight processing, will push after rebasing properly. ASPP layer will still need some work to align it to Detectron2's implementation of the reference model. |
The changes have been pushed. |
|
Hi @mbezuljTT, Could you please review the PR? |
|
what is the resolution for the shared performance? |
Resolution : 1x3x512x1024 |
|
@ianastasijevicTT on my team has reviewed your PR; generally it's OK starting point for the functional PanopticDeepLab. We have our own version of the model being merged on the main tt-metal repo as we speak; it's optimized for a special case of blackhole and 20 cores; we would have to figure out what is the easiest way to have two flavors of the model; and at this time we don't want to merge this into tt-metal. FYI @mbahnasTT @ign-saurav if you want to continue optimization of this model we can share some pointers. |
@mbezuljTT , Sure, please share the pointers for optimization. |
Overview
This PR integrates Panoptic-DeepLab model support into the Tenstorrent codebase(
tt-metal), enabling end-to-end semantic + instance segmentation on TTNN devices .Input Resolution: (1x3x512x1024)
Source
Panoptic-DeepLab Model Tree
panoptic_deeplab
├── backbone
│ ├── stem
│ └── bottleneck
│
├── semantic_segmentation_head
│ ├── aspp
│ ├── res3
│ ├── res2
│ └── semantic_head
│
└── instance_segmentation_head
├── aspp
├── res3
├── res2
│ ├── center_head
│ └── offset_head
Results
PCC Score
Hardware: Wormhole n150
Tests
Full network test (with real weights and real input from image):
pytest models/experimental/panoptic_deeplab/tests/pcc/test_panoptic_deeplab.pyDemo Test:
python models/experimental/panoptic_deeplab/demo/panoptic_deeplab_demo.py -i models/experimental/panoptic_deeplab/resources/input.png -o <output-dir-path>Performance:
Total device time : 78,156us
FPS : 12.8
Observations:
Checklist