Skip to content

Conversation

@ign-saurav
Copy link
Owner

@ign-saurav ign-saurav commented Sep 10, 2025

Overview

This PR integrates Panoptic-DeepLab model support into the Tenstorrent codebase(tt-metal), enabling end-to-end semantic + instance segmentation on TTNN devices .

Input Resolution: (1x3x512x1024)

Source

Panoptic-DeepLab Model Tree

panoptic_deeplab
├── backbone
│ ├── stem
│ └── bottleneck

├── semantic_segmentation_head
│ ├── aspp
│ ├── res3
│ ├── res2
│ └── semantic_head

└── instance_segmentation_head
├── aspp
├── res3
├── res2
│ ├── center_head
│ └── offset_head

Results

PCC Score

Hardware: Wormhole n150

Head Real weights and input data Random weights and input data
Semantic Head 0.990338 0.999981
Instance Center Head 0.989921 1.0
Instance Offset Head 0.994153 0.999999

Tests

Full network test (with real weights and real input from image):
pytest models/experimental/panoptic_deeplab/tests/pcc/test_panoptic_deeplab.py

Demo Test:
python models/experimental/panoptic_deeplab/demo/panoptic_deeplab_demo.py -i models/experimental/panoptic_deeplab/resources/input.png -o <output-dir-path>

Performance:

Total device time : 78,156us
FPS : 12.8

Observations:

  • At present, the upsampling, depthwise convolution, and copy operations consume the most device time.

Checklist

@ign-saurav ign-saurav changed the title panoptic-deeplab full network functional Panoptic-Deeplab functional full network Sep 11, 2025
@ign-navaneethk
Copy link
Collaborator

Working on a cleaner reference implementation and weight processing, will push after rebasing properly. ASPP layer will still need some work to align it to Detectron2's implementation of the reference model.

@ign-navaneethk
Copy link
Collaborator

Working on a cleaner reference implementation and weight processing, will push after rebasing properly. ASPP layer will still need some work to align it to Detectron2's implementation of the reference model.

The changes have been pushed.

@ign-saurav ign-saurav marked this pull request as ready for review September 17, 2025 14:27
@ign-saurav
Copy link
Owner Author

Hi @mbezuljTT, Could you please review the PR?

@mbezuljTT
Copy link

what is the resolution for the shared performance?

@ign-saurav
Copy link
Owner Author

what is the resolution for the shared performance?

Resolution : 1x3x512x1024

@mbezuljTT
Copy link

@ianastasijevicTT on my team has reviewed your PR; generally it's OK starting point for the functional PanopticDeepLab.

We have our own version of the model being merged on the main tt-metal repo as we speak; it's optimized for a special case of blackhole and 20 cores;

we would have to figure out what is the easiest way to have two flavors of the model; and at this time we don't want to merge this into tt-metal. FYI @mbahnasTT

@ign-saurav if you want to continue optimization of this model we can share some pointers.

@ign-saurav
Copy link
Owner Author

@ianastasijevicTT on my team has reviewed your PR; generally it's OK starting point for the functional PanopticDeepLab.

We have our own version of the model being merged on the main tt-metal repo as we speak; it's optimized for a special case of blackhole and 20 cores;

we would have to figure out what is the easiest way to have two flavors of the model; and at this time we don't want to merge this into tt-metal. FYI @mbahnasTT

@ign-saurav if you want to continue optimization of this model we can share some pointers.

@mbezuljTT , Sure, please share the pointers for optimization.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants