Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
37 commits
Select commit Hold shift + click to select a range
f3123eb
panoptic-deeplab full network functional
ign-saurav Sep 10, 2025
9bd023e
torch output tensor shape error fix
ign-saurav Sep 11, 2025
39e0934
cleanUp WIP
ign-febin Sep 12, 2025
89f2af7
cleanUp WIP
ign-febin Sep 12, 2025
b36d3fe
cleanUp WIP
ign-febin Sep 12, 2025
d39d81a
reverted to old custom_preprocessor
ign-febin Sep 15, 2025
d6de981
decoder test fix
ign-febin Sep 15, 2025
f361094
[WIP]Adds trained weight loading support
ign-navaneethk Sep 12, 2025
e88f046
Trained weight loading for unit tests
ign-navaneethk Sep 15, 2025
75017ea
Fixes layer names in backbone and minor cleanup
ign-navaneethk Sep 15, 2025
912dcfa
refactored full net test
ign-febin Sep 15, 2025
5414f87
clean full net test
ign-febin Sep 15, 2025
a507e51
README WIp
ign-febin Sep 15, 2025
8b18461
custom_preprocessor refactor
ign-febin Sep 15, 2025
cc07c6e
refactored decoder test
ign-febin Sep 15, 2025
c234a3f
Demo added
ign-saurav Sep 15, 2025
4e738ec
uniform test infra
ign-febin Sep 16, 2025
5bad0c8
fixed errors due to rebase to latest main
ign-saurav Sep 16, 2025
db25fef
activation default set to None as per latest changes
ign-saurav Sep 16, 2025
aa7eda5
image path fix
ign-febin Sep 16, 2025
05bd832
refactored common.py and some other files
ign-akshayr Sep 16, 2025
2860ca7
refactor decoder test and demo bug fix
ign-akshayr Sep 16, 2025
cda67fb
panoptic output added
ign-saurav Sep 16, 2025
4f12a5f
enabled enable_act_double_buffer
ign-saurav Sep 16, 2025
6eb6159
for res3 reduced size act_block_h due to memory overlapping error
ign-saurav Sep 16, 2025
0bbe5ab
reduced act_block_h in aspp
ign-saurav Sep 16, 2025
a8f8e41
demo file refactoring
ign-akshayr Sep 17, 2025
d414cd3
Updates refernce model for reduced checkpoint key mappings
ign-navaneethk Sep 16, 2025
abfe9d3
Splits common file and fixes import issues
ign-navaneethk Sep 17, 2025
9ed15f6
Porting of test files
ign-navaneethk Sep 17, 2025
71b864c
Updates ASPP and Res blocks for easier weight loading
ign-navaneethk Sep 17, 2025
b474ef5
Fixes copyright string in decoder file
ign-navaneethk Sep 17, 2025
204f980
perf_test fix , runner added and fixed multiple run
ign-febin Sep 17, 2025
a0d02ce
updated README file and resolved comments
ign-akshayr Sep 17, 2025
61734ba
cleanup post-processing and demo files
ign-saurav Sep 17, 2025
f229cbb
labels added to panoptic segmentation
ign-saurav Sep 18, 2025
5cae782
instances added for the labels with multiple occurences
ign-saurav Sep 19, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
159 changes: 159 additions & 0 deletions models/experimental/panoptic_deeplab/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,159 @@
# Panoptic-DeepLab (TT-NN)

**Platforms:** Wormhole (n150)
**Supported Input Resolution:** `(512, 1024)` = (Height, Width)

## Introduction
Panoptic-DeepLab is a state-of-the-art bottom-up method for panoptic segmentation, where the goal is to assign semantic labels (e.g., person, dog, cat and so on) to every pixel in the input image as well as instance labels (e.g. an id of 1, 2, 3, etc) to pixels belonging to thing classes.

This repository provides:
- A **reference PyTorch model** for correctness.
- A **TT-NN implementation** for Tenstorrent hardware (Wormhole).
- A **demo pipeline**, **tests**, and **resources** (weights + sample assets).

## Table of Contents
- [Prerequisites](#prerequisites)
- [Repository Layout](#repository-layout)
- [Weights](#weights)
- [Quickstart](#quickstart)
- [Run Tests](#run-tests)
- [Run the Demo](#run-the-demo)
- [Custom Images](#custom-images)
- [Performance (Trace + 2CQ)](#performance-trace--2cq)
- [Configuration Notes](#configuration-notes)

## Prerequisites
- Clone the **tt-metal** repository (source code & toolchains):
<https://github.com/tenstorrent/tt-metal>
- Install **TT-Metalium™ / TT-NN™**:
Follow the official instructions: <https://github.com/tenstorrent/tt-metal/blob/main/INSTALLING.md>
- (Optional, for profiling) Build with profiler enabled:
```bash
./build_metal.sh --enable-profiler

## Repository Layout
```
models/
└── experimental/
└── panoptic_deeplab/
├── resources/
│ ├── test_inputs/
│ │ └── input_torch_input.pt # generated and stored during runtime
│ ├── input.png
│ ├── Panoptic_Deeplab_R52.pkl # downloaded during runtime if not present in the directory
│ └── panoptic_deeplab_weights_download.sh
├── reference/
│ ├── aspp.py
│ ├── decoder.py
│ ├── head.py
│ ├── panoptic_deeplab.py # TorchPanopticDeepLab (reference)
│ ├── res_block.py
│ ├── resnet52_backbone.py
│ ├── resnet52_bottleneck.py
│ ├── resnet52_stem.py
│ └── utils.py
├── tt/
│ ├── aspp.py
│ ├── backbone.py
│ ├── bottleneck.py
│ ├── custom_peprocessing.py
│ ├── decoder.py
│ ├── head.py
│ ├── panoptic_deeplab.py
│ ├── res_block.py
│ ├── stem.py
│ └── utils.py
├── runner/
│ └── runner.py
├── common.py
├── README.md
├── demo/
│ ├── config.py
│ ├── post_proessing.py
│ └── panoptic_deeplab_demo.py # CLI demo
└── tests/
├── perf/
│ ├── test_perf.py
└── pcc/
└── test_panoptic_deeplab.py # end-to-end pytest
└── test_aspp.py
└── test_decoder.py
└── test_head.py
└── test_residual_block.py
└── test_resnet52_backbone.py
└── test_resnet52_bottleneck.py
└── test_resnet52_stem.py
```

## Weights
The default model expects Panoptic_Deeplab_R52.pkl in:

```
models/experimental/panoptic_deeplab/resources/Panoptic_Deeplab_R52.pkl
```
If missing, the code will attempt to run:
```
models/experimental/panoptic_deeplab/resources/panoptic_deeplab_weights_download.sh
```
Note: The weights are for Cityscapes panoptic segmentation with an R-52 backbone.

## Quickstart
### Run Tests
```
models/experimental/panoptic_deeplab/tests/pcc/test_panoptic_deeplab.py
```
This runs an end-to-end flow that:

- Loads the Torch reference,

- Runs the TT-NN graph,

- Post-processes outputs,

- Optionally compares results and saves artifacts.

### Run the Demo
```
python models/experimental/panoptic_deeplab/demo/panoptic_deeplab_demo.py \
--input <path/to/image.png> \
--output <path/to/output_dir>
```
### Custom Images
You can place your image(s) under:
```
models/experimental/panoptic_deeplab/resources/
```
Then re-run either the demo:
```
python models/experimental/panoptic_deeplab/demo/panoptic_deeplab_demo.py
-i models/experimental/panoptic_deeplab/resources/input.png
-o models/experimental/panoptic_deeplab/resources
```
Note: Currently, the input image is taken from the Cityscapes dataset, and accordingly, post-processing is applied.

For visualizing heads comparison of PyTorch and TTNN implementation, enable save_comparison in demo/config.


## Performance
### Single Device (BS=1):

- end-2-end perf is `12.81` FPS

To run perf test:
```
pytest models/experimental/panoptic_deeplab/tests/perf/test_perf.py
```

To collect perf reports with the profiler, build with `--enable-profiler`

## Configuration Notes

- Resolution: (H, W) = (512, 1024) is supported end-to-end.

- Device: The demo opens a Wormhole device (default id typically 0). If you need to change it, adjust the DemoConfig or the device open call in the demo.

- Batch Size: Demo/tests are written for BS=1. For larger BS you’ll need to verify memory layouts and tile alignment.

- Memory Layouts: The TT-NN path uses ROW_MAJOR layout for resize ops and may pad channels to multiples of 32 to satisfy kernel/tile alignment.

- Weights: The loader maps Detectron/PDL keys → internal module keys. It will auto-download weights if missing via the included script.
238 changes: 238 additions & 0 deletions models/experimental/panoptic_deeplab/common.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,238 @@
# SPDX-FileCopyrightText: © 2025 Tenstorrent Inc.
# SPDX-License-Identifier: Apache-2.0

import os
import pickle
import ttnn
import torch
import numpy as np
import torchvision.transforms as transforms

from PIL import Image
from loguru import logger
from typing import Tuple, Optional, Any
from ttnn.model_preprocessing import infer_ttnn_module_args

from models.experimental.panoptic_deeplab.reference.resnet52_backbone import ResNet52BackBone as TorchBackbone
from models.experimental.panoptic_deeplab.reference.resnet52_stem import DeepLabStem
from models.experimental.panoptic_deeplab.reference.aspp import ASPPModel
from models.experimental.panoptic_deeplab.reference.decoder import DecoderModel
from models.experimental.panoptic_deeplab.reference.res_block import ResModel
from models.experimental.panoptic_deeplab.reference.head import HeadModel
from models.experimental.panoptic_deeplab.reference.panoptic_deeplab import TorchPanopticDeepLab
from models.experimental.panoptic_deeplab.reference.resnet52_bottleneck import Bottleneck


# ---------------------------
# Key mapping & model loading
# ---------------------------

key_mappings = {
# Semantic head mappings
"sem_seg_head.": "semantic_decoder.",
".predictor.": ".head_1.predictor.",
".head.pointwise.": ".head_1.conv2.",
".head.depthwise.": ".head_1.conv1.",
# Instance head mappings
"ins_embed_head.": "instance_decoder.",
".center_head.0.": ".head_2.conv1.",
".center_head.1.": ".head_2.conv2.",
".center_predictor.": ".head_2.predictor.",
".offset_head.depthwise.": ".head_1.conv1.",
".offset_head.pointwise.": ".head_1.conv2.",
".offset_predictor.": ".head_1.predictor.",
# ASPP mappings (res5 -> aspp)
"decoder.res5.project_conv": "aspp",
# Decoder res3 mappings
".decoder.res3.": ".res3.",
# Decoder res2 mappings
".decoder.res2.": ".res2.",
}


def map_single_key(checkpoint_key):
for key, value in key_mappings.items():
checkpoint_key = checkpoint_key.replace(key, value)
return checkpoint_key


def load_partial_state(torch_model: torch.nn.Module, state_dict, layer_name: str = ""):
partial_state_dict = {}
layer_prefix = layer_name + "."
for k, v in state_dict.items():
if k.startswith(layer_prefix):
partial_state_dict[k[len(layer_prefix) :]] = v
torch_model.load_state_dict(partial_state_dict, strict=True)
logger.info(f"Successfully loaded all mapped weights with strict=True")
return torch_model


def load_torch_model_state(torch_model: torch.nn.Module = None, layer_name: str = "", model_location_generator=None):
if model_location_generator == None or "TT_GH_CI_INFRA" not in os.environ:
model_path = "models"
else:
model_path = model_location_generator("vision-models/panoptic_deeplab", model_subdir="", download_if_ci_v2=True)
if model_path == "models":
if not os.path.exists(
"models/experimental/panoptic_deeplab/resources/Panoptic_Deeplab_R52.pkl"
): # check if Panoptic_Deeplab_R52.pkl is available
os.system(
"models/experimental/panoptic_deeplab/resources/panoptic_deeplab_weights_download.sh"
) # execute the panoptic_deeplab_weights_download.sh file
weights_path = "models/experimental/panoptic_deeplab/resources/Panoptic_Deeplab_R52.pkl"
else:
weights_path = os.path.join(model_path, "Panoptic_Deeplab_R52.pkl")

# Load checkpoint
with open(weights_path, "rb") as f:
checkpoint = pickle.load(f, encoding="latin1")
state_dict = checkpoint["model"]

converted_count = 0
for k, v in state_dict.items():
if isinstance(v, np.ndarray) or isinstance(v, np.array):
state_dict[k] = torch.from_numpy(v)
converted_count += 1

# Get keys
checkpoint_keys = set(state_dict.keys())

# Get key mappings
logger.info("Mapping keys...")
key_mapping = {}
for checkpoint_key in checkpoint_keys: # pickle key
mapped_key = map_single_key(checkpoint_key)
key_mapping[checkpoint_key] = mapped_key

# Apply mappings
mapped_state_dict = {}
for checkpoint_key, model_key in key_mapping.items():
mapped_state_dict[model_key] = state_dict[checkpoint_key]
del mapped_state_dict["pixel_mean"]
del mapped_state_dict["pixel_std"]
logger.debug(f"Mapped {len(mapped_state_dict)} weights")

if isinstance(
torch_model,
(
DeepLabStem,
Bottleneck,
TorchBackbone,
ASPPModel,
ResModel,
HeadModel,
DecoderModel,
),
):
torch_model = load_partial_state(torch_model, mapped_state_dict, layer_name)
elif isinstance(torch_model, TorchPanopticDeepLab):
torch_model.load_state_dict(mapped_state_dict, strict=True)
else:
raise NotImplementedError("Unknown torch model. Weight loading not implemented")

return torch_model.eval()


def _infer_and_set(module, params_holder, attr_name, run_fn):
"""Infer conv args for a TTNN module and set them if present in parameters."""
if hasattr(params_holder, attr_name):
args = infer_ttnn_module_args(model=module, run_model=run_fn, device=None)
getattr(params_holder, attr_name).conv_args = args


def _populate_decoder(torch_dec: torch.nn.Module = None, params_dec: dict = None):
"""Warm up a single decoder (semantic or instance) to populate conv_args."""
if not (torch_dec and params_dec):
return

# Synthetic tensors that match typical Panoptic-DeepLab strides
input_tensor = torch.randn(1, 2048, 32, 64)
res3_tensor = torch.randn(1, 512, 64, 128)
res2_tensor = torch.randn(1, 256, 128, 256)

# ASPP
_infer_and_set(torch_dec.aspp, params_dec, "aspp", lambda m: m(input_tensor))
aspp_out = torch_dec.aspp(input_tensor)

# res3
_infer_and_set(torch_dec.res3, params_dec, "res3", lambda m: m(aspp_out, res3_tensor))
res3_out = torch_dec.res3(aspp_out, res3_tensor)

# res2
_infer_and_set(torch_dec.res2, params_dec, "res2", lambda m: m(res3_out, res2_tensor))
res2_out = torch_dec.res2(res3_out, res2_tensor)

# heads (one or two, if present)
if hasattr(torch_dec, "head_1"):
_infer_and_set(torch_dec.head_1, params_dec, "head_1", lambda m: m(res2_out))
if hasattr(torch_dec, "head_2"):
_infer_and_set(torch_dec.head_2, params_dec, "head_2", lambda m: m(res2_out))


def _populate_all_decoders(torch_model: torch.nn.Module = None, parameters: dict = None):
if hasattr(parameters, "semantic_decoder"):
_populate_decoder(torch_model.semantic_decoder, parameters.semantic_decoder)
if hasattr(parameters, "instance_decoder"):
_populate_decoder(torch_model.instance_decoder, parameters.instance_decoder)


def preprocess_image(
image_path: str, input_width: int, input_height: int, ttnn_device: ttnn.Device, inputs_mesh_mapper: Optional[Any]
) -> Tuple[torch.Tensor, ttnn.Tensor, np.ndarray, Tuple[int, int]]:
"""Preprocess image for both PyTorch and TTNN"""
# Load image
image = Image.open(image_path).convert("RGB")
original_size = image.size # (width, height)
original_array = np.array(image)
preprocess = transforms.Compose(
[transforms.ToTensor(), transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])]
)

# Resize to model input size
target_size = (input_width, input_height) # PIL expects (width, height)
image_resized = image.resize(target_size)

# PyTorch preprocessing
torch_tensor = preprocess(image_resized).unsqueeze(0) # Add batch dimension
torch_tensor = torch_tensor.to(torch.float)

# TTNN preprocessing
ttnn_tensor = None
ttnn_tensor = ttnn.from_torch(
torch_tensor.permute(0, 2, 3, 1), # BCHW -> BHWC
dtype=ttnn.bfloat16,
device=ttnn_device,
mesh_mapper=inputs_mesh_mapper,
)

if ttnn_tensor is not None:
_ = ttnn.to_torch(ttnn_tensor)

return torch_tensor, ttnn_tensor, original_array, original_size


def save_preprocessed_inputs(torch_input: torch.Tensor, save_dir: str, filename: str):
"""Save preprocessed inputs for testing purposes"""

# Create directory for test inputs
test_inputs_dir = os.path.join(save_dir, "test_inputs")
os.makedirs(test_inputs_dir, exist_ok=True)

# Save torch input tensor
torch_input_path = os.path.join(test_inputs_dir, f"{filename}_torch_input.pt")
torch.save(
{
"tensor": torch_input,
"shape": torch_input.shape,
"dtype": torch_input.dtype,
"mean": torch_input.mean().item(),
"std": torch_input.std().item(),
"min": torch_input.min().item(),
"max": torch_input.max().item(),
},
torch_input_path,
)

logger.info(f"Saved preprocessed torch input to: {torch_input_path}")

return torch_input_path
Loading
Loading