Polyp Instrument Segmentation Using DeepLabv3+ with Custom Layers

1. Introduction and Motivation

Accurate instrument segmentation in endoscopic imagery is vital for automated surgical systems, assisting in tool tracking, scene understanding, and minimizing invasiveness. This project explores a custom implementation of DeepLabv3+ with a ResNet50 backbone, enhanced with handcrafted convolution and dilated spatial pyramid pooling layers. It focuses on robust polyp tool segmentation from the Kvasir-Instrument dataset using advanced preprocessing and evaluation strategies.

2. Research Objectives

Develop an efficient DeepLabv3+ architecture with custom layers.
Improve segmentation robustness through preprocessing techniques like histogram equalization and dilation.
Evaluate performance using Dice Coefficient and Jaccard Index.
Visualize qualitative performance under varying prediction thresholds.

Key Research Questions

Does preprocessing improve segmentation quality in medical instrument images?
How effective is the custom DSPP module for fine-grained instrument boundary detection?
Can we reach near real-time inference without compromising accuracy?

3. Dataset

Source: Kvasir-Instrument Dataset
Inputs: RGB endoscopy images (512x512)
Masks: Binary instrument masks
Split: 531 training, 80 validation, 59 test images

4. Image Preprocessing

Images: Histogram equalization + Gaussian smoothing (kernel = 4x4)
Masks: Binary conversion and dilation (kernel = 2x2)
Libraries: OpenCV, NumPy

5. Model Architecture

5.1 Custom Layers

ConvBlock: Modular convolutional layer with optional batch normalization and activation.
DilatedSpatialPyramidPooling: Captures multi-scale context using dilated convolutions and global pooling.

5.2 DeepLabv3+ Backbone

Encoder: ResNet50 pretrained on ImageNet.
ASPP: Custom DSPP module with dilation rates [1, 6, 12, 18].
Decoder: Concatenation with low-level features + ConvBlocks + UpSampling2D
Output: 1-channel sigmoid for binary segmentation

5.3 Model Details

Input Shape: (512, 512, 3)
Loss: Binary Crossentropy
Metrics: Dice Loss, Dice Coefficient, Jaccard Index
Optimizer: Adam

6. Training Strategy

Epochs: 25
Batch Size: 8
Learning Rate: Default Adam
Callback Strategy: No early stopping, manual validation

Training Metrics

Epoch	Val Dice	Val Jaccard
15	0.8672	0.7685
21	0.9725	0.9466
25	0.9731	0.9477

7. Evaluation

Split	Dice	Jaccard	Loss
Train	0.9781	0.9571	0.0071
Validation	0.9761	0.9534	0.0079
Test	0.8649	0.7761	0.1596

8. Prediction and Visualization

Evaluated predictions at two thresholds: 0.3 and 0.99
Observed robustness in boundary detection across tools of varying shape and brightness

9. Conclusion

This project demonstrates the utility of combining handcrafted preprocessing with custom deep semantic segmentation models. It highlights:

The effectiveness of histogram equalization + dilation for medical imagery
Custom DSPP’s ability to capture fine spatial detail
Competitive segmentation scores with minimal overfitting

Future Work

Apply post-processing (e.g., CRF, morphological refinements)
Explore multi-class segmentation across different tools
Deploy model in real-time surgical assistance setups

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
README.md		README.md
segment-medical-instrument-deeplabv3-dice-0-86.ipynb		segment-medical-instrument-deeplabv3-dice-0-86.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Polyp Instrument Segmentation Using DeepLabv3+ with Custom Layers

1. Introduction and Motivation

2. Research Objectives

Key Research Questions

3. Dataset

4. Image Preprocessing

5. Model Architecture

5.1 Custom Layers

5.2 DeepLabv3+ Backbone

5.3 Model Details

6. Training Strategy

Training Metrics

7. Evaluation

8. Prediction and Visualization

9. Conclusion

Future Work

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Polyp Instrument Segmentation Using DeepLabv3+ with Custom Layers

1. Introduction and Motivation

2. Research Objectives

Key Research Questions

3. Dataset

4. Image Preprocessing

5. Model Architecture

5.1 Custom Layers

5.2 DeepLabv3+ Backbone

5.3 Model Details

6. Training Strategy

Training Metrics

7. Evaluation

8. Prediction and Visualization

9. Conclusion

Future Work

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages