Skip to content

fanqicheng/cbioportal-mcp-sample-code

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 

Repository files navigation

This repository showcases a partial code of preprocessing Whole Slide Images (WSIs), such as those found in the TCGA and PANDA datasets. It is designed to bridge the gap between massive raw clinical data and AI-ready inputs.

Key Features

  • Intelligent Tissue Detection: Implements a robust contour detection algorithm using OpenCV to automatically separate histological tissue from background noise.
  • Multi-Resolution Processing: Handles hierarchical TIFF and Zarr structures, allowing for efficient downsampling and high-resolution patch extraction.
  • Heuristic-Based Patch Ranking: Features a scoring mechanism (rank_key) that prioritizes patches based on tissue coverage and edge density, optimizing data quality for downstream models.

Technical Stack

  • Languages: Python
  • Core Libraries: OpenCV, NumPy, TiffFile, Zarr, PIL
  • Domain Focus: Computational Pathology, Cancer Genomics (TCGA)

Relevance to cBioPortal MCP Project

While my professional UI development at Amazon is proprietary, this project demonstrates my ability to architect the backend data logic and pruning strategies required for the cBioPortal MCP server. It proves I can handle high-dimensional genomic datasets and translate them into optimized formats for visualization.

About

GSoC 2026 Proposal Code Sample - WSI Processing

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages