Deep learning-based histomorphological image phenotyping reclassifies small cell lung cancer and improves risk stratification
In this study, we perform a comprehensive deep learning-based histomorphological image phenotyping of SCLC patients from multi-center cohorts. Deep representation learning analysis of tissue images constructs an atlas of histomorphological phenotypes (HIPO) and reveals two robust HIPO-based SCLC subtypes correlating with distinct survival outcomes independent of molecular subtyping and clinical features. A clinically applicable image-based histomorphological phenotyping stratification system (HIPOS) was proposed to evaluate the imaging intratumor ecosystem heterogeneity and improve risk stratification in the discovery cohort of 348 patients, and validated in two independent cohorts of 67 and 109 from other medical centers. The HIPOS consistently showed independent prognostic performance in predicting overall survival and disease-free survival outcomes and contributed extra prognostic significance beyond tumor–node–metastasis stage and molecular subtypes.
NVIDIA GeForce RTX 4080 Laptop GPU
This package is supported for Linux and Windows. The package has been tested on the following systems:
Linux 3.10.0-957.el7.x86_64
Windows 11 x64
Python version:3.8.19
matplotlib version: 3.7.2
pandas version: 2.0.3
numpy version: 1.23.4
pytorch version: 1.11.0
CUDA available: True
CUDA version: 11.3
cuDNN enabled: True
cuDNN version: 8200
Pillow 8.2.0
opencv-python 4.5.5.64
openslide-python 1.1.1
Scikit-learn 0.24.1
R version 4.3.0
It is recommended to install the environment in the Linux 3.10.0-957.el7.x86_64 system.
-
First install Anconda3.
-
Then install CUDA 11.x and cudnn.
-
Finall intall these dependent python software library.
The installation is estimated to take 1 hour, depending on the network environment.
- Convert the SVS file to PNG format.
- WSIs were segmented into 224x224-pixel tiles at 5x resolution.
- Artifacts were filtered using Otsu's thresholding, retaining tiles with ≥60% tissue coverage.
- Stain normalization was performed with Reinhard's method.
python ./data/H&E Tile Segmentation with watershed.py
In the file CC_Model.py , we provide an example of how to extract features
from each tile, given their coordinates, using a ResNet50 pre-trained on the ImageNet dataset.
The code to train such a model is available here: https://github.com/topics/resnet50.
python ./code/Contrastive Clustering/CC_Model.py
This correlation.py takes two CSV files as input: Patch_Feature.csv, which contains the feature data of image patches, and Patch_Cluster.csv, which includes the clustering labels for the patches. The output is a correlation matrix (Corrletation_matrix_pd) that quantifies the relationships between different clusters. The purpose of this code is to group image patches based on their clustering labels, calculate the mean features for each cluster, and analyze the inter-cluster correlations to better understand the relationships and similarities between the clusters, aiding in the interpretation of clustering results.
python ./code/Intercluster correlation/correlation.py
- The purpose of the
Patient_Level.pyis to aggregate patch-level cluster information into patient-level features by grouping patches based on patient identifiers and calculating the proportion of patches assigned to each HIPO type. This enables the generation of patient-level representations that can be used for further analysis, such as patient stratification or predictive modeling. - The
GMM.Raims to identify the optimal number of clusters in the similarity matrix using a model-based clustering approach (Mclust). It then classifies the matrix into clusters and reorders it for clear visualization, facilitating the exploration of patterns and relationships between the clustered entities. This process helps in deriving meaningful groupings and patterns for downstream analyses, such as feature extraction or classification.
python ./code/GMM_Cluster_to_HIPO/Patient_Level.py
R ./code/GMM_Cluster_to_HIPO/GMM.R
The DeepDHP architecture featured two autoencoders:
- A global autoencoder to extract overall tissue characteristics.
- A local autoencoder with an attention mechanism to focus on critical features. Global and local embeddings were concatenated and passed through a classifier for subtype prediction.
python ./code/DeepDHP_HIPOS/Model Validation.py
