CHIMERA Challenge Submission

Team TBG Lab
Task 3

Our Approach:

Load WSI and divide into patches of 560x560 of the useful regions, given by the mask.
Extract features and coordinates using Virchow2, a foundation model for histopathology images.
Create clusters based on agglomerative clustering and make a single node with the mean of all patch-level features in the Cluster.
Create a graph of each patient's WSI, where an unweighted edge is made if the distance between the central coordinates of 2 nodes is <= 3000 pixels.
Extract embeddings using a custom GCN designed by us.
Filter out the top 2000 genes based on the BRS deviation between BRS 1/2 and 3.
Concatenate the embeddings of GCN, RNA seq data, and one-hot encoded data to form the final embedding.
Finally, train attention MLP, Random Survival Forest, CoxPH, Survival SVM, and Survival Gradient boosted trees, and take a weighted ensemble of the risk scores.

Data Download

Follow the steps below to download the dataset.

Step 1: Install AWS CLI

Ensure that you have the AWS CLI installed on your system. You can find installation instructions for various platforms (Windows, macOS, Linux) in the link above.

Step 2: Download Dataset

Once AWS CLI is installed, use the following command to sync the dataset from the S3 bucket:

aws s3 sync --no-sign-request s3://chimera-challenge/v2/task3 .

This command will download the dataset to the current directory. Make sure you are in the desired directory where you want the data to be stored.

Note: The --no-sign-request flag ensures you can access the dataset without AWS credentials.

Download dependencies

To set up the environment, make sure you have the following dependencies:

Python: Version 3.10
CUDA: Version 12.6 (ensure that your GPU supports this version)

Once the prerequisites are met, follow these steps to set up your environment:

Create a virtual environment:
```
python -m venv .venv
```
Activate the virtual environment:
- On macOS/Linux:
```
source .venv/bin/activate
```
- On Windows:
```
.venv\Scripts\activate
```

Install the required dependencies:

python -m ven .venv
source .venv/bin/activate
pip install -r requirements.txt

Visualize the Dataset

To visualize the dataset, you can use the provided Jupyter notebook. Open a terminal or command prompt, and run the following command to start the Jupyter notebook:

jupyter notebook notebooks/data_visualization.ipynb

Results

Survival analysis optimization report using 5-fold cross-validation and Optuna-based hyperparameter tuning

Summary

(1) Mean C-index: 0.8211 ± 0.0344

(2) Best C-index: 0.8668

(3) Improvement over baseline: +0.1123

(4) Optimization trials per model: 100

Individual Seed Results

(1) Seed 42: 0.8441

(2) Seed 121: 0.8243

(3) Seed 144: 0.7661

(4) Seed 245: 0.8668

(5) Seed 1212: 0.8044

Key Findings

(1) Best Performing Models: The optimization successfully improved model performance.

(2) Ensemble Benefits: Optimized ensembles showed consistent improvements.

(3) Parameter Insights: Systematic hyperparameter tuning revealed optimal configurations.

Recommendations

(1) Use the optimized hyperparameters for production models.

(2) Consider the ensemble approach for the best performance.

(3) Monitor model stability across different seeds.

NOTE:

We were unable to submit our model for the competition due to an error in our Docker implementation. However, we will evaluate the model on the hidden test set once it becomes publicly available.

Developers: Madhav Arora, Sumit Kumar, Dhairya Gupta

Name		Name	Last commit message	Last commit date
Latest commit History 56 Commits
.github/workflows		.github/workflows
data		data
features		features
notebooks		notebooks
resources		resources
src		src
.gitattributes		.gitattributes
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
do_build.sh		do_build.sh
do_save.sh		do_save.sh
do_test_run.sh		do_test_run.sh
extract_embeddings.py		extract_embeddings.py
inference.py		inference.py
requirements.txt		requirements.txt
script.py		script.py
test.py		test.py
top_2000_genes_by_brs.txt		top_2000_genes_by_brs.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

CHIMERA Challenge Submission

Our Approach:

Data Download

Step 1: Install AWS CLI

Step 2: Download Dataset

Download dependencies

Visualize the Dataset

Results

Summary

Individual Seed Results

Key Findings

Recommendations

NOTE:

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

License

madhavarora03/chimera-challenge-submission

Folders and files

Latest commit

History

Repository files navigation

CHIMERA Challenge Submission

Our Approach:

Data Download

Step 1: Install AWS CLI

Step 2: Download Dataset

Download dependencies

Visualize the Dataset

Results

Summary

Individual Seed Results

Key Findings

Recommendations

NOTE:

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages