Skip to content

madhavarora03/chimera-challenge-submission

Repository files navigation

CHIMERA Challenge Submission

Team TBG Lab
Task 3

Our Approach:

chimera
  1. Load WSI and divide into patches of 560x560 of the useful regions, given by the mask.
  2. Extract features and coordinates using Virchow2, a foundation model for histopathology images.
  3. Create clusters based on agglomerative clustering and make a single node with the mean of all patch-level features in the Cluster.
  4. Create a graph of each patient's WSI, where an unweighted edge is made if the distance between the central coordinates of 2 nodes is <= 3000 pixels.
  5. Extract embeddings using a custom GCN designed by us.
  6. Filter out the top 2000 genes based on the BRS deviation between BRS 1/2 and 3.
  7. Concatenate the embeddings of GCN, RNA seq data, and one-hot encoded data to form the final embedding.
  8. Finally, train attention MLP, Random Survival Forest, CoxPH, Survival SVM, and Survival Gradient boosted trees, and take a weighted ensemble of the risk scores.

Data Download

Follow the steps below to download the dataset.

Step 1: Install AWS CLI

Ensure that you have the AWS CLI installed on your system. You can find installation instructions for various platforms (Windows, macOS, Linux) in the link above.

Step 2: Download Dataset

Once AWS CLI is installed, use the following command to sync the dataset from the S3 bucket:

aws s3 sync --no-sign-request s3://chimera-challenge/v2/task3 .

This command will download the dataset to the current directory. Make sure you are in the desired directory where you want the data to be stored.

Note: The --no-sign-request flag ensures you can access the dataset without AWS credentials.

Download dependencies

To set up the environment, make sure you have the following dependencies:

  • Python: Version 3.10
  • CUDA: Version 12.6 (ensure that your GPU supports this version)

Once the prerequisites are met, follow these steps to set up your environment:

  1. Create a virtual environment:

    python -m venv .venv
  2. Activate the virtual environment:

    • On macOS/Linux:
      source .venv/bin/activate
    • On Windows:
      .venv\Scripts\activate
  3. Install the required dependencies:

    python -m ven .venv
    source .venv/bin/activate
    pip install -r requirements.txt

Visualize the Dataset

To visualize the dataset, you can use the provided Jupyter notebook. Open a terminal or command prompt, and run the following command to start the Jupyter notebook:

jupyter notebook notebooks/data_visualization.ipynb

Results

Survival analysis optimization report using 5-fold cross-validation and Optuna-based hyperparameter tuning

Summary

(1) Mean C-index: 0.8211 ± 0.0344

(2) Best C-index: 0.8668

(3) Improvement over baseline: +0.1123

(4) Optimization trials per model: 100

Individual Seed Results

(1) Seed 42: 0.8441

(2) Seed 121: 0.8243

(3) Seed 144: 0.7661

(4) Seed 245: 0.8668

(5) Seed 1212: 0.8044

Key Findings

(1) Best Performing Models: The optimization successfully improved model performance.

(2) Ensemble Benefits: Optimized ensembles showed consistent improvements.

(3) Parameter Insights: Systematic hyperparameter tuning revealed optimal configurations.

Recommendations

(1) Use the optimized hyperparameters for production models.

(2) Consider the ensemble approach for the best performance.

(3) Monitor model stability across different seeds.

NOTE:

We were unable to submit our model for the competition due to an error in our Docker implementation. However, we will evaluate the model on the hidden test set once it becomes publicly available.

Developers: Madhav Arora, Sumit Kumar, Dhairya Gupta

About

Task 3 submission by team TBG Lab

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •