Skip to content

dxsillydzeko/Multiple-Myeloma-Risk-Classifier-MMRC-

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

12 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Multiple Myeloma Risk Classifier

GitHub stars License: MIT Python 3.8+

This repository contains a Python script for classifying multiple myeloma patients into ultra-high-risk and low-risk categories based on RNA-seq gene expression data and survival information.

Data Requirements

  • Expression data: TSV/CSV with GENE_ID as rows, sample IDs (e.g., MMRF_XXXX_1_BM) as columns. Values are TPM.
  • Survival data: CSV with columns public_id (e.g., MMRF_XXXX) and ttcos (survival time in months).

Installation

pip install -r requirements.txt

Usage

python mmrc.py --expression_file path/to/exp.tpm.tsv --survival_file path/to/survival_months.csv --output_dir output/

Output

  • Intermediate CSV files and plots in the specified output directory.
  • Console output with model performance.

Notes

  • This is a simplified classification approach. For real clinical use, consider censoring in survival data and consult domain experts.
  • Immunoglobulin genes are fetched via HGNC API and removed.

License

MIT License — see LICENSE for details.

Contributing

Pull requests and suggestions are welcome! Please see CONTRIBUTING.md for guidelines.

About

This ML algorithm was created by me for risk classification of multiple myeloma patients using differential gene expression (RNA-seq) datasets and clinical information of the patients. This is a binary classifier: low-risk and ultra-high-risk. Any suggestions and improvements are very very welcome and if this is useful to you, very cool.

Topics

Resources

License

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages