This repository contains the code and data for the project titled 'A Machine Learning framework to uncover actionable genetic insights applied to triple-negative breast cancer'. It focuses on exploring mutational profiles in Triple-negative Breast Cancer (TNBC) using statistical and machine learning methods. The project aims to identify novel therapeutic targets for TNBC patients and support further precision medicine investigations.
This repository consists of a series of Jupyter notebooks along with relevant datasets housed in the 'Data' folder. The initial clinical and mutational data are retrieved from cBioPortal. Additionally, outputs from MutSig2CV and MutClustSW, obtained by following the associated documentation, are included in the 'Data' folder. A mutation file with corrected protein positions, essential for preparing the input to MutClustSW, is also provided.
The notebooks are organized sequentially, numbered from 1 to 11, guiding users through the analysis process step by step. Users can reproduce the analysis by starting with the initial datasets and executing the notebooks in order.