This repository contains a proof of concept Metabolomics Pipeline for the processing and structural assignment of LC-MS/MS (data dependant acquisition) data by modern methods:
A - Files for the generation of a docker containerised environment, that is GPU enabled (CUDA) and utilises Python and R languages. Key libraries include Tensorflow and Pytorch to support machine learning based methods, and XCMS and PyOpenMS for metabolomics functions.
B - Jupyter notebooks including for the initial exploratory data analysis (EDA) of raw data files with PyOpenMS; the extraction of molecular features with XCMS, including deep learning based peak filtering with NeatMS; and the GPU accelerated spectral database searching of target spectra with SimMS.
The Jupyter Notebooks illustrate a worked example and contain all the required code.
Please note that this work is a prototype and should be validated before use.