-
Notifications
You must be signed in to change notification settings - Fork 11
Description
In the tutorial.ipynb workflow, a file is loaded at data/test_files/ptm_file.csv which contains a set of sites and known PTMs associated with that site (e.g. p, ub, m etc.)
There is also a *_reg column for some of these sites, however it's not explained what this means and i'm unsure to what extent these extra columns are used in the downstream analysis.
For example, in perform_enrichment_analysis_per_protein, we supply a ptm_dict which to my understanding just tells the function which residues to use for the "random" background generation (i.e. residues STY that are not necessarily modified should be analysed to see if there is a statistical difference in structural properties compared to the known phosphorylation sites). But is the p_reg also important for enrichment analysis here? Are these the background residues...?
Thanks in advance!