Using Peak-Gene Assignments to Calculate Gene Score #576
-
A problem we've been running into with our data is that the default distance from the tss used to calculate gene score genome-wide is not reliable in non-terminally differentiated cells. We've fiddled with the various settings, and we think the best way to overcome these problems would be to first assign peaks to genes (either using archr, or using ABCenhancergene or a similar program), then use these assignments/loops as a factor in the prediction of the gene score, instead of relying on the metric of accessibility within a certain distance of the gene. This would take into account the accessibility at the tss of the gene in addition to known biological connections between regulatory elements and the tss. The idea would be to use addGeneScoreMatrix, with the option geneModel=usePeakToGene(project), or something along those lines. Is this something that would be possible to add? We've looked into adding our own patches to rig it up for our own uses, but haven't been able to thus far. |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment
-
Hi @emmawwinchester, the main utility of Gene Scores is to identify biological labels associated with clusters based on known marker genes. This method isnt perfect, but it does work surprisingly well for lots of marker genes. In regards to your question, this really isnt possible to use in this manner at the moment because of how the implementation is set up. The best thing i can imagine is splitting the peaks into groups (based on the linked gene assignment) and computing module scores (see #308). --Screenshot from that issue |
Beta Was this translation helpful? Give feedback.
Hi @emmawwinchester, the main utility of Gene Scores is to identify biological labels associated with clusters based on known marker genes. This method isnt perfect, but it does work surprisingly well for lots of marker genes. In regards to your question, this really isnt possible to use in this manner at the moment because of how the implementation is set up. The best thing i can imagine is splitting the peaks into groups (based on the linked gene assignment) and computing module scores (see #308). --Screenshot from that issue