Skip to content

Huilin-Li/DeepTMHMM2TopologyViz

Repository files navigation

DeepTMHMM2TopologyViz

| Visualize the result of DeepTMHMM into protein topology.

In my case, I want to visualize a big transmembrane protein with ~2500AA, in order to better compare and inverstivate its structure with others. However, Protter gave wrong positions of N-term and C-term when I submited the sequence of my protein. Although I tried to customize positions of amino acids via the output of DeepTMHMM, the positions of N-term and C-term are still wrong. I also tried TMRPres2D, but it is much less flexible.

Therefore, I want a program that can help me visualize the topology of transmembrane proteins in a coding way with much more flexible and customized functions.

example

Example of Q92508 in my case


Usage

I didn't buid a Python package for this program, because the current features are very basic. The important module is deeptmhmm2topology folder, in which only 2 important .py files (DeepTMHMM2Topology.py and tools.py). The VizPlain.py is just an example to visualize the generated circle centers of the topology, and the visulization is plain just like its name.

Therefore, feel free to copy the module folder of deeptmhmm2topology, and use it as you want. An example of usage is in run.py.

With the input of TMRs.gff3 and topologies.3line files from DeepTMHMM, this program deeptmhmm2topology will result in a dataframe with columns = ["position", "AA", "IMO", "start", "end", "elngth", "center"], for exampel:

position,AA,IMO,start,end,length,center
1,M,I,1,8,8,"(0.0, 0.0)"
2,E,I,1,8,8,"(0.2, -0.34641)"
3,P,I,1,8,8,"(0.6, -0.34641)"
4,H,I,1,8,8,"(0.8, 0.0)"
5,V,I,1,8,8,"(0.8, 0.4)"
6,L,I,1,8,8,"(0.8, 0.8)"
7,G,I,1,8,8,"(0.8, 1.2)"
8,A,I,1,8,8,"(0.8, 1.6)"
9,V,M,9,25,17,"(0.8, 2.0)"
10,L,M,9,25,17,"(1.158968, 2.176471)"
11,Y,M,9,25,17,"(0.8, 2.352941)"
12,W,M,9,25,17,"(1.158968, 2.529412)"
13,L,M,9,25,17,"(0.8, 2.705882)"
14,L,M,9,25,17,"(1.158968, 2.882353)"
... ...

Math

  1. I want the intercellular and extracellular segments of protein can fold as a S-like curve, so I formally state this question as a linear integer optimization problem.

alt text

  • $N$ is the number of circles in the S-like curve.
  • $a_i$ is the number of circles in the curve peak, and the maximum number of circles on $a_i$ is $a_{max}$
  • $b_j$ is the number of circles in the vertical direction, and the maximum number of circles on $b_j$ is $b_{max}$
  • $x$ is the number of peak curves.
  • $y$ is the number of turns.
$$\begin{align} \text{Minimize} \quad & y \\[4pt] \text{subject to} \quad & x=2y-1 \\[2pt] & \sum_{i=1}^{x} a_i \;+\; 2\sum_{j=1}^{y} b_j \;=\; N \\[2pt] & a_i \in \{1,a_{max}\}, i=1,2,3,...,x, \text{where } x\in \{1,3,5,…\} \\[2pt] & b_j \in \{1,2,3,...,b_{max}\}, j=1,2,3,...,y, \text{where } y\in \{1,2,3,…\} \\[2pt] \end{align}$$

Therefor,

$$\begin{align} N_{min} & = x*1+2*(y*1)=2y-1+2*(y*1)=4y-1=3\\[2pt] N_{max} & = (2y-1)*a_{max}+2*(y*b_{max})=2(a_{max}+b_{max})y-a_{max} \\[2pt] y & < \frac{N+1}{4} \\[2pt] \end{align}$$

alt text

Features

alt text

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors