Skip to content

Commit b1064c2

Browse files
Added worflow
1 parent af08f8f commit b1064c2

File tree

1 file changed

+5
-1
lines changed

1 file changed

+5
-1
lines changed

README.md

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,8 @@
11
# Comparative analysis of protein function text-based embeddings and its potential for prediction tasks
22

33

4-
This thesis explores how information for protein functions can be exploited through embeddings so that the produced information can be used to improve protein function annotations. The underlying hypothesis here is that any pair of proteins with high sequence similarity will also share a similar biological function which would be reflected by the corresponding protein embeddings. The comparion and evaluation of this would be done using two text-driven embedding approaches: word2vec and doc2vec.
4+
This thesis explores how information for protein functions can be exploited through embeddings so that the produced information can be used to improve protein function annotations. The underlying hypothesis here is that any pair of proteins with high sequence similarity will also share a similar biological function which would be reflected by the corresponding protein embeddings. The comparion and evaluation of this is done using two text-driven embedding approaches: Word2doc2Vec and Hybrid-Word2doc2Vec.
5+
6+
The overall workflow of the thesis is illustarted in th image below:
7+
8+
![](docs/thesis_workflow.png)

0 commit comments

Comments
 (0)