SPACE: STRING proteins as complementary embeddings

Introduction

Official repository for the paper in Bioinformatics: SPACE: STRING proteins as complementary embeddings, in which we precalculated:

cross-species network embeddings
ProtT5 sequence embeddings

for all eukaryotic proteins in STRING v12.0.

You can download all the embeddings from the STRING website:

protein.network.embeddings.v12.0.h5
protein.sequence.embeddings.v12.0.h5

Reproduce the results in the paper

Please follow this document.

How to Cite

If you use this work in your research, please cite the SPACE paper:

Hu, Dewei, et al. "SPACE: STRING proteins as complementary embeddings." Bioinformatics (2025): btaf496. https://doi.org/10.1101/2024.11.25.625140

and the STRING database:

Szklarczyk, D., Nastou, K., Koutrouli, M., Kirsch, R., Mehryary, F., Hachilif, R., ... & von Mering, C. (2025). The STRING database in 2025: protein networks with directionality of regulation. Nucleic Acids Research, 53(D1), D730-D737. https://doi.org/10.1093/nar/gkae1113

How to load the embeddings

The following code reads the cross-species network embedding file 9606.protein.network.embeddings.v12.0.h5.

Python example

pip install h5py

import h5py

filename = '9606.protein.network.embeddings.v12.0.h5'

with h5py.File(filename, 'r') as f:
    meta_keys = f['metadata'].attrs.keys()
    for key in meta_keys:
        print(key, f['metadata'].attrs[key])

    embedding = f['embeddings'][:]
    proteins = f['proteins'][:]
	
    # protein names are stored as bytes, convert them to strings
    proteins = [p.decode('utf-8') for p in proteins]

R example:

Install the rhdf5 package to read the embedding files. The following code reads the embedding file 9606.protein.network.embeddings.v12.0.h5.

# Install required packages if not already installed
# install.packages("rhdf5")

# Load the library
library(rhdf5)

filename <- '9606.protein.network.embeddings.v12.0.h5'

metadata <- h5readAttributes(filename, "metadata")
for (key in names(meta_keys)) {
    print(paste(key, meta_keys[[key]]))
}

embeddings <- h5read(filename, "embeddings")
proteins <- h5read(filename, "proteins")

Read combined files

Read the combined network embedding file of all eukaryotes with Python

import h5py

filename = 'protein.network.embeddings.v12.0.h5'

with h5py.File(filename, 'r') as f:
    meta_keys = f['metadata'].attrs.keys()
    for key in meta_keys:
        print(key, f['metadata'].attrs[key])
  
    species = '4932'  # if we check the brewer's yeast
    embeddings = f['species'][species]['embeddings'][:]
    proteins = f['species'][species]['proteins'][:]
	
    # protein names are stored as bytes, convert them to strings
    proteins = [p.decode('utf-8') for p in proteins]

Read the combined file with R

library(rhdf5)

filename <- 'protein.network.embeddings.v12.0.h5'

meta_keys <- h5attributes(h5file$metadata)
for (key in names(meta_keys)) {
    print(paste(key, meta_keys[[key]]))
}

species <- '4932'  # for brewer's yeast
embeddings <- h5read(filename, paste0('species/', species, '/embeddings'))
proteins <- h5read(filename, paste0('species/', species, '/proteins'))

Contact

dewei.hu@sund.ku.dk.

Star history

License

MIT.

Name		Name	Last commit message	Last commit date
Latest commit History 34 Commits
figures		figures
scripts		scripts
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
reproduce.md		reproduce.md
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SPACE: STRING proteins as complementary embeddings

Table of Contents

Introduction

Reproduce the results in the paper

How to Cite

How to load the embeddings

Python example

R example:

Read combined files

Contact

Star history

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

License

deweihu96/SPACE

Folders and files

Latest commit

History

Repository files navigation

SPACE: STRING proteins as complementary embeddings

Table of Contents

Introduction

Reproduce the results in the paper

How to Cite

How to load the embeddings

Python example

R example:

Read combined files

Contact

Star history

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages