Skip to content

latest EC product from bioregistry has http nodes and over 200k TrEMBL proteins #444

@realmarcin

Description

@realmarcin

We are using EC from bioregistry and looks like latest release is using http for node ids:
id category name provided_by synonym deprecated iri same_as
https://www.ebi.ac.uk/intenz/query?cmd=SearchEC&ec=1 biolink:OntologyClass Oxidoreductases ec.json https://www.ebi.ac.uk/intenz/query?cmd=SearchEC&ec=1
And also add trEMBL annotations … this is a problem because extra data flying in through KGX ontology transform is harder to control and could even be non-trivial to notice. (edited)

11:49
236220 TrEMBL proteins coming in with bioregistry EC — I can make an issue there but first wanted to check that this is somehow expected? (edited)
11:51
(but ideally can avoid to post-process KGX transforms, actually ends up being at two step process I think which may be a new precedent?) (edited)
11:54
For context this is the download:
url: https://w3id.org/biopragmatics/resources/eccode/eccode.json
local_name: ec.json

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions