Skip to content

[Bug] KEGG database is missing reactions in certain modules #207

@jolespin

Description

@jolespin

First off, huge thanks for not only creating this extremely useful package but also compiling all the backend databases into a structured format. I cited your paper in my VEBA 2.0 manuscript that is in the final stages of peer review. Woltka is very complimentary to the methods I've been developing and would be interested in making the outputs of VEBA interoperable with the inputs of Woltka.

I've been diving into the KEGG backend files and I noticed that some of the reactions are missing. For instance, in M00001 there is a reaction [RN:R13199] catalyzed by glucose-6-phosphate isomerase [EC:5.3.1.9] which is the 2nd reaction in the module (screenshot below).

Here's my Woltka KEGG database files:

$ md5sum *.txt *.map
d3f65b659bb4854211dad994f58eeb01  compound_name.txt
bcf10734d502d5f6ffbf6a5057f6c9d4  disease_name.txt
a24e59b753ccbc86457d42c240493be5  ko_name.txt
547e766b68dc0d1f65a29942f67937d9  module_name.txt
7bf2998df823875e19789ae15dd37641  pathway_name.txt
6cb322ff27bb3d8d5a7697669352b4f2  rclass_name.txt
0fa3c575ffb0b54b259ac7f2c9ed0e5b  reaction_enzyme.txt
d3557f87e134098d4b3e33d59ebaa6a0  reaction_equation.txt
b4b9ada1d052f06f3923013d9b57e091  reaction_name.txt
0bd6cd56b7877871f767c776b5fb6cbf  ko-to-cog.map
e17d1520cebbfc7d58746bfaa8c57b76  ko-to-disease.map
c3eb10443d0c1b8b6c467331eace47a7  ko-to-ec.map
dfb93ab50a8bd5f844d5b46076039b9b  ko-to-go.map
5b527423df4263cd5afd72c9b779b5ef  ko-to-module.map
fac26369ee0adcc8d143ce3196ea2105  ko-to-pathway.map
7e594ba405a49bcf2a2b420205108b65  ko-to-reaction.map
144a9091168b3e4f0779951abe58bc65  module-to-class.map
51af0c14283b7d511bf67f3bbc52a601  module-to-compound.map
6fdd695aae2170e474f869036272a728  module-to-ko.map
67afb8bc7bc722709a10b6fe8eb245c6  module-to-pathway.map
f1941d180ecdad984fc0a6b508726add  module-to-reaction.map
ff012d04387d92540dd024daa9c133fc  pathway-to-class.map
0e9d4f09bef1c579e5dd19e48fa0f695  pathway-to-compound.map
f92476be31b01721165bcdc0ef776161  pathway-to-disease.map
41c1c17f8dd02ca7cef8de11bafcb21b  pathway-to-ko.map
64e7462dc7f55dd43b4f939a053d4372  pathway-to-module.map
81db79a82fd44d46b2feefbd8ea7459c  reaction-to-ko.map
d348291b388df2dfaadcc885bd5bc6bb  reaction-to-left_compound.map
84d42244057e0788afae1023953c276e  reaction-to-module.map
cf6946a262117edbbab8a5401ae3a7bd  reaction-to-pathway.map
186a6f7c2a661029d59e012e2a681244  reaction-to-rclass.map
cd072d7435189da0d05f74cce32c992c  reaction-to-right_compound.map

I couldn't find this reaction in any of the Woltka KEGG database files:

$ grep -r "R13199" .

Would it be possible to update these backend files for proper pathway calculations?

Also, are there any KEGG module definition parsers that you know of that can reconstruct an edgelist that can reconstruct a KEGG module from combinations of KO? I want to try and reconstruct some of the modules using NetworkX.

image

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions