Skip to content

Add new textToKnowledgeGraph reader source#1487

Merged
bgyori merged 25 commits intogyorilab:masterfrom
prasham21:add-new-source
Dec 19, 2025
Merged

Add new textToKnowledgeGraph reader source#1487
bgyori merged 25 commits intogyorilab:masterfrom
prasham21:add-new-source

Conversation

@prasham21
Copy link

@prasham21 prasham21 commented Dec 16, 2025

(rewritten by @bgyori ) This PR adds a new source module wrapping the textToKnowledgeGraph or tkg reader which uses LLMs to produce BEL statements from text. Key to the integration is that the existing indra.sources.bel module can be leveraged to process BEL statement strings produced by tkg.

The new API supports both invoking tkg as a package to process PubMedCentral articles or process existing output files from tkg offline.

One note is that tkg often produces BEL strings that are not valid. These are effectively skipped since indra.sources.bel will not be able to process them, and overall we still get many valid statements. It is possible that with additional rule-based string manipulation, some of these can be turned into valid BEL but fixing these upstream may be more effective.

@bgyori bgyori changed the title Add new LLM-BEL reader source with full INDRA integration Add new textToKnowledgeGraph reader source Dec 17, 2025
@bgyori bgyori merged commit bd24641 into gyorilab:master Dec 19, 2025
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants