-
Notifications
You must be signed in to change notification settings - Fork 4
QA of GO-CAM ingest #339
Copy link
Copy link
Open
Labels
Description
QA Resources:
- KGX Summary report: https://docs.google.com/spreadsheets/d/1f6i3ASkjdrT6uEOkis82vaJofqJdrbLyzWiyhA43a-k/edit?gid=943440447#gid=943440447
- RIG (original): https://github.com/NCATSTranslator/translator-ingests/blob/main/src/translator_ingest/ingests/go_cam/go_cam_rig.yaml
- RIG (PR with QA changes)
Issues to consider / address for next release:
Generally, the initial release of this was pretty fast and loose, and there are plans to come back and improve things. Lots of notes to this effect in the RIG 'future considerations'. Noting some additional observations below:
- uses predicates that violate Translator conventions (e.g. include direction of effect)
- very few edges actually created
- those that are lack detail e.g. qualifiers, edge properties that could be taken)
- contains relatively large numbers of nonsensical edges (e.g. gene causes gene (878), gene has input gene (1400)
Also, the RIG is not in sync with what the code is actually doing to produce data, and seems to reference predicates that are not in Biolink (e.g. positively_regulates), and edge properties that are not in biolink (apparently as placeholders to capture qualifying info that will be captured using proper SPOQ patterns in future iterations of the ingest)
Issues to consider longer term:
- Some Biolink predicates include directionality (e.g.
acts upstream of negative effect) - which we capture in qualifiers for Translator use cases. We should not use these in our ingest, and define qualifier based patterns to represent directionality of effects.
Questions:
Reactions are currently unavailable