Skip to content

Consider changing the new schema (incremental_load) #69

@mmaiers-nmdp

Description

@mmaiers-nmdp

In the new schema
image
there are HAS_IPD_ALLELE edges from both GFE to IPD_Allele and IPD_Accession to IPD_Allele

The first issue is that both edges should be vectors (currently the one from IPD_Accession to IPD_Allele is scalar.

But there is a more fundamental issue which is that this model will not capture situations where the sequence changes but the IPD_Accession and the IPD_Allele stay the same. These are exactly the type of inconsistencies that this graph database is well suited to discover and catalog. With that in mind I propose that we update the schema to have IPD_Accession have an edge directly to the GFE. Or rather a "HAS_IPD_ACCESSION" edge from the GFE to the IPD_Accession node which will be symmetrical to the "HAS_IPD_ALLELE" from the GFE node to the IPD_Allele node.

This new HAS_IPD_ACCESSION should have an array of versions as an attribute.

Also the minor version should be included (e.g. HLA00012.1 not HLA00012) as the full accession number with attributes of the major potion "HLA00012" to allow queries that join these to not have to parse the name in cypher to get only the part left of the ".".

image

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions