Skip to content

Adding new association schema to croissant #4278

@DSuveges

Description

@DSuveges

Follow up changes in the association datasets

Datasets to update column and dataset descriptions:

association_by_datasource_direct
association_by_datasource_indirect
association_by_datatype_direct
association_by_datatype_indirect
association_overall_direct
association_overall_indirect

These datasets now capturing temporal dimension: how did the association scores developed over time and if a given disease/target association can be considered as "novel": the knowledge is based on recent developments of the field.

The schema is the same for all datasets:

Dataset: association_by_datasource_direct:

root
 |-- diseaseId: string (nullable = true)
 |-- targetId: string (nullable = true)
 |-- aggregationType: string (nullable = true)
 |-- aggregationValue: string (nullable = true)
 |-- associationScore: double (nullable = true)
 |-- evidenceCount: long (nullable = true)
 |-- timeseries: array (nullable = true)
 |    |-- element: struct (containsNull = true)
 |    |    |-- year: integer (nullable = true)
 |    |    |-- associationScore: double (nullable = true)
 |    |    |-- novelty: double (nullable = true)
 |    |    |-- yearlyEvidenceCount: integer (nullable = true)
 |-- currentNovelty: double (nullable = true)

Dataset: association_by_datasource_indirect:

root
 |-- diseaseId: string (nullable = true)
 |-- targetId: string (nullable = true)
 |-- aggregationType: string (nullable = true)
 |-- aggregationValue: string (nullable = true)
 |-- associationScore: double (nullable = true)
 |-- evidenceCount: long (nullable = true)
 |-- timeseries: array (nullable = true)
 |    |-- element: struct (containsNull = true)
 |    |    |-- year: integer (nullable = true)
 |    |    |-- associationScore: double (nullable = true)
 |    |    |-- novelty: double (nullable = true)
 |    |    |-- yearlyEvidenceCount: integer (nullable = true)
 |-- currentNovelty: double (nullable = true)

Dataset: association_by_datatype_direct:

root
 |-- diseaseId: string (nullable = true)
 |-- targetId: string (nullable = true)
 |-- aggregationType: string (nullable = true)
 |-- aggregationValue: string (nullable = true)
 |-- associationScore: double (nullable = true)
 |-- evidenceCount: long (nullable = true)
 |-- timeseries: array (nullable = true)
 |    |-- element: struct (containsNull = true)
 |    |    |-- year: integer (nullable = true)
 |    |    |-- associationScore: double (nullable = true)
 |    |    |-- novelty: double (nullable = true)
 |    |    |-- yearlyEvidenceCount: long (nullable = true)
 |-- currentNovelty: double (nullable = true)

Dataset: association_by_datatype_indirect:

root
 |-- diseaseId: string (nullable = true)
 |-- targetId: string (nullable = true)
 |-- aggregationType: string (nullable = true)
 |-- aggregationValue: string (nullable = true)
 |-- associationScore: double (nullable = true)
 |-- evidenceCount: long (nullable = true)
 |-- timeseries: array (nullable = true)
 |    |-- element: struct (containsNull = true)
 |    |    |-- year: integer (nullable = true)
 |    |    |-- associationScore: double (nullable = true)
 |    |    |-- novelty: double (nullable = true)
 |    |    |-- yearlyEvidenceCount: long (nullable = true)
 |-- currentNovelty: double (nullable = true)

Dataset: association_overall_direct:

root
 |-- diseaseId: string (nullable = true)
 |-- targetId: string (nullable = true)
 |-- aggregationType: string (nullable = true)
 |-- aggregationValue: string (nullable = true)
 |-- associationScore: double (nullable = true)
 |-- evidenceCount: long (nullable = true)
 |-- timeseries: array (nullable = true)
 |    |-- element: struct (containsNull = true)
 |    |    |-- year: integer (nullable = true)
 |    |    |-- associationScore: double (nullable = true)
 |    |    |-- novelty: double (nullable = true)
 |    |    |-- yearlyEvidenceCount: long (nullable = true)
 |-- currentNovelty: double (nullable = true)

Dataset: association_overall_indirect:

root
 |-- diseaseId: string (nullable = true)
 |-- targetId: string (nullable = true)
 |-- aggregationType: string (nullable = true)
 |-- aggregationValue: string (nullable = true)
 |-- associationScore: double (nullable = true)
 |-- evidenceCount: long (nullable = true)
 |-- timeseries: array (nullable = true)
 |    |-- element: struct (containsNull = true)
 |    |    |-- year: integer (nullable = true)
 |    |    |-- associationScore: double (nullable = true)
 |    |    |-- novelty: double (nullable = true)
 |    |    |-- yearlyEvidenceCount: long (nullable = true)
 |-- currentNovelty: double (nullable = true)

Tasks

  • Update dataset description to reflect the temporal axis.
  • Remove column old descriptions
  • Add new column descriptions
  • Update croissant dataset

Metadata

Metadata

Assignees

Labels

DataRelates to Open Targets data teamcroissant

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions