Skip to content

Commit e95779d

Browse files
bherr2johardi
andauthored
v0.10.0 (#46)
* Proposed tweaks to speed up ds-graph performance (#36) * Proposed tweaks to speed up ds-graph performance * Stop using owl.template in donor.yaml * Remove "links back to" relationship * Remove "partially overlaps" relationship * Simplify the "sex_id" and "race_id" annotations * Remove the redundant extraction_site relationship This relationship is already handled by the rui_location (or formally it is defined as "has_registration_location"). * Add metadata module to generate the ontology metadata --------- Co-authored-by: Josef Hardi <[email protected]> * Improve error handling to help on debugging * Fix when a sample block has no registration location For example, samples from CellxGene repository * Commit the sssom schema * Convert the attributes to first class slots * Removing 'equals_string' expression due to known LinkML bug Refer to linkml/linkml#1855 ticket to learn more about the bug. * Improve execution logging to debug better * Remove unnecessary prefixes * Explicitly declare slot_uri and class_uri to avoid mix-ups during OWL generation. * Fix info logging should use the name variable * Exclude Eukaryota organism hierarchy * Remove unused variables * Fix typo Thanks to Andi who pointed this out. * Commit schema for ctann DO * Initial commit for ctann data normalization * Initial commit for CTAnn data enrichment. * Update LinkML versions * Replace with full IRI * Added references as part of dataset metadata * Write a module to extract DOI and BioRxiv references from text * Implement reference extraction from description text in metadata normalization * Include references mentioned explicitly in the metadata.yaml file * Set www. as optional in the regex * Add references field to README.md and index.html files * Align donor schema with raw data schema. * Rename external_link to link per raw schema * Explicitly mention slot_uri * Remove collision summaries and corridors as they belong to spatial entities * Remove unused class and slot * Add collision summaries, corridor, and cell summaries to spatial entity * Rename external_link to link per raw data schema * Rename cell_summaries to summaries per raw data schema * Add cell_count, gene_count and organ_id to dataset schema * Remove unused argument * Explicitly mention slot_uri on each slot * Rename file_url to file per raw data schema * Set corridor's parent to spatialEntity, not tissue block * Rename collision_items to collisions per raw data schema * Refactor code * Rename slots to align with the raw data schema * Rename slots to align with raw data schema * Explicitly mention slot_uri on each slot * Ensure a consistent definition of percentage * Fix missing aggregated_summary_count and aggregated_summaries * Fix argument name * Updated top-level attribute names for uniqueness and to prevent collisions with existing attributes * Fix missing spatialEntity reference * Make collisions as optional * Expand any compact ASCTB-TEMP terms to its full IRI * Remove ASCTB-TEMP prefix to prevent automatic IRI compaction by CLI tool The ASCTB-TEMP prefix was causing the expanded IRI to be automatically compacted by the command line tool, resulting in invalid compacted names (e.g., ASCTB-TEMP:-cells-ins-). This could lead to parsing errors when the tool encounters these improperly formatted names. * Explicitly append CCF prefix to entity type names to ensure correct IRI generation * Add testing framework for evaluating enriched graph data * Rename test to validate * Normalize the aggregated cell summaries * More on renaming slots to align with the raw data * Use EntityID to simplify the slot definition * Fix description * Fix the schema to support sample comes_from donor relationship * Updating test case queries * Add Sample class for backward compatibility * Add a new sample_type slot * Rename slot * Map description slot to rdfs:comment * Swap values between rdfs:label and skos:prefLabel * Update the query to find tissue blocks and tissue sections * Add references in the dataset object * Map to its original slot name * Update the test query to if references exists * Fix the missing pref_label property * Return error log excluding the long message field * Add anatomical structure collision object to schema * Update cell summary data to use blank nodes * Update collision data to use blank nodes * Update collision data to use blank nodes * Add creator first and last name properties * Fix date off by one day * Suppress warning messages * Updating the number types * Suppress log messages from mmdc command * Remove unused slots * Move spatial placement into its own record in the model * Adjust enrichment logic to reflect recent model changes * Patch ds-graph context to include @reverse property support * Create utility script for converting YAML into JSON-LD format * Update SPARQL test cases to ensure accurate query validation * Disambiguate dataset entities: use Dataset to refer to experimental dataset and RawDataset for source input for creating a DO graph * Skip date processing when format is already YYYY-MM-DD * Update test cases to handle regular ds-graph structure (non hra-pop) * Add usage comments * Set range as reference ID to actual entities * Get rid of settings * Improve code to normalize various date formats to YYYY-MM-DD format * Enable normalization and enrichment of ds-graph data * Update requirements-freeze.txt * Disable error report parsing, as it fails with large reports * Remove label definition because it's already specified in entity-base schema * Remove ASCTB-TEMP prefix to enforce full IRI format for entity ID * Set default label * Use Date object to parse date string * Fix file naming case collision in documentation * Regenerating ER diagrams * Add patch to define the ASCTB-TEMP prefix * Update URIs for gene_expr and mean_gene_expr_value slots * Enforce full URI usage for ASCTB-TEMP terms to prevent prefix-related issues * Add support for DOI URLs containing dx.doi.org * Set author ORCID field as optional in the schema * Remove unused variable * Add new slots * Include the range for the create_date slot * Set normalizeDate as a global function * Patch the slot_id to use dcterms * Fix passing parameter * Return null when the RUI location is empty * Collect the references from the array only * Display the references in the README file, if available * Change float range to decimal * Remove references * Update ER diagrams * Breakdown long function * Update OMAP schema to support reverse data reconstruction * Bump version --------- Co-authored-by: Josef Hardi <[email protected]>
1 parent 7db4da5 commit e95779d

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

83 files changed

+4150
-1945
lines changed

docs/ds-graph/Creator.md

Lines changed: 0 additions & 40 deletions
This file was deleted.

docs/ds-graph/Dataset.md

Lines changed: 0 additions & 59 deletions
This file was deleted.

docs/ds-graph/Donor.md

Lines changed: 0 additions & 71 deletions
This file was deleted.

docs/er-diagrams/index-1.svg

Lines changed: 1 addition & 1 deletion
Loading

docs/er-diagrams/index-10.svg

Lines changed: 1 addition & 1 deletion
Loading

docs/er-diagrams/index-11.svg

Lines changed: 1 addition & 1 deletion
Loading

docs/er-diagrams/index-12.svg

Lines changed: 1 addition & 1 deletion
Loading

docs/er-diagrams/index-13.svg

Lines changed: 1 addition & 1 deletion
Loading

docs/er-diagrams/index-14.svg

Lines changed: 1 addition & 1 deletion
Loading

docs/er-diagrams/index-15.svg

Lines changed: 1 addition & 1 deletion
Loading

0 commit comments

Comments
 (0)