Should the metric definition/ID evolve from common to specific based on data type (short/long) and assembly (GRCh37/38)? #108
Replies: 3 comments
-
Guideline for Metric Definition/ID Granularity1. Base Principle
2. When to Keep a Metric CommonKeep a single ID when:
Example:
3. When to Make a Metric SpecificCreate separate IDs or definitions when:
Example: Mapping quality ( 4. Metadata RequirementsEvery metric entry must include at least:
5. Edge Case Handling
6. Recommended Sources for Definitions
|
Beta Was this translation helpful? Give feedback.
-
WG consensus:Extend the metric definition template with attributes:
Schema Flexibility:
Common ID + metadata – when the metric meaning is stable |
Beta Was this translation helpful? Give feedback.
-
Metadata Requirements
|
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Do we need to refine the metric definition/ID from a common form to more specific ones depending on data type (short paired-end / long), genome assembly (GRCh37 / 38) and the JSON schema?
Current version support specific to short pair-end data, GRCh38 assembly and only autosomes non gap regions of GRCh38.
Metrics common to both short and long read sequence data are yield BP, percent reads mapped, mean coverage, cross contamiination, count SNV and ti/tv rate..
eg.

GRCh38 assembly refers to 1000genome-dragen-3.7.6 reference
Autosomes non gap regions
Autosomes non gap regions refers to the selection of chromosome 1 to 22, excluding chromosome X, Y, M and alternative contigs. In addition, gap regions as defined by UCSC are excluded. See NPM-sample-qc documentation
Beta Was this translation helpful? Give feedback.
All reactions