Fix MHC typing vs HLA typing naming inconsistency#801
Conversation
|
Important Review skippedAuto reviews are disabled on base/target branches other than the default branch. Please check the settings in the CodeRabbit UI or the You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
@jonasscheid Can you review this file also https://github.com/bigbio/proteomics-sample-metadata/tree/dev/sdrf-proteomics/templates/immunopeptidomics for consistency and the given template: https://github.com/bigbio/sdrf-templates/blob/main/immunopeptidomics/1.0.0-dev/immunopeptidomics.yaml |
|
@jonasscheid should we also clearly explain how to write the non cleavage enzymes: https://www.ebi.ac.uk/ols4/ontologies/ms/classes/http%253A%252F%252Fpurl.obolibrary.org%252Fobo%252FMS_1001956?lang=en |
|
Theoretically there is no need to have it since all immunopeptidomics experiments are unspecific cleavage or how do you normally handle these "default" metadata |
|
I think is not needed. However, we still have one issue that I haven't solved quite well. When we combined two templates lets say DDA + Inmunopeptidomics; We will have conflicting rules in the DDA we required the Enzyme and in the inmunopeptidomics we actually dont' I have to make sure @noatgnu we have a mechanism to actually represent when a property is required in one template and optional/not required in the other, what to do, and more importantly, how to detect this and make it clear for the users. @jpfeuffer @timosachsenberg Can you give us an opinion about this also? |
|
Currently we do have a mechanism for merging templates that were written but we need to have more test case for it. You can see it in the sdrf-pipelines here You will find a utility for composing template from different schemas into a new schema with all the rules. For this purpose, the merge strategy would be using In this We might need some update to have it write out the correct original schema references that the composite schema was created from including its name and version. |
|
@noatgnu The problem with this is: a) how is highest level defined |
|
But to be honest immunopeptidomics can just require an enzyme column and even require even more strictly a very specific enzyme only, called "No enzyme" which should be part of the ontology. |
@noatgnu I would expect mandatory > optional > custom for merging templates, or? Agree with @jpfeuffer it will be combined most likely with proteomics-ms template, and therefore enzyme specification will be required. I would still put it in optional for immunopeptidomics template, since its not a concrete batch effect of the data compared to MHC type or enrichment method. But then we have it documented if other templates require enzyme. Should also be done in the same way then for other templates for e.g. metaproteomics |
|
I would suggest the name as MHC class as the example in the template for the same is "HLA-A02:01, HLA-B07:02, HLA-C*07:02, H-2Kb, H-2Db" which is the type of MHC molecule presenting peptides. But MHC typing would be the laboratory method used to determine which HLA alleles a donor has |
|
Yes but even with optional , this would not be enough to overwrite the requiredness of this column that comes from the Ms template (with the reasonable order that you mentioned). Therefore you either have to define overwriting strictly by combination order (i.e. MS + immuno is different from immuno + MS), OR as I said, make immuno even more strict about enzymes and make it a required AND term restricted column. |
|
What about going for a default enzyme value in the yaml for immunopeptidomics and if this field is required by another template, it will just be populated? I guess this behaviour could be relevant for other templates as well.. |
|
A couple of points: Immunopeptidomics can be combined with both DDA and DIA, not only DDA. I agree with @jpfeuffer that we can reference the enzyme using MS:1001956 (unspecific cleavage). This discussion highlights a more general issue around combining templates. What happens when one template defines a column as REQUIRED, while another defines it as OPTIONAL or does not define it at all — especially when the column may not even make sense in that context? For example, the immunopeptidomics template may not define an enzyme column, while the DDA template defines enzyme as REQUIRED. The real issue may be how we designed DDA/DIA: we marked enzyme as REQUIRED in DDA without considering valid DDA experiments (e.g., immunopeptidomics or top-down) where no enzyme applies. The key question is whether column requirements (REQUIRED / RECOMMENDED / OPTIONAL) should be context-dependent rather than globally enforced at the acquisition level. |
|
Yes, @jonasscheid @ypriverol both your options are also valid! We have to decide which one is the most maintainable or least complex. |
Summary
HLA typing/HLA typing methodtoMHC typing/MHC typing methodinsdrf-terms.tsvto match the immunopeptidomics template README which already uses the species-agnosticMHC typingterminologyinferred from mass spectrometryas a valid MHC typing methodFixes #794
Test plan
sdrf-terms.tsvcolumn names match immunopeptidomicsREADME.adoccolumn namesHLA typingreferences used as column names