Skip to content

Fix MHC typing vs HLA typing naming inconsistency#801

Merged
ypriverol merged 1 commit intobigbio:devfrom
jonasscheid:fix/mhc-allele-typing-inconsistency
Feb 26, 2026
Merged

Fix MHC typing vs HLA typing naming inconsistency#801
ypriverol merged 1 commit intobigbio:devfrom
jonasscheid:fix/mhc-allele-typing-inconsistency

Conversation

@jonasscheid
Copy link
Contributor

Summary

  • Renames HLA typing / HLA typing method to MHC typing / MHC typing method in sdrf-terms.tsv to match the immunopeptidomics template README which already uses the species-agnostic MHC typing terminology
  • Updates all cross-references in quickstart guides, main README, site HTML, and llms.txt
  • Aligns descriptions and allowed values to be species-agnostic (e.g., supports both human HLA and mouse H-2 nomenclature)
  • Adds inferred from mass spectrometry as a valid MHC typing method

Fixes #794

Test plan

  • Verify sdrf-terms.tsv column names match immunopeptidomics README.adoc column names
  • Verify site HTML renders correctly with updated terms
  • Confirm no remaining HLA typing references used as column names

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Feb 23, 2026

Important

Review skipped

Auto reviews are disabled on base/target branches other than the default branch.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@ypriverol
Copy link
Member

@ypriverol
Copy link
Member

@jonasscheid should we also clearly explain how to write the non cleavage enzymes: https://www.ebi.ac.uk/ols4/ontologies/ms/classes/http%253A%252F%252Fpurl.obolibrary.org%252Fobo%252FMS_1001956?lang=en

@jonasscheid
Copy link
Contributor Author

jonasscheid commented Feb 24, 2026

Theoretically there is no need to have it since all immunopeptidomics experiments are unspecific cleavage or how do you normally handle these "default" metadata

@ypriverol
Copy link
Member

I think is not needed. However, we still have one issue that I haven't solved quite well. When we combined two templates lets say DDA + Inmunopeptidomics; We will have conflicting rules in the DDA we required the Enzyme and in the inmunopeptidomics we actually dont' I have to make sure @noatgnu we have a mechanism to actually represent when a property is required in one template and optional/not required in the other, what to do, and more importantly, how to detect this and make it clear for the users. @jpfeuffer @timosachsenberg Can you give us an opinion about this also?

@noatgnu
Copy link
Collaborator

noatgnu commented Feb 24, 2026

Currently we do have a mechanism for merging templates that were written but we need to have more test case for it. You can see it in the sdrf-pipelines here https://github.com/bigbio/sdrf-pipelines/blob/dev/src/sdrf_pipelines/sdrf/schemas/utils.py

You will find a utility for composing template from different schemas into a new schema with all the rules. For this purpose, the merge strategy would be using _merge_fields_combine_strategy here if two or more templates have the same column and requirement was defined for that column then only the requirement that is at the highest level will persist.

In this utils.py we also have ability to write out a tsv file with all the template column header in the correct order using schema_to_tsv function which take a schema.

We might need some update to have it write out the correct original schema references that the composite schema was created from including its name and version.

@jpfeuffer
Copy link

@noatgnu The problem with this is:

a) how is highest level defined
b) the immunopeptidomics template will by default say nothing about enzyme, so it won't overwrite any field whatsoever

@jpfeuffer
Copy link

But to be honest immunopeptidomics can just require an enzyme column and even require even more strictly a very specific enzyme only, called "No enzyme" which should be part of the ontology.

@jonasscheid
Copy link
Contributor Author

a) how is highest level defined

@noatgnu I would expect mandatory > optional > custom for merging templates, or?

Agree with @jpfeuffer it will be combined most likely with proteomics-ms template, and therefore enzyme specification will be required. I would still put it in optional for immunopeptidomics template, since its not a concrete batch effect of the data compared to MHC type or enrichment method. But then we have it documented if other templates require enzyme.

Should also be done in the same way then for other templates for e.g. metaproteomics

@nithujohn
Copy link
Collaborator

I would suggest the name as MHC class as the example in the template for the same is "HLA-A02:01, HLA-B07:02, HLA-C*07:02, H-2Kb, H-2Db" which is the type of MHC molecule presenting peptides. But MHC typing would be the laboratory method used to determine which HLA alleles a donor has

@jpfeuffer
Copy link

Yes but even with optional , this would not be enough to overwrite the requiredness of this column that comes from the Ms template (with the reasonable order that you mentioned).

Therefore you either have to define overwriting strictly by combination order (i.e. MS + immuno is different from immuno + MS), OR as I said, make immuno even more strict about enzymes and make it a required AND term restricted column.

@jonasscheid
Copy link
Contributor Author

What about going for a default enzyme value in the yaml for immunopeptidomics and if this field is required by another template, it will just be populated? I guess this behaviour could be relevant for other templates as well..

@ypriverol
Copy link
Member

A couple of points:

Immunopeptidomics can be combined with both DDA and DIA, not only DDA.

I agree with @jpfeuffer that we can reference the enzyme using MS:1001956 (unspecific cleavage).

This discussion highlights a more general issue around combining templates. What happens when one template defines a column as REQUIRED, while another defines it as OPTIONAL or does not define it at all — especially when the column may not even make sense in that context? For example, the immunopeptidomics template may not define an enzyme column, while the DDA template defines enzyme as REQUIRED.

The real issue may be how we designed DDA/DIA: we marked enzyme as REQUIRED in DDA without considering valid DDA experiments (e.g., immunopeptidomics or top-down) where no enzyme applies. The key question is whether column requirements (REQUIRED / RECOMMENDED / OPTIONAL) should be context-dependent rather than globally enforced at the acquisition level.

@jpfeuffer
Copy link

Yes, @jonasscheid @ypriverol both your options are also valid! We have to decide which one is the most maintainable or least complex.

@ypriverol ypriverol merged commit dcf0f6c into bigbio:dev Feb 26, 2026
2 of 3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants