-
Notifications
You must be signed in to change notification settings - Fork 0
Add comprehensive enum metadata and stewardship framework #32
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This commit enhances the value sets project with comprehensive metadata and stewardship tracking for all 516 enums across 109 schema files. ## Key Changes ### 1. Custom Metamodel for Validation (src/valuesets/metamodel/) Created `valuesets_meta.yaml` defining metaclasses for enum validation: - `ValueSetEnumDefinition`: Base metaclass requiring title, description, status, and contributors - `ValueSetEnumDefinitionWithStewardship`: Extends base with stewardship annotations (stewards, publishers, endorsers) - `ValueSetEnumDefinitionWithSource`: For enums with external source - `ValueSetEnumDefinitionWithConformance`: For enums conforming to standards - `ReferenceEnumDefinition`: For reference/controlled vocabulary enums ### 2. Stewardship Value Set (src/valuesets/schema/stewardship.yaml) Created enum of recognized steward organizations with ROR IDs: - GA4GH, Gene Ontology Consortium, Genomic Standards Consortium - Phenopackets, EMBL-EBI, NCBI, CDISC, HL7, ISO, W3C, etc. - Supports tracking organizational responsibility for value sets ### 3. Batch Metadata Updates (All 516 Enums) Updated all enums with required metadata fields: - Added `title` (human-readable names) - Ensured `description` present - Set `status` (using StandardsMaturityLevel: DRAFT, STANDARD, etc.) - Added default `contributors`: orcid:0000-0002-6601-2165 and github.com/anthropics/claude-code - Added `instantiates` references to appropriate metaclasses ### 4. Stewardship Annotations (Conservative Application) Applied stewardship metadata to well-established value sets: **Gene Ontology (4 enums):** - go_aspect.yaml, go_causality.yaml, go_evidence.yaml - Stewards: https://geneontology.org - Publishers: https://geneontology.org **Phenopackets (11 enums in clinical/phenopackets.yaml):** - KaryotypicSexEnum, PhenotypicSexEnum, AllelicStateEnum, ACMGPathogenicityEnum, etc. - Stewards: https://phenopackets.org - Publishers: https://ga4gh.org **INSDC/GSC (2 enums):** - insdc_geographic_locations.yaml, insdc_missing_values.yaml - Stewards: https://gensc.org - Publishers: https://www.insdc.org Note: Used `annotations:` block since LinkML EnumDefinition doesn't support custom stewardship slots directly. ### 5. Governance Documentation (docs/governance.md) Created comprehensive governance documentation covering: - Source of Truth model (mirrored vs community value sets) - Maturity levels and lifecycle - Stewardship roles and responsibilities - Contributing process and quality standards - Working group structure (placeholder) ### 6. Bug Fix: Stale Virtual Environment Fixed pre-existing issue where `.venv` had stale shebangs pointing to old project location (`common-value-sets`). Recreated venv, enabling: - `just site` now succeeds - `just test` passes (63 tests) - All LinkML generators work correctly ## Impact - All 516 enums now have complete, consistent metadata - Stewardship tracking enables community governance - Custom metamodel provides validation framework - Foundation for future expansion of stewardship annotations ## Testing - ✅ All 63 tests pass - ✅ `just site` generates documentation successfully - ✅ `just validate` runs without errors - ✅ Schema validation passes
|
Claude finished @cmungall's task —— View job PR Review: Add comprehensive enum metadata and stewardship frameworkReview Checklist:
🔍 Review SummaryThis is an excellent foundational PR that adds crucial governance infrastructure to the valuesets project. The implementation demonstrates thoughtful design and comprehensive coverage. ✅ Strengths1. Well-Designed Metamodel Architecture
2. Comprehensive Metadata Coverage
3. Robust Stewardship ImplementationThe stewardship.yaml design is particularly well done:
4. Excellent Governance DocumentationThe governance.md provides:
🐛 Issues Found1. Minor Schema Inconsistencies
2. Validation Gaps
3. Documentation Issues
🔒 Security Assessment
⚡ Performance Considerations
🧪 Test Coverage AssessmentReviewed existing test suite:
📋 RecommendationsHigh Priority
Medium Priority
Low Priority
💡 Architecture PraiseThe annotation-based stewardship approach is particularly clever: annotations:
stewards: https://geneontology.org
publishers: https://geneontology.orgThis works around LinkML's current limitation where EnumDefinition doesn't support custom slots directly, while maintaining clean semantics and future extensibility. 🎯 ConclusionThis PR represents exceptional work that establishes crucial governance foundations. The metamodel design is sophisticated, the metadata coverage is comprehensive, and the stewardship model is well-architected. The minor issues identified are easily addressable and don't detract from the overall quality. Recommendation: Approve with minor fixes The identified namespace inconsistencies and placeholder URLs should be addressed, but the core contribution is excellent and ready for adoption. |
Changes: - Changed all enums to use `valuesets_meta:ValueSetEnumDefinition` instead of `ValueSetEnumDefinitionWithStewardship` - Single base metaclass is simpler and cleaner - Stewardship metadata remains via annotations block - Removed Python 3.9 from CI test matrix (EOL) - Updated 6 schema files: GO (3), INSDC (2), Phenopackets (1) - Regenerated Python datamodels All enums now consistently use the same base metaclass while maintaining stewardship annotations where appropriate.
|
Claude finished @cmungall's task —— View job PR Review: Add comprehensive enum metadata and stewardship frameworkReview Checklist:
🔍 Review SummaryThis is an excellent foundational PR that adds crucial governance infrastructure to the valuesets project. The implementation demonstrates thoughtful design and comprehensive coverage. ✅ Strengths1. Well-Designed Metamodel Architecture
2. Comprehensive Metadata Coverage
3. Robust Stewardship ImplementationThe stewardship.yaml design is particularly well done:
4. Excellent Governance DocumentationThe governance.md provides:
🐛 Issues Found1. Minor Schema Inconsistencies
2. Validation Gaps
3. Documentation Issues
🔒 Security Assessment
⚡ Performance Considerations
🧪 Test Coverage AssessmentReviewed existing test suite:
📋 RecommendationsHigh Priority
Medium Priority
Low Priority
💡 Architecture PraiseThe annotation-based stewardship approach is particularly clever: annotations:
stewards: https://geneontology.org
publishers: https://geneontology.orgThis works around LinkML's current limitation where EnumDefinition doesn't support custom slots directly, while maintaining clean semantics and future extensibility. 🎯 ConclusionThis PR represents exceptional work that establishes crucial governance foundations. The metamodel design is sophisticated, the metadata coverage is comprehensive, and the stewardship model is well-architected. The minor issues identified are easily addressable and don't detract from the overall quality. Recommendation: Approve with minor fixes The identified namespace inconsistencies and placeholder URLs should be addressed, but the core contribution is excellent and ready for adoption. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR enhances the valuesets project by adding comprehensive metadata and stewardship tracking for all 516 enums across 109 schema files. The changes establish a custom metamodel for validation, define stewardship organizations, and apply metadata to all enums in the repository.
Key changes:
- Created a custom metamodel (
valuesets_meta.yaml) with metaclasses for enum validation - Introduced
stewardship.yamldefining 15 recognized organizations for value set governance - Updated all 516 enums with title, description, status, contributors, and instantiates references
- Applied stewardship metadata to 17 well-established value sets (GO, Phenopackets, INSDC/GSC)
- Fixed pre-existing venv issue with stale shebangs
- Created governance documentation
Reviewed changes
Copilot reviewed 132 out of 136 changed files in this pull request and generated no comments.
Show a summary per file
| File | Description |
|---|---|
| src/valuesets/metamodel/valuesets_meta.yaml | Defines metaclasses for enum validation and stewardship tracking |
| src/valuesets/schema/stewardship.yaml | New enum defining 15 stewardship organizations with ROR IDs |
| src/valuesets/schema/**/*.yaml | 109 schema files updated with enum metadata (title, status, contributors, instantiates) |
| src/valuesets/schema/clinical/phenopackets.yaml | Added stewardship annotations to 11 Phenopackets enums |
| docs/governance.md | New governance documentation |
| justfile | Added gen-slides to site target |
Summary
This PR enhances the value sets project with comprehensive metadata and stewardship tracking for all 516 enums across 109 schema files.
Key Features
🏛️ Custom Metamodel for Validation
Created
src/valuesets/metamodel/valuesets_meta.yamlwith metaclasses:🌐 Stewardship Value Set
New
stewardship.yamlwith 15 recognized organizations:📋 Complete Metadata Coverage
Updated all 516 enums with:
🎯 Conservative Stewardship Application
Applied stewardship metadata to well-established value sets:
Gene Ontology (4 enums):
Phenopackets (11 enums):
INSDC/GSC (2 enums):
📖 Governance Documentation
Created
docs/governance.mdcovering:🐛 Bug Fix
Fixed pre-existing venv issue with stale shebangs pointing to old project location:
just sitenow succeedsjust testpasses (63 tests)Technical Notes
Stewardship metadata uses
annotations:block because LinkML's EnumDefinition doesn't support custom slots directly:Testing
just sitegenerates documentationjust validateruns without errorsImpact
Next Steps
Future work could include:
🤖 Generated with Claude Code
Co-Authored-By: Claude [email protected]