WiP: Branch with auto converted linkml model#381
Conversation
yarikoptic
left a comment
There was a problem hiding this comment.
initial review pointers
| contributor: | ||
| name: contributor | ||
| notes: | ||
| - 'pydantic2linkml: Warning: The translation is incomplete. Tagged union types |
There was a problem hiding this comment.
for all 3 of those (present in 4 spots in pydantic model) translate to use linkml's designates_type
See https://claude.ai/share/66fe0f31-8cf1-40e6-b9fd-47c6719f2006
There was a problem hiding this comment.
(Self)Note: This is an issue regarding translation of discriminated unions in Pydantic models.
| name: identifier | ||
| range: string | ||
| required: true | ||
| pattern: ^(?:urn:uuid:)?[0-9a-fA-F]{8}-?[0-9a-fA-F]{4}-?4[0-9a-fA-F]{3}-?[89abAB][0-9a-fA-F]{3}-?[0-9a-fA-F]{12}$ |
There was a problem hiding this comment.
per discussion with @candleindark and claude -- it seems we should be able to use types here. Here is an example excerpt from https://claude.ai/share/66fe0f31-8cf1-40e6-b9fd-47c6719f2006
types:
ORCID:
uri: xsd:string
base: str
pattern: "^https://orcid\\.org/\\d{4}-\\d{4}-\\d{4}-\\d{3}[0-9X]$"
description: An ORCID identifier
ISSN:
uri: xsd:string
base: str
pattern: "^\\d{4}-\\d{3}[0-9X]$"
description: An ISSN identifier
UUID:
uri: xsd:string
base: str
pattern: "^[0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12}$"
description: A UUID
slots:
contributor_id:
range: ORCID
publication_id:
range: ISSN
asset_id:
range: UUIDThere was a problem hiding this comment.
The definitions in dandischema.models doesn't define those types with patterns information attached. Those types are actually defines as alias of str
dandi-schema/dandischema/models.py
Lines 714 to 718 in 1deb319
Because of that, I don't think these type definitions should be part of the result of auto translation but added manually something to add to the result of the auto translation.
| genotype: | ||
| name: genotype | ||
| notes: | ||
| - 'pydantic2linkml: Warning: The translation is incomplete. The union core schema |
There was a problem hiding this comment.
seems "obvious" -- union of a list[object] vs str . Should be doable now.
There was a problem hiding this comment.
This kind of union can now be expressed in LinkML using any_of as demo in #257 (comment).
dandischema/models.yaml
Outdated
| notes: | ||
| - 'pydantic2linkml: Unable to translate the logic contained in the wrap validation | ||
| function, <function _BaseUrl.__get_pydantic_core_schema__.<locals>.wrap_val | ||
| at 0x7f8c96c36f80>.' |
There was a problem hiding this comment.
this is for List[AnyHttpUrl] and @candleindark says that there is custom validator attached. May be we could just provide custom type for validation here.
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## master #381 +/- ##
=======================================
Coverage 97.92% 97.92%
=======================================
Files 18 18
Lines 2405 2405
=======================================
Hits 2355 2355
Misses 50 50
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
fa28737 to
c78f9d3
Compare
Specify Hatch-managed env for auto converting `dandischema.models` to LinkML schema and back to Pydantic models
Provide script to translate `dandischema.models` in to a LinkML schema and overly it with definition provided by an overlay file.
Provide script to translate `dandischema/models.yaml` back to Pydantic models and store them in `dandischema/models.py`
=== Do not change lines below ===
{
"chain": [],
"cmd": "hatch run linkml-auto-converted:2linkml",
"exit": 0,
"extra_inputs": [],
"inputs": [],
"outputs": [],
"pwd": "."
}
^^^ Do not change lines above ^^^
28c5b44 to
bd38142
Compare
These prefixes are copied from https://github.com/dandi/schema/blob/master/releases/0.7.0/context.json
bd38142 to
6bb4b52
Compare
…tions Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
There is no prefix defined as `dandi_default`. The intended default prefix is `dandi`
… models.yaml The previous BRE pattern used `\+` (GNU sed extension) which silently fails on macOS BSD sed. Switch to `-E` (extended regex) with POSIX character class `[^[:space:]]` instead of `\S` (also unsupported by BSD sed), making the normalization work on both macOS and Linux. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
to not be merged
The plan here
git mv dandischema/{,orig_}models.pyhatch ... TODOmodels.pyintodandischema/models.yamland overlaid with an [dandischema/models_overlay.yaml] overlay file.model_instances.yaml(or alike) which would define pre-populated records such as standards (bids, nwb, ...). aim for potentially multiple classes there.Where possible keep additional checks in some extra file which would tune auto-converted model