Skip to content

Conversation

benjeffery
Copy link
Contributor

Fixes #379

@coveralls
Copy link
Collaborator

coveralls commented May 13, 2025

Coverage Status

coverage: 98.179% (+0.01%) from 98.166%
when pulling 7049588 on benjeffery:variant_length
into 7942838 on sgkit-dev:main.

@benjeffery benjeffery marked this pull request as ready for review May 13, 2025 10:36
Copy link
Contributor

@jeromekelleher jeromekelleher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM - we can probably refine the API a bit, but good step in the right direction.

@tomwhite does this look right to you in terms of computing variant length? Should we do this, storing a "length" field everywhere, or have some fallback in the index computation code which computes the default when it's missing?

@jeromekelleher jeromekelleher requested a review from tomwhite May 13, 2025 11:50
@tomwhite
Copy link
Contributor

@tomwhite does this look right to you in terms of computing variant length? Should we do this, storing a "length" field everywhere, or have some fallback in the index computation code which computes the default when it's missing?

This looks good to me. It's not obvious to me that there is a general way of computing it so having the formats provide is safest.

bio2zarr/vcf.py Outdated
@@ -1145,14 +1156,12 @@ def fixed_field_spec(name, dtype, source=None, dimensions=("variants",)):
dtype="bool",
),
]
name_map = {field.full_name: field for field in self.metadata.fields}

# Only three of the fixed fields have a direct one-to-one mapping.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# Only three of the fixed fields have a direct one-to-one mapping.
# Only two of the fixed fields have a direct one-to-one mapping.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, fixed

@benjeffery benjeffery force-pushed the variant_length branch 3 times, most recently from a28f22d to 3829dbb Compare May 14, 2025 09:04
@benjeffery
Copy link
Contributor Author

Rebased and ready to go

@jeromekelleher
Copy link
Contributor

Looks like there's a real coverage drop here, with some warnings that used to be tested no longer covered. We should address these (in general, coverage drops aren't something we want)

@benjeffery
Copy link
Contributor Author

It's the index-when-arrays-missing error case has been uncovered (it was accidentally covered before). I'll add a specific test for it.

@benjeffery
Copy link
Contributor Author

Test added and rebased. Coveralls now happy.

@jeromekelleher jeromekelleher added this pull request to the merge queue May 14, 2025
Merged via the queue into sgkit-dev:main with commit 21d0224 May 14, 2025
15 checks passed
@benjeffery benjeffery deleted the variant_length branch May 14, 2025 13:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Tskit to zarr conversion fails to create index due to missing variant length
4 participants