Skip to content

Top-level Python package structure proposalΒ #349

@jeromekelleher

Description

@jeromekelleher

We do want to start providing a Python API soon, and I think @benjeffery's excellent refactoring work in #339 has brought this pretty close. Here's a proposal for how we could provide a long-term stable API that allows the package to grow new (potentially optional) modules for formats, and also having support for writing things other than VCZ.

  • bio2zarr.vcz: All code for writing VCZ
  • bio2zarr.plink: All code for working with plink format, defining conversion methods to VCZ
  • bio2zarr.vcf: All code for working with VCF format, defining conversion methods to VCZ
  • bio2zarr.tskit: All code for working with tskit format, defining conversion methods to VCZ
  • bio2zarr.bgen...

This feels like an easy thing for users to remember, as well as giving a nice clean separation between multi-format reading and writing-to-Zarr code. It should be straighforward to have optional dependencies then, if we don't want to bundle (e.g.) plink and bgen into the default install to keep things simpler for users when dependencies misbehave.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions