Skip to content

Conversation

@JerryJi0730
Copy link
Contributor

In this PR, we are adding a few features following with #238. We are now supporting to pass in gzip GAF files, and to use CLI command line precessing the files. There are two main changes in this PR.

  1. Integrateingflate 2 decompression into make_pangenotype_matrix to automatically handle gzip files and plain text files at the same time.

  2. Adding a new make_pangenotype_matrix_from_stream that accepts standard input to read pangenotype data from streams.

Currently, the users can pass in gzip files in the python binding, or use CLI command gzip -dc Chr3.1.gaf.gz | fgfa -I Chr3.gfa matrix --stdin for matrix constructing. The users can also use fgfa -I Chr3.gfa matrix -f Chr3.1.gaf.gz for a gzip format or fgfa -I Chr3.gfa matrix -f Chr3.1.gaf for plain text format.

Hang Ji and others added 14 commits May 14, 2025 01:19
We weren't actually using flate2 anywhere yet, so I have removed the
dependency.
Trying to keep the focus on the current version...
Instead of putting this directly in the Python bindings, now this is
available in the core library. This way, we can also add a command-line
way to do the same thing, for instance.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants