-
Log messages are written to
stderrrather thanstdout. -
cli:
fq lint --record-definition_separator <string>now only accepts a single ASCII character (#51).This previously accepted any nonempty string and dropped the rest of the characters.
-
commands/filter: Require source count to match destination count (#52).
-
commands/filter: Normalized names input (#49).
The record names to filter now follow the same normalization rules as FASTQ record names, i.e., removal of the
@prefix and description. -
commands/filter: Require a filter condition.
Previously, both the
namesandsequence-patternoptions were allowed to be missing, which would passthrough the input. This is a nonsensical use-case and can be alternatively achieved with a copy. -
commands/lint: Re-enable names validator when duplicate name validator is used for paired inputs (#48).
When the inputs are paired, the duplicate name validator (S007) depends on the names validator (P001). If P001 is disabled, it will now get re-enabled.
-
commands/lint: Support the duplicate name validator for single inputs (#47).
-
fastq/record: Split name from definition on first separator.
This previously searched for the separator from the end of the definition, which may contain part of the description if the separator appears multiple times. It now searches from the beginning of the definition.
This also affects how the name is extracted in the
filtercommand.
-
Remove
--verboseflag.Logging is always enabled. This flag was previously deprecated in 0.8.0.
-
commands: Remove
generatecommand.The
generatecommand created completely random paired reads. The descriptors tended to overlap when N was large, so while the outputs were parsable, they weren't practically useful for anything.
-
commands/lint: Add
--record-definition-separatoroption (#34).This allows a custom separator to be used to strip the description from a record name. When unset, the default remains the same with '/' and ' '.
-
commands/lint: Return a nonzero exit code if an error is logged.
When the lint mode is set to
log, thelintcommand will now exit with a nonzero status if there are any validation errors.
-
commands/filter: Add filter by sequence pattern (#27).
Records can be filtered by their sequence using a regular expression:
fq filter --sequence-pattern <regex> --dsts <dst> <src>. It cannot be combined with name filtering.
-
commands/filter: Support multiple segments (#30).
The
filtercommand now supports multiple segments. Each source is paired with a destination (i.e., the output is no longer written to stdout by default), which is filtered by whether the record in the first segment is matched. -
commands/subsample: Disallow 0% and 100% as probabilities.
At these extremes, use
touchandcp, respectively, instead.
-
commands/subsample: Count the lines from the decompressed data if the input is gzipped.
Used in the exact sampler, this previously counted "lines" from the compressed input.
-
commands/subsample: Clamp the destination record count to the range of the source record count.
Otherwise, this would cause the filter to never finish building.
-
commands/subsample: Add exact sampler.
This writes an exact number of samples to the output. Set the
-n/--record-countoption to use the exact sampler.
- Update argument parser to clap 3.
- Rename project to fq.
-
commands/generate: Add
-sshort option for--seed. -
commands: Add
subsamplecommand.subsampleoutputs a proportional subset of records from single or paired FASTQ files.
-
Deprecate
--verboseflag.Logging is now always enabled.
-
main: Show global version in subcommands (#20).
This allows subcommands to show the global version, e.g.,
fq lint --version.
generate: Added--read-lengthoption to set the number of bases to generate in each record's sequence.
- The FASTQ reader handles files with CRLF (Windows) newlines and no final newline.
-
[BREAKING]
generate: Renamed--n-recordsto--record-count. -
generate:--record-countis parsed as au64rather than ani32. The argument parser never allowed negative numbers, so this change still includes the entire previous input set.
- The
generatecommand adds a--seed <u64>option to seed the random number generator. This is useful to regenerate the same outputs.
- The FASTQ generator now uses the Sanger/Illumina 1.8+ range of quality scores ([0, 41]). It samples scores on a normal distribution (μ = 20.5, σ = 2.61).
- Updated dependency
bloom-->bbloomto reflect a name change in the library.
-
New
filtercommand. This accepts an allowlist of record read names to keep in the output FASTQ. -
Add
Dockerfileto build a self-contained image forfq. Build withdocker build --tag fqlib .. -
Show git commit ID and date in display version, e.g., when using
--version. This makes it easier to know the exact build of fqlib being used.
- [BREAKING]
generate: Renamed--num-blocksto--n-records.
-
For paired end reads,
fq lintexits with unexpected EOF if the both streams do not finish together. -
Multistream gzip files can be used as inputs. Written files still use a single stream.
-
fq lintcan take one FASTQ file as input for only single read validation.
-
A single binary
fqwith subcommands replacesfqgenandfqlint. Update usages tofq generateandfq lint, respectively. -
Metadata from CASAVA 1.8 read names is truncated. This is handled the same as interleaves.
- Fix line offset in error messages, which was previously off by 4.
- Initial release