Skip to content

Add a mechanism to allow integer to specify orders of magnitude suffixes like k, m, g, t? #165

@SGSSGene

Description

@SGSSGene

The original question is: Should sharg provide a datatype that can also be specified with additional magnitude suffixes like k, m, g, t?

Motivation: Currently I am trying to do integrate automatic CWL generation into sharg. For this, sharg needs to apply a mapping from a app-dev provided variable/type to a CWL-type. (This is actually not only needed by CWL, but also the man and help page generator are using this.)
As my first working block I am using raptor as a reference and checking how to map its parameter to CWL.
There are a few candidates that report improper types.

Troubling arguments
For ./raptor build -hh these are:

--window (raptor::window) // Note: Solved, see below.
          The window size. Default: k-mer size. Value must be a positive integer.
--shape (std::string)
          The shape to use for k-mers. Mutually exclusive with --kmer. Default: . Value must match the pattern
          '[01]+'.
--size (std::string)
          The size in bytes of the resulting index. Default: 1k. Must be an integer followed by [k,m,g,t] (case
          insensitive).
--output (std::filesystem::path)
          Provide an output filepath or an output directory if --compute-minimiser is used.

For ./raptor search -hh:

 --pattern (raptor::pattern_size) // Note: Solved, see below.
          The pattern size. Default: Median of sequence lengths in query file.
 --output (std::filesystem::path)
          Provide a path to the output.

Details:

  1. ✔️ --output For CWL I need to differentiate between input-file, output-file, input-directory, output-directory.
    To decide if something is an input or an output, we can just take a peek at the validators. Problem solved.
    For ./raptor build specifically, we also need to decide if the output is a directory or a file. This will be solved by moving the --compute-minimser as a new command. considered solved
  2. ✔️ --window and --pattern. These arguments are strong types. This was done to override a custom default message. This has been solved by [FEATURE] Add default_message #109 which allows specifying a custom default message and the strong type is not required anymore. considered solved
  3. ✔️ shape: I have no clue what to do with this one. But it is very special and a string should suffice, considered solved for now
  4. --size: This is what this discussion is about!

This option seems to be something that maybe a lot of other apps need. So should sharg provide a way of doing this out of the box?

  • If we answer no: There is nothing left to do.
  • If we say yes: How do we want to this?

What are our goals?

  • Easy use for app-dev
  • Minimally invasive
  • Maintainability
  • Extensibility
  • Readability
  • Encapsulation

API suggestions
Here are different API-suggestions on how this could look like for app-devs. It does not go into detail on how the implementation looks like:

// Idea 1:
uint64_t size{30};
myparser.add_option(sharg::enable_length_suffix{size}, sharg::config{.long_id = "size");
// Idea 2:
std::string size{"30"};
myparser.add_option(size, sharg::config{.long_id = "size", .underlying_type = size_t{}); // Maybe we colud also encode that in the validator?
// Idea 3:
sharg::enable_length_suffix size{30};
myparser.add_option(size, sharg::config{.long_id = "size"); 

Exploring

  • Could we use this mechanism also for input/output-files/directories?
    Answer: yes, this works with Idea1 and Idea3. We just add a sharg::output_file or sharg::output_directory type.

  • Could we use this mechanism instead of validators?
    Answer: yes, parsing a string and converting it to a type always includes an implicit validation.
    Maybe a validator is the wrong abstraction, alternatively we could annotate a parser, this would also allow many more possibilities, but removes the possibility of chaining validators (does anyone uses this functionality?):

// Idea 4
size_t size{30};
myparser.add_option(size, sharg::config{.long_id = "size", .parser = sharg::large_number_parser); 

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions