Skip to content

Conversation

@Erotemic
Copy link
Contributor

What does this PR do?

This PR is an initial attempt at #244

This allows a user to define a scriptconfig object that gives them fine-grained control over the arguments and metadata on the CLI, without requiring the variables to be typed more than once. In other words, the signature of a function is maintained in a class, and it is assumed that the keyword arguments given to that class will be used to create an instance of the scriptconfig object.

Here is what the MWE looks like:

import scriptconfig as scfg


class MyClassConfig(scfg.DataConfig):
    key1 = scfg.Value(1, alias=['key_one'], help='description1')
    key2 = scfg.Value(None, type=str, help='description1')
    key3 = scfg.Value(False, isflag=True, help='description1')
    key4 = 123
    key5 = '123'


class MyClass:
    __scriptconfig__ = MyClassConfig

    def __init__(self, regular_param1, **kwargs):
        self.regular_param1 = regular_param1
        self.config = MyClassConfig(**kwargs)


def main():
    import jsonargparse
    import shlex
    import ubelt as ub
    import rich
    from rich.markup import escape
    parser = jsonargparse.ArgumentParser()
    parser.add_class_arguments(MyClass, nested_key='my_class', fail_untyped=False, sub_configs=True)
    parser.add_argument('--foo', default='bar')
    parser.add_argument('-b', '--baz', '--buzz', default='bar')
    print('Parse Args')
    cases = [
        '',
        '--my_class.key1 123',
        '--my_class.key_one 123ab',
        '--my_class.key4 strings-are-ok',
    ]
    for case_idx, case in enumerate(cases):
        print('--- Case {case_idx} ---')
        print(f'case={case}')
        args = shlex.split(case)
        config = parser.parse_args(args)
        instances = parser.instantiate_classes(config)

        my_class = instances.my_class
        rich.print(f'config = {escape(ub.urepr(config, nl=2))}')
        rich.print(f'my_class.config = {escape(ub.urepr(my_class.config, nl=2))}')
        print('---')


if __name__ == '__main__':
    """
    CommandLine:
        cd ~/code/geowatch/dev/mwe/
        python jsonargparse_scriptconfig_integration_test.py
    """
    main()

You'll note that it looks similar to dataclass / pydantic / attrs object. However, a major difference is that it doesn't rely on type annotations to store metadata. It uses a special "Value" class, which can be augmented with things like help, aliases, and other information useful to building both a CLI and a function / class signature.

The main scriptconfig page is here for more information: https://gitlab.kitware.com/utils/scriptconfig

I've been using a monkey-patched version of jsonargparse that allows me to work with scriptconfig objects for over a year now, and in late 2023 there was some change to jsonargparse that broke me. This force me to pin jsonargparse and pytorch_lightning to older versions, and I've finally had time to look into it. However, I'd really really really like is there was something codified into jsonargparse that would either directly support scriptconfig or expose some way for users to do custom things when running add_class_arguments based on a property in the class itself (which will let it integrate with lightning much easier). I don't particularly care if scriptconfig itself is supported, what I need is something that won't get deprecated or refactored that I can hook into.

So far this PR just adds basic support for scriptconfig itself. This involves some extra logic in _add_scriptconfig_arguments, and in order to support the alias feature of scriptconfig, I updated ParamData so it is now equipped with a short_aliases and a long_aliases attribute. I've also added a method to it so it can construct the args that will ultimately be passed to argparse. This makes modifications to _add_signature_parameter a bit cleaner, and also sets the stage to allow other argument sources to specify aliases.

Before submitting

  • Did you read the contributing guideline?
  • Did you update the documentation? (readme and public docstrings)
  • Did you write unit tests such that there is 100% coverage on related code? (required for bug fixes and new features)
  • Did you verify that new and existing tests pass locally?
  • Did you make sure that all changes preserve backward compatibility?
  • Did you update the CHANGELOG? (not for typos, docs, test updates, or minor internal changes/refactors)

WRT to this checklist, I don't want to invest too much time into this before having a discussion with the authors.

I'm looking for feedback on the level of support the maintainers would be willing to provide for something like this. I think the simple thing to do would be to "add scriptconfig support" and be done with it, in which case I could clean up the existing code. The alternative is to allow allow classes to have some attribute (e.g. __customize_signature__) that lets the user return a list of additional ParamData objects that should be recognized by the CLI. However, that involves making ParamData a public class, so that has its own set of tradeoffs.

@codecov
Copy link

codecov bot commented Jun 20, 2024

Codecov Report

Attention: Patch coverage is 58.62069% with 12 lines in your changes missing coverage. Please review.

Project coverage is 99.81%. Comparing base (6c89858) to head (f574e3f).
Report is 29 commits behind head on main.

Files with missing lines Patch % Lines
jsonargparse/_signatures.py 20.00% 12 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff             @@
##              main     #532      +/-   ##
===========================================
- Coverage   100.00%   99.81%   -0.19%     
===========================================
  Files           22       22              
  Lines         6375     6402      +27     
===========================================
+ Hits          6375     6390      +15     
- Misses           0       12      +12     
Flag Coverage Δ
py3.10 85.51% <58.62%> (-0.13%) ⬇️
py3.10_all 98.50% <58.62%> (-0.19%) ⬇️
py3.10_pydantic1 48.59% <55.17%> (+0.01%) ⬆️
py3.10_pydantic2 48.37% <55.17%> (+0.01%) ⬆️
py3.10_types 98.51% <58.62%> (-0.19%) ⬇️
py3.11 85.50% <58.62%> (-0.13%) ⬇️
py3.11_all 98.50% <58.62%> (-0.19%) ⬇️
py3.11_types 98.51% <58.62%> (-0.19%) ⬇️
py3.12 85.64% <58.62%> (-0.13%) ⬇️
py3.12_all 98.50% <58.62%> (-0.19%) ⬇️
py3.12_types 98.51% <58.62%> (-0.19%) ⬇️
py3.7 85.99% <58.62%> (-0.14%) ⬇️
py3.7_all 99.05% <58.62%> (-0.19%) ⬇️
py3.7_types 99.08% <58.62%> (-0.19%) ⬇️
py3.8 86.18% <58.62%> (-0.13%) ⬇️
py3.8_all 99.21% <58.62%> (-0.19%) ⬇️
py3.8_types 99.23% <58.62%> (-0.19%) ⬇️
py3.9 86.04% <58.62%> (-0.13%) ⬇️
py3.9_all 99.10% <58.62%> (-0.19%) ⬇️
py3.9_types 99.12% <58.62%> (-0.19%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@mauvilsa
Copy link
Member

mauvilsa commented Jul 1, 2024

@Erotemic thank you for the pull request. Scriptconfig support like proposed here most likely won't happen. But I will have this in mind to add some public interface that would allow you to do this. Note that this is not a simple topic. New features should behave well with all other existing features and not prevent or make more difficult other potential new features. For example what is done here for aliases could conflict with #301. Please just be patient since this will take time.

For the time being we can use this pull request to test that your integration works with the latest changes.

@Erotemic
Copy link
Contributor Author

Erotemic commented Jul 1, 2024

Thanks for taking the time to look at / review this. Avoiding integration with a particular library makes a lot of sense, but I'm glad you're open to some public API to make something like this possible.

My thought is that ParamData is the obvious thing to expose to the user. If there was something like:

class MyClass:
    __jsonargparse_hook__ = ...

import jsonargparse
parser = jsonargparse.ArgumentParser()
parser.add_class_arguments(MyClass, nested_key='my_class', fail_untyped=False, sub_configs=True)

Where __jsonargparse_hook__ (or a better name) was a callback that users could use to return a custom list of ParamData objects, then that would allow me to write something specific for scriptconfig to integrate with jsonargparse without jsonargparse having direct knowledge of specific consumer packages. Perhaps the signature looks like:

def __jsonargparse_hook__(params: List[ParamData]) -> List[ParamData]:
    """
    Input is the initial list of default parameters that would be added. Modify / add / remove these 
    and return the modified list to control details of which arguments are exposed to jsonargparse. 
    """

However, ParamData may need to be modified (or given an initial restricted public form that is added to as the default develops). I think the first step is to add support for aliases. It looks like I made a PR to add alias last year that I forgot about: #255 but it is outdated and it looks like things have changed.

I don't have as much time to work on FOSS code in the next few months, but jsonargparse is a critical enough part of my workflow that I can prioritize it, but I want to have a solid plan before I commit to anything.

Let me know what you think.

@mauvilsa
Copy link
Member

mauvilsa commented Jul 3, 2024

Thank you for the proposal. Though, I don't like it much. I see several issues with it.

  • Seems to mostly consider the scriptconfig integration as implemented in this pull request, instead of targeting a more general scope. If a public API is added, it must be clear that several cases could benefit from it, instead of just a single library.
  • Only works for classes, leaving out functions and methods.
  • Not clear how this works when implementing subclasses.
  • It goes against a principle that jsonargparse has been following: have a design that promotes some good practices. Among these are separation of concerns and low coupling. For example note how cli.py is highly decoupled from the rest of lightning or user implemented module classes. The idea is, it is better that all the logic of a CLI be implemented in a single place, and not requiring a bunch of changes throughout a code base. That is, I don't like this __jsonargparse_hook__ in classes. I know that scriptconfig works like this. But that is no reason to do it similar in jsonargparse.

In my previous comment I was asking for patience because right now I am not sure how the integration API would be. And it will take some time for this, since I am focused on other stuff right now. But I will propose something here when ready.

[pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Fix test
@Erotemic Erotemic force-pushed the dev/scriptconfig_integration branch from 9039ad5 to f574e3f Compare September 10, 2024 18:18
@sonarqubecloud
Copy link

@Erotemic
Copy link
Contributor Author

Closing as this is not planned to be integrated, although I will continue to maintain this branch as my use-cases require some sort of way to define parameters without relying on the python annotation system or docstrings.

@Erotemic Erotemic closed this Sep 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants