diff --git a/README.md b/README.md index ea0834f..bf9b83b 100644 --- a/README.md +++ b/README.md @@ -20,6 +20,8 @@ Its design is heavily inspired by the awesome [ESLint](https://eslint.org/). different dataset conventions. - Project-specific configurations including configuration of individual rules and file-specific settings. +- Works with dataset files in the local filesystem or any of the remote + filesystems supported by xarray. ## Inbuilt Rules diff --git a/docs/about.md b/docs/about.md index f3d1e44..5c9df78 100644 --- a/docs/about.md +++ b/docs/about.md @@ -33,7 +33,7 @@ To install the XRLint development environment into an existing Python environmen pip install .[dev,doc] ``` -or create a new environment using +or create a new environment using `conda` or `mamba` ```bash mamba env create diff --git a/docs/api.md b/docs/api.md index cfc9fa6..40ec151 100644 --- a/docs/api.md +++ b/docs/api.md @@ -17,19 +17,19 @@ This chapter provides a plain reference for the XRLint Python API. configuration information and provide related functionality: [Config][xrlint.config.Config] and [ConfigList][xrlint.config.ConfigList]. - The `rule` module provides rule related classes and functions: - [Rule][xrlint.rule.Rule] comprising rule metadata - [RuleMeta][xrlint.rule.RuleMeta] and the rule operation + [Rule][xrlint.rule.Rule] comprising rule metadata, + [RuleMeta][xrlint.rule.RuleMeta], the rule validation operations in [RuleOp][xrlint.rule.RuleOp], as well as related to the latter [RuleContext][xrlint.rule.RuleContext] and [RuleExit][xrlint.rule.RuleExit]. Decorator [define_rule][xrlint.rule.define_rule] allows defining rules. -- The `node` module defines the nodes passed to [xrlint.rule.RuleOp]: - base classes [None][xrlint.node.Node], [XarrayNode][xrlint.node.XarrayNode] +- The `node` module defines the nodes passed to [RuleOp][xrlint.rule.RuleOp]: + base classes [None][xrlint.node.Node], [XarrayNode][xrlint.node.XarrayNode], and the specific [DatasetNode][xrlint.node.DatasetNode], [DataArray][xrlint.node.DataArrayNode], [AttrsNode][xrlint.node.AttrsNode], and [AttrNode][xrlint.node.AttrNode] nodes. - The `processor` module provides processor related classes and functions: [Processor][xrlint.processor.Processor] comprising processor metadata - [ProcessorMeta][xrlint.processor.ProcessorMeta] + [ProcessorMeta][xrlint.processor.ProcessorMeta], and the processor operation [ProcessorOp][xrlint.processor.ProcessorOp]. Decorator [define_processor][xrlint.processor.define_processor] allows defining processors. @@ -41,7 +41,8 @@ This chapter provides a plain reference for the XRLint Python API. [RuleTester][xrlint.testing.RuleTester] that is made up of [RuleTest][xrlint.testing.RuleTest]s. -Note: the `xrlint.all` convenience module exports all of the above from a +Note: + the `xrlint.all` convenience module exports all of the above from a single module. ::: xrlint.cli.engine.XRLint diff --git a/docs/cli.md b/docs/cli.md index 6fcb212..ccf06c6 100644 --- a/docs/cli.md +++ b/docs/cli.md @@ -8,15 +8,19 @@ Usage: xrlint [OPTIONS] [FILES]... Validate the given dataset FILES. - Reads configuration from `./xrlint_config.*` if such file exists and unless - `--no_config_lookup` is set or `--config` is provided. Then validates each - dataset in FILES against the configuration. The default dataset patters are - `**/*.zarr` and `**/.nc`. FILES may comprise also directories. If a - directory is not matched by any file pattern, it will be traversed - recursively. The validation result is dumped to standard output if not - otherwise stated by `--output-file`. The output format is `simple` by - default. Other inbuilt formats are `json` and `html` which you can specify - using the `--format` option. + Reads configuration from './xrlint_config.*' if such file exists and unless + '--no_config_lookup' is set or '--config' is provided. It then validates + each dataset in FILES against the configuration. The default dataset patters + are '**/*.zarr' and '**/.nc'. FILES may comprise also directories or URLs. + The supported URL protocols are the ones supported by xarray. Using remote + protocols may require installing additional packages such as S3Fs + (https://s3fs.readthedocs.io/) for the 's3' protocol. + + If a directory is provided that not matched by any file pattern, it will be + traversed recursively. The validation result is dumped to standard output if + not otherwise stated by '--output-file'. The output format is 'simple' by + default. Other inbuilt formats are 'json' and 'html' which you can specify + using the '--format' option. Options: --no-config-lookup Disable use of default configuration from @@ -36,5 +40,4 @@ Options: --init Write initial configuration file and exit. --version Show the version and exit. --help Show this message and exit. - ``` diff --git a/docs/config.md b/docs/config.md index 770af50..c3819fb 100644 --- a/docs/config.md +++ b/docs/config.md @@ -66,27 +66,78 @@ these properties: * `name` - A name for the configuration object. This is used in error messages and config inspector to help identify which configuration object is being used. -* `files` - A list of glob patterns indicating the files that the +* `files` - A list of glob patterns indicating the files or URLs that the configuration object should apply to. If not specified, the configuration object applies to all files matched by any other configuration object. -* `ignores` - A list of glob patterns indicating the files that the + See section [File and Ignore Patterns](#file-and-ignore-patterns) below. +* `ignores` - A list of glob patterns indicating the files and URLs that the configuration object should not apply to. If not specified, the configuration - object applies to all files matched by files. If ignores is used without any + object applies to all files matched by `files`. If ignores is used without any other keys in the configuration object, then the patterns act as _global ignores_. + See section [File and Ignore Patterns](#file-and-ignore-patterns) below. * `opener_options` - A dictionary specifying keyword-arguments that are passed directly to the `xarray.open_dataset()` function. The available options are dependent on the xarray backend selected by the `engine` option. + See section [Opener Options](#opener-options) below. * `linter_options` - A dictionary containing settings related to the linting process. (Currently not used.) -* `processor` - A string indicating the name of a processor inside of a plugin, - i.e., `"/"`. In Python configurations - it can also be an object of type `ProcessorOp` containing - `preprocess()` and `postprocess()` methods. + See section [Linter Options](#linter-options) below. +* `settings` - An object containing name-value pairs of information that should + be available to all rules. * `plugins` - A dictionary containing a name-value mapping of plugin names to either plugin module names or `Plugin` objects. When `files` is specified, these plugins are only available to the matching files. + See sections [Configuring Plugins](#configuring-plugins) + and [Custom Plugins](#custom-plugins) below. * `rules` - An object containing the configured rules. When `files` or `ignores` are specified, these rule configurations are only available to the matching files. -* `settings` - An object containing name-value pairs of information that should - be available to all rules. + See sections [Configuring Rules](#configuring-rules) + and [Custom Rules](#custom-rules) below. +* `processor` - A string indicating the name of a processor inside of a plugin, + i.e., `"/"`. In Python configurations + it can also be an object of type `ProcessorOp` containing + `preprocess()` and `postprocess()` methods. + See sections [Configuring Processors](#custom-processors) + and [Custom Processors](#custom-processors) below. + +## File and Ignore Patterns + +_Coming soon_ + +## Opener Options + +_Coming soon_ + +## Linter Options + +_Coming soon_ + +## Configuring Plugins + +_Coming soon_ + +## Configuring Rules + +_Coming soon_ + +## Configuring Processors + +_Coming soon_ + +## Predefined Configuration Objects + +_Coming soon_ + +## Custom Plugins + +_Coming soon_ + +## Custom Rules + +_Coming soon_ + +## Custom Processors + +_Coming soon_ + diff --git a/docs/index.md b/docs/index.md index 65d3f5a..914f91e 100644 --- a/docs/index.md +++ b/docs/index.md @@ -14,6 +14,8 @@ Its design is heavily inspired by the awesome [ESLint](https://eslint.org/). different dataset conventions. - Project-specific configurations including configuration of individual rules and file-specific settings. +- Works with dataset files in the local filesystem or any of the remote + filesystems supported by xarray. ## Inbuilt Rules diff --git a/xrlint/cli/engine.py b/xrlint/cli/engine.py index c7bc7f3..26e2a14 100644 --- a/xrlint/cli/engine.py +++ b/xrlint/cli/engine.py @@ -73,9 +73,10 @@ def result_stats(self) -> ResultStats: def load_config_list(self) -> None: """Load configuration list. - The function considers any `plugin` and `rule` - options, the default configuration file names or a specified - configuration file. + The function will load the configuration list from a specified + configuration file, if any. + Otherwise it will search for the default configuration files + in the current working directory. """ plugins = {} for plugin_spec in self.plugin_specs: @@ -121,7 +122,7 @@ def get_config_for_file(self, file_path: str) -> Config | None: """Compute configuration for the given file. Args: - file_path: A file path. + file_path: A file path or URL. Returns: A configuration object or `None` if no item @@ -133,14 +134,14 @@ def print_config_for_file(self, file_path: str) -> None: """Print computed configuration for the given file. Args: - file_path: A file path. + file_path: A file path or URL. """ config = self.get_config_for_file(file_path) config_json_obj = config.to_json() if config is not None else None click.echo(json.dumps(config_json_obj, indent=2)) def verify_datasets(self, files: Iterable[str]) -> Iterator[Result]: - """Verify given files. + """Verify given files or directories which may also be given as URLs. The function produces a validation result for each file. Args: diff --git a/xrlint/cli/main.py b/xrlint/cli/main.py index fa76911..d5bc730 100644 --- a/xrlint/cli/main.py +++ b/xrlint/cli/main.py @@ -112,17 +112,22 @@ def main( ): """Validate the given dataset FILES. - Reads configuration from `./xrlint_config.*` if such file - exists and unless `--no_config_lookup` is set or `--config` is + Reads configuration from './xrlint_config.*' if such file + exists and unless '--no_config_lookup' is set or '--config' is provided. - Then validates each dataset in FILES against the configuration. - The default dataset patters are `**/*.zarr` and `**/.nc`. - FILES may comprise also directories. If a directory is not matched - by any file pattern, it will be traversed recursively. + It then validates each dataset in FILES against the configuration. + The default dataset patters are '**/*.zarr' and '**/.nc'. + FILES may comprise also directories or URLs. The supported URL + protocols are the ones supported by xarray. Using remote + protocols may require installing additional packages such as + S3Fs (https://s3fs.readthedocs.io/) for the 's3' protocol. + + If a directory is provided that not matched by any file pattern, + it will be traversed recursively. The validation result is dumped to standard output if not otherwise - stated by `--output-file`. The output format is `simple` by default. - Other inbuilt formats are `json` and `html` which you can specify - using the `--format` option. + stated by '--output-file'. The output format is 'simple' by default. + Other inbuilt formats are 'json' and 'html' which you can specify + using the '--format' option. """ from xrlint.cli.engine import XRLint