You can run astminer via its command line interface (CLI).
The CLI allows you to run the tool on any implemented parser with specified options for filtering, label extraction, and storage of the results.
You can build and run the CLI with any version of astminer:
- Check out the relevant version of
astminersources (for example, themasterbranch). - Build a shadow jar for
astminer:
gradle shadowJar - (Optional) Pull a Docker image with all parser dependencies installed:
docker pull voudy/astminer- Run
astminerwith a specified configuration file:
./cli.sh <path-to-yaml-config>The CLI of astminer is fully configured via a YAML file.
The config should contain the following values:
inputDir— path to the directory with input dataoutputDir— path to the output directoryparser— parser name and list of target languagesfilters— list of filters and parameterslabel— label extraction strategystorage— storage format
We prepared several YAML config examples so that you can use them as a reference. Possible parameter values for the respective entities can be found in the docs or definitions of the config classes.
Some parsers have non-trivial environment requirements. For example, g++ must be installed for the Fuzzy parser (see parsers).
To simplify dealing with such cases, we provide a Docker image with all parser dependencies. This image can be pulled from DockerHub:
docker pull voudy/astminer