|
| 1 | +.. extractor-options: |
| 2 | +
|
| 3 | +Extractor options |
| 4 | +================= |
| 5 | + |
| 6 | +The CodeQL CLI uses special programs, called extractors, to extract information from the source code of a |
| 7 | +software system into a database that can be queried. You can customize the behavior of extractors by |
| 8 | +setting extractor configuration options through the CodeQL CLI. |
| 9 | + |
| 10 | +About extractor options |
| 11 | +----------------------- |
| 12 | + |
| 13 | +Each extractor defines its own set of configuration options. To find out which options are available for a particular extractor, you can run ``codeql resolve languages`` or ``codeql resolve extractor`` with the ``--format=betterjson`` option. The ``betterjson`` output format provides the root paths of extractors and additional information. The output of ``codeql resolve extractor --format=betterjson`` will often be formatted like the following example:: |
| 14 | + |
| 15 | + { |
| 16 | + "extractor_root" : "/home/user/codeql/java", |
| 17 | + "extractor_options" : { |
| 18 | + "option1" : { |
| 19 | + "title" : "Java extractor option 1", |
| 20 | + "description" : "An example string option for the Java extractor.", |
| 21 | + "type" : "string", |
| 22 | + "pattern" : "[a-z]+" |
| 23 | + }, |
| 24 | + "group1" : { |
| 25 | + "title" : "Java extractor group 1", |
| 26 | + "description" : "An example option group for the Java extractor.", |
| 27 | + "type" : "object", |
| 28 | + "properties" : { |
| 29 | + "option2" : { |
| 30 | + "title" : "Java extractor option 2", |
| 31 | + "description" : "An example array option for the Java extractor", |
| 32 | + "type" : "array", |
| 33 | + "pattern" : "[1-9][0-9]*" |
| 34 | + } |
| 35 | + } |
| 36 | + } |
| 37 | + } |
| 38 | + } |
| 39 | + |
| 40 | +The extractor option names and descriptions are listed under ``extractor_options``. Each option may contain the following fields: |
| 41 | + |
| 42 | +* ``title`` (required): The title of the option |
| 43 | +* ``description`` (required): The description of the option |
| 44 | +* ``type`` (required): The type of the option, which can be |
| 45 | + |
| 46 | + * ``string``: indicating that the option can have a single string value |
| 47 | + * ``array``: indicating that the option can have a sequence of string values |
| 48 | + * ``object``: indicating that it is not an option itself, but a grouping that may contain other options and option groups |
| 49 | + |
| 50 | +* ``pattern`` (optional): The regular expression patterns that all values of the option should match. Note that the extractor may impose additional constraints on option values that are not or cannot be expressed in this regular expression pattern. Such constraints, if they exist, would be explained under the description field. |
| 51 | +* ``properties`` (optional): A map from extractor option names in the option group to the corresponding extractor option descriptions. This field can only be present for option groups. For example, options of ``object`` type. |
| 52 | + |
| 53 | +In the example above, the extractor declares two options: |
| 54 | + |
| 55 | +* ``option1`` is a ``string`` option with value matching ``[a-z]+`` |
| 56 | +* ``group1.option2`` is an ``array`` option with values matching ``[1-9][0-9]*`` |
| 57 | + |
| 58 | +Setting extractor options with the CodeQL CLI |
| 59 | +--------------------------------------------- |
| 60 | + |
| 61 | +The CodeQL CLI supports setting extractor options in subcommands that directly or indirectly invoke extractors. These commands are: |
| 62 | + |
| 63 | +* ``codeql database create`` |
| 64 | +* ``codeql database start-tracing`` |
| 65 | +* ``codeql database trace-command`` |
| 66 | +* ``codeql database index-files`` |
| 67 | + |
| 68 | +When running these subcommands, you can set extractor options with the ``--extractor-option`` CLI option. For example: |
| 69 | + |
| 70 | +* ``codeql database create --extractor-option java.option1=abc ...`` |
| 71 | +* ``codeql database start-tracing --extractor-option java.group1.option2=102 ...`` |
| 72 | + |
| 73 | +``--extractor-option`` requires exactly one argument of the form ``extractor_option_name=extractor_option_value``. ``extractor_option_name`` is the name of the extractor (in this example, ``java``) followed by a period and then the name of the extractor option (in this example, either ``option1`` or ``group1.option2``). ``extractor_option_value`` is the value being assigned to the extractor option. The value must match the regular expression pattern of the extractor option (if it exists), and it must not contain newline characters. |
| 74 | + |
| 75 | +Using ``--extractor-option`` to assign an extractor option that does not exist is an error. |
| 76 | + |
| 77 | +The CodeQL CLI accepts multiple ``--extractor-option`` options in the same invocation. If you set a ``string`` extractor option multiple times, the last option value overwrites all previous ones. If you set an `array` extractor option multiple times, all option values are concatenated in order. |
| 78 | + |
| 79 | +You can also specify extractor option names without the extractor name. For example: |
| 80 | + |
| 81 | +* ``codeql database create --extractor-option option1=abc ...`` |
| 82 | +* ``codeql database start-tracing --extractor-option group1.option2=102 ...`` |
| 83 | + |
| 84 | +If you do not specify an extractor name, the extractor option settings will apply to all extractors that declare an option with the given name. In the above example, the first command would set the extractor option ``option1`` to ``abc`` for the ``java`` extractor and every extractor that has an option of ``option1``, for example the ``cpp`` extractor, if the ``option1`` extractor option exists for that extractor. |
| 85 | + |
| 86 | +Setting extractor options from files |
| 87 | +------------------------------------ |
| 88 | + |
| 89 | +You can also set extractor options through a file. The CodeQL CLI subcommands that accept ``--extractor-option`` also accept ``--extractor-options-file``, which has a required argument of the path to a YAML file (with extension ``.yaml`` or ``.yml``) or a JSON file (with extension ``.json``). For example: |
| 90 | + |
| 91 | +* ``codeql database create --extractor-options-file options.yml ...`` |
| 92 | +* ``codeql database start-tracing --extractor-options-file options.json ...`` |
| 93 | + |
| 94 | +Each option file contains a tree structure of nested maps. At the root is an extractor map key, and beneath it are map keys that correspond to extractor names. Starting at the third level, there are extractor options and option groups. |
| 95 | + |
| 96 | +In JSON:: |
| 97 | + |
| 98 | + { |
| 99 | + "extractor" : { |
| 100 | + “java”: { |
| 101 | + "option1" : “abc”, |
| 102 | + "group1" : { |
| 103 | + "option2" : [ 102 ] |
| 104 | + } |
| 105 | + } |
| 106 | + } |
| 107 | + } |
| 108 | + |
| 109 | + |
| 110 | +In YAML:: |
| 111 | + |
| 112 | + extractor: |
| 113 | + java: |
| 114 | + option1: “abc” |
| 115 | + group1: |
| 116 | + option2: [ 102 ] |
| 117 | + |
| 118 | +The value for a ``string`` extractor option must be a string or a number (which will be converted to a string before further processing). |
| 119 | + |
| 120 | +The value for an ``array`` extractor option must be an array of strings or numbers. |
| 121 | + |
| 122 | +The value for an option group (of type ``object``) must be a map, which may contain nested extractor options and option groups. |
| 123 | + |
| 124 | +Each extractor option value must match the regular expression pattern of the extractor option (if it exists), and it must not contain newline characters. |
| 125 | + |
| 126 | +Assigning an extractor option that does not exist is an error. You can make the CodeQL CLI ignore unknown extractor options by using a special ``__allow_unknown_properties`` Boolean field. For example, the following option file asks the CodeQL CLI to ignore all unknown extractor options and option groups under ``group1``:: |
| 127 | + |
| 128 | + extractor: |
| 129 | + java: |
| 130 | + option1: “abc” |
| 131 | + group1: |
| 132 | + __allow_unknown_properties: true |
| 133 | + option2: [ 102 ] |
| 134 | + |
| 135 | +You can specify ``--extractor-options-file`` multiple times. The extractor option assignments are processed in the following order: |
| 136 | + |
| 137 | +1. All extractor option files specified by ``--extractor-options-file`` are processed in the order they appear on the command line, then |
| 138 | +2. All extractor option assignments specified by ``--extractor-option`` are processed in the order they appear on the command line |
| 139 | + |
| 140 | +The same rules govern what happens when the same extractor option is set multiple times, regardless of whether the assignments are done using ``--extractor-option``, using ``--extractor-options-file``, or some combination of the two. If you set a ``string`` extractor option multiple times, the last option value overwrites all previous values. If you set an ``array`` extractor option multiple times, all option values are concatenated in order. |
0 commit comments