|
15 | 15 |
|
16 | 16 | """ Splunk search command library |
17 | 17 |
|
18 | | - #Design Notes |
| 18 | +#Design Notes |
19 | 19 |
|
20 | | - 1. Command lines are constrained to this grammar (expressed informally): |
| 20 | +1. Command lines are constrained to this ABNF grammar:: |
21 | 21 |
|
22 | | - command-line = `*<command>* [*<option>***=**<value>**]… [*<field>*]…` |
23 | | - command = `\w+` |
24 | | - option = `[_a-zA-Z][_a-zA-Z0-9]+` |
25 | | - value = `([^\s"]+|"(?:[^"]+|""|\\")*")` |
26 | | - field = `"[_a-zA-Z][._a-zA-Z0-9-]+|"[_a-zA-Z][._a-zA-Z0-9-]"` |
| 22 | + command = command-name *[wsp option] *[wsp [dquote] field-name [dquote]] |
| 23 | + command-name = alpha *( alpha / digit ) |
| 24 | + option = option-name [wsp] "=" [wsp] option-value |
| 25 | + option-name = alpha *( alpha / digit / "_" ) |
| 26 | + option-value = word / quoted-string |
| 27 | + word = 1*( %01-%08 / %0B / %0C / %0E-1F / %21 / %23-%FF ) ; Any character but DQUOTE and WSP |
| 28 | + quoted-string = dquote *( word / wsp / "\" dquote ) dquote |
| 29 | + field-name = ( "_" / alpha ) *( alpha / digit / "_" / "." / "-" ) |
27 | 30 |
|
28 | | - Note that this grammar does not indicate that *<field>* values may be |
29 | | - comma-separated. This is because Splunk strips commas from the command |
30 | | - line. You never see them. |
| 31 | + **Note:** |
31 | 32 |
|
32 | | - 2. Commands support dynamic probing for settings. |
33 | | - Splunk probes for settings when `supports_getinfo=true`. |
| 33 | + This grammar is constrained to an 8-bit character set. |
34 | 34 |
|
35 | | - 3. Commands do not support static probing for settings. |
36 | | - This class expects that commands are statically configured as follows: |
| 35 | + **Note:** |
37 | 36 |
|
38 | | - ``` |
39 | | - [*<command>*] |
40 | | - filename = *<command>*.py |
| 37 | + This grammar does not show that `field-name` values may be comma-separated |
| 38 | + when in fact they may be. This is because Splunk strips commas from the |
| 39 | + command line. A custom search command will never see them. |
| 40 | +
|
| 41 | +2. Commands support dynamic probing for settings. |
| 42 | +
|
| 43 | + Splunk probes for settings dynamically when `supports_getinfo=true`. |
| 44 | +
|
| 45 | +3. Commands do not support static probing for settings. |
| 46 | +
|
| 47 | + This module expects that commands are statically configured as follows:: |
| 48 | +
|
| 49 | + [<command-name>] |
| 50 | + filename = <command-name>.py |
41 | 51 | supports_getinfo = true |
42 | | - ``` |
43 | | -
|
44 | | - No other static configuration is required or expected and may interfere |
45 | | - with command execution. |
46 | | -
|
47 | | - 4. Commands do not support parsed arguments on the command line. |
48 | | - Splunk parses arguments when `supports_rawargs=false`. This class sets |
49 | | - `supports_rawargs=true` unconditionally. |
50 | | -
|
51 | | - **Rationale:* |
52 | | - Splunk parses arguments by stripping quotes, nothing more. This |
53 | | - may be useful in some cases, but doesn't work well with our chosen |
54 | | - grammar. |
55 | | -
|
56 | | - 5. Commands consume input headers. |
57 | | - An input header is provided by Splunk when `enableheader=true`. This |
58 | | - class sets this value unconditionally. |
59 | | -
|
60 | | - 6. Commands produce an output messages header. |
61 | | - Splunk expects a command to produce an output messages header when |
62 | | - `outputheader=true`. This class sets this value unconditionally. |
63 | | -
|
64 | | - 7. Commands support multi-value fields. |
65 | | - Multi-value fields are provided and consumed by Splunk when |
66 | | - `supports_multivalue=true`. This class sets this value unconditionally. |
67 | | -
|
68 | | - 8. Commands represent all fields on the output stream as multi-value |
69 | | - fields. |
70 | | - Splunk represents multi-value fields with a pair of fields: |
71 | | -
|
72 | | - + `<field-name>` |
73 | | - Contains the text from which the multi-value field was derived. |
74 | | -
|
75 | | - + `__mv_<field-name>` |
76 | | - Contains an encoded list. Values in the list are wrapped in dollar |
77 | | - signs ('$') and separated by semi-colons (';). Dollar signs ('$') |
78 | | - within a value are represented by a pair of dollar signs ('$$'). |
79 | | - Empty lists are represented by the empty string. Single-value lists |
80 | | - are represented by the single value. |
81 | | -
|
82 | | - On input this class processes and hides all **__mv_** fields. On output |
83 | | - this class produces backing **__mv_** fields for all fields, thereby |
84 | | - enabling a command to reduce its memory footprint by using streaming |
85 | | - I/O. This is done at the cost of one extra byte of data per field per |
86 | | - record on the output stream and extra processing time by the next |
87 | | - processor in the pipeline. |
88 | | -
|
89 | | - 9. A ReportingCommand must implement both its map (a.k.a, streaming preop) |
90 | | - and reduce (a.k.a., reporting) operations. Map/reduce command lines are |
91 | | - distinguished as exemplified below: |
92 | | -
|
93 | | - **Command:** |
94 | | - ``` |
95 | | - ...| sum total=total_date_hour date_hour |
96 | | - ``` |
97 | 52 |
|
98 | | - **Reduce command line:** |
99 | | - ``` |
100 | | - sum __GETINFO__ total=total_date_hour date_hour |
101 | | - sum __EXECUTE__ total=total_date_hour date_hour |
102 | | - ``` |
| 53 | + No other static configuration is required or expected and may interfere |
| 54 | + with command execution. |
103 | 55 |
|
104 | | - **Map command line:** |
105 | | - ``` |
106 | | - sum __GETINFO__ __map__ total=total_date_hour date_hour |
107 | | - sum __EXECUTE__ __map__ total=total_date_hour date_hour |
108 | | - ``` |
| 56 | +4. Commands do not support parsed arguments on the command line. |
109 | 57 |
|
110 | | - The `__map__`` argument is introduced by the `ReportingCommand._execute` |
111 | | - method. ReportingCommand authors cannot influence the contents of the |
112 | | - command line in this release. |
| 58 | + Splunk parses arguments when `supports_rawargs=false`. This ``SearchCommand`` |
| 59 | + class sets this value unconditionally. You cannot override it. |
113 | 60 |
|
114 | | - #References |
| 61 | + **Rationale:** |
115 | 62 |
|
116 | | - 1. [Commands.conf.spec](http://docs.splunk.com/Documentation/Splunk/5.0.5/Admin/Commandsconf) |
117 | | - 2. [Search command style guide](http://docs.splunk.com/Documentation/Splunk/6.0/Search/Searchcommandstyleguide) |
| 63 | + Splunk parses arguments by stripping quotes, nothing more. This may be useful |
| 64 | + in some cases, but doesn't work well with our chosen grammar. |
118 | 65 |
|
119 | | - #Implementation notes |
| 66 | +5. Commands consume input headers. |
120 | 67 |
|
121 | | - 1. `# BFR` comments denote issues that should either be resolved before |
122 | | - formal review or turned into TODO comments. |
| 68 | + An input header is provided by Splunk when `enableheader=true`. The |
| 69 | + ``SearchCommand`` class sets this value unconditionally. You cannot override |
| 70 | + it. |
123 | 71 |
|
124 | | - 2. `# TODO` comments denote issues that should be eliminated or reported as |
125 | | - issues to be addressed in a later draft following formal review. |
| 72 | +6. Commands produce an output messages header. |
126 | 73 |
|
127 | | -""" |
| 74 | + Splunk expects a command to produce an output messages header when |
| 75 | + `outputheader=true`. The ``SearchCommand`` class sets this value |
| 76 | + unconditionally. You cannot override it. |
| 77 | +
|
| 78 | +7. Commands support multi-value fields. |
| 79 | +
|
| 80 | + Multi-value fields are provided and consumed by Splunk when |
| 81 | + `supports_multivalue=true`. The ``SearchCommand`` class sets this value |
| 82 | + unconditionally. You cannot override it. |
| 83 | +
|
| 84 | +8. Commands represent all fields on the output stream as multi-value fields. |
| 85 | +
|
| 86 | + Splunk represents multi-value fields with a pair of fields: |
| 87 | +
|
| 88 | + + `<field-name>` |
| 89 | +
|
| 90 | + Contains the text from which the multi-value field was derived. |
| 91 | +
|
| 92 | + + `__mv_<field-name>` |
128 | 93 |
|
129 | | -# TODO: . |
130 | | -# For examples inside the Python example files, we should do an entire search |
131 | | -# string, not just the specific invocation. |
132 | | - |
133 | | -# TODO: . |
134 | | -# Is changing commands.conf [default] a good or bad idea? Follow-up with Anirban |
135 | | -# on this. |
136 | | - |
137 | | -# TODO: . |
138 | | -# Meet the bar for a __repr__ implementation: format value as a Python |
139 | | -# expression, if you can provide an exact representation. We have more than |
140 | | -# one __repr__ implementation. Ensure they meet the bar. |
141 | | - |
142 | | -# TODO: . |
143 | | -# Doc comment sweep |
144 | | - |
145 | | -# TODO: . |
146 | | -# Q&A |
147 | | -# Does Splunk redundantly store the value of a single-value list in the |
148 | | -# shadowing **__mv_** field? |
149 | | -# Does Splunk gives us an input header on __GETINFO__? |
150 | | -# What does it really mean for a generating command to be streamable? |
151 | | - |
152 | | -# TODO: .csv |
153 | | -# Optimize because there's too much data copying, especially in writerows. |
154 | | -# Consider replacing csv.DictReader/Writer with csv.reader/writer. We're not |
155 | | -# getting much use out of the higher level variants. |
156 | | - |
157 | | -# TODO: .csv, .validators |
158 | | -# Data conversion from Splunk data types to Python data types: |
159 | | -# + Enumerate the set of data types native to Splunk |
160 | | -# + Map them to Python data types |
161 | | -# + Ensure input/output conversions (we do it for bool and list types today) |
162 | | - |
163 | | -# TODO: .logging |
164 | | -# Logging configuration files should be loaded once and only once. Does the |
165 | | -# Python logging system ensure this? Is it possible for us to check so that we |
166 | | -# can skip some of the work of logging.configure on repeated calls? |
167 | | - |
168 | | -# TODO: .search_command.SearchCommand.ConfigurationSettingsType |
169 | | -# Configuration matrix (an Excel spreadsheet) for each of the three command |
170 | | -# types: Generating, reporting, and streaming. |
171 | | - |
172 | | -# TODO: .search_command_internals.SearchCommandParser |
173 | | -# Eliminate field name checking or make it an option for search command |
174 | | -# developers. At present, we're not using it because we can't distinguish |
175 | | -# between fields that must be present in the input stream (we use this list |
176 | | -# to validate fields at present), fields that are created by the command, or |
177 | | -# fields that are unused by parts of a command (e.g., ReportingCommand.map |
178 | | -# may use and/or create some fields and ReportingCommand.reduce may use and/or |
179 | | -# create others.) |
180 | | - |
181 | | -# TODO: .search_command_internals.SearchCommandParser |
182 | | -# Consider an alternative to raising one error at a time. It would be nice to |
183 | | -# get all ValueErrors (e.g., illegal values, missing options,...) in one shot. |
184 | | - |
185 | | -# TODO: .search_command_internals.SearchCommandParser |
186 | | -# Finish BNF and ensure that regular expressions agree with it. One known point |
187 | | -# of departure: regular expressions and <name>, in the BNF that's presented in |
188 | | -# the source. |
189 | | - |
190 | | -# TODO: .search_command_internals.ConfigurationSettingsType |
191 | | -# Validate setting values. Today we verify that settings provided by a search |
192 | | -# command are settable (Unmanaged settings are not settable; managed settings |
193 | | -# are not. Managed settings are those without a backing class field. There are |
194 | | -# two types of managed settings: fixed and computed. Managed settings include |
195 | | -# computed: required_fields, streaming_preop and fixed: enableheaders,...) |
196 | | - |
197 | | -# TODO: .search_command_internals.InputHeaders |
198 | | -# Consider providing access to the contents of the file located at |
199 | | -# `self.input_headers['infoPath']`. This header is provided by Splunk when |
200 | | -# `requires_srinfo = True` |
201 | | - |
202 | | -# TODO: .search_command_internals.MessagesHeader |
203 | | -# Consider improving the interface and replacing its data structure borrowed |
204 | | -# from Intersplunk. The data structure is unsatisfying in that it doesn't retain |
205 | | -# the full temporal order of messages. For example, you can see the order in |
206 | | -# which `info_message` level messages arrive, but you cannot see how they |
207 | | -# interleaved with `warn_message` and `error_message` level messages. |
208 | | - |
209 | | -# TODO: .reporting_command.ReportingCommand.ConfigurationSettings |
210 | | -# Unless a ReportingCommand overrides the map method these settings should be |
211 | | -# fixed: |
212 | | -# + requires_preop = False |
213 | | -# + streaming_preop = '' |
214 | | -# Pay special attention to ReportingCommand.ConfigurationSettings.fix_up |
215 | | - |
216 | | -# TODO: .validators.Boolean |
217 | | -# Consider using the Splunk normalizeBoolean function |
| 94 | + Contains an encoded list. Values in the list are wrapped in dollar |
| 95 | + signs ('$') and separated by semi-colons (';). Dollar signs ('$') |
| 96 | + within a value are represented by a pair of dollar signs ('$$'). |
| 97 | + Empty lists are represented by the empty string. Single-value lists |
| 98 | + are represented by the single value. |
218 | 99 |
|
| 100 | + On input this class processes and hides all **__mv_** fields. On output |
| 101 | + this class produces backing **__mv_** fields for all fields, thereby |
| 102 | + enabling a command to reduce its memory footprint by using streaming |
| 103 | + I/O. This is done at the cost of one extra byte of data per field per |
| 104 | + record on the output stream and extra processing time by the next |
| 105 | + processor in the pipeline. |
219 | 106 |
|
| 107 | +9. A ReportingCommand may implement a `map` method (a.k.a, a streaming preop) |
| 108 | + and must implement a `reduce` operation (a.k.a., a reporting operation). |
| 109 | +
|
| 110 | + Map/reduce command lines are distinguished by this module as exemplified |
| 111 | + here: |
| 112 | +
|
| 113 | + **Command**:: |
| 114 | +
|
| 115 | + ...| sum total=total_date_hour date_hour |
| 116 | +
|
| 117 | + **Reduce command line**:: |
| 118 | +
|
| 119 | + sum __GETINFO__ total=total_date_hour date_hour |
| 120 | + sum __EXECUTE__ total=total_date_hour date_hour |
| 121 | +
|
| 122 | + **Map command line**:: |
| 123 | +
|
| 124 | + sum __GETINFO__ __map__ total=total_date_hour date_hour |
| 125 | + sum __EXECUTE__ __map__ total=total_date_hour date_hour |
| 126 | +
|
| 127 | + The `__map__` argument is introduced by the `ReportingCommand._execute` |
| 128 | + method. ReportingCommand authors cannot influence the contents of the |
| 129 | + command line in this release. |
| 130 | +
|
| 131 | +#References |
| 132 | +
|
| 133 | +1. [Commands.conf.spec](http://docs.splunk.com/Documentation/Splunk/5.0.5/Admin/Commandsconf) |
| 134 | +2. [Search command style guide](http://docs.splunk.com/Documentation/Splunk/6.0/Search/Searchcommandstyleguide) |
| 135 | +
|
| 136 | +""" |
220 | 137 | from __future__ import absolute_import |
221 | 138 |
|
222 | 139 | from .decorators import * |
|
0 commit comments