Skip to content

Commit e752a65

Browse files
author
David Noble
committed
Updated documentation and create Jira issues for all TODOs
1 parent a7c3c0f commit e752a65

File tree

2 files changed

+136
-211
lines changed

2 files changed

+136
-211
lines changed

splunklib/searchcommands/__init__.py

Lines changed: 100 additions & 183 deletions
Original file line numberDiff line numberDiff line change
@@ -15,208 +15,125 @@
1515

1616
""" Splunk search command library
1717
18-
#Design Notes
18+
#Design Notes
1919
20-
1. Command lines are constrained to this grammar (expressed informally):
20+
1. Command lines are constrained to this ABNF grammar::
2121
22-
command-line = `*<command>* [*<option>***=**<value>**]… [*<field>*]…`
23-
command = `\w+`
24-
option = `[_a-zA-Z][_a-zA-Z0-9]+`
25-
value = `([^\s"]+|"(?:[^"]+|""|\\")*")`
26-
field = `"[_a-zA-Z][._a-zA-Z0-9-]+|"[_a-zA-Z][._a-zA-Z0-9-]"`
22+
command = command-name *[wsp option] *[wsp [dquote] field-name [dquote]]
23+
command-name = alpha *( alpha / digit )
24+
option = option-name [wsp] "=" [wsp] option-value
25+
option-name = alpha *( alpha / digit / "_" )
26+
option-value = word / quoted-string
27+
word = 1*( %01-%08 / %0B / %0C / %0E-1F / %21 / %23-%FF ) ; Any character but DQUOTE and WSP
28+
quoted-string = dquote *( word / wsp / "\" dquote ) dquote
29+
field-name = ( "_" / alpha ) *( alpha / digit / "_" / "." / "-" )
2730
28-
Note that this grammar does not indicate that *<field>* values may be
29-
comma-separated. This is because Splunk strips commas from the command
30-
line. You never see them.
31+
**Note:**
3132
32-
2. Commands support dynamic probing for settings.
33-
Splunk probes for settings when `supports_getinfo=true`.
33+
This grammar is constrained to an 8-bit character set.
3434
35-
3. Commands do not support static probing for settings.
36-
This class expects that commands are statically configured as follows:
35+
**Note:**
3736
38-
```
39-
[*<command>*]
40-
filename = *<command>*.py
37+
This grammar does not show that `field-name` values may be comma-separated
38+
when in fact they may be. This is because Splunk strips commas from the
39+
command line. A custom search command will never see them.
40+
41+
2. Commands support dynamic probing for settings.
42+
43+
Splunk probes for settings dynamically when `supports_getinfo=true`.
44+
45+
3. Commands do not support static probing for settings.
46+
47+
This module expects that commands are statically configured as follows::
48+
49+
[<command-name>]
50+
filename = <command-name>.py
4151
supports_getinfo = true
42-
```
43-
44-
No other static configuration is required or expected and may interfere
45-
with command execution.
46-
47-
4. Commands do not support parsed arguments on the command line.
48-
Splunk parses arguments when `supports_rawargs=false`. This class sets
49-
`supports_rawargs=true` unconditionally.
50-
51-
**Rationale:*
52-
Splunk parses arguments by stripping quotes, nothing more. This
53-
may be useful in some cases, but doesn't work well with our chosen
54-
grammar.
55-
56-
5. Commands consume input headers.
57-
An input header is provided by Splunk when `enableheader=true`. This
58-
class sets this value unconditionally.
59-
60-
6. Commands produce an output messages header.
61-
Splunk expects a command to produce an output messages header when
62-
`outputheader=true`. This class sets this value unconditionally.
63-
64-
7. Commands support multi-value fields.
65-
Multi-value fields are provided and consumed by Splunk when
66-
`supports_multivalue=true`. This class sets this value unconditionally.
67-
68-
8. Commands represent all fields on the output stream as multi-value
69-
fields.
70-
Splunk represents multi-value fields with a pair of fields:
71-
72-
+ `<field-name>`
73-
Contains the text from which the multi-value field was derived.
74-
75-
+ `__mv_<field-name>`
76-
Contains an encoded list. Values in the list are wrapped in dollar
77-
signs ('$') and separated by semi-colons (';). Dollar signs ('$')
78-
within a value are represented by a pair of dollar signs ('$$').
79-
Empty lists are represented by the empty string. Single-value lists
80-
are represented by the single value.
81-
82-
On input this class processes and hides all **__mv_** fields. On output
83-
this class produces backing **__mv_** fields for all fields, thereby
84-
enabling a command to reduce its memory footprint by using streaming
85-
I/O. This is done at the cost of one extra byte of data per field per
86-
record on the output stream and extra processing time by the next
87-
processor in the pipeline.
88-
89-
9. A ReportingCommand must implement both its map (a.k.a, streaming preop)
90-
and reduce (a.k.a., reporting) operations. Map/reduce command lines are
91-
distinguished as exemplified below:
92-
93-
**Command:**
94-
```
95-
...| sum total=total_date_hour date_hour
96-
```
9752
98-
**Reduce command line:**
99-
```
100-
sum __GETINFO__ total=total_date_hour date_hour
101-
sum __EXECUTE__ total=total_date_hour date_hour
102-
```
53+
No other static configuration is required or expected and may interfere
54+
with command execution.
10355
104-
**Map command line:**
105-
```
106-
sum __GETINFO__ __map__ total=total_date_hour date_hour
107-
sum __EXECUTE__ __map__ total=total_date_hour date_hour
108-
```
56+
4. Commands do not support parsed arguments on the command line.
10957
110-
The `__map__`` argument is introduced by the `ReportingCommand._execute`
111-
method. ReportingCommand authors cannot influence the contents of the
112-
command line in this release.
58+
Splunk parses arguments when `supports_rawargs=false`. This ``SearchCommand``
59+
class sets this value unconditionally. You cannot override it.
11360
114-
#References
61+
**Rationale:**
11562
116-
1. [Commands.conf.spec](http://docs.splunk.com/Documentation/Splunk/5.0.5/Admin/Commandsconf)
117-
2. [Search command style guide](http://docs.splunk.com/Documentation/Splunk/6.0/Search/Searchcommandstyleguide)
63+
Splunk parses arguments by stripping quotes, nothing more. This may be useful
64+
in some cases, but doesn't work well with our chosen grammar.
11865
119-
#Implementation notes
66+
5. Commands consume input headers.
12067
121-
1. `# BFR` comments denote issues that should either be resolved before
122-
formal review or turned into TODO comments.
68+
An input header is provided by Splunk when `enableheader=true`. The
69+
``SearchCommand`` class sets this value unconditionally. You cannot override
70+
it.
12371
124-
2. `# TODO` comments denote issues that should be eliminated or reported as
125-
issues to be addressed in a later draft following formal review.
72+
6. Commands produce an output messages header.
12673
127-
"""
74+
Splunk expects a command to produce an output messages header when
75+
`outputheader=true`. The ``SearchCommand`` class sets this value
76+
unconditionally. You cannot override it.
77+
78+
7. Commands support multi-value fields.
79+
80+
Multi-value fields are provided and consumed by Splunk when
81+
`supports_multivalue=true`. The ``SearchCommand`` class sets this value
82+
unconditionally. You cannot override it.
83+
84+
8. Commands represent all fields on the output stream as multi-value fields.
85+
86+
Splunk represents multi-value fields with a pair of fields:
87+
88+
+ `<field-name>`
89+
90+
Contains the text from which the multi-value field was derived.
91+
92+
+ `__mv_<field-name>`
12893
129-
# TODO: .
130-
# For examples inside the Python example files, we should do an entire search
131-
# string, not just the specific invocation.
132-
133-
# TODO: .
134-
# Is changing commands.conf [default] a good or bad idea? Follow-up with Anirban
135-
# on this.
136-
137-
# TODO: .
138-
# Meet the bar for a __repr__ implementation: format value as a Python
139-
# expression, if you can provide an exact representation. We have more than
140-
# one __repr__ implementation. Ensure they meet the bar.
141-
142-
# TODO: .
143-
# Doc comment sweep
144-
145-
# TODO: .
146-
# Q&A
147-
# Does Splunk redundantly store the value of a single-value list in the
148-
# shadowing **__mv_** field?
149-
# Does Splunk gives us an input header on __GETINFO__?
150-
# What does it really mean for a generating command to be streamable?
151-
152-
# TODO: .csv
153-
# Optimize because there's too much data copying, especially in writerows.
154-
# Consider replacing csv.DictReader/Writer with csv.reader/writer. We're not
155-
# getting much use out of the higher level variants.
156-
157-
# TODO: .csv, .validators
158-
# Data conversion from Splunk data types to Python data types:
159-
# + Enumerate the set of data types native to Splunk
160-
# + Map them to Python data types
161-
# + Ensure input/output conversions (we do it for bool and list types today)
162-
163-
# TODO: .logging
164-
# Logging configuration files should be loaded once and only once. Does the
165-
# Python logging system ensure this? Is it possible for us to check so that we
166-
# can skip some of the work of logging.configure on repeated calls?
167-
168-
# TODO: .search_command.SearchCommand.ConfigurationSettingsType
169-
# Configuration matrix (an Excel spreadsheet) for each of the three command
170-
# types: Generating, reporting, and streaming.
171-
172-
# TODO: .search_command_internals.SearchCommandParser
173-
# Eliminate field name checking or make it an option for search command
174-
# developers. At present, we're not using it because we can't distinguish
175-
# between fields that must be present in the input stream (we use this list
176-
# to validate fields at present), fields that are created by the command, or
177-
# fields that are unused by parts of a command (e.g., ReportingCommand.map
178-
# may use and/or create some fields and ReportingCommand.reduce may use and/or
179-
# create others.)
180-
181-
# TODO: .search_command_internals.SearchCommandParser
182-
# Consider an alternative to raising one error at a time. It would be nice to
183-
# get all ValueErrors (e.g., illegal values, missing options,...) in one shot.
184-
185-
# TODO: .search_command_internals.SearchCommandParser
186-
# Finish BNF and ensure that regular expressions agree with it. One known point
187-
# of departure: regular expressions and <name>, in the BNF that's presented in
188-
# the source.
189-
190-
# TODO: .search_command_internals.ConfigurationSettingsType
191-
# Validate setting values. Today we verify that settings provided by a search
192-
# command are settable (Unmanaged settings are not settable; managed settings
193-
# are not. Managed settings are those without a backing class field. There are
194-
# two types of managed settings: fixed and computed. Managed settings include
195-
# computed: required_fields, streaming_preop and fixed: enableheaders,...)
196-
197-
# TODO: .search_command_internals.InputHeaders
198-
# Consider providing access to the contents of the file located at
199-
# `self.input_headers['infoPath']`. This header is provided by Splunk when
200-
# `requires_srinfo = True`
201-
202-
# TODO: .search_command_internals.MessagesHeader
203-
# Consider improving the interface and replacing its data structure borrowed
204-
# from Intersplunk. The data structure is unsatisfying in that it doesn't retain
205-
# the full temporal order of messages. For example, you can see the order in
206-
# which `info_message` level messages arrive, but you cannot see how they
207-
# interleaved with `warn_message` and `error_message` level messages.
208-
209-
# TODO: .reporting_command.ReportingCommand.ConfigurationSettings
210-
# Unless a ReportingCommand overrides the map method these settings should be
211-
# fixed:
212-
# + requires_preop = False
213-
# + streaming_preop = ''
214-
# Pay special attention to ReportingCommand.ConfigurationSettings.fix_up
215-
216-
# TODO: .validators.Boolean
217-
# Consider using the Splunk normalizeBoolean function
94+
Contains an encoded list. Values in the list are wrapped in dollar
95+
signs ('$') and separated by semi-colons (';). Dollar signs ('$')
96+
within a value are represented by a pair of dollar signs ('$$').
97+
Empty lists are represented by the empty string. Single-value lists
98+
are represented by the single value.
21899
100+
On input this class processes and hides all **__mv_** fields. On output
101+
this class produces backing **__mv_** fields for all fields, thereby
102+
enabling a command to reduce its memory footprint by using streaming
103+
I/O. This is done at the cost of one extra byte of data per field per
104+
record on the output stream and extra processing time by the next
105+
processor in the pipeline.
219106
107+
9. A ReportingCommand may implement a `map` method (a.k.a, a streaming preop)
108+
and must implement a `reduce` operation (a.k.a., a reporting operation).
109+
110+
Map/reduce command lines are distinguished by this module as exemplified
111+
here:
112+
113+
**Command**::
114+
115+
...| sum total=total_date_hour date_hour
116+
117+
**Reduce command line**::
118+
119+
sum __GETINFO__ total=total_date_hour date_hour
120+
sum __EXECUTE__ total=total_date_hour date_hour
121+
122+
**Map command line**::
123+
124+
sum __GETINFO__ __map__ total=total_date_hour date_hour
125+
sum __EXECUTE__ __map__ total=total_date_hour date_hour
126+
127+
The `__map__` argument is introduced by the `ReportingCommand._execute`
128+
method. ReportingCommand authors cannot influence the contents of the
129+
command line in this release.
130+
131+
#References
132+
133+
1. [Commands.conf.spec](http://docs.splunk.com/Documentation/Splunk/5.0.5/Admin/Commandsconf)
134+
2. [Search command style guide](http://docs.splunk.com/Documentation/Splunk/6.0/Search/Searchcommandstyleguide)
135+
136+
"""
220137
from __future__ import absolute_import
221138

222139
from .decorators import *

0 commit comments

Comments
 (0)