Skip to content

Commit 055030a

Browse files
committed
update rex limitations
Signed-off-by: Ritvi Bhatt <[email protected]>
1 parent 657f38e commit 055030a

File tree

1 file changed

+16
-16
lines changed

1 file changed

+16
-16
lines changed

docs/user/ppl/cmd/rex.rst

Lines changed: 16 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -20,13 +20,13 @@ rex [mode=<mode>] field=<field> <pattern> [max_match=<int>] [offset_field=<strin
2020
* field: mandatory. The field must be a string field to extract data from.
2121
* pattern: mandatory string. The regular expression pattern with named capture groups used to extract new fields. Pattern must contain at least one named capture group using ``(?<name>pattern)`` syntax.
2222
* mode: optional. Either ``extract`` or ``sed``. **Default:** extract
23-
* **extract mode** (default): Creates new fields from regular expression named capture groups. This is the standard field extraction behavior.
24-
* **sed mode**: Performs text substitution on the field using sed-style patterns:
25-
* ``s/pattern/replacement/`` - Replace first occurrence
26-
* ``s/pattern/replacement/g`` - Replace all occurrences (global)
27-
* ``s/pattern/replacement/n`` - Replace only the nth occurrence (where n is a number)
28-
* ``y/from_chars/to_chars/`` - Character-by-character transliteration
29-
* Backreferences: ``\1``, ``\2``, etc. reference captured groups in replacement
23+
* **extract mode** (default): Creates new fields from regular expression named capture groups. This is the standard field extraction behavior.
24+
* **sed mode**: Performs text substitution on the field using sed-style patterns:
25+
* ``s/pattern/replacement/`` - Replace first occurrence
26+
* ``s/pattern/replacement/g`` - Replace all occurrences (global)
27+
* ``s/pattern/replacement/n`` - Replace only the nth occurrence (where n is a number)
28+
* ``y/from_chars/to_chars/`` - Character-by-character transliteration
29+
* Backreferences: ``\1``, ``\2``, etc. reference captured groups in replacement
3030

3131
* max_match: optional integer (default=1). Maximum number of matches to extract. If greater than 1, extracted fields become arrays. The value 0 means unlimited matches, but is automatically capped to the configured limit (default: 10, configurable via ``plugins.ppl.rex.max_match.limit``).
3232
* offset_field: optional string. Field name to store the character offset positions of matches. Only available in extract mode.
@@ -217,17 +217,17 @@ Limitations
217217
===========
218218
**Named Capture Group Naming:**
219219

220-
- Group names must start with a letter and contain only letters and digits
221-
- For detailed Java regex pattern syntax and usage, refer to the `official Java Pattern documentation <https://docs.oracle.com/javase/8/docs/api/java/util/regex/Pattern.html>`_
220+
* Group names must start with a letter and contain only letters and digits
221+
* For detailed Java regex pattern syntax and usage, refer to the `official Java Pattern documentation <https://docs.oracle.com/javase/8/docs/api/java/util/regex/Pattern.html>`_
222222

223223
**Pattern Requirements:**
224224

225-
- Pattern must contain at least one named capture group
226-
- Regular capture groups ``(...)`` without names are not allowed
225+
* Pattern must contain at least one named capture group
226+
* Regular capture groups ``(...)`` without names are not allowed
227227

228228
**Max Match Limit:**
229-
230-
- The ``max_match`` parameter is subject to a configurable system limit to prevent memory exhaustion
231-
- When ``max_match=0`` (unlimited) is specified, it is automatically capped at the configured limit (default: 10)
232-
- User-specified values exceeding the configured limit will result in an error
233-
- Users can adjust the limit via the ``plugins.ppl.rex.max_match.limit`` cluster setting. Setting this limit to a large value is not recommended as it can lead to excessive memory consumption, especially with patterns that match empty strings (e.g., ``\d*``, ``\w*``)
229+
230+
* The ``max_match`` parameter is subject to a configurable system limit to prevent memory exhaustion
231+
* When ``max_match=0`` (unlimited) is specified, it is automatically capped at the configured limit (default: 10)
232+
* User-specified values exceeding the configured limit will result in an error
233+
* Users can adjust the limit via the ``plugins.ppl.rex.max_match.limit`` cluster setting. Setting this limit to a large value is not recommended as it can lead to excessive memory consumption, especially with patterns that match empty strings (e.g., ``\d*``, ``\w*``)

0 commit comments

Comments
 (0)