Skip to content

Commit fed7045

Browse files
committed
resolve comments
1 parent e750458 commit fed7045

File tree

1 file changed

+26
-23
lines changed
  • presto-docs/src/main/sphinx/connector

1 file changed

+26
-23
lines changed

presto-docs/src/main/sphinx/connector/clp.rst

Lines changed: 26 additions & 23 deletions
Original file line numberDiff line numberDiff line change
@@ -306,12 +306,13 @@ Each JSON log maps to this unified ``ROW`` type, with absent fields represented
306306
``status``, ``thread_num``, ``backtrace``) become fields within the ``ROW``, clearly reflecting the nested and varying
307307
structures of the original JSON logs.
308308

309+
*************
309310
CLP Functions
310-
-------------
311+
*************
311312

312-
In semi-structured logs, the number of potential keys can grow significantly, resulting in extremely wide Presto tables
313-
with many columns. To manage this complexity, the metadata provider may expose only a subset of the full schema,
314-
typically the static fields or those most relevant to expected queries.
313+
Semi-structured logs can have many potential keys, which can lead to very wide Presto tables. To keep table metadata
314+
concise and still preserve access to dynamic fields, the connector provides two sets of functions that are specific to
315+
the CLP connector. These functions are not part of standard Presto SQL.
315316

316317
To enable access to dynamic or less common fields not present in the exposed schema, CLP provides two set of functions
317318
to help users query flexible log schemas while keeping the table metadata definition concise. These functions are only
@@ -320,33 +321,33 @@ available in the CLP connector and are not part of standard Presto SQL.
320321
- JSON path functions (e.g., ``CLP_GET_STRING``)
321322
- Wildcard column matching functions for use in filter predicates (e.g., ``CLP_WILDCARD_STRING_COLUMN``)
322323

323-
There is **no performance penalty** for using these functions. During query optimization, they are rewritten into
324-
references to actual schema-backed columns or valid symbols in KQL queries. This avoids additional parsing overhead and
325-
delivers performance comparable to querying standard columns.
324+
There is **no performance penalty** when using these functions. During query optimization, the connector rewrites these
325+
functions to references to concrete schema-backed columns or valid symbols in KQL queries. This avoids additional
326+
parsing overhead and delivers performance comparable to querying standard columns.
326327

327328
Path-Based Functions
328-
^^^^^^^^^^^^^^^^^^^^
329+
====================
329330

330331
.. function:: CLP_GET_STRING(varchar) -> varchar
331332

332-
Returns the string value of the given JSON path, where the column type is one of: ``ClpString``, ``VarString``, or
333+
Returns the string value at the given JSON path, where the column type is one of: ``ClpString``, ``VarString``, or
333334
``DateString``. Returns a Presto ``VARCHAR``.
334335

335336
.. function:: CLP_GET_BIGINT(varchar) -> bigint
336337

337-
Returns the integer value of the given JSON path, where the column type is ``Integer``, Returns a Presto ``BIGINT``.
338+
Returns the integer value at the given JSON path, where the column type is ``Integer``, Returns a Presto ``BIGINT``.
338339

339340
.. function:: CLP_GET_DOUBLE(varchar) -> double
340341

341-
Returns the double value of the given JSON path, where the column type is ``Float``. Returns a Presto ``DOUBLE``.
342+
Returns the double value at the given JSON path, where the column type is ``Float``. Returns a Presto ``DOUBLE``.
342343

343344
.. function:: CLP_GET_BOOL(varchar) -> boolean
344345

345-
Returns the double value of the given JSON path, where the column type is ``Boolean``. Returns a Presto ``BOOLEAN``.
346+
Returns the boolean value at the given JSON path, where the column type is ``Boolean``. Returns a Presto ``BOOLEAN``.
346347

347348
.. function:: CLP_GET_STRING_ARRAY(varchar) -> array(varchar)
348349

349-
Returns the array value of the given JSON path, where the column type is ``UnstructuredArray`` and converts each
350+
Returns the array value at the given JSON path, where the column type is ``UnstructuredArray`` and converts each
350351
element into a string. Returns a Presto ``ARRAY(VARCHAR)``.
351352

352353
.. note::
@@ -355,7 +356,8 @@ Path-Based Functions
355356
- Wildcards (e.g., ``msg.*.ts``) are **not supported**.
356357
- If a path is invalid or missing, the function returns ``NULL`` rather than raising an error.
357358

358-
Examples:
359+
Examples
360+
--------
359361

360362
.. code-block:: sql
361363
@@ -369,33 +371,33 @@ Examples:
369371
370372
371373
Wildcard Column Functions
372-
^^^^^^^^^^^^^^^^^^^^^^^^^
374+
=========================
373375

374376
These functions are used to apply filter predicates across all columns of a certain type. They are useful for searching
375377
across unknown or dynamic schemas without specifying exact column names. Similar to the path-based functions, these
376378
functions are rewritten during query optimization to a KQL query that matches the appropriate columns.
377379

378380
.. function:: CLP_WILDCARD_STRING_COLUMN() -> varchar
379381

380-
Represents all columns of CLP types: ``ClpString``, ``VarString``, and ``DateString``.
382+
Represents all columns whose CLP types are ``ClpString``, ``VarString``, or ``DateString``.
381383

382384
.. function:: CLP_WILDCARD_INT_COLUMN() -> bigint
383385

384-
Represents all columns of CLP type: ``Integer``.
386+
Represents all columns whose CLP type is ``Integer``.
385387

386388
.. function:: CLP_WILDCARD_FLOAT_COLUMN() -> double
387389

388-
Represents all columns of CLP type: ``Float``.
390+
Represents all columns whose CLP type is ``Float``.
389391

390392
.. function:: CLP_WILDCARD_BOOL_COLUMN() -> boolean
391393

392-
Represents all columns of CLP type: ``Boolean``.
394+
Represents all columns whose CLP type is ``Boolean``.
393395

394396
.. note::
395397

396-
- They must appear **only in filter conditions** (`WHERE` clause). They cannot be selected or passed as arguments
397-
to other functions.
398-
- Supported operators includes:
398+
- Wildcard functions must appear **only in filter conditions** (`WHERE` clause). They cannot be selected and cannot
399+
be passed as arguments to other functions.
400+
- Supported operators include:
399401

400402
::
401403

@@ -412,7 +414,8 @@ functions are rewritten during query optimization to a KQL query that matches th
412414
Use of other operators (e.g., arithmetic or function calls) with wildcard functions is not allowed and will result
413415
in a query error.
414416

415-
Examples:
417+
Examples
418+
--------
416419

417420
.. code-block:: sql
418421

0 commit comments

Comments
 (0)