Skip to content

Commit 6a99917

Browse files
authored
GEOMESA-3549 Docs - Improve converter docs (#3480)
1 parent 662b04a commit 6a99917

File tree

40 files changed

+2246
-2094
lines changed

40 files changed

+2246
-2094
lines changed

docs/_static/css/theme_custom.css

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -294,6 +294,10 @@ div.admonition.warning, div.admonition.note {
294294
margin-bottom: 12px;
295295
}
296296

297+
.wy-table-responsive table td .line-block:last-of-type {
298+
margin-bottom: 0;
299+
}
300+
297301
.wy-table-responsive tr.row-even td {
298302
background-color: #e7f3fc !important;
299303
}

docs/user/accumulo/examples.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -238,7 +238,7 @@ The config file needs to have a ``SimpleFeatureType`` defined along with a
238238
converter that specifies instructions on how to turn the raw data file into
239239
that simple feature type. See :doc:`/user/convert/index` for a more details
240240
on converters, including a full list of the transformation functions available
241-
(:doc:`/user/convert/function_overview`).
241+
(:doc:`/user/convert/functions`).
242242

243243
This example uses the ``date()`` function to tell the parser what date column
244244
is in. The ``stringToDouble()`` and ``::double`` functions give two different

docs/user/cli/export.rst

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3,6 +3,8 @@ Query and Export Commands
33

44
These commands are used to query and export simple features. Required parameters are indicated with a ``*``.
55

6+
.. _cli_convert:
7+
68
``convert``
79
-----------
810

docs/user/convert/avro.rst

Lines changed: 39 additions & 29 deletions
Original file line numberDiff line numberDiff line change
@@ -3,39 +3,39 @@
33
Avro Converter
44
==============
55

6-
The Avro converter handles data written by `Apache Avro <https://avro.apache.org/>`__. To use the Avro converter,
7-
specify ``type = "avro"`` in your converter definition.
6+
The Avro converter handles data written by `Apache Avro <https://avro.apache.org/>`__.
87

98
Configuration
109
-------------
1110

12-
The Avro converter supports parsing whole Avro files, with the schema embedded, or Avro IPC messages with
13-
the schema omitted. For an embedded schema, set ``schema = "embedded"`` in your converter definition.
14-
For IPC messages, specify the schema in one of two ways: to use an inline schema string, set
15-
``schema = "<schema string>"``; to use a schema defined in a separate file, set ``schema-file = "<path to file>"``.
16-
17-
The Avro record being parsed is available to field transforms as ``$1``.
18-
19-
Avro Paths
20-
----------
21-
22-
Avro paths are defined similarly to JSONPath or XPath, and allow you to extract specific fields out of an
23-
Avro record. An Avro path consists of forward-slash delimited strings. Each part of the path defines
24-
a field name with an optional predicate:
11+
The Avro converter supports the following configuration keys:
2512

26-
* ``$type=<typename>`` - match the Avro schema type name on the selected element
27-
* ``[$<field>=<value>]`` - match elements with a field named "field" and a value equal to "value"
13+
=============== ======== ======= ==========================================================================================
14+
Key Required Type Description
15+
=============== ======== ======= ==========================================================================================
16+
``type`` yes String Must be the string ``avro``.
17+
``schema`` yes String The Avro schema used for parsing (may be omitted if using ``schema-file``).
18+
``schema-file`` yes String A pointer to an Avro schema on the classpath (may be omitted if using ``schema``).
19+
=============== ======== ======= ==========================================================================================
2820

29-
For example, ``/foo$type=bar/baz[$qux=quux]``. See `Example Usage`, below, for a concrete example.
21+
``schema``/``schema-file``
22+
^^^^^^^^^^^^^^^^^^^^^^^^^^
3023

31-
Avro paths are available through the ``avroPath`` transform function, as described below.
24+
The Avro converter supports parsing whole Avro files, with the schema embedded, or Avro IPC messages with
25+
the schema omitted. For an embedded schema, set ``schema = "embedded"`` in your converter definition.
26+
For IPC messages, specify the schema in one of two ways: to use an inline schema string, set
27+
``schema = "<schema string>"``; or to use a schema defined in a separate file, set ``schema-file = "<path to file>"``
28+
(the schema file must be available on the classpath).
3229

3330
.. _avro_converter_functions:
3431

35-
Avro Transform Functions
36-
------------------------
32+
Transform Functions
33+
-------------------
34+
35+
The current Avro record being parsed is available to field transforms as ``$1``. The original message bytes are available
36+
as ``$0``, which may be useful for generating consistent feature IDs.
3737

38-
GeoMesa defines several Avro-specific transform functions.
38+
In addition to the standard :ref:`converter_functions`, the Avro converter provides the following Avro-specific functions:
3939

4040
avroPath
4141
^^^^^^^^
@@ -45,7 +45,16 @@ Description: Extract values from nested Avro structures.
4545
Usage: ``avroPath($ref, $pathString)``
4646

4747
* ``$ref`` - a reference object (avro root or extracted object)
48-
* ``pathString`` - forward-slash delimited path strings. See `Avro Paths`, above
48+
* ``pathString`` - forward-slash delimited path strings
49+
50+
Avro paths are defined similarly to JSONPath or XPath, and allow you to extract specific fields out of an
51+
Avro record. An Avro path consists of forward-slash delimited strings. Each part of the path defines
52+
a field name with an optional predicate:
53+
54+
* ``$type=<typename>`` - match the Avro schema type name on the selected element
55+
* ``[$<field>=<value>]`` - match elements with a field named "field" and a value equal to "value"
56+
57+
For example, ``/foo$type=bar/baz[$qux=quux]``. See the example below for a concrete example.
4958

5059
avroToJson
5160
^^^^^^^^^^
@@ -89,7 +98,7 @@ Usage: ``avroBinaryUuid($ref)``
8998
Example Usage
9099
-------------
91100

92-
For this example we'll use the following Avro schema in a file named ``/tmp/schema.avsc``:
101+
For this example we'll use the following Avro schema in a classpath file named ``schema.avsc``:
93102

94103
::
95104

@@ -98,7 +107,8 @@ For this example we'll use the following Avro schema in a file named ``/tmp/sche
98107
"type": "record",
99108
"name": "CompositeMessage",
100109
"fields": [
101-
{ "name": "content",
110+
{
111+
"name": "content",
102112
"type": [
103113
{
104114
"name": "DataObj",
@@ -126,7 +136,7 @@ For this example we'll use the following Avro schema in a file named ``/tmp/sche
126136
"fields": [{ "name": "id", "type": "int"}]
127137
}
128138
]
129-
}
139+
}
130140
]
131141
}
132142

@@ -135,9 +145,9 @@ which has a nested object which is either of type ``DataObj`` or
135145
``OtherObject``. As an exercise, we can use avro tools to generate some
136146
test data and view it::
137147

138-
java -jar /tmp/avro-tools-1.7.7.jar random --schema-file /tmp/schema -count 5 /tmp/avro
148+
java -jar avro-tools-1.11.4.jar random --schema-file schema.avsc -count 5 /tmp/avro
139149

140-
$ java -jar /tmp/avro-tools-1.7.7.jar tojson /tmp/avro
150+
$ java -jar /tmp/avro-tools-1.11.4.jar tojson /tmp/avro
141151
{"content":{"org.locationtech.DataObj":{"kvmap":[{"k":"thhxhumkykubls","v":{"double":0.8793488185997134}},{"k":"mlungpiegrlof","v":{"double":0.45718223406586045}},{"k":"mtslijkjdt","v":null}]}}}
142152
{"content":{"org.locationtech.OtherObject":{"id":-86025408}}}
143153
{"content":{"org.locationtech.DataObj":{"kvmap":[]}}}
@@ -186,7 +196,7 @@ The following converter config would be sufficient to parse the Avro::
186196

187197
{
188198
type = "avro"
189-
schema-file = "/tmp/schema.avsc"
199+
schema-file = "schema.avsc"
190200
id-field = "uuid()"
191201
fields = [
192202
{ name = "tobj", transform = "avroPath($1, '/content$type=DataObj')" },

docs/user/convert/avro_schema_registry.rst

Lines changed: 21 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -3,25 +3,31 @@
33
Avro Schema Registry Converter
44
==============================
55

6-
The Avro Schema Registry converter handles data written by `Apache Avro <https://avro.apache.org/>`__
7-
using a Confluent Schema Registry. The schema registry is a centralized store of versioned Avro schemas.
8-
9-
To use the Avro converter, specify ``type = "avro-schema-registry"`` in your converter definition.
10-
11-
Note that Confluent requires Avro 1.8 and the Confluent client JARs, which are not bundled with GeoMesa.
6+
The Avro schema registry converter handles data written by `Apache Avro <https://avro.apache.org/>`__
7+
using a Confluent schema registry. The schema registry is a centralized store of versioned Avro schemas.
128

9+
Note that the schema registry converter requires Confluent client JARs, which are not bundled by default with GeoMesa.
1310

1411
Configuration
1512
-------------
1613

17-
The Avro Schema Registry converter supports parsing Avro data using a Confluent schema registry.
18-
To configure the schema registry set ``schema-registry = "<URL of schema registry>"`` in your converter definition.
14+
The Avro schema registry converter supports the following configuration keys:
15+
16+
===================== ======== ======= ==========================================================================================
17+
Key Required Type Description
18+
===================== ======== ======= ==========================================================================================
19+
``type`` yes String Must be the string ``avro-schema-registry``.
20+
``schema-registry`` yes String URL of the schema registry.
21+
===================== ======== ======= ==========================================================================================
1922

20-
The Avro record being parsed is available to field transforms as ``$1``.
23+
Transform Functions
24+
-------------------
2125

22-
The Avro Schema Registry Converter is an extension of the :ref:`avro_converter`, therefore the :ref:`avro_converter_functions`
23-
can be used to extract fields out of the parsed Avro record.
26+
The current Avro record being parsed is available to field transforms as ``$1``. The original message bytes are available
27+
as ``$0``, which may be useful for generating consistent feature IDs.
2428

29+
The Avro schema registry converter is an extension of the :ref:`avro_converter`, therefore both the standard
30+
:ref:`converter_functions` and the :ref:`avro_converter_functions` can be used to extract fields out of the parsed Avro record.
2531

2632
Example Usage
2733
-------------
@@ -85,8 +91,8 @@ Here's a sample Avro record encoded using schema version 2: ::
8591
"extra": "Extra Test Field"
8692
}
8793

88-
Let's say we want to convert our Avro records into simple
89-
features. We notice that between the two schema versions there are 3 attributes:
94+
Let's say we want to convert our Avro records into simple features. We notice that between the two schema versions there are
95+
3 attributes:
9096

9197
- lat
9298
- lon
@@ -96,10 +102,9 @@ The following converter config would be sufficient to parse the Avro records tha
96102
using multiple schema version defined in the schema registry::
97103

98104
{
99-
type = "avro-schema-registry"
105+
type = "avro-schema-registry"
100106
schema-registry = "http://localhost:8080"
101-
sft = "testsft"
102-
id-field = "uuid()"
107+
id-field = "uuid()"
103108
fields = [
104109
{ name = "lat", transform = "avroPath($1, '/lat')" },
105110
{ name = "lon", transform = "avroPath($1, '/lon')" },

0 commit comments

Comments
 (0)