33Avro Converter
44==============
55
6- The Avro converter handles data written by `Apache Avro <https://avro.apache.org/ >`__. To use the Avro converter,
7- specify ``type = "avro" `` in your converter definition.
6+ The Avro converter handles data written by `Apache Avro <https://avro.apache.org/ >`__.
87
98Configuration
109-------------
1110
12- The Avro converter supports parsing whole Avro files, with the schema embedded, or Avro IPC messages with
13- the schema omitted. For an embedded schema, set ``schema = "embedded" `` in your converter definition.
14- For IPC messages, specify the schema in one of two ways: to use an inline schema string, set
15- ``schema = "<schema string>" ``; to use a schema defined in a separate file, set ``schema-file = "<path to file>" ``.
16-
17- The Avro record being parsed is available to field transforms as ``$1 ``.
18-
19- Avro Paths
20- ----------
21-
22- Avro paths are defined similarly to JSONPath or XPath, and allow you to extract specific fields out of an
23- Avro record. An Avro path consists of forward-slash delimited strings. Each part of the path defines
24- a field name with an optional predicate:
11+ The Avro converter supports the following configuration keys:
2512
26- * ``$type=<typename> `` - match the Avro schema type name on the selected element
27- * ``[$<field>=<value>] `` - match elements with a field named "field" and a value equal to "value"
13+ =============== ======== ======= ==========================================================================================
14+ Key Required Type Description
15+ =============== ======== ======= ==========================================================================================
16+ ``type `` yes String Must be the string ``avro ``.
17+ ``schema `` yes String The Avro schema used for parsing (may be omitted if using ``schema-file ``).
18+ ``schema-file `` yes String A pointer to an Avro schema on the classpath (may be omitted if using ``schema ``).
19+ =============== ======== ======= ==========================================================================================
2820
29- For example, ``/foo$type=bar/baz[$qux=quux] ``. See `Example Usage `, below, for a concrete example.
21+ ``schema ``/``schema-file ``
22+ ^^^^^^^^^^^^^^^^^^^^^^^^^^
3023
31- Avro paths are available through the ``avroPath `` transform function, as described below.
24+ The Avro converter supports parsing whole Avro files, with the schema embedded, or Avro IPC messages with
25+ the schema omitted. For an embedded schema, set ``schema = "embedded" `` in your converter definition.
26+ For IPC messages, specify the schema in one of two ways: to use an inline schema string, set
27+ ``schema = "<schema string>" ``; or to use a schema defined in a separate file, set ``schema-file = "<path to file>" ``
28+ (the schema file must be available on the classpath).
3229
3330.. _avro_converter_functions :
3431
35- Avro Transform Functions
36- ------------------------
32+ Transform Functions
33+ -------------------
34+
35+ The current Avro record being parsed is available to field transforms as ``$1 ``. The original message bytes are available
36+ as ``$0 ``, which may be useful for generating consistent feature IDs.
3737
38- GeoMesa defines several Avro-specific transform functions.
38+ In addition to the standard :ref: ` converter_functions `, the Avro converter provides the following Avro -specific functions:
3939
4040avroPath
4141^^^^^^^^
@@ -45,7 +45,16 @@ Description: Extract values from nested Avro structures.
4545Usage: ``avroPath($ref, $pathString) ``
4646
4747* ``$ref `` - a reference object (avro root or extracted object)
48- * ``pathString `` - forward-slash delimited path strings. See `Avro Paths `, above
48+ * ``pathString `` - forward-slash delimited path strings
49+
50+ Avro paths are defined similarly to JSONPath or XPath, and allow you to extract specific fields out of an
51+ Avro record. An Avro path consists of forward-slash delimited strings. Each part of the path defines
52+ a field name with an optional predicate:
53+
54+ * ``$type=<typename> `` - match the Avro schema type name on the selected element
55+ * ``[$<field>=<value>] `` - match elements with a field named "field" and a value equal to "value"
56+
57+ For example, ``/foo$type=bar/baz[$qux=quux] ``. See the example below for a concrete example.
4958
5059avroToJson
5160^^^^^^^^^^
@@ -89,7 +98,7 @@ Usage: ``avroBinaryUuid($ref)``
8998Example Usage
9099-------------
91100
92- For this example we'll use the following Avro schema in a file named ``/tmp/ schema.avsc ``:
101+ For this example we'll use the following Avro schema in a classpath file named ``schema.avsc ``:
93102
94103::
95104
@@ -98,7 +107,8 @@ For this example we'll use the following Avro schema in a file named ``/tmp/sche
98107 "type": "record",
99108 "name": "CompositeMessage",
100109 "fields": [
101- { "name": "content",
110+ {
111+ "name": "content",
102112 "type": [
103113 {
104114 "name": "DataObj",
@@ -126,7 +136,7 @@ For this example we'll use the following Avro schema in a file named ``/tmp/sche
126136 "fields": [{ "name": "id", "type": "int"}]
127137 }
128138 ]
129- }
139+ }
130140 ]
131141 }
132142
@@ -135,9 +145,9 @@ which has a nested object which is either of type ``DataObj`` or
135145``OtherObject ``. As an exercise, we can use avro tools to generate some
136146test data and view it::
137147
138- java -jar /tmp/ avro-tools-1.7.7 .jar random --schema-file /tmp/ schema -count 5 /tmp/avro
148+ java -jar avro-tools-1.11.4 .jar random --schema-file schema.avsc -count 5 /tmp/avro
139149
140- $ java -jar /tmp/avro-tools-1.7.7 .jar tojson /tmp/avro
150+ $ java -jar /tmp/avro-tools-1.11.4 .jar tojson /tmp/avro
141151 {"content":{"org.locationtech.DataObj":{"kvmap":[{"k":"thhxhumkykubls","v":{"double":0.8793488185997134}},{"k":"mlungpiegrlof","v":{"double":0.45718223406586045}},{"k":"mtslijkjdt","v":null}]}}}
142152 {"content":{"org.locationtech.OtherObject":{"id":-86025408}}}
143153 {"content":{"org.locationtech.DataObj":{"kvmap":[]}}}
@@ -186,7 +196,7 @@ The following converter config would be sufficient to parse the Avro::
186196
187197 {
188198 type = "avro"
189- schema-file = "/tmp/ schema.avsc"
199+ schema-file = "schema.avsc"
190200 id-field = "uuid()"
191201 fields = [
192202 { name = "tobj", transform = "avroPath($1, '/content$type=DataObj')" },
0 commit comments