Skip to content

Commit e66dbef

Browse files
committed
Merge branch 'sedlak-clickhouse-schema-helper' into 'master'
ClickHouse schema helper See merge request monitoring/ipfixcol2!26
2 parents 4fcaf73 + d094dd8 commit e66dbef

File tree

12 files changed

+951
-4
lines changed

12 files changed

+951
-4
lines changed

README.rst

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -60,6 +60,8 @@ network interface and a port. Multiple instances of these plugins can run concur
6060
it on standard output
6161
- `Time Check <src/plugins/output/timecheck>`_ - flow timestamp check
6262
- `Dummy <src/plugins/output/dummy>`_ - simple output module example
63+
- `Overview <src/plugins/output/overview>`_ - get a quick overview of the IPFIX
64+
fields the collector is receiving from the probe
6365
- `lnfstore <extra_plugins/output/lnfstore>`_ (*) - store all flows in nfdump compatible
6466
format for long-term preservation
6567
- `UniRec <extra_plugins/output/unirec>`_ (*) - send flow records in UniRec format

extra_plugins/output/clickhouse/README.rst

Lines changed: 51 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -32,7 +32,7 @@ To build the plugin, IPFIXcol2 (and its header files) must be installed on your
3232

3333
Finally, compile and install the plugin:
3434

35-
.. code-block:: sh
35+
::
3636

3737
$ mkdir build && cd build && cmake ..
3838
$ make
@@ -47,6 +47,9 @@ and schema of the table is checked after initiating connection to the database
4747
and an error is displayed if there is a mismatch. The table is not
4848
automatically created.
4949

50+
To aid in initial setup, a script `schema-helper.py` is included. For more information,
51+
see `Schema helper script <#schema-helper>`_.
52+
5053
To run the example configuration below, you can create the ClickHouse table
5154
using the following SQL query:
5255

@@ -307,9 +310,53 @@ performance at the cost of higher memory usage:
307310

308311
.. code-block:: xml
309312
310-
<inserterThreads>16</inserterThreads>
311-
<blocks>128</blocks>
312-
<blockInsertThreshold>500000</blockInsertThreshold>
313+
<inserterThreads>16</inserterThreads>
314+
<blocks>128</blocks>
315+
<blockInsertThreshold>500000</blockInsertThreshold>
313316
314317
You can further experiment with the values based on your input characteristics
315318
and your machine specifications.
319+
320+
Schema helper
321+
--------------
322+
323+
To aid in initial setup, a script `schema-helper.py` is included. The script
324+
runs ipfixcol2 for a brief period of time to observe structure of the flow data
325+
that is being received, and then generates a ClickHouse schema SQL and a
326+
ipfixcol2 config XML. The generated files can be used as an easier starting
327+
point as opposed to doing everything manually.
328+
329+
::
330+
331+
usage: schema-helper.py [-h] [-a ADDRESS] [-p PORT] [-i INTERVAL] [-t {tcp,udp}] [-s SCHEMA_FILE] [-c CONFIG_FILE] [-o OVERWRITE]
332+
333+
optional arguments:
334+
-h, --help show this help message and exit
335+
-a ADDRESS, --address ADDRESS
336+
the local IP address, i.e. interface address; empty = all interfaces (default: )
337+
-p PORT, --port PORT the local port (default: 4739)
338+
-i INTERVAL, --interval INTERVAL
339+
the collection interval in seconds (default: 60)
340+
-t {tcp,udp}, --type {tcp,udp}
341+
the input protocol type (default: udp)
342+
-s SCHEMA_FILE, --schema-file SCHEMA_FILE
343+
the output schema file (default: schema.sql)
344+
-c CONFIG_FILE, --config-file CONFIG_FILE
345+
the output config file (default: config.xml)
346+
-o OVERWRITE, --overwrite OVERWRITE
347+
overwrite the output files without asking if they already exist (default: False)
348+
349+
**Note:**
350+
In case UDP is used, you might need to increase the interval up to several
351+
minutes for the collector to gather and decode enough data. If you are using
352+
UDP and are getting no results, try running the script with interval set to 300
353+
or even 600.
354+
355+
356+
**Example usage:**
357+
358+
::
359+
360+
$ schema-helper.py -i 60 -p 4739 -t tcp
361+
362+
Collect data for 60 seconds, listen on port 4739 using TCP.

0 commit comments

Comments
 (0)