|
| 1 | +clickhouse (output plugin) |
| 2 | +=========================== |
| 3 | + |
| 4 | +The plugin converts and store IPFIX flow records into a Clickhouse database. |
| 5 | +files and provide query results faster. |
| 6 | + |
| 7 | +How to build |
| 8 | +------------ |
| 9 | + |
| 10 | +By default, the plugin is not distributed with IPFIXcol due to extra dependencies. |
| 11 | +To build the plugin, IPFIXcol2 (and its header files) must be installed on your system. |
| 12 | + |
| 13 | +Finally, compile and install the plugin: |
| 14 | + |
| 15 | +.. code-block:: sh |
| 16 | +
|
| 17 | + $ mkdir build && cd build && cmake .. |
| 18 | + $ make |
| 19 | + # make install |
| 20 | +
|
| 21 | +Usage |
| 22 | +------ |
| 23 | + |
| 24 | +The plugin expects the Clickhouse database to already contain the table with |
| 25 | +appropriate schema corresponding to the configuration entered. The existence |
| 26 | +and schema of the table is checked after initiating connection to the database |
| 27 | +and an error is displayed if there is a mismatch. The table is not |
| 28 | +automatically created. |
| 29 | + |
| 30 | +To run the example configuration below, you can create the Clickhouse table |
| 31 | +using the following SQL query: |
| 32 | + |
| 33 | +.. code-block:: sql |
| 34 | +
|
| 35 | + CREATE TABLE ipfixcol2.flows ( |
| 36 | + odid UInt32, |
| 37 | + srcip IPv6, |
| 38 | + dstip IPv6, |
| 39 | + flowstart DateTime64(9), |
| 40 | + flowend DateTime64(9), |
| 41 | + sourceTransportPort UInt16, |
| 42 | + destinationTransportPort UInt16, |
| 43 | + protocolIdentifier UInt8, |
| 44 | + octetDeltaCount UInt64, |
| 45 | + packetDeltaCount UInt64, |
| 46 | + INDEX srcipindex srcip TYPE bloom_filter GRANULARITY 16, |
| 47 | + INDEX dstipindex dstip TYPE bloom_filter GRANULARITY 16 |
| 48 | + ) |
| 49 | + ENGINE = MergeTree |
| 50 | + PARTITION BY toStartOfInterval(flowstart, INTERVAL 1 HOUR) |
| 51 | + ORDER BY flowstart |
| 52 | +
|
| 53 | +The following Clickhouse column types are expected to store the following IPFIX element types: |
| 54 | + |
| 55 | +.. list-table:: Mapping of IPFIX types to Clickhouse column types |
| 56 | + |
| 57 | + * - **IPFIX Abstract Data Type** |
| 58 | + - **Clickhouse Column Type** |
| 59 | + * - unsigned8 |
| 60 | + - UInt8 |
| 61 | + * - unsigned16 |
| 62 | + - UInt16 |
| 63 | + * - unsigned32 |
| 64 | + - UInt32 |
| 65 | + * - unsigned64 |
| 66 | + - UInt64 |
| 67 | + * - signed8 |
| 68 | + - Int8 |
| 69 | + * - signed16 |
| 70 | + - Int16 |
| 71 | + * - signed32 |
| 72 | + - Int32 |
| 73 | + * - signed64 |
| 74 | + - Int64 |
| 75 | + * - ipv4Address |
| 76 | + - IPv4 |
| 77 | + * - ipv6Address |
| 78 | + - IPv6 |
| 79 | + * - dateTimeNanoseconds |
| 80 | + - DateTime64(9) |
| 81 | + * - dateTimeMicroseconds |
| 82 | + - DateTime64(6) |
| 83 | + * - dateTimeMilliseconds |
| 84 | + - DateTime64(3) |
| 85 | + * - dateTimeSeconds |
| 86 | + - DateTime |
| 87 | + * - string |
| 88 | + - String |
| 89 | + |
| 90 | +In case the field is an alias mapping to multiple IPFIX elements of compatible |
| 91 | +types, the resulting type is unified to the type with higher precision, i.e. |
| 92 | +the type that can hold both of the values without data loss. To unify IPv4 and |
| 93 | +IPv6 addresses as one type, the IPv4 is stored as an IPv6 value as a IPv4 |
| 94 | +mapped IPv6 address. |
| 95 | + |
| 96 | + |
| 97 | +Example configuration |
| 98 | +--------------------- |
| 99 | + |
| 100 | +.. code-block:: xml |
| 101 | +
|
| 102 | + <output> |
| 103 | + <name>Clickhouse output</name> |
| 104 | + <plugin>clickhouse</plugin> |
| 105 | + <params> |
| 106 | + <connection> |
| 107 | + <endpoints> |
| 108 | + <!-- One or more ClickHouse databases (endpoints) --> |
| 109 | + <endpoint> |
| 110 | + <host>clickhouse.example.com</host> |
| 111 | + <port>9000</port> |
| 112 | + </endpoint> |
| 113 | + </endpoints> |
| 114 | + <user>ipfixcol2</user> |
| 115 | + <password>ipfixcol2</password> |
| 116 | + <database>ipfixcol2</database> |
| 117 | + <table>flows</table> |
| 118 | + </connection> |
| 119 | + <inserterThreads>32</inserterThreads> |
| 120 | + <blocks>1024</blocks> |
| 121 | + <blockInsertThreshold>100000</blockInsertThreshold> |
| 122 | + <splitBiflow>true</splitBiflow> |
| 123 | + <columns> |
| 124 | + <column> |
| 125 | + <!-- Special field representing the ODID the flow originated from. --> |
| 126 | + <name>odid</name> |
| 127 | + </column> |
| 128 | + <column> |
| 129 | + <!-- IPFIX field(s) identified by an alias. Maps to sourceIPv4Address or sourceIPv6Address, whichever exists. --> |
| 130 | + <name>srcip</name> |
| 131 | + </column> |
| 132 | + <column> |
| 133 | + <name>dstip</name> |
| 134 | + </column> |
| 135 | + <column> |
| 136 | + <name>flowstart</name> |
| 137 | + </column> |
| 138 | + <column> |
| 139 | + <name>flowend</name> |
| 140 | + </column> |
| 141 | + <column> |
| 142 | + <!-- IPFIX field identified by its IANA name stored to a column named "srcport" --> |
| 143 | + <name>srcport</name> |
| 144 | + <source>sourceTransportPort</source> |
| 145 | + </column> |
| 146 | + <column> |
| 147 | + <name>dstport</name> |
| 148 | + <source>destinationTransportPort</source> |
| 149 | + </column> |
| 150 | + <column> |
| 151 | + <!-- IPFIX field identified by its IANA name stored to a column with the same name --> |
| 152 | + <name>protocolIdentifier</name> |
| 153 | + </column> |
| 154 | + <column> |
| 155 | + <name>octetDeltaCount</name> |
| 156 | + </column> |
| 157 | + <column> |
| 158 | + <name>packetDeltaCount</name> |
| 159 | + </column> |
| 160 | + </columns> |
| 161 | + </params> |
| 162 | + </output> |
| 163 | +
|
| 164 | +**Warning**: The database and the table with the appropriate schema must already exist. |
| 165 | +It will not be created automatically. |
| 166 | + |
| 167 | +Parameters |
| 168 | +---------- |
| 169 | + |
| 170 | +:``connection``: |
| 171 | + The database connection parameters. |
| 172 | + |
| 173 | + :``endpoints``: |
| 174 | + The possible endpoints data can be sent to, i.e. all the replicas of a |
| 175 | + particular shard. In case one endpoint is unreachable, another one is used. |
| 176 | + |
| 177 | + :``endpoint``: |
| 178 | + Connection parameters of one endpoint. |
| 179 | + |
| 180 | + :``host``: |
| 181 | + The Clickhouse database host as a domain name or an IP address. |
| 182 | + |
| 183 | + :``port``: |
| 184 | + The port of the Clickhouse database. [default: 9000] |
| 185 | + |
| 186 | + :``username``:" |
| 187 | + The database username. |
| 188 | + |
| 189 | + :``password``: |
| 190 | + The database password. |
| 191 | + |
| 192 | + :``database``: |
| 193 | + The database name where the specified table is present. |
| 194 | + |
| 195 | + :``table``: |
| 196 | + The name of the table to insert the data into. |
| 197 | + |
| 198 | +:``splitBiflow``: |
| 199 | + When true, biflow records are split into two uniflow records. [default: true] |
| 200 | + |
| 201 | +:``biflowEmptyAutoignore``: |
| 202 | + When true and ``splitBiflow`` is active, the uniflow records resulting from |
| 203 | + the split are also checked for emptiness and are omitted if empty. A flow |
| 204 | + is considered empty when ``octetDeltaCount = 0`` or ``packetDeltaCount = 0``. |
| 205 | + This exists because some IPFIX probes may export uniflow records as biflow |
| 206 | + with the reverse direction always empty, resulting in a large amount of |
| 207 | + empty flow records. |
| 208 | + [default: true] |
| 209 | + |
| 210 | +:``blocks``: |
| 211 | + Number of data blocks in circulation. Each block is de-facto a memory |
| 212 | + buffer that the rows are written to before being sent out to the Clickhouse |
| 213 | + database. [default: 1024] |
| 214 | + |
| 215 | +:``inserterThreads``: |
| 216 | + Number of threads used for data insertion to Clickhouse. In other words, |
| 217 | + the number of Clickhouse connections that are concurrently used. [default: 32] |
| 218 | + |
| 219 | +:``blockInsertThreshold``: |
| 220 | + Number of rows to be buffered into a block before the block is sent out to |
| 221 | + be inserted into the database. [default: 100000] |
| 222 | + |
| 223 | +:``blockInsertMaxDelaySecs``: |
| 224 | + Maximum number of seconds to wait before a block gets sent out to be |
| 225 | + inserted into the database even if the threshold has not been reached yet. |
| 226 | + [default: 10] |
| 227 | + |
| 228 | +:``columns``: |
| 229 | + The fields that each row will consist of. |
| 230 | + |
| 231 | + :``column``: |
| 232 | + |
| 233 | + :``name``: |
| 234 | + Name of the column in the database. Also the source field if source |
| 235 | + is not explicitly defined. |
| 236 | + |
| 237 | + :``nullable``: |
| 238 | + Whether null should be a special value. If false, zero value of the |
| 239 | + corresponding data type is used as null. Turning this option on |
| 240 | + might negatively affect performance. [default: false] |
| 241 | + |
| 242 | + :``source``: |
| 243 | + An IPFIX element name or an alias. If not present, name is used. |
| 244 | + [default: same as name] |
0 commit comments