Skip to content

Commit b78162d

Browse files
committed
Add plugin documentation for Core and Enterprise
1 parent c0e3f49 commit b78162d

File tree

6 files changed

+980
-673
lines changed

6 files changed

+980
-673
lines changed

.ci/remark-lint/yarn.lock

Lines changed: 333 additions & 365 deletions
Large diffs are not rendered by default.

api-docs/yarn.lock

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -4,12 +4,12 @@
44

55
argparse@^2.0.1:
66
version "2.0.1"
7-
resolved "https://registry.yarnpkg.com/argparse/-/argparse-2.0.1.tgz#246f50f3ca78a3240f6c997e8a9bd1eac49e4b38"
7+
resolved "https://registry.npmjs.org/argparse/-/argparse-2.0.1.tgz"
88
integrity sha512-8+9WqebbFzpX9OR+Wa6O29asIogeRMzcGtAINdpMHHyAg10f05aSFVBbcEqGf/PXw1EjAZ+q2/bEBg3DvurK3Q==
99

1010
js-yaml@^4.1.0:
1111
version "4.1.0"
12-
resolved "https://registry.yarnpkg.com/js-yaml/-/js-yaml-4.1.0.tgz#c1fb65f8f5017901cdd2c951864ba18458a10602"
12+
resolved "https://registry.npmjs.org/js-yaml/-/js-yaml-4.1.0.tgz"
1313
integrity sha512-wpxZs9NoxZaJESJGIZTyDEaYpl0FKSA+FB9aJiyemKhMwkxQg63h4T1KJgUGHpTqPDNRcmmYLugrRjJlBtWvRA==
1414
dependencies:
1515
argparse "^2.0.1"

content/influxdb3/core/plugins.md

Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,14 @@
1+
---
2+
title: Python Plugins and Processing Engine
3+
description: Instructions for using the Python processing engine in InfluxDB 3
4+
menu:
5+
influxdb3_core:
6+
name: Processing Engine and Python Plugins
7+
weight: 2
8+
influxdb3/core/tags: []
9+
source: /shared/v3-core-plugins/_index.md
10+
---
11+
12+
<!--
13+
The content of this page is at /shared/v3-core-plugins/_index.md
14+
-->
Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,14 @@
1+
---
2+
title: Python Plugins and Processing Engine
3+
description: Instructions for using the Python processing engine in InfluxDB 3
4+
menu:
5+
influxdb3_enterprise:
6+
name: Processing Engine and Python Plugins
7+
weight: 2
8+
influxdb3/enterprise/tags: []
9+
source: /shared/v3-core-plugins/_index.md
10+
---
11+
12+
<!--
13+
The content of this page is at /shared/v3-core-plugins/_index.md
14+
-->
Lines changed: 297 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,297 @@
1+
> [!Important]
2+
> #### Processing engine only works with Docker
3+
>
4+
> The Processing engine is currently supported only in Docker x86 environments. Non-Docker support is coming soon. The engine, API, and developer experience are actively evolving and may change. Join our [Discord](https://discord.gg/9zaNCW2PRT) for updates and feedback.
5+
6+
InfluxDB 3 has an embedded Python VM for dynamically loading plugins that can execute code in the database. There are four types of plugins that can be triggered by the following events in the database:
7+
8+
* **WAL flush**: Triggered when the write-ahead log (WAL) is flushed to object store (once a second by default)
9+
* **Parquet persistenc (coming soon)**: Triggered when data is persisted to object store in Parquet format
10+
* **Scheduled tasks**: Triggered by a schedule, specified in cron sytnax
11+
* **On Request**: Bind to a specific endpoint under `/api/v3/engine` and trigger when GET or POST requests are made
12+
13+
Each plugin type has a different trigger configuration, which will be described in the section on each plugin type.
14+
15+
## Installing Plugins from the repo
16+
The repository at [https://github.com/influxdata/influxdb3_plugins](https://github.com/influxdata/influxdb3_plugins) contans example plugins and contributions from the community. To install a plugin from the repository, you can using the `influxdb3` CLI.
17+
18+
Just use `gh:<path>` as the plugin name to install a plugin from the repository. For example, to install the `wal_plugin.py` from the repository, you can use the following command:
19+
20+
```shell
21+
influxdb3 create plugin -d=mydb --filename "gh:examples/shedule/system_metrics" --plugin-type=scheduled system_metrics
22+
influxdb3 create plugin -d=mydb --filename "gh:examples/wal_plugin" --plugin-type=wal_rows wal_plugin_example
23+
```
24+
25+
You will then need to create a trigger to activate the plugin. Details on triggers for each type of plugin are provided below.
26+
27+
## Shared API
28+
29+
Within any of the plugin types, a shared API is available to interact with the database. The shared API provides access to the following:
30+
* `LineBuilder` to create Line Protocol lines that can be written to the database
31+
* `query` to query data from any database
32+
* `info`, `warn`, and `error` to log messages to the database log, which will be output in the server logs and captured in system tables queryable by SQL
33+
34+
### Line Builder
35+
36+
The `LineBuilder` is a simple API for building lines of Line Protocol to write into the database. Writes are buffered while the plugin runs and are flushed when the plugin completes. The `LineBuilder` API is available in all plugin types. Here are some examples of using the `LineBuilder` API:
37+
38+
```python
39+
line = LineBuilder("weather")
40+
.tag("location", "us-midwest")
41+
.float64_field("temperature", 82.5)
42+
.time_ns(1627680000000000000)
43+
influxdb3_local.write(line)
44+
45+
# to output it as a string: "weather,location=us-midwest temperature=82.5 1627680000000000000"
46+
line_str = line.build()
47+
48+
# or build incrementally
49+
line = LineBuilder("weather")
50+
line.tag("location", "us-midwest")
51+
line.float64_field("temperature", 82.5)
52+
line.time_ns(1627680000000000000)
53+
influxdb3_local.write(line)
54+
```
55+
56+
Here is the Python implementation of the `LineBuilder` API:
57+
58+
```python
59+
from typing import Optional
60+
from collections import OrderedDict
61+
62+
class InfluxDBError(Exception):
63+
"""Base exception for InfluxDB-related errors"""
64+
pass
65+
66+
class InvalidMeasurementError(InfluxDBError):
67+
"""Raised when measurement name is invalid"""
68+
pass
69+
70+
class InvalidKeyError(InfluxDBError):
71+
"""Raised when a tag or field key is invalid"""
72+
pass
73+
74+
class InvalidLineError(InfluxDBError):
75+
"""Raised when a line protocol string is invalid"""
76+
pass
77+
78+
class LineBuilder:
79+
def __init__(self, measurement: str):
80+
if ' ' in measurement:
81+
raise InvalidMeasurementError("Measurement name cannot contain spaces")
82+
self.measurement = measurement
83+
self.tags: OrderedDict[str, str] = OrderedDict()
84+
self.fields: OrderedDict[str, str] = OrderedDict()
85+
self._timestamp_ns: Optional[int] = None
86+
87+
def _validate_key(self, key: str, key_type: str) -> None:
88+
"""Validate that a key does not contain spaces, commas, or equals signs."""
89+
if not key:
90+
raise InvalidKeyError(f"{key_type} key cannot be empty")
91+
if ' ' in key:
92+
raise InvalidKeyError(f"{key_type} key '{key}' cannot contain spaces")
93+
if ',' in key:
94+
raise InvalidKeyError(f"{key_type} key '{key}' cannot contain commas")
95+
if '=' in key:
96+
raise InvalidKeyError(f"{key_type} key '{key}' cannot contain equals signs")
97+
98+
def tag(self, key: str, value: str) -> 'LineBuilder':
99+
"""Add a tag to the line protocol."""
100+
self._validate_key(key, "tag")
101+
self.tags[key] = str(value)
102+
return self
103+
104+
def uint64_field(self, key: str, value: int) -> 'LineBuilder':
105+
"""Add an unsigned integer field to the line protocol."""
106+
self._validate_key(key, "field")
107+
if value < 0:
108+
raise ValueError(f"uint64 field '{key}' cannot be negative")
109+
self.fields[key] = f"{value}u"
110+
return self
111+
112+
def int64_field(self, key: str, value: int) -> 'LineBuilder':
113+
"""Add an integer field to the line protocol."""
114+
self._validate_key(key, "field")
115+
self.fields[key] = f"{value}i"
116+
return self
117+
118+
def float64_field(self, key: str, value: float) -> 'LineBuilder':
119+
"""Add a float field to the line protocol."""
120+
self._validate_key(key, "field")
121+
# Check if value has no decimal component
122+
self.fields[key] = f"{int(value)}.0" if value % 1 == 0 else str(value)
123+
return self
124+
125+
def string_field(self, key: str, value: str) -> 'LineBuilder':
126+
"""Add a string field to the line protocol."""
127+
self._validate_key(key, "field")
128+
# Escape quotes and backslashes in string values
129+
escaped_value = value.replace('"', '\\"').replace('\\', '\\\\')
130+
self.fields[key] = f'"{escaped_value}"'
131+
return self
132+
133+
def bool_field(self, key: str, value: bool) -> 'LineBuilder':
134+
"""Add a boolean field to the line protocol."""
135+
self._validate_key(key, "field")
136+
self.fields[key] = 't' if value else 'f'
137+
return self
138+
139+
def time_ns(self, timestamp_ns: int) -> 'LineBuilder':
140+
"""Set the timestamp in nanoseconds."""
141+
self._timestamp_ns = timestamp_ns
142+
return self
143+
144+
def build(self) -> str:
145+
"""Build the line protocol string."""
146+
# Start with measurement name (escape commas only)
147+
line = self.measurement.replace(',', '\\,')
148+
149+
# Add tags if present
150+
if self.tags:
151+
tags_str = ','.join(
152+
f"{k}={v}" for k, v in self.tags.items()
153+
)
154+
line += f",{tags_str}"
155+
156+
# Add fields (required)
157+
if not self.fields:
158+
raise InvalidLineError(f"At least one field is required: {line}")
159+
160+
fields_str = ','.join(
161+
f"{k}={v}" for k, v in self.fields.items()
162+
)
163+
line += f" {fields_str}"
164+
165+
# Add timestamp if present
166+
if self._timestamp_ns is not None:
167+
line += f" {self._timestamp_ns}"
168+
169+
return line
170+
```
171+
172+
### Query
173+
The `query` function on the API will execute a SQL query with optional parameters (through a parameterized query) and return the results as a `List` of `Dict[String, Any]` where the key is the column name and the value is the value for that column. The `query` function is available in all plugin types.
174+
175+
Some examples:
176+
177+
```python
178+
influxdb3_local.query("SELECT * from foo where bar = 'baz' and time > now() - 'interval 1 hour'")
179+
180+
# or using parameterized queries
181+
args = {"bar": "baz"}
182+
influxdb3_local.query("SELECT * from foo where bar = $bar and time > now() - 'interval 1 hour'", args)
183+
```
184+
185+
### Logging
186+
The `info`, `warn`, and `error` functions on the API will log messages to the database log, which will be output in the server logs and captured in system tables queryable by SQL. The `info`, `warn`, and `error` functions are available in all plugin types. The functions take an arbitrary number of arguments and will convert them to strings and join them into a single message separated by a space. Examples:
187+
188+
```python
189+
ifluxdb3_local.info("This is an info message")
190+
influxdb3_local.warn("This is a warning message")
191+
influxdb3_local.error("This is an error message")
192+
193+
obj_to_log = {"hello": "world"}
194+
influxdb3_local.info("This is an info message with an object", obj_to_log)
195+
```
196+
197+
### Trigger arguments
198+
Every plugin type can receive arguments from the configuration of the trigger. This is useful for passing configuration to the plugin. This can drive behavior like things to monitor for or it could be connection information to third party services that the plugin will interact with. The arguments are passed as a `Dict` of `String` to `String` where the key is the argument name and the value is the argument value. Here's an example of how to use arguments in a plugin:
199+
200+
```python
201+
def process_writes(influxdb3_local, table_batches, args=None):
202+
if args and "threshold" in args:
203+
threshold = int(args["threshold"])
204+
influxdb3_local.info(f"Threshold is {threshold}")
205+
else:
206+
influxdb3_local.warn("No threshold provided")
207+
```
208+
209+
The `args` parameter is optional and can be omitted from the trigger definitions if the plugin does not need to use arguments.
210+
211+
## Imports
212+
The Python plugins run using the system Python in the Docker container. Pip is installed in the container and can be used to install any dependencies.
213+
You will need to start up the server with the `PYTHONPATH` set to the location of your site packages for your virtual environment. For example: `PYTHONPATH=myenvl/lib/python3.13/site-packages`
214+
215+
## WAL Flush Plugin
216+
When a WAL flush plugin is triggered, the plugin will receive a list of `table_batches` that have matched the plugin trigger (either all tables in the database or a specific table). Here's an example of a simple WAL flush plugin
217+
218+
```python
219+
def process_writes(influxdb3_local, table_batches, args=None):
220+
for table_batch in table_batches:
221+
# Skip if table_name is write_reports
222+
if table_batch["table_name"] == "write_reports":
223+
continue
224+
225+
row_count = len(table_batch["rows"])
226+
227+
# Double row count if table name matches args table_name
228+
if args and "double_count_table" in args and table_batch["table_name"] == args["double_count_table"]:
229+
row_count *= 2
230+
231+
line = LineBuilder("write_reports")\
232+
.tag("table_name", table_batch["table_name"])\
233+
.int64_field("row_count", row_count)
234+
influxdb3_local.write(line)
235+
236+
influxdb3_local.info("wal_plugin.py done")
237+
```
238+
239+
### WAL Flush Trigger Configuration
240+
241+
Every trigger is associated with a specific database. The best reference for the arguments for trigger definition can be accessed through the CLI help:
242+
243+
```shell
244+
influxdb3 create trigger help
245+
```
246+
247+
For the WAL plugin, the `trigger-spec` can be either `all-tables` which will trigger on any write to the assoicated database or `table:<table_name>` which will call the `process_writes` function only with the writes for the given table. The `args` parameter can be used to pass configuration to the plugin.
248+
249+
## Schedule Plugin
250+
Schedule plugins run on a schedule specified in cron syntax. The plugin will receive the local API, the time of the trigger, and any arguments passed in the trigger definition. Here's an example of a simple schedule plugin:
251+
252+
```python
253+
# see if a table has been written to in the last 5 minutes
254+
def process_scheduled_call(influxdb3_local, time, args=None):
255+
if args and "table_name" in args:
256+
table_name = args["table_name"]
257+
result = influxdb3_local.query(f"SELECT * FROM {table_name} WHERE time > now() - 'interval 5m'")
258+
# write an error log if the result is empty
259+
if not result:
260+
influxdb3_local.error(f"No data in {table_name} in the last 5 minutes")
261+
else:
262+
influxdb3_local.error("No table_name provided for schedule plugin")
263+
```
264+
265+
### Schedule Trigger Configuration
266+
Schedule plugins are set with a `trigger-spec` of `schedule:<cron_expression>`. The `args` parameter can be used to pass configuration to the plugin. For example, if we wanted the above plugin to run the check every minute, we would use `schedule:*/5 * * * *` as the `trigger-spec`.
267+
268+
## On Request Plugin
269+
On Request plugins are triggered by a request to a specific endpoint under `/api/v3/engine`. The plugin will receive the local API, query parameters `Dict[str, str]`, request headers `Dict[str, str]`, request body (as bytes), and any arguments passed in the trigger definition. Here's an example of a simple On Request plugin:
270+
271+
```python
272+
import json
273+
274+
def process_request(influxdb3_local, query_parameters, request_headers, request_body, args=None):
275+
for k, v in query_parameters.items():
276+
influxdb3_local.info(f"query_parameters: {k}={v}")
277+
for k, v in request_headers.items():
278+
influxdb3_local.info(f"request_headers: {k}={v}")
279+
280+
request_data = json.loads(request_body)
281+
282+
influxdb3_local.info("parsed JSON request body:", request_data)
283+
284+
# write the data to the database
285+
line = LineBuilder("request_data").tag("tag1", "tag1_value").int64_field("field1", 1)
286+
# get a string of the line to return as the body
287+
line_str = line.build()
288+
289+
influxdb3_local.write(line)
290+
291+
return 200, {"Content-Type": "application/json"}, json.dumps({"status": "ok", "line": line_str})
292+
```
293+
294+
### On Request Trigger Configuration
295+
On Request plugins are set with a `trigger-spec` of `request:<endpoint>`. The `args` parameter can be used to pass configuration to the plugin. For example, if we wanted the above plugin to run on the endpoint `/api/v3/engine/my_plugin`, we would use `request:my_plugin` as the `trigger-spec`.
296+
297+
Trigger specs must be unique across all configured plugins, regardless of which database they are tied to, given the path is the same.

0 commit comments

Comments
 (0)