You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/docs/custom_ops/custom_targets.mdx
+34-20Lines changed: 34 additions & 20 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -8,12 +8,29 @@ import Tabs from '@theme/Tabs';
8
8
importTabItemfrom'@theme/TabItem';
9
9
10
10
A custom target allows you to export data to any destination you want, such as databases, cloud storage, file systems, APIs, or other external systems.
11
+
You can either continuously update the destination to keep it in sync with the latest exported data, or simply publish the changes as a changelog to somewhere.
12
+
13
+
## Overview
11
14
12
15
Custom targets are defined by two components:
13
16
14
17
* A **target spec** that configures the behavior and connection parameters for the target.
15
18
* A **target connector** that handles the actual data export operations.
16
19
20
+
When you define a flow within CocoIndex, you define how data are transformed, collected and exported, without worrying about how to handle data change (insert, update, delete). CocoIndex handles it for you.
21
+
However, a target connects CocoIndex flow and external systems, and needs to synchronize changes of data from the CocoIndex flow to outside.
22
+
The implementation of a target connector needs to deal with changes, in two aspects:
23
+
24
+
-**Setup changes**.
25
+
They're for basic setup of a target's corresponding infrastructure (e.g. a table, a directory) without specific data.
26
+
When users add a new target, delete an existing target, or make changes to the target spec,
27
+
the framework will trigger the connector to apply these setup changes by calling `apply_setup_change()`.
28
+
The connector needs apply corresponding setup changes to the external system, e.g. create/delete a table, update/delete a directory, etc.
29
+
30
+
-**Data changes**.
31
+
They're changes of specific data exported to the target.
32
+
During the flow is running, when new rows-to-export appear, or existing ones are updated or deleted in the CocoIndex flow, the framework will trigger the connector to apply these data changes by calling `mutate()`, e.g. insert/update/delete a row in a table, write/delete a file, etc.
33
+
17
34
## Target Spec
18
35
19
36
The target spec defines the configuration parameters for your custom target. When you use this target in a flow (typically by calling [`export()`](/docs/core/flow_def#export)), you instantiate this target spec with specific parameter values.
@@ -44,7 +61,7 @@ Notes:
44
61
45
62
A target connector handles the actual data export operations for your custom target. It defines how data should be written to your target destination.
46
63
47
-
Target connectors implement two categories of methods: **setup methods** for managing target infrastructure (similar to DDL operations in databases), and **data methods** for handling specific data operations (similar to DML operations).
64
+
Target connectors implement two categories of methods: setup methodsto deal with setup changes, and data methods to deal with data changes.
If not provided, the original spec will be passed directly to `mutate`.
160
177
161
-
### Complete Example
178
+
## Best Practices
179
+
180
+
### Idempotency of Methods with Side Effects
181
+
182
+
`apply_setup_change()` and `mutate()` are the two methods that are expected to produce side effects.
183
+
We expect them to be idempotent, i.e. when calling them with the same arguments multiple times, the effect should remain the same.
184
+
185
+
For example,
186
+
- For `apply_setup_change()`, if the target is a directory, it should be a no-op if we try to create it (`previous` is `None`) when the directory already exists, and also a no-op if we try to delete it (`current` is `None`) when the directory does not exist.
187
+
- For `mutate()`, if a mutation is a deletion, it should be a no-op if the row does not exist.
188
+
189
+
This is to make sure when the system if left in an intermediate state, e.g. interrupted in the middle between a change is made and CocoIndex notes down the change is completed, the targets can still be gracefully rolled forward to the desired states after the system is resumed.
190
+
191
+
192
+
## Example
162
193
163
194
In this example, we define a custom target that accepts data with the following fields:
164
195
-`filename` (key field)
@@ -247,21 +278,4 @@ For simplicity, the type hints can be omitted and a `dict` will be created inste
247
278
</TabItem>
248
279
</Tabs>
249
280
250
-
## Best Practices
251
-
252
-
### Idempotency of Methods with Side Effects
253
-
254
-
`apply_setup_change()` and `mutate()` are the two methods that are expected to produce side effects.
255
-
We expect them to be idempotent, i.e. when calling them with the same arguments multiple times, the effect should remain the same.
256
-
257
-
For example,
258
-
- For `apply_setup_change()`, if the target is a directory, it should be a no-op if we try to create it (`previous` is `None`) when the directory already exists, and also a no-op if we try to delete it (`current` is `None`) when the directory does not exist.
259
-
- For `mutate()`, if a mutation is a deletion, it should be a no-op if the row does not exist.
260
-
261
-
This is to make sure when the system if left in an intermediate state, e.g. interrupted in the middle between a change is made and CocoIndex notes down the change is completed, the targets can still be gracefully rolled forward to the desired states after the system is resumed.
262
-
263
-
## Examples
264
-
265
-
The cocoindex repository contains the following examples of custom targets:
266
-
267
-
* In the [custom_output_files](https://github.com/cocoindex-io/cocoindex/blob/main/examples/custom_output_files/main.py) example, `LocalFileTarget` exports data to local HTML files.
281
+
See the [custom_output_files](https://github.com/cocoindex-io/cocoindex/blob/main/examples/custom_output_files/main.py) for the an end-to-end example.
0 commit comments