Skip to content

Commit e8bd0e2

Browse files
authored
docs(custom-target): add an overview section for custom target (#882)
1 parent 055c071 commit e8bd0e2

File tree

1 file changed

+34
-20
lines changed

1 file changed

+34
-20
lines changed

docs/docs/custom_ops/custom_targets.mdx

Lines changed: 34 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -8,12 +8,29 @@ import Tabs from '@theme/Tabs';
88
import TabItem from '@theme/TabItem';
99

1010
A custom target allows you to export data to any destination you want, such as databases, cloud storage, file systems, APIs, or other external systems.
11+
You can either continuously update the destination to keep it in sync with the latest exported data, or simply publish the changes as a changelog to somewhere.
12+
13+
## Overview
1114

1215
Custom targets are defined by two components:
1316

1417
* A **target spec** that configures the behavior and connection parameters for the target.
1518
* A **target connector** that handles the actual data export operations.
1619

20+
When you define a flow within CocoIndex, you define how data are transformed, collected and exported, without worrying about how to handle data change (insert, update, delete). CocoIndex handles it for you.
21+
However, a target connects CocoIndex flow and external systems, and needs to synchronize changes of data from the CocoIndex flow to outside.
22+
The implementation of a target connector needs to deal with changes, in two aspects:
23+
24+
- **Setup changes**.
25+
They're for basic setup of a target's corresponding infrastructure (e.g. a table, a directory) without specific data.
26+
When users add a new target, delete an existing target, or make changes to the target spec,
27+
the framework will trigger the connector to apply these setup changes by calling `apply_setup_change()`.
28+
The connector needs apply corresponding setup changes to the external system, e.g. create/delete a table, update/delete a directory, etc.
29+
30+
- **Data changes**.
31+
They're changes of specific data exported to the target.
32+
During the flow is running, when new rows-to-export appear, or existing ones are updated or deleted in the CocoIndex flow, the framework will trigger the connector to apply these data changes by calling `mutate()`, e.g. insert/update/delete a row in a table, write/delete a file, etc.
33+
1734
## Target Spec
1835

1936
The target spec defines the configuration parameters for your custom target. When you use this target in a flow (typically by calling [`export()`](/docs/core/flow_def#export)), you instantiate this target spec with specific parameter values.
@@ -44,7 +61,7 @@ Notes:
4461

4562
A target connector handles the actual data export operations for your custom target. It defines how data should be written to your target destination.
4663

47-
Target connectors implement two categories of methods: **setup methods** for managing target infrastructure (similar to DDL operations in databases), and **data methods** for handling specific data operations (similar to DML operations).
64+
Target connectors implement two categories of methods: setup methods to deal with setup changes, and data methods to deal with data changes.
4865

4966
<Tabs>
5067
<TabItem value="python" label="Python" default>
@@ -158,7 +175,21 @@ def prepare(spec: CustomTarget) -> PreparedCustomTarget:
158175

159176
If not provided, the original spec will be passed directly to `mutate`.
160177

161-
### Complete Example
178+
## Best Practices
179+
180+
### Idempotency of Methods with Side Effects
181+
182+
`apply_setup_change()` and `mutate()` are the two methods that are expected to produce side effects.
183+
We expect them to be idempotent, i.e. when calling them with the same arguments multiple times, the effect should remain the same.
184+
185+
For example,
186+
- For `apply_setup_change()`, if the target is a directory, it should be a no-op if we try to create it (`previous` is `None`) when the directory already exists, and also a no-op if we try to delete it (`current` is `None`) when the directory does not exist.
187+
- For `mutate()`, if a mutation is a deletion, it should be a no-op if the row does not exist.
188+
189+
This is to make sure when the system if left in an intermediate state, e.g. interrupted in the middle between a change is made and CocoIndex notes down the change is completed, the targets can still be gracefully rolled forward to the desired states after the system is resumed.
190+
191+
192+
## Example
162193

163194
In this example, we define a custom target that accepts data with the following fields:
164195
- `filename` (key field)
@@ -247,21 +278,4 @@ For simplicity, the type hints can be omitted and a `dict` will be created inste
247278
</TabItem>
248279
</Tabs>
249280

250-
## Best Practices
251-
252-
### Idempotency of Methods with Side Effects
253-
254-
`apply_setup_change()` and `mutate()` are the two methods that are expected to produce side effects.
255-
We expect them to be idempotent, i.e. when calling them with the same arguments multiple times, the effect should remain the same.
256-
257-
For example,
258-
- For `apply_setup_change()`, if the target is a directory, it should be a no-op if we try to create it (`previous` is `None`) when the directory already exists, and also a no-op if we try to delete it (`current` is `None`) when the directory does not exist.
259-
- For `mutate()`, if a mutation is a deletion, it should be a no-op if the row does not exist.
260-
261-
This is to make sure when the system if left in an intermediate state, e.g. interrupted in the middle between a change is made and CocoIndex notes down the change is completed, the targets can still be gracefully rolled forward to the desired states after the system is resumed.
262-
263-
## Examples
264-
265-
The cocoindex repository contains the following examples of custom targets:
266-
267-
* In the [custom_output_files](https://github.com/cocoindex-io/cocoindex/blob/main/examples/custom_output_files/main.py) example, `LocalFileTarget` exports data to local HTML files.
281+
See the [custom_output_files](https://github.com/cocoindex-io/cocoindex/blob/main/examples/custom_output_files/main.py) for the an end-to-end example.

0 commit comments

Comments
 (0)