Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
30 commits
Select commit Hold shift + click to select a range
c4a6601
upgrade docusaurus version
badmonster0 Aug 21, 2025
369ac26
initial checkin
badmonster0 Aug 21, 2025
3b609d9
example documentation for custom targets
badmonster0 Aug 21, 2025
f44135f
Update custom_targets.md
badmonster0 Aug 21, 2025
7b045be
paper indexing
badmonster0 Aug 21, 2025
fda59b1
Update academic_papers_index.md
badmonster0 Aug 21, 2025
98eaa05
add example for knowledge graphs
badmonster0 Aug 21, 2025
7707e41
add examples for photo search / knowledge graph
badmonster0 Aug 21, 2025
b74a1ed
Create multi_format_index.md
badmonster0 Aug 21, 2025
2ddb232
Update multi_format_index.md
badmonster0 Aug 21, 2025
f18b84d
product recommendation example
badmonster0 Aug 21, 2025
145f488
Merge branch 'main' into examples
badmonster0 Aug 21, 2025
84a553a
Create manual_extraction.md
badmonster0 Aug 21, 2025
0ceda04
Create simple_text_embedding.md
badmonster0 Aug 21, 2025
57a61e2
Delete code_index.md
badmonster0 Aug 21, 2025
70e74a2
patient intake form
badmonster0 Aug 21, 2025
ed847f4
Create image_search.md
badmonster0 Aug 21, 2025
8ccf086
visual & images for examples
badmonster0 Aug 22, 2025
b72a49d
Merge branch 'main' into examples
badmonster0 Aug 22, 2025
e483a71
update example for semantic search 101
badmonster0 Aug 22, 2025
9eefa87
compress image
badmonster0 Aug 22, 2025
8966c05
Merge branch 'main' into examples
badmonster0 Aug 22, 2025
c6542bb
tags & images
badmonster0 Aug 22, 2025
b689d9e
Merge branch 'main' into examples
badmonster0 Aug 26, 2025
23b8130
polish codebase example docs
badmonster0 Aug 26, 2025
83a58b7
add flow overview to codebase example
badmonster0 Aug 26, 2025
2600706
add image to illustrate chunks
badmonster0 Aug 26, 2025
6c99025
Merge branch 'main' into examples
badmonster0 Aug 26, 2025
2d76b05
docs: custom target example
badmonster0 Aug 26, 2025
2c9a3ab
Merge branch 'main' into examples
badmonster0 Aug 26, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
36 changes: 16 additions & 20 deletions docs/docs/examples/examples/custom_targets.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,19 +9,13 @@ sidebar_custom_props:
tags: [custom-building-blocks]
tags: [custom-building-blocks]
---
import { GitHubButton, YouTubeButton } from '../../../src/components/GitHubButton';
import { GitHubButton, YouTubeButton, DocumentationButton } from '../../../src/components/GitHubButton';

<GitHubButton url="https://github.com/cocoindex-io/cocoindex/tree/main/examples/custom_output_files"/>

## Overview

Let’s walk through a simple example—exporting `.md` files as `.html` using a custom file-based target. This project monitors folder changes and continuously converts markdown to HTML incrementally.
Check out the full [source code](https://github.com/cocoindex-io/cocoindex/tree/main/examples/custom_output_files).

The overall flow is simple:
This example focuses on
- how to configure your custom target
- the flow effortless picks up the changes in the source, recomputes only what's changed and export to the target
Let’s walk through a simple example—exporting `.md` files as `.html` using a custom file-based target. This project monitors folder changes and continuously converts markdown to HTML incrementally. The overall flow is simple and primarily focuses on how to configure your custom target.


## Ingest files
Expand All @@ -33,28 +27,26 @@ Ingest a list of markdown files:
def custom_output_files(
flow_builder: cocoindex.FlowBuilder, data_scope: cocoindex.DataScope
) -> None:
"""
Define an example flow that exports markdown files to HTML files.
"""
data_scope["documents"] = flow_builder.add_source(
cocoindex.sources.LocalFile(path="data", included_patterns=["*.md"]),
refresh_interval=timedelta(seconds=5),
)
```
This ingestion creates a table with `filename` and `content` fields.

<DocumentationButton href="https://cocoindex.io/docs/ops/sources" text="Sources" />

## Process each file and collect

Define custom function that converts markdown to HTML

```python
@cocoindex.op.function()

def markdown_to_html(text: str) -> str:
return _markdown_it.render(text)
```

<DocumentationButton href="https://cocoindex.io/docs/custom_ops/custom_functions" text="Custom Function" margin="0 0 16px 0" />

Define data collector and transform each document to html.

```python
Expand All @@ -63,21 +55,27 @@ with data_scope["documents"].row() as doc:
doc["html"] = doc["content"].transform(markdown_to_html)
output_html.collect(filename=doc["filename"], html=doc["html"])
```
![Convert markdown to html](/img/examples/custom_targets/convert.png)


## Define the custom target

### Define the target spec

<DocumentationButton href="https://cocoindex.io/docs/custom_ops/custom_targets#target-spec" text="Target Spec" margin="0 0 16px 0" />

The target spec contains a directory for output files:

```python
class LocalFileTarget(cocoindex.op.TargetSpec):
directory: str
```


### Implement the connector

<DocumentationButton href="https://cocoindex.io/docs/custom_ops/custom_targets#target-connector" text="Target Connector" margin="0 0 16px 0" />

`get_persistent_key()` defines the persistent key,
which uniquely identifies the target for change tracking and incremental updates. Here, we simply use the target directory as the key (e.g., `./data/output`).

Expand Down Expand Up @@ -180,17 +178,15 @@ def mutate(
### Use it in the Flow

```python
output_html.export(
"OutputHtml",
LocalFileTarget(directory="output_html"),
primary_key_fields=["filename"],
)
output_html.export(
"OutputHtml",
LocalFileTarget(directory="output_html"),
primary_key_fields=["filename"],
)
```

## Run the example

Once your pipeline is set up, keeping your knowledge graph updated is simple:

```bash
pip install -e .
cocoindex update --setup main.py
Expand Down
Binary file modified docs/static/img/examples/codebase_index/chunk.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
3 changes: 1 addition & 2 deletions examples/custom_output_files/README.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,4 @@
# Build text embedding and semantic search 🔍
[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/cocoindex-io/cocoindex/blob/main/examples/text_embedding/Text_Embedding.ipynb)
# Export markdown files to local Html with Custom Targets
[![GitHub](https://img.shields.io/github/stars/cocoindex-io/cocoindex?color=5B5BD6)](https://github.com/cocoindex-io/cocoindex)

In this example, we will build index flow to load data from a local directory, convert them to HTML, and save the data to another local directory powered by [CocoIndex Custom Targets](https://cocoindex.io/docs/custom_ops/custom_targets).
Expand Down