cocoindex-io · badmonster0 · Mar 13, 2025 · Mar 13, 2025
diff --git a/README.md b/README.md
@@ -62,8 +62,8 @@ def text_embedding_flow(flow_builder: cocoindex.FlowBuilder, data_scope: cocoind
     with data_scope["documents"].row() as doc:
         # Split the document into chunks, put into `chunks` field
         doc["chunks"] = doc["content"].transform(
-            cocoindex.functions.SplitRecursively(
-                language="markdown", chunk_size=300, chunk_overlap=100))
+            cocoindex.functions.SplitRecursively(),
+            language="markdown", chunk_size=300, chunk_overlap=100)
 
         # Transform data of each chunk
         with doc["chunks"].row() as chunk:

diff --git a/docs/docs/core/flow_def.mdx b/docs/docs/core/flow_def.mdx
@@ -122,14 +122,20 @@ A data slice has a certain data type, and it's the input for most operations.
 `transform()` method transforms the data slice by a function, which creates another data slice.
 A *function spec* needs to be provided for any transform operation, to describe the function and parameters related to the function.
 
+The function takes one or multiple data arguments.
+The first argument is the data slice to be transformed, and the `transform()` method is applied from it.
+Other arguments can be passed in as positional arguments or keyword arguments, aftert the function spec.
+
 <Tabs>
 <TabItem value="python" label="Python" default>
 
 ```python
 @cocoindex.flow_def(name="DemoFlow")
 def demo_flow(flow_builder: cocoindex.FlowBuilder, data_scope: cocoindex.DataScope):
   ...
-  data_scope["field1"] = data_scope["documents"].transform(DemoFunctionSpec(...))
+  data_scope["field2"] = data_scope["field1"].transform(
+                             DemoFunctionSpec(...),
+                             arg1, arg2, ..., key0=kwarg0, key1=kwarg1, ...)
   ...
 ```
 

diff --git a/docs/docs/getting_started/quickstart.md b/docs/docs/getting_started/quickstart.md
@@ -78,8 +78,8 @@ def text_embedding_flow(flow_builder: cocoindex.FlowBuilder, data_scope: cocoind
     with data_scope["documents"].row() as doc:
         # Split the document into chunks, put into `chunks` field
         doc["chunks"] = doc["content"].transform(
-            cocoindex.functions.SplitRecursively(
-                language="markdown", chunk_size=300, chunk_overlap=100))
+            cocoindex.functions.SplitRecursively(),
+            language="markdown", chunk_size=300, chunk_overlap=100)
 
         # Transform data of each chunk
         with doc["chunks"].row() as chunk:

diff --git a/docs/docs/ops/functions.md b/docs/docs/ops/functions.md
@@ -11,15 +11,12 @@ description: CocoIndex Built-in Functions
 It tries to split at higher-level boundaries. If each chunk is still too large, it tries at the next level of boundaries.
 For example, for a Markdown file, it identifies boundaries in this order: level-1 sections, level-2 sections, level-3 sections, paragraphs, sentences, etc.
 
-The spec takes the following fields:
-
-*   `chunk_size` (type: `int`, required): The maximum size of each chunk, in bytes.
-*   `chunk_overlap` (type: `int`, required): The maximum overlap size between adjacent chunks, in bytes.
-*   `language` (type: `str`, optional): The language of the document. Currently it supports `markdown`, `python` and  `javascript`. If unspecified, will treat it as plain text.
-
 Input data:
 
 *   `text` (type: `str`, required): The text to split.
+*   `chunk_size` (type: `int`, required): The maximum size of each chunk, in bytes.
+*   `chunk_overlap` (type: `int`, optional): The maximum overlap size between adjacent chunks, in bytes.
+*   `language` (type: `str`, optional): The language of the document. Currently it supports `markdown`, `python` and  `javascript`. If unspecified, will treat it as plain text.
 
 Return type: `Table`, each row represents a chunk, with the following sub fields: