Skip to content

Commit 0bae439

Browse files
authored
docs: clarify nullability for builtin function argument types (#843)
1 parent d31fd29 commit 0bae439

File tree

2 files changed

+12
-11
lines changed

2 files changed

+12
-11
lines changed

docs/docs/core/data_types.mdx

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -204,9 +204,10 @@ Currently, the following types are key types
204204

205205
CocoIndex supports *Null* values. A *Null* value represents the absence of data or an unknown value, distinct from empty strings, zero numbers, or false boolean values.
206206

207-
### Nullable Data
207+
### Nullable Type
208208

209-
For any data (e.g. a field of a *Struct*, an argument or return value of a CocoIndex function), if it is nullable, it means its value can be *Null*.
209+
For any data (e.g. a field of a *Struct*, an argument or return value of a CocoIndex function), if it is nullable, it means its value can be *Null*.
210+
We use a `?` suffix to indicate a nullable type, e.g. *Str?*, *Person?*.
210211

211212
In Python, *Null* is represented as `None`, so a nullable type can be represented by `T | None` or `typing.Optional[T]`.
212213

docs/docs/ops/functions.md

Lines changed: 9 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -9,12 +9,12 @@ description: CocoIndex Built-in Functions
99

1010
`ParseJson` parses a given text to JSON.
1111

12-
The spec takes the following fields:
12+
Input data:
1313

14-
* `text` (`str`): The source text to parse.
15-
* `language` (`str`, optional): The language of the source text. Only `json` is supported now. Default to `json`.
14+
* `text` (*Str*): The source text to parse.
15+
* `language` (*Str?*, default: `"json"`): The language of the source text. Only `json` is supported now.
1616

17-
Return: *Json*
17+
Return: *Json*, the parsed JSON object.
1818

1919
## SplitRecursively
2020

@@ -37,7 +37,7 @@ Input data:
3737

3838
* `text` (*Str*): The text to split.
3939
* `chunk_size` (*Int64*): The maximum size of each chunk, in bytes.
40-
* `min_chunk_size` (*Int64*, optional): The minimum size of each chunk, in bytes. If not provided, default to `chunk_size / 2`.
40+
* `min_chunk_size` (*Int64*, default: `chunk_size / 2`): The minimum size of each chunk, in bytes.
4141

4242
:::note
4343

@@ -48,8 +48,8 @@ Input data:
4848

4949
:::
5050

51-
* `chunk_overlap` (*Int64*, optional): The maximum overlap size between adjacent chunks, in bytes.
52-
* `language` (*Str*, optional): The language of the document.
51+
* `chunk_overlap` (*Int64?*, default: *Null*): The maximum overlap size between adjacent chunks, in bytes.
52+
* `language` (*Str*, default: `""`): The language of the document.
5353
Can be a language name (e.g. `Python`, `Javascript`, `Markdown`) or a file extension (e.g. `.py`, `.js`, `.md`).
5454

5555

@@ -61,7 +61,7 @@ Input data:
6161
* `custom_languages` in the spec, against the `language_name` or `aliases` field of each entry.
6262
* Builtin languages (see [Supported Languages](#supported-languages) section below), against the language, aliases or file extensions of each entry.
6363

64-
All matches are in a case-insensitive manner. If the value of `language` is null, it'll be treated as empty string.
64+
All matches are in a case-insensitive manner.
6565

6666
* If no match is found, the input will be treated as plain text.
6767

@@ -185,7 +185,7 @@ Not all LLM APIs support text embedding. See the [LLM API Types table](/docs/ai/
185185

186186
Input data:
187187

188-
* `text` (*Str*, required): The text to embed.
188+
* `text` (*Str*): The text to embed.
189189

190190
Return: *Vector[Float32, N]*, where *N* is the dimension of the embedding vector determined by the model.
191191

0 commit comments

Comments
 (0)