You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/docs/core/data_types.mdx
+17-9Lines changed: 17 additions & 9 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -21,12 +21,13 @@ All you need to do is to make sure the data passed to functions and targets are
21
21
Each type in CocoIndex type system is mapped to one or multiple types in Python.
22
22
When you define a [custom function](/docs/core/custom_function), you need to annotate the data types of arguments and return values.
23
23
24
-
* When you pass a Python value to the engine (e.g. return values of a custom function), type annotation is required,
25
-
as it provides the ground truth of the data type in the flow.
24
+
* When you pass a Python value to the engine (e.g. return values of a custom function), a specific type annotation is required.
25
+
The type annotation needs to be specific in describing the target data type, as it provides the ground truth of the data type in the flow.
26
26
27
27
* When you use a Python variable to bind to an engine value (e.g. arguments of a custom function),
28
-
we use the type annotation as a guidance to construct the Python value.
29
-
Type annotation is optional for basic types and struct types, and required for table types.
28
+
the engine already knows the specific data type, so we don't require a specific type annotation, e.g. type annotations can be omitted, or you can use `Any` at any level.
29
+
When a specific type annotation is provided, it's still used as a guidance to construct the Python value with compatible type.
30
+
Otherwise, we will bind to a default Python type.
30
31
31
32
### Basic Types
32
33
@@ -54,7 +55,7 @@ This is the list of all primitive types supported by CocoIndex:
54
55
Notes:
55
56
56
57
* For some CocoIndex types, we support multiple Python types. You can annotate with any of these Python types.
57
-
The first one is the default type, i.e. CocoIndex will create a value with this type when the type annotation is not provided (e.g. for arguments of a custom function).
58
+
The first one is the default type, i.e. CocoIndex will create a value with this type when a specific type annotation is not provided (e.g. for arguments of a custom function).
58
59
59
60
* All Python types starting with `cocoindex.` are type aliases exported by CocoIndex. They're annotated types based on certain Python types:
60
61
@@ -136,7 +137,7 @@ Both `Person` and `PersonTuple` are valid Struct types in CocoIndex, with identi
136
137
Choose `dataclass` for mutable objects or when you need additional methods, and `NamedTuple` for immutable, lightweight structures.
137
138
138
139
Besides, for arguments of custom functions, CocoIndex also supports using dictionaries (`dict[str, Any]`) to represent a *Struct* type.
139
-
It's the default Python type if you don't annotate the function argument.
140
+
It's the default Python type if you don't annotate the function argument with a specific type.
140
141
141
142
### Table Types
142
143
@@ -152,11 +153,16 @@ The row order of a *KTable* is not preserved.
152
153
Type of the first column (key column) must be a [key type](#key-types).
153
154
154
155
In Python, a *KTable* type is represented by `dict[K, V]`.
155
-
The `V` should be a *Struct* type, either a `dataclass` or `NamedTuple`, representing the value fields of each row.
156
+
The `K` should be the type binding to a key type,
157
+
and the `V` should be the type binding to a *Struct* type representing the value fields of each row.
158
+
When the specific type annotation is not provided,
159
+
the key type is bound to a tuple with its key parts when it's a *Struct* type, the value type is bound to `dict[str, Any]`.
160
+
161
+
156
162
For example, you can use `dict[str, Person]` or `dict[str, PersonTuple]` to represent a *KTable*, with 4 columns: key (*Str*), `first_name` (*Str*), `last_name` (*Str*), `dob` (*Date*).
163
+
It's bound to `dict[str, dict[str, Any]]` if you don't annotate the function argument with a specific type.
157
164
158
165
Note that if you want to use a *Struct* as the key, you need to ensure its value in Python is immutable. For `dataclass`, annotate it with `@dataclass(frozen=True)`. For `NamedTuple`, immutability is built-in. For example:
159
-
For example:
160
166
161
167
```python
162
168
@dataclass(frozen=True)
@@ -170,14 +176,16 @@ class PersonKeyTuple(NamedTuple):
170
176
```
171
177
172
178
Then you can use `dict[PersonKey, Person]` or `dict[PersonKeyTuple, PersonTuple]` to represent a KTable keyed by `PersonKey` or `PersonKeyTuple`.
179
+
It's bound to `dict[(str, str), dict[str, Any]]` if you don't annotate the function argument with a specific type.
173
180
174
181
175
182
#### LTable
176
183
177
184
*LTable* is a *Table* type whose row order is preserved. *LTable* has no key column.
178
185
179
-
In Python, a *LTable* type is represented by `list[R]`, where `R` is a dataclass representing a row.
186
+
In Python, a *LTable* type is represented by `list[R]`, where `R` is the type binding to the *Struct* type representing the value fields of each row.
180
187
For example, you can use `list[Person]` to represent a *LTable* with 3 columns: `first_name` (*Str*), `last_name` (*Str*), `dob` (*Date*).
188
+
It's bound to `list[dict[str, Any]]` if you don't annotate the function argument with a specific type.
0 commit comments