Skip to content
This repository was archived by the owner on Oct 10, 2025. It is now read-only.

Commit 448d55d

Browse files
acquamarinray6080
andauthored
Add doc for duckdb/sqlite/postgres's type conversion (#348)
* Add doc for duckdb's type conversion * Update rdbms.mdx * Update rdbms.mdx * Update rdbms.mdx * Update rdbms.mdx * Update src/content/docs/extensions/attach/rdbms.mdx Co-authored-by: Guodong Jin <guod.jin@gmail.com> * Update src/content/docs/extensions/attach/rdbms.mdx Co-authored-by: Guodong Jin <guod.jin@gmail.com> * Update src/content/docs/extensions/attach/rdbms.mdx Co-authored-by: Guodong Jin <guod.jin@gmail.com> * Update src/content/docs/extensions/attach/rdbms.mdx Co-authored-by: Guodong Jin <guod.jin@gmail.com> * Update rdbms.mdx --------- Co-authored-by: Guodong Jin <guod.jin@gmail.com>
1 parent 3d331c0 commit 448d55d

File tree

1 file changed

+123
-18
lines changed

1 file changed

+123
-18
lines changed

src/content/docs/extensions/attach/rdbms.mdx

Lines changed: 123 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -109,8 +109,42 @@ Result:
109109
└──────────────┘
110110
```
111111

112-
113-
#### 3. Scan from DuckDB tables
112+
#### 3. Data type mapping from DuckDB to Kùzu
113+
114+
The table below shows the mapping from duckdb's type to Kùzu's type:
115+
| Data type in DuckDB | Corresponding data type in Kùzu |
116+
|-----------------------------|----------------------------------|
117+
| BIGINT | INT64 |
118+
| BIT | UNSUPPORTED |
119+
| BLOB | BLOB |
120+
| BOOLEAN | BOOL |
121+
| DATE | DATE |
122+
| DECIMAL(prec, scale) | DECIMAL(prec, scale) |
123+
| DOUBLE | DOUBLE |
124+
| FLOAT | FLOAT |
125+
| HUGEINT | INT128 |
126+
| INTEGER | INT32 |
127+
| INTERVAL | INTERVAL |
128+
| SMALLINT | INT16 |
129+
| TIME | UNSUPPORTED |
130+
| TIMESTAMP WITH TIME ZONE | UNSUPPORTED |
131+
| TIMESTAMP | TIMESTAMP |
132+
| TINYINT | INT8 |
133+
| UBIGINT | UINT64 |
134+
| UHUGEINT | UNSUPPORTED |
135+
| UINTEGER | UINT32 |
136+
| USMALLINT | UINT16 |
137+
| UTINYINT | UINT8 |
138+
| UUID | UUID |
139+
| VARCHAR | STRING |
140+
| ENUM | UNSUPPORTED |
141+
| ARRAY | ARRAY |
142+
| LIST | LIST |
143+
| MAP | MAP |
144+
| STRUCT | STRUCT |
145+
| UNION | UNION |
146+
147+
#### 4. Scan from DuckDB tables
114148

115149
Finally, we can utilize the `LOAD FROM` statement to scan the `person` table. Note that you need to prefix the
116150
external `person` table with the database alias (in our example `uw`). See the `USE` statement which allows you to
@@ -137,7 +171,7 @@ Result:
137171
---------------
138172
```
139173

140-
#### 4. USE: Reference database without alias
174+
#### 5. USE: Reference database without alias
141175

142176
You can use the `USE` statement for attached databases to use a default database name for future operations.
143177
This can be used when reading from an attached database to avoid specifying the full database name
@@ -164,7 +198,7 @@ LOAD FROM person
164198
RETURN *
165199
```
166200

167-
#### 5. Copy data from DuckDB tables
201+
#### 6. Copy data from DuckDB tables
168202

169203
One important use case of the external RDBMS extensions is to facilitate seamless data transfer from the external RDBMS to Kùzu.
170204
In this example, we continue using the `university.db` database created in the last step, but this time,
@@ -187,7 +221,7 @@ If the schemas are not the same, e.g., `Person` contains only `name` property wh
187221
COPY Person FROM (LOAD FROM uw.person RETURN name);
188222
```
189223

190-
#### 6. Query the data in Kùzu
224+
#### 7. Query the data in Kùzu
191225

192226
Finally, we can verify the data in the `Person` table in Kùzu.
193227

@@ -210,7 +244,7 @@ Result:
210244
------------------
211245
```
212246

213-
#### 7. Clear attached database schema cache
247+
#### 8. Clear attached database schema cache
214248

215249
To avoid redundantly retrieving schema information from attached databases, Kùzu maintains a schema cache
216250
including table names and their respective columns and types. Should modifications occur in the schema
@@ -319,7 +353,56 @@ The below table lists some common connection string parameters:
319353
| `password` | Postgres password | [empty] |
320354
| `port` | Port number | 5432 |
321355

322-
#### 3. Scan from PostgreSQL tables
356+
#### 3. Data type mapping from PostgreSQL to Kùzu
357+
358+
The table below shows the mapping from PostgreSQL's type to Kùzu's type:
359+
| PostgreSQL Data Type | Corresponding Data Type in Kùzu |
360+
|-------------------------------------------|----------------------------------|
361+
| bigint (int8) | INT64 |
362+
| bigserial (serial8) | INT64 |
363+
| bit [ (n) ] | STRING |
364+
| bit varying [ (n) ] (varbit [ (n) ]) | STRING |
365+
| boolean (bool) | BOOL |
366+
| box | DOUBLE[] |
367+
| bytea | BLOB |
368+
| character [ (n) ] (char [ (n) ]) | STRING |
369+
| character varying [ (n) ] (varchar [ (n)])| STRING |
370+
| cidr | STRING |
371+
| circle | DOUBLE[] |
372+
| date | DATE |
373+
| double precision (float8) | DOUBLE |
374+
| inet | STRING |
375+
| integer (int, int4) | INT32 |
376+
| interval [ fields ] [ (p) ] | INTERVAL |
377+
| json | JSON |
378+
| line | DOUBLE[] |
379+
| lseg | DOUBLE[] |
380+
| macaddr | STRING |
381+
| macaddr8 | STRING |
382+
| money | STRING |
383+
| numeric [ (p, s) ] (decimal [ (p, s) ]) | DECIMAL |
384+
| path | DOUBLE[] |
385+
| pg_lsn | STRING |
386+
| pg_snapshot | STRING |
387+
| point | STRUCT(x DOUBLE, y DOUBLE) |
388+
| polygon | DOUBLE[] |
389+
| real (float4) | FLOAT |
390+
| smallint (int2) | INT16 |
391+
| smallserial (serial2) | INT16 |
392+
| serial (serial4) | INT32 |
393+
| text | STRING |
394+
| time [ (p) ] [ without time zone ] | UNSUPPORTED |
395+
| time [ (p) ] with time zone (timetz) | UNSUPPORTED |
396+
| timestamp [ (p) ] [ without time zone ] | TIMESTAMP |
397+
| timestamp [ (p) ] with time zone (timestamptz) | UNSUPPORTED |
398+
| tsquery | STRING |
399+
| tsvector | STRING |
400+
| txid_snapshot | STRING |
401+
| uuid | UUID |
402+
| xml | STRING |
403+
404+
405+
#### 4. Scan from PostgreSQL tables
323406

324407
Finally, we can utilize the `LOAD FROM` statement to scan the `Person` table.
325408

@@ -344,7 +427,7 @@ Result:
344427
---------------
345428
```
346429

347-
#### 4. USE: Reference database without alias
430+
#### 5. USE: Reference database without alias
348431

349432
You can use the `USE` statement for attached databases to use a default database name for future operations.
350433
This can be used when reading from an attached database to avoid specifying the full database name
@@ -371,7 +454,7 @@ LOAD FROM person
371454
RETURN *
372455
```
373456

374-
#### 5. Copy data from PostgreSQL tables
457+
#### 6. Copy data from PostgreSQL tables
375458

376459
One important use case of the external RDBMS extensions is to facilitate seamless data transfer from the external RDBMS to Kùzu.
377460
In this example, we continue using the `university.db` database created in the last step, but this time,
@@ -394,7 +477,7 @@ If the schemas are not the same, e.g., `Person` contains only `name` property wh
394477
COPY Person FROM (LOAD FROM uw.person RETURN name);
395478
```
396479

397-
#### 6. Query the data in Kùzu
480+
#### 7. Query the data in Kùzu
398481

399482
Finally, we can verify the data in the `Person` table in Kùzu.
400483

@@ -417,7 +500,7 @@ Result:
417500
------------------
418501
```
419502

420-
#### 7. Clear attached database schema cache
503+
#### 8. Clear attached database schema cache
421504

422505
To avoid redundantly retrieving schema information from attached databases, Kùzu maintains a schema cache
423506
including table names and their respective columns and types. Should modifications occur in the schema
@@ -431,7 +514,7 @@ CALL clear_attached_db_cache()
431514
Note: If you have attached to databases from different
432515
RDBMSs, say Postgres, DuckDB, and Sqlite, this call will clear the cache for all of them.
433516

434-
#### 8. Detach database
517+
#### 9. Detach database
435518

436519
To detach a database, use `DETACH [ALIAS]` as follows:
437520

@@ -489,7 +572,29 @@ the alias `uw`:
489572
ATTACH 'university.db' AS uw (dbtype sqlite);
490573
```
491574

492-
#### 3. Scan from SQLite tables
575+
#### 3. Data type mapping from SQLite to Kùzu
576+
577+
The table below shows the mapping from SQLite's type to Kùzu's type:
578+
| SQLite Storage Class / Datatype | Corresponding Data Type in Kùzu |
579+
|--------------------------------------------|----------------------------------|
580+
| NULL | BLOB |
581+
| INTEGER | INT64 |
582+
| REAL | DOUBLE |
583+
| TEXT | STRING |
584+
| BLOB | BLOB |
585+
| BOOLEAN | INT64 |
586+
| DATE | DATE |
587+
| TIME | TIMESTAMP |
588+
589+
Note: Sqlite uses a [dynamic type system](https://www.sqlite.org/datatype3.html), meaning that a column in sqlite can store values with different types. The option: `sqlite_all_varchar_option` is provided to scan such columns in Kùzu.
590+
Usage:
591+
```
592+
`CALL sqlite_all_varchar_option=<OPTION>`
593+
```
594+
If `sqlite_all_varchar_option` is set to true, all sqlite columns will be treated and scanned as `VAR_CHAR` columns.
595+
If `sqlite_all_varchar_option` is set to false, trying to scan a column with values incompatible with the specified data type will result in a runtime exception.
596+
597+
#### 4. Scan from SQLite tables
493598

494599
Finally, we can utilize the `LOAD FROM` statement to scan the `person` table.
495600

@@ -514,7 +619,7 @@ Result:
514619
---------------
515620
```
516621

517-
#### 4. USE: Reference database without alias
622+
#### 5. USE: Reference database without alias
518623

519624
You can use the `USE` statement for attached databases to use a default database name for future operations.
520625
This can be used when reading from an attached database to avoid specifying the full database name
@@ -541,7 +646,7 @@ LOAD FROM person
541646
RETURN *
542647
```
543648

544-
#### 5. Copy data from SQLite tables
649+
#### 6. Copy data from SQLite tables
545650

546651
One important use case of the external RDBMS extensions is to facilitate seamless data transfer from the external RDBMS to Kùzu.
547652
In this example, we continue using the `university.db` database created in the last step, but this time,
@@ -564,7 +669,7 @@ If the schemas are not the same, e.g., `Person` contains only `name` property wh
564669
COPY Person FROM (LOAD FROM uw.person RETURN name);
565670
```
566671

567-
#### 6. Query the data in Kùzu
672+
#### 7. Query the data in Kùzu
568673

569674
Finally, we can verify the data in the `Person` table in Kùzu.
570675

@@ -587,7 +692,7 @@ Result:
587692
------------------
588693
```
589694

590-
#### 7. Clear attached database schema cache
695+
#### 8. Clear attached database schema cache
591696

592697
To avoid redundantly retrieving schema information from attached databases, Kùzu maintains a schema cache
593698
including table names and their respective columns and types. Should modifications occur in the schema
@@ -601,7 +706,7 @@ CALL clear_attached_db_cache()
601706
Note: If you have attached to databases from different
602707
RDBMSs, say Postgres, DuckDB, and Sqlite, this call will clear the cache for all of them.
603708

604-
#### 8. Detach database
709+
#### 9. Detach database
605710

606711
To detach a database, use `DETACH [ALIAS]` as follows:
607712

0 commit comments

Comments
 (0)