Skip to content
This repository was archived by the owner on Oct 10, 2025. It is now read-only.

Commit b347ae5

Browse files
prrao87acquamarin
andauthored
Add doc for file-format option (#342) (#343)
* Add doc for file-format * Update index.mdx * Apply suggestions from code review --------- Co-authored-by: ziyi chen <chenziyi990424@gmail.com>
1 parent 789dd4c commit b347ae5

File tree

2 files changed

+23
-1
lines changed

2 files changed

+23
-1
lines changed

src/content/docs/cypher/query-clauses/load-from.md

Lines changed: 14 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -144,7 +144,20 @@ You can also see the details of any warnings generated by the skipped lines usin
144144
See the [ignoring erroneous rows section of `COPY FROM`](import#ignore-erroneous-rows) for more details.
145145

146146
## Scan Data Formats
147-
Load from can scan several raw or in-memory file formats, such as CSV, Parquet, Pandas, Polars, Arrow tables, and JSON.
147+
`LOAD FROM` can scan several raw or in-memory file formats, such as CSV, Parquet, Pandas, Polars, Arrow tables, and JSON.
148+
149+
### File format detection
150+
`Load from` determines the file format based on the file extension if the `file_format` option is not given. For instance, files with a `.csv` extension are automatically recognized as CSV format.
151+
152+
If the file format cannot be inferred from the extension, or if you need to override the default sniffing behaviour, the `file_format` option can be used.
153+
154+
For example, to load a CSV file that has a `.tsv` extension (for tab-separated data), you must explicitly specify the file format using the `file_format` option, as shown below:
155+
```
156+
LOAD FROM 'data.tsv' (file_format='csv')
157+
RETURN *
158+
```
159+
160+
148161
Below we give examples of using `LOAD FROM` to scan data from each of these formats. We assume `WITH HEADERS`
149162
is not used in the examples below, so we discuss how Kùzu infers the variable names and data types of
150163
that bind to the scanned tuples.

src/content/docs/import/index.mdx

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -52,6 +52,15 @@ The following sections show how to bulk import data using `COPY FROM` on various
5252

5353
</CardGrid>
5454

55+
:::caution[Note]
56+
Similar to the [LOAD FROM](/cypher/query-clauses/load-from.md) clause, the `COPY FROM` clause determines the file format based on the file extension if the `file_format` option is not provided. Alternatively, the `file_format` option can be used in the `COPY FROM` clause to explicitly specify the file format.
57+
58+
Example: To copy from a file ending with an arbitrary extension like `.dsv`, use the `file_format = 'csv'` option to explicitly tell Kùzu to treat the file as a `CSV` file.
59+
```
60+
COPY person FROM 'person.dsv' (file_format = 'csv')
61+
```
62+
:::
63+
5564
## `COPY FROM` a partial subset
5665

5766
In certain cases, you may only want to partially fill your Kùzu table using the data from your input

0 commit comments

Comments
 (0)