Skip to content
This repository was archived by the owner on Oct 10, 2025. It is now read-only.
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
15 changes: 14 additions & 1 deletion src/content/docs/cypher/query-clauses/load-from.md
Original file line number Diff line number Diff line change
Expand Up @@ -144,7 +144,20 @@ You can also see the details of any warnings generated by the skipped lines usin
See the [ignoring erroneous rows section of `COPY FROM`](import#ignore-erroneous-rows) for more details.

## Scan Data Formats
Load from can scan several raw or in-memory file formats, such as CSV, Parquet, Pandas, Polars, Arrow tables, and JSON.
`LOAD FROM` can scan several raw or in-memory file formats, such as CSV, Parquet, Pandas, Polars, Arrow tables, and JSON.

### File format detection
`Load from` determines the file format based on the file extension if the `file_format` option is not given. For instance, files with a `.csv` extension are automatically recognized as CSV format.

If the file format cannot be inferred from the extension, or if you need to override the default sniffing behaviour, the `file_format` option can be used.

For example, to load a CSV file that has a `.tsv` extension (for tab-separated data), you must explicitly specify the file format using the `file_format` option, as shown below:
```
LOAD FROM 'data.tsv' (file_format='csv')
RETURN *
```


Below we give examples of using `LOAD FROM` to scan data from each of these formats. We assume `WITH HEADERS`
is not used in the examples below, so we discuss how Kùzu infers the variable names and data types of
that bind to the scanned tuples.
Expand Down
9 changes: 9 additions & 0 deletions src/content/docs/import/index.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -52,6 +52,15 @@ The following sections show how to bulk import data using `COPY FROM` on various

</CardGrid>

:::caution[Note]
Similar to the [LOAD FROM](/cypher/query-clauses/load-from.md) clause, the `COPY FROM` clause determines the file format based on the file extension if the `file_format` option is not provided. Alternatively, the `file_format` option can be used in the `COPY FROM` clause to explicitly specify the file format.

Example: To copy from a file ending with an arbitrary extension like `.dsv`, use the `file_format = 'csv'` option to explicitly tell Kùzu to treat the file as a `CSV` file.
```
COPY person FROM 'person.dsv' (file_format = 'csv')
```
:::

## `COPY FROM` a partial subset

In certain cases, you may only want to partially fill your Kùzu table using the data from your input
Expand Down
Loading