Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
36 changes: 19 additions & 17 deletions docs/preview/data/csv/auto_detection.md
Original file line number Diff line number Diff line change
Expand Up @@ -109,34 +109,36 @@ The type detection works by attempting to convert the values in each column to t

<div class="monospace_table"></div>

| Types |
|-----------|
| BOOLEAN |
| BIGINT |
| DOUBLE |
| TIME |
| DATE |
| TIMESTAMP |
| VARCHAR |

Everything can be cast to `VARCHAR`, therefore, this type has the lowest priority meaning that all columns are converted to `VARCHAR` if they cannot be cast to anything else.
| Types |
|-------------|
| NULL |
| BOOLEAN |
| TIME |
| DATE |
| TIMESTAMP |
| TIMESTAMPTZ |
| BIGINT |
| DOUBLE |
| VARCHAR |

Everything can be cast to `VARCHAR`, therefore, this type has the lowest priority meaning that all columns are converted to `VARCHAR` as a fallback if they cannot be cast to anything else.
In [`flights.csv`](/data/flights.csv) the `FlightDate` column will be cast to a `DATE`, while the other columns will be cast to `VARCHAR`.

The set of candidate types that should be considered by the CSV reader can be explicitly specified using the [`auto_type_candidates`]({% link docs/preview/data/csv/overview.md %}#auto_type_candidates-details) option.
The set of candidate types that should be considered by the CSV reader can be specified explicitly using the [`auto_type_candidates`]({% link docs/preview/data/csv/overview.md %}#auto_type_candidates-details) option. `VARCHAR` as the fallback type will always be considered as a candidate type whether you specify it or not.

In addition to the default set of candidate types, other types that may be specified using the `auto_type_candidates` options are:
Here are all additional candidate types that may be specified using the `auto_type_candidates` option, in order of priority:

<div class="monospace_table"></div>

| Types |
|-----------|
| TINYINT |
| SMALLINT |
| INTEGER |
| DECIMAL |
| FLOAT |
| INTEGER |
| SMALLINT |
| TINYINT |

Even though the set of data types that can be automatically detected may appear quite limited, the CSV reader can configured to read arbitrarily complex types by using the `types`-option described in the next section.
Even though the set of data types that can be automatically detected may appear quite limited, the CSV reader can be configured to read arbitrarily complex types by using the `types`-option described in the next section.

Type detection can be entirely disabled by using the `all_varchar` option. If this is set all columns will remain as `VARCHAR` (as they originally occur in the CSV file).

Expand Down
2 changes: 1 addition & 1 deletion docs/preview/data/csv/overview.md
Original file line number Diff line number Diff line change
Expand Up @@ -130,7 +130,7 @@ Usage example:
SELECT * FROM read_csv('csv_file.csv', auto_type_candidates = ['BIGINT', 'DATE']);
```

The default value for the `auto_type_candidates` option is `['SQLNULL', 'BOOLEAN', 'BIGINT', 'DOUBLE', 'TIME', 'DATE', 'TIMESTAMP', 'VARCHAR']`.
The default value for the `auto_type_candidates` option is `['NULL', 'BOOLEAN', 'TIME', 'DATE', 'TIMESTAMP', 'TIMESTAMPTZ', 'BIGINT', 'DOUBLE', 'VARCHAR']`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The order doesn't really matter here, since it's internally sorted when the option is set. Still, this might make it more readable.


## CSV Functions

Expand Down