Skip to content

Commit d99d766

Browse files
docs: add schema inference note for read_csv, read_parquet, read_avro, and COPY FROM (FIR-50448) (#1102)
1 parent ff4de4c commit d99d766

File tree

4 files changed

+9
-1
lines changed

4 files changed

+9
-1
lines changed

docs-mdx/reference-sql/commands/data-management/copy-from.mdx

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -255,6 +255,8 @@ You can use the automatic schema discovery feature in `COPY FROM` to handle even
255255
* **Parquet files** - Firebolt automatically reads metadata in Parquet files to create corresponding target tables.
256256
* **CSV files** - Firebolt infers column types based on the data content itself, which can streamline the initial data loading process significantly. Use `WITH HEADER=TRUE` if your CSV file contains column names in the first line.
257257

258+
When loading multiple files, the schema is inferred from the most recently modified file.
259+
258260
Automatic schema discovery operates on a "best effort" basis, and attempts to balance accuracy with practical usability, but it may not always be error-free.
259261

260262
The following query reads `levels.csv`, a sample dataset from the fictional “Ultra Fast Gaming Inc.” company. The example implicitly uses automatic schema creation with `AUTO_CREATE=TRUE`, which defaults to `TRUE`, and also triggers automatic table creation:

docs-mdx/reference-sql/functions-reference/table-valued/read_avro.mdx

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -53,6 +53,8 @@ READ_AVRO (
5353

5454
The result is a table with data from the Avro files. Columns are read and parsed using their inferred data types based on the Avro schema. All data types are inferred as nullable.
5555

56+
When reading multiple files, the schema is inferred from the most recently modified file.
57+
5658
## Avro Data Type Mapping
5759

5860
`READ_AVRO` supports all Avro data types with the following mappings:
@@ -224,4 +226,4 @@ The pattern will recursively match files in all subdirectories. For example:
224226
```sql
225227
SELECT * FROM READ_AVRO('s3://firebolt-publishing-public/help_center_assets/firebolt_sample_avro/*.avro')
226228
```
227-
will read all Avro files in the `firebolt_sample_avro` directory and all of its subdirectories.
229+
will read all Avro files in the `firebolt_sample_avro` directory and all of its subdirectories.

docs-mdx/reference-sql/functions-reference/table-valued/read_csv.mdx

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -72,6 +72,8 @@ READ_CSV (
7272

7373
The result is a table with the data from the CSV file. Each cell is read as a `TEXT`.
7474

75+
When reading multiple files, the schema is inferred from the most recently modified file.
76+
7577
## Examples
7678

7779
### Using LOCATION object

docs-mdx/reference-sql/functions-reference/table-valued/read_parquet.mdx

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -58,6 +58,8 @@ READ_PARQUET (
5858

5959
The result is a table with data from the Parquet files. Columns are read and parsed using their inferred data types.
6060

61+
When reading multiple files, the schema is inferred from the most recently modified file.
62+
6163
## Best practice
6264

6365
Firebolt recommends using a `LOCATION` object to store credentials for authentication.

0 commit comments

Comments
 (0)