Skip to content

Commit 55bdc83

Browse files
docs: add schema inference note block with type compatibility warning (FIR-50448) (#1104)
Co-authored-by: Devin AI <158243242+devin-ai-integration[bot]@users.noreply.github.com> Co-authored-by: [email protected] <[email protected]>
1 parent d99d766 commit 55bdc83

File tree

4 files changed

+13
-5
lines changed

4 files changed

+13
-5
lines changed

docs-mdx/reference-sql/commands/data-management/copy-from.mdx

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -255,7 +255,9 @@ You can use the automatic schema discovery feature in `COPY FROM` to handle even
255255
* **Parquet files** - Firebolt automatically reads metadata in Parquet files to create corresponding target tables.
256256
* **CSV files** - Firebolt infers column types based on the data content itself, which can streamline the initial data loading process significantly. Use `WITH HEADER=TRUE` if your CSV file contains column names in the first line.
257257

258-
When loading multiple files, the schema is inferred from the most recently modified file.
258+
<Note>
259+
**When loading multiple files, Firebolt infers the schema from the most recently modified file.** The remaining files must have compatible data types. If types vary between files (e.g., a column contains integers in one file but doubles in another, or is numeric in one file but text in another), the inferred schema may not match all files and thus cause data type errors or query failures. In such cases, we recommend defining an explicit schema using either [external tables](/reference-sql/commands/data-definition/create-external-table) or [`COPY FROM`](/reference-sql/commands/data-management/copy-from) into existing tables.
260+
</Note>
259261

260262
Automatic schema discovery operates on a "best effort" basis, and attempts to balance accuracy with practical usability, but it may not always be error-free.
261263

docs-mdx/reference-sql/functions-reference/table-valued/read_avro.mdx

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -53,7 +53,9 @@ READ_AVRO (
5353

5454
The result is a table with data from the Avro files. Columns are read and parsed using their inferred data types based on the Avro schema. All data types are inferred as nullable.
5555

56-
When reading multiple files, the schema is inferred from the most recently modified file.
56+
<Note>
57+
**When loading multiple files, Firebolt infers the schema from the most recently modified file.** The remaining files must have compatible data types. If types vary between files (e.g., a column contains integers in one file but doubles in another, or is numeric in one file but text in another), the inferred schema may not match all files and thus cause data type errors or query failures. In such cases, we recommend defining an explicit schema using either [external tables](/reference-sql/commands/data-definition/create-external-table) or [`COPY FROM`](/reference-sql/commands/data-management/copy-from) into existing tables.
58+
</Note>
5759

5860
## Avro Data Type Mapping
5961

@@ -226,4 +228,4 @@ The pattern will recursively match files in all subdirectories. For example:
226228
```sql
227229
SELECT * FROM READ_AVRO('s3://firebolt-publishing-public/help_center_assets/firebolt_sample_avro/*.avro')
228230
```
229-
will read all Avro files in the `firebolt_sample_avro` directory and all of its subdirectories.
231+
will read all Avro files in the `firebolt_sample_avro` directory and all of its subdirectories.

docs-mdx/reference-sql/functions-reference/table-valued/read_csv.mdx

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -72,7 +72,9 @@ READ_CSV (
7272

7373
The result is a table with the data from the CSV file. Each cell is read as a `TEXT`.
7474

75-
When reading multiple files, the schema is inferred from the most recently modified file.
75+
<Note>
76+
**When loading multiple files, Firebolt infers the schema from the most recently modified file.** The remaining files must have compatible data types. If types vary between files (e.g., a column contains integers in one file but doubles in another, or is numeric in one file but text in another), the inferred schema may not match all files and thus cause data type errors or query failures. In such cases, we recommend defining an explicit schema using either [external tables](/reference-sql/commands/data-definition/create-external-table) or [`COPY FROM`](/reference-sql/commands/data-management/copy-from) into existing tables.
77+
</Note>
7678

7779
## Examples
7880

docs-mdx/reference-sql/functions-reference/table-valued/read_parquet.mdx

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -58,7 +58,9 @@ READ_PARQUET (
5858

5959
The result is a table with data from the Parquet files. Columns are read and parsed using their inferred data types.
6060

61-
When reading multiple files, the schema is inferred from the most recently modified file.
61+
<Note>
62+
**When loading multiple files, Firebolt infers the schema from the most recently modified file.** The remaining files must have compatible data types. If types vary between files (e.g., a column contains integers in one file but doubles in another, or is numeric in one file but text in another), the inferred schema may not match all files and thus cause data type errors or query failures. In such cases, we recommend defining an explicit schema using either [external tables](/reference-sql/commands/data-definition/create-external-table) or [`COPY FROM`](/reference-sql/commands/data-management/copy-from) into existing tables.
63+
</Note>
6264

6365
## Best practice
6466

0 commit comments

Comments
 (0)