Skip to content

Commit 406a6bb

Browse files
Merge pull request #230198 from ilijazagorac/patch-1
Schema inference causes missing columns
2 parents e1ac83d + 10ebe6b commit 406a6bb

File tree

1 file changed

+2
-0
lines changed

1 file changed

+2
-0
lines changed

articles/synapse-analytics/sql/develop-openrowset.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -310,6 +310,8 @@ Parquet files contain column metadata, which will be read, type mappings can be
310310
311311
For the CSV files, column names can be read from header row. You can specify whether header row exists using HEADER_ROW argument. If HEADER_ROW = FALSE, generic column names will be used: C1, C2, ... Cn where n is number of columns in file. Data types will be inferred from first 100 data rows. Check [reading CSV files without specifying schema](#read-csv-files-without-specifying-schema) for samples.
312312
313+
Have in mind that if you are reading number of files at once, the schema will be inferred from the first file service gets from the storage. This can mean that some of the columns expected are omitted, all because the file used by the service to define the schema did not contain these columns. In that case, please use OPENROWSET WITH clause.
314+
313315
> [!IMPORTANT]
314316
> There are cases when appropriate data type cannot be inferred due to lack of information and larger data type will be used instead. This brings performance overhead and is particularly important for character columns which will be inferred as varchar(8000). For optimal performance, please [check inferred data types](./best-practices-serverless-sql-pool.md#check-inferred-data-types) and [use appropriate data types](./best-practices-serverless-sql-pool.md#use-appropriate-data-types).
315317

0 commit comments

Comments
 (0)