Skip to content

Commit 8593cc3

Browse files
Merge pull request #288509 from whhender/public-updates
Public updates
2 parents 6ee72a7 + cffe3a5 commit 8593cc3

File tree

1 file changed

+11
-11
lines changed
  • articles/synapse-analytics/metadata

1 file changed

+11
-11
lines changed

articles/synapse-analytics/metadata/table.md

Lines changed: 11 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -27,7 +27,7 @@ Since the tables are synchronized to serverless SQL pool asynchronously, there w
2727

2828
Use Spark to manage Spark created databases. For example, delete it through a serverless Apache Spark pool job, and create tables in it from Spark.
2929

30-
Objects in synchronized databases cannot be modified from serverless SQL pool.
30+
Objects in synchronized databases can't be modified from serverless SQL pool.
3131

3232
## Expose a Spark table in SQL
3333

@@ -43,7 +43,7 @@ Spark provides two types of tables that Azure Synapse exposes in SQL automatical
4343

4444
Spark also provides ways to create external tables over existing data, either by providing the `LOCATION` option or using the Hive format. Such external tables can be over a variety of data formats, including Parquet.
4545

46-
Azure Synapse currently only shares managed and external Spark tables that store their data in Parquet, DELTA, or CSV format with the SQL engines. Tables backed by other formats are not automatically synced. You may be able to sync such tables explicitly yourself as an external table in your own SQL database if the SQL engine supports the table's underlying format.
46+
Azure Synapse currently only shares managed and external Spark tables that store their data in Parquet, DELTA, or CSV format with the SQL engines. Tables backed by other formats aren't automatically synced. You can sync such tables explicitly yourself as an external table in your own SQL database if the SQL engine supports the table's underlying format.
4747

4848
> [!NOTE]
4949
> Currently, only Parquet and CSV formats are fully supported in serverless SQL pool. Spark Delta tables are also available in the serverless SQL pool, but this feature is in **public preview**. External tables created in Spark are not available in dedicated SQL pool databases.
@@ -65,24 +65,24 @@ Spark tables provide different data types than the Synapse SQL engines. The foll
6565
| `LongType`, `long`, `bigint` | `bigint` | **Spark**: *LongType* represents 8-byte signed integer numbers.<BR>**SQL**: See [int, bigint, smallint, and tinyint](/sql/t-sql/data-types/int-bigint-smallint-and-tinyint-transact-sql).|
6666
| `BooleanType`, `boolean` | `bit` (Parquet), `varchar(6)` (CSV) | **Spark**: Boolean.<BR>**SQL**: See [/sql/t-sql/data-types/bit-transact-sql).|
6767
| `DecimalType`, `decimal`, `dec`, `numeric` | `decimal` | **Spark**: *DecimalType* represents arbitrary-precision signed decimal numbers. Backed internally by java.math.BigDecimal. A BigDecimal consists of an arbitrary precision integer unscaled value and a 32-bit integer scale. <br> **SQL**: Fixed precision and scale numbers. When maximum precision is used, valid values are from - 10^38 +1 through 10^38 - 1. The ISO synonyms for decimal are dec and dec(p, s). numeric is functionally identical to decimal. See [decimal and numeric](/sql/t-sql/data-types/decimal-and-numeric-transact-sql). |
68-
| `IntegerType`, `Integer`, `int` | `int` | **Spark** *IntegerType* represents 4-byte signed integer numbers. <BR>**SQL**: See [int, bigint, smallint, and tinyint](/sql/t-sql/data-types/int-bigint-smallint-and-tinyint-transact-sql).|
69-
| `ByteType`, `Byte`, `tinyint` | `smallint` | **Spark**: *ByteType* represents 1-byte signed integer numbers [-128 to 127] and ShortType represents 2-byte signed integer numbers [-32768 to 32767]. <br> **SQL**: Tinyint represents 1-byte signed integer numbers [0, 255] and smallint represents 2-byte signed integer numbers [-32768, 32767]. See [int, bigint, smallint, and tinyint](/sql/t-sql/data-types/int-bigint-smallint-and-tinyint-transact-sql).|
68+
| `IntegerType`, `Integer`, `int` | `int` | **Spark** *IntegerType* represents 4 byte signed integer numbers. <BR>**SQL**: See [int, bigint, smallint, and tinyint](/sql/t-sql/data-types/int-bigint-smallint-and-tinyint-transact-sql).|
69+
| `ByteType`, `Byte`, `tinyint` | `smallint` | **Spark**: *ByteType* represents 1 byte signed integer numbers [-128 to 127] and ShortType represents 2 byte signed integer numbers [-32768 to 32767]. <br> **SQL**: Tinyint represents 1 byte signed integer numbers [0, 255] and smallint represents 2 byte signed integer numbers [-32768, 32767]. See [int, bigint, smallint, and tinyint](/sql/t-sql/data-types/int-bigint-smallint-and-tinyint-transact-sql).|
7070
| `ShortType`, `Short`, `smallint` | `smallint` | Same as above. |
7171
| `DoubleType`, `Double` | `float` | **Spark**: *DoubleType* represents 8-byte double-precision floating point numbers. **SQL**: See [float and real](/sql/t-sql/data-types/float-and-real-transact-sql).|
7272
| `FloatType`, `float`, `real` | `real` | **Spark**: *FloatType* represents 4-byte double-precision floating point numbers. **SQL**: See [float and real](/sql/t-sql/data-types/float-and-real-transact-sql).|
73-
| `DateType`, `date` | `date` | **Spark**: *DateType* represents values comprising values of fields year, month and day, without a time-zone.<BR>**SQL**: See [date](/sql/t-sql/data-types/date-transact-sql).|
73+
| `DateType`, `date` | `date` | **Spark**: *DateType* represents values comprising values of fields year, month, and day, without a time-zone.<BR>**SQL**: See [date](/sql/t-sql/data-types/date-transact-sql).|
7474
| `TimestampType`, `timestamp` | `datetime2` | **Spark**: *TimestampType* represents values comprising values of fields year, month, day, hour, minute, and second, with the session local time-zone. The timestamp value represents an absolute point in time.<BR>**SQL**: See [datetime2](/sql/t-sql/data-types/datetime2-transact-sql). |
7575
| `char` | `char` |
76-
| `StringType`, `String`, `varchar` | `Varchar(n)` | **Spark**: *StringType* represents character string values. *VarcharType(n)* is a variant of StringType which has a length limitation. Data writing will fail if the input string exceeds the length limitation. This type can only be used in table schema, not functions/operators.<br> *CharType(n)* is a variant of *VarcharType(n)* which is fixed length. Reading column of type *CharType(n)* always returns string values of length n. *CharType(n)* column comparison will pad the short one to the longer length. <br> **SQL**: If there's a length provided from Spark, n in *varchar(n)* will be set to that length. If it is partitioned column, n can be max 2048. Otherwise, it will be *varchar(max)*. See [char and varchar](/sql/t-sql/data-types/char-and-varchar-transact-sql).<br> Use it with collation `Latin1_General_100_BIN2_UTF8`. |
77-
| `BinaryType`, `binary` | `varbinary(n)` | **SQL**: If there's a length provided from Spark, `n` in *Varbinary(n)* will be set to that length. If it is partitioned column, n can be max 2048. Otherwise, it will be *Varbinary(max)*. See [binary and varbinary](/sql/t-sql/data-types/binary-and-varbinary-transact-sql).|
76+
| `StringType`, `String`, `varchar` | `Varchar(n)` | **Spark**: *StringType* represents character string values. *VarcharType(n)* is a variant of StringType which has a length limitation. Data writing will fail if the input string exceeds the length limitation. This type can only be used in table schema, not functions/operators.<br> *CharType(n)* is a variant of *VarcharType(n)* which is fixed length. Reading column of type *CharType(n)* always returns string values of length n. *CharType(n)* column comparison will pad the short one to the longer length. <br> **SQL**: If there's a length provided from Spark, n in *varchar(n)* will be set to that length. If it's partitioned column, n can be max 2048. Otherwise, it will be *varchar(max)*. See [char and varchar](/sql/t-sql/data-types/char-and-varchar-transact-sql).<br> Use it with collation `Latin1_General_100_BIN2_UTF8`. |
77+
| `BinaryType`, `binary` | `varbinary(n)` | **SQL**: If there's a length provided from Spark, `n` in *Varbinary(n)* will be set to that length. If it's partitioned column, n can be max 2048. Otherwise, it will be *Varbinary(max)*. See [binary and varbinary](/sql/t-sql/data-types/binary-and-varbinary-transact-sql).|
7878
| `array`, `map`, `struct` | `varchar(max)` | **SQL**: Serializes into JSON with collation `Latin1_General_100_BIN2_UTF8`. See [JSON Data](/sql/relational-databases/json/json-data-sql-server).|
7979

8080
>[!NOTE]
8181
> Database level collation is `Latin1_General_100_CI_AS_SC_UTF8`.
8282
8383
## Security model
8484

85-
The Spark databases and tables, as well as their synchronized representations in the SQL engine will be secured at the underlying storage level. Since they do not currently have permissions on the objects themselves, the objects can be seen in the object explorer.
85+
The Spark databases and tables, and their synchronized representations in the SQL engine will be secured at the underlying storage level. Since they don't currently have permissions on the objects themselves, the objects can be seen in the object explorer.
8686

8787
The security principal who creates a managed table is considered the owner of that table and has all the rights to the table as well as the underlying folders and files. In addition, the owner of the database will automatically become co-owner of the table.
8888

@@ -140,7 +140,7 @@ df.Write().Mode(SaveMode.Append).InsertInto("mytestdb.myparquettable");
140140
Now you can read the data from your serverless SQL pool as follows:
141141

142142
```sql
143-
SELECT * FROM mytestdb.dbo.myparquettable WHERE name = 'Alice';
143+
SELECT * FROM mytestdb.myparquettable WHERE name = 'Alice';
144144
```
145145

146146
You should get the following row as result:
@@ -153,7 +153,7 @@ id | name | birthdate
153153

154154
### Create an external table in Spark and query from serverless SQL pool
155155

156-
In this example, we will create an external Spark table over the Parquet data files that got created in the previous example for the managed table.
156+
In this example, we'll create an external Spark table over the Parquet data files that got created in the previous example for the managed table.
157157

158158
For example, with SparkSQL run:
159159

@@ -163,7 +163,7 @@ CREATE TABLE mytestdb.myexternalparquettable
163163
LOCATION "abfss://<storage-name>.dfs.core.windows.net/<fs>/synapse/workspaces/<synapse_ws>/warehouse/mytestdb.db/myparquettable/"
164164
```
165165

166-
Replace the placeholder `<storage-name>` with the ADLS Gen2 storage account name that you are using, `<fs>` with the file system name you're using and the placeholder `<synapse_ws>` with the name of the Azure Synapse workspace you're using to run this example.
166+
Replace the placeholder `<storage-name>` with the ADLS Gen2 storage account name that you're using, `<fs>` with the file system name you're using and the placeholder `<synapse_ws>` with the name of the Azure Synapse workspace you're using to run this example.
167167

168168
The previous example creates the table `myextneralparquettable` in the database `mytestdb`. After a short delay, you can see the table in your serverless SQL pool. For example, run the following statement from your serverless SQL pool.
169169

0 commit comments

Comments
 (0)