Skip to content

Commit 1cdc279

Browse files
authored
Improved Acrolinx Score
Improved Acrolinx Score
1 parent 60e9748 commit 1cdc279

File tree

1 file changed

+8
-8
lines changed

1 file changed

+8
-8
lines changed

articles/hdinsight-aks/trino/trino-sharded-sql-connector.md

Lines changed: 8 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -12,14 +12,14 @@ The sharded SQL connector allows queries to be executed over data distributed ac
1212

1313
## Prerequisites
1414

15-
To connect to sharded SQL servers, you need the following:
15+
To connect to sharded SQL servers, you need:
1616

1717
- SQL Server 2012 or higher, or Azure SQL Database.
1818
- Network access from the Trino coordinator and workers to SQL Server. Port 1433 is the default port.
1919

2020
### General configuration
2121

22-
The connector can query multiple SQL servers as a single data source. Create a catalog properties file and use `connector.name=sharded-sql` to use sharded SQL connector .
22+
The connector can query multiple SQL servers as a single data source. Create a catalog properties file and use `connector.name=sharded-sql` to use sharded SQL connector.
2323

2424
Configuration example:
2525

@@ -34,7 +34,7 @@ shard-config-location=<path-to-sharding-schema>
3434

3535
|Property|Description|
3636
|--------|-----------|
37-
|connector.name| Name of the connector For sharded SQL which should be `sharded_sqlserver`|
37+
|connector.name| Name of the connector For sharded SQL, which should be `sharded_sqlserver`|
3838
|connection-user| User name in SQL server|
3939
|connection-password| Password for the user in SQL server|
4040
|sharded-cluster| Required to be set to `TRUE` for sharded-sql connector|
@@ -44,7 +44,7 @@ shard-config-location=<path-to-sharding-schema>
4444

4545
The connector uses user-password authentication to query SQL servers. The same user specified in the configuration is expected to authenticate against all the SQL servers.
4646

47-
## Schema Definition
47+
## Schema definition
4848

4949
Connector assumes a 2D partition/bucketed layout of the physical data across SQL servers. Schema definition describes this layout.
5050
Currently, only file based sharding schema definition is supported.
@@ -57,13 +57,13 @@ The following JSON file describes the configuration for a Trino sharded SQL conn
5757
- **tables**: An array of objects, each representing a table in the database. Each table object contains:
5858
- **schema**: The schema name of the table, which corresponds to the database in the SQL server.
5959
- **name**: The name of the table.
60-
- **sharding_schema**: The name of the sharding schema associated with the table, this acts as a reference to the `sharding_schema` described in the next steps.
60+
- **sharding_schema**: The name of the sharding schema associated with the table, which acts as a reference to the `sharding_schema` described in the next steps.
6161

6262
- **sharding_schema**: An array of objects, each representing a sharding schema. Each sharding schema object contains:
6363
- **name**: The name of the sharding schema.
6464
- **partitioned_by**: An array containing one or more columns by which the sharding schema is partitioned.
6565
- **bucket_count(optional)**: An integer representing the total number of buckets the table is distributed, which defaults to 1.
66-
- **bucketed_by(optional)**: An array containing one or more columns by which the data is bucketed, note the partitioning and bucketing are hierarchical, i.e each partition is bucketed.
66+
- **bucketed_by(optional)**: An array containing one or more columns by which the data is bucketed, note the partitioning and bucketing are hierarchical, which means each partition is bucketed.
6767
- **partition_map**: An array of objects, each representing a partition within the sharding schema. Each partition object contains:
6868
- **partition**: The partition value specified in the form `partition-key=partitionvalue`
6969
- **shards**: An array of objects, each representing a shard within the partition, each element of the array represents a replica, trino queries any one of them at random to fetch data for a partition/buckets. Each shard object contains:
@@ -137,13 +137,13 @@ This example describes:
137137
- Shards are an array of `connectionUrl`. Each member of the array represents a replicaSet. During query execution, Trino selects a shard randomly from the array to query data.
138138

139139

140-
### Partition and Bucket Pruning
140+
### Partition and bucket pruning
141141

142142
Connector evaluates the query constraints during the planning and performs based on the provided query predicates. This helps speed-up query performance, and allows connector to query large amounts of data.
143143

144144
Bucketing formula to determine assignments using murmur hash function implementation described [here](https://commons.apache.org/proper/commons-codec/apidocs/src-html/org/apache/commons/codec/digest/MurmurHash3.html#line.388).
145145

146-
### Type Mapping
146+
### Type mapping
147147

148148
Sharded SQL connector supports the same type mappings as SQL server connector [type mappings](https://trino.io/docs/current/connector/sqlserver.html#type-mapping).
149149

0 commit comments

Comments
 (0)