Skip to content

Commit 99b3b65

Browse files
authored
Merge pull request #33413 from WilliamDAssafMSFT/20250307-query-hints-dw
20250307 hints maintenance, add Fabric DW applicability
2 parents e53f39b + bb3b155 commit 99b3b65

File tree

4 files changed

+162
-39
lines changed

4 files changed

+162
-39
lines changed

docs/t-sql/queries/from-transact-sql.md

Lines changed: 15 additions & 26 deletions
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@ description: FROM clause plus JOIN, APPLY, PIVOT (Transact-SQL)
44
author: VanMSFT
55
ms.author: vanto
66
ms.reviewer: randolphwest
7-
ms.date: 09/25/2024
7+
ms.date: 03/07/2025
88
ms.service: sql
99
ms.subservice: t-sql
1010
ms.topic: reference
@@ -59,7 +59,7 @@ This article also discusses the following keywords that can be used on the FROM
5959

6060
## Syntax
6161

62-
Syntax for SQL Server, Azure SQL Database, and Fabric SQL database:
62+
Syntax for SQL Server, Azure SQL Database, and SQL database in Fabric:
6363

6464
```syntaxsql
6565
[ FROM { <table_source> } [ , ...n ] ]
@@ -168,8 +168,7 @@ FROM { <table_source> [ , ...n ] }
168168
| REDISTRIBUTE
169169
```
170170

171-
Syntax for Microsoft Fabric:
172-
171+
Syntax for Microsoft Fabric Data Warehouse:
173172

174173
```syntaxsql
175174
FROM { <table_source> [ , ...n ] }
@@ -334,19 +333,9 @@ Specifies all rows from the right table not meeting the join condition are inclu
334333

335334
For [!INCLUDE[ssNoVersion](../../includes/ssnoversion-md.md)] and [!INCLUDE[ssSDS](../../includes/sssds-md.md)], specifies that the [!INCLUDE[ssNoVersion](../../includes/ssnoversion-md.md)] query optimizer uses one join hint, or execution algorithm, per join specified in the query FROM clause. For more information, see [Join Hints (Transact-SQL)](../queries/hints-transact-sql-join.md).
336335

337-
For [!INCLUDE[ssazuresynapse-md](../../includes/ssazuresynapse-md.md)] and [!INCLUDE[ssPDW](../../includes/sspdw-md.md)], these join hints apply to INNER joins on two distribution incompatible columns. They can improve query performance by restricting the amount of data movement that occurs during query processing. The allowable join hints for [!INCLUDE[ssazuresynapse-md](../../includes/ssazuresynapse-md.md)] and [!INCLUDE[ssPDW](../../includes/sspdw-md.md)] are as follows:
338-
339-
#### REDUCE
340-
341-
Reduces the number of rows to be moved for the table on the right side of the join in order to make two distribution incompatible tables compatible. The REDUCE hint is also called a semi-join hint.
342-
343-
#### REPLICATE
344-
345-
Causes the values in the joining column from the table on the right side of the join to be replicated to all nodes. The table on the left is joined to the replicated version of those columns.
346-
347-
#### REDISTRIBUTE
336+
For [!INCLUDE[ssazuresynapse-md](../../includes/ssazuresynapse-md.md)], [!INCLUDE[ssPDW](../../includes/sspdw-md.md)], and [!INCLUDE [fabric](../../includes/fabric.md)] Data Warehouse, these join hints apply to `INNER` joins on two distribution incompatible columns. They can improve query performance by restricting the amount of data movement that occurs during query processing.
348337

349-
Forces two data sources to be distributed on columns specified in the JOIN clause. For a distributed table, [!INCLUDE[ssPDW](../../includes/sspdw-md.md)] performs a shuffle move. For a replicated table, [!INCLUDE[ssPDW](../../includes/sspdw-md.md)] performs a trim move. To understand these move types, see the "DMS Query Plan Operations" section in the "Understanding Query Plans" article in the [!INCLUDE[pdw-product-documentation](../../includes/pdw-product-documentation-md.md)]. This hint can improve performance when the query plan is using a broadcast move to resolve a distribution incompatible join.
338+
For more information on `REDUCE`, `REPLICATE`, and `REDISTRIBUTE`, see [Join hints (Transact-SQL)](hints-transact-sql-join.md).
350339

351340
#### JOIN
352341

@@ -681,7 +670,7 @@ The following example assumes that the following tables and table-valued functio
681670
| Employees | EmpID, EmpLastName, EmpFirstName, EmpSalary |
682671
| GetReports(MgrID) | EmpID, EmpLastName, EmpSalary |
683672

684-
The `GetReports` table-valued function, returns the list of all employees that report directly or indirectly to the specified `MgrID`.
673+
The `GetReports` table-valued function returns the list of all employees that report directly or indirectly to the specified `MgrID`.
685674

686675
The example uses `APPLY` to return all departments and all employees in that department. If a particular department doesn't have any employees, there won't be any rows returned for that department.
687676

@@ -729,7 +718,7 @@ GO
729718

730719
**Applies to**: [!INCLUDE[sssql16-md](../../includes/sssql16-md.md)] and later versions, and [!INCLUDE[sssds](../../includes/sssds-md.md)].
731720

732-
The following example uses the FOR SYSTEM_TIME AS OF *date_time_literal_or_variable* argument to return table rows that were actual (current) as of January 1, 2014.
721+
The following example uses the `FOR SYSTEM_TIME AS OF *date_time_literal_or_variable*` argument to return table rows that were actual (current) as of January 1, 2014.
733722

734723
```sql
735724
SELECT DepartmentNumber,
@@ -741,7 +730,7 @@ FOR SYSTEM_TIME AS OF '2014-01-01'
741730
WHERE ManagerID = 5;
742731
```
743732

744-
The following example uses the FOR SYSTEM_TIME FROM *date_time_literal_or_variable* TO *date_time_literal_or_variable* argument to return all rows that were active during the period defined as starting with January 1, 2013 and ending with January 1, 2014, exclusive of the upper boundary.
733+
The following example uses the `FOR SYSTEM_TIME FROM *date_time_literal_or_variable* TO *date_time_literal_or_variable*` argument to return all rows that were active during the period defined as starting with January 1, 2013 and ending with January 1, 2014, exclusive of the upper boundary.
745734

746735
```sql
747736
SELECT DepartmentNumber,
@@ -753,7 +742,7 @@ FOR SYSTEM_TIME FROM '2013-01-01' TO '2014-01-01'
753742
WHERE ManagerID = 5;
754743
```
755744

756-
The following example uses the FOR SYSTEM_TIME BETWEEN *date_time_literal_or_variable* AND *date_time_literal_or_variable* argument to return all rows that were active during the period defined as starting with January 1, 2013 and ending with January 1, 2014, inclusive of the upper boundary.
745+
The following example uses the `FOR SYSTEM_TIME BETWEEN *date_time_literal_or_variable* AND *date_time_literal_or_variable*` argument to return all rows that were active during the period defined as starting with January 1, 2013 and ending with January 1, 2014, inclusive of the upper boundary.
757746

758747
```sql
759748
SELECT DepartmentNumber,
@@ -765,7 +754,7 @@ FOR SYSTEM_TIME BETWEEN '2013-01-01' AND '2014-01-01'
765754
WHERE ManagerID = 5;
766755
```
767756

768-
The following example uses the FOR SYSTEM_TIME CONTAINED IN (*date_time_literal_or_variable*, *date_time_literal_or_variable*) argument to return all rows that were opened and closed during the period defined as starting with January 1, 2013 and ending with January 1, 2014.
757+
The following example uses the `FOR SYSTEM_TIME CONTAINED IN (*date_time_literal_or_variable*, *date_time_literal_or_variable*)` argument to return all rows that were opened and closed during the period defined as starting with January 1, 2013 and ending with January 1, 2014.
769758

770759
```sql
771760
SELECT DepartmentNumber,
@@ -797,7 +786,7 @@ WHERE ManagerID = 5;
797786

798787
### N. Use the INNER JOIN syntax
799788

800-
The following example returns the `SalesOrderNumber`, `ProductKey`, and `EnglishProductName` columns from the `FactInternetSales` and `DimProduct` tables where the join key, `ProductKey`, matches in both tables. The `SalesOrderNumber` and `EnglishProductName` columns each exist in one of the tables only, so it isn't necessary to specify the table alias with these columns, as is shown; these aliases are included for readability. The word **AS** before an alias name isn't required but is recommended for readability and to conform to the ANSI standard.
789+
The following example returns the `SalesOrderNumber`, `ProductKey`, and `EnglishProductName` columns from the `FactInternetSales` and `DimProduct` tables where the join key, `ProductKey`, matches in both tables. The `SalesOrderNumber` and `EnglishProductName` columns each exist in one of the tables only, so it isn't necessary to specify the table alias with these columns, as is shown; these aliases are included for readability. The keyword `AS` before an alias name isn't required but is recommended for readability and to conform to the ANSI standard.
801790

802791
```sql
803792
-- Uses AdventureWorks
@@ -868,7 +857,7 @@ RIGHT OUTER JOIN FactInternetSales AS fis
868857
ON dp.ProductKey = fis.ProductKey;
869858
```
870859

871-
The following query uses the `DimSalesTerritory` table as the left table in a left outer join. It retrieves the `SalesOrderNumber` values from the `FactInternetSales` table. If there are no orders for a particular `SalesTerritoryKey`, the query returns a NULL for the `SalesOrderNumber` for that row. This query is ordered by the `SalesOrderNumber` column, so that any NULLs in this column appear at the top of the results.
860+
The following query uses the `DimSalesTerritory` table as the left table in a left outer join. It retrieves the `SalesOrderNumber` values from the `FactInternetSales` table. If there are no orders for a particular `SalesTerritoryKey`, the query returns a `NULL` for the `SalesOrderNumber` for that row. This query is ordered by the `SalesOrderNumber` column, so that any `NULL`s in this column appear at the top of the results.
872861

873862
```sql
874863
-- Uses AdventureWorks
@@ -898,7 +887,7 @@ ORDER BY fis.SalesOrderNumber;
898887

899888
### P. Use the FULL OUTER JOIN syntax
900889

901-
The following example demonstrates a full outer join, which returns all rows from both joined tables but returns NULL for values that don't match from the other table.
890+
The following example demonstrates a full outer join, which returns all rows from both joined tables but returns `NULL` for values that don't match from the other table.
902891

903892
```sql
904893
-- Uses AdventureWorks
@@ -998,9 +987,9 @@ ORDER BY SalesOrderNumber;
998987

999988
### U. Use the REDISTRIBUTE hint to guarantee a Shuffle move for a distribution incompatible join
1000989

1001-
The following query uses the REDISTRIBUTE query hint on a distribution incompatible join. This guarantees the query optimizer uses a Shuffle move in the query plan. This also guarantees the query plan won't use a Broadcast move, which moves a distributed table to a replicated table.
990+
The following query uses the `REDISTRIBUTE` query hint on a distribution incompatible join. This guarantees the query optimizer uses a Shuffle move in the query plan. This also guarantees the query plan won't use a Broadcast move, which moves a distributed table to a replicated table.
1002991

1003-
In the following example, the REDISTRIBUTE hint forces a Shuffle move on the FactInternetSales table because ProductKey is the distribution column for DimProduct, and isn't the distribution column for FactInternetSales.
992+
In the following example, the `REDISTRIBUTE` hint forces a Shuffle move on the `FactInternetSales` table because `ProductKey` is the distribution column for `DimProduct`, and isn't the distribution column for `FactInternetSales`.
1004993

1005994
```sql
1006995
-- Uses AdventureWorks

docs/t-sql/queries/hints-transact-sql-join.md

Lines changed: 103 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -3,8 +3,8 @@ title: "Join hints (Transact-SQL)"
33
description: Join hints specify that the query optimizer enforce a join strategy between two tables in SQL Server.
44
author: VanMSFT
55
ms.author: vanto
6-
ms.reviewer: randolphwest
7-
ms.date: 06/07/2024
6+
ms.reviewer: randolphwest, wiassaf
7+
ms.date: 03/20/2025
88
ms.service: sql
99
ms.subservice: t-sql
1010
ms.topic: reference
@@ -26,7 +26,7 @@ monikerRange: "=azuresqldb-current || >=sql-server-2016 || >=sql-server-linux-20
2626
---
2727
# Join hints (Transact-SQL)
2828

29-
[!INCLUDE [SQL Server Azure SQL Database Azure SQL Managed Instance FabricSQLDB](../../includes/applies-to-version/sql-asdb-asdbmi-fabricsqldb.md)]
29+
[!INCLUDE [sql-asdb-asdbmi-fabricse-fabricdw-fabricsqldb](../../includes/applies-to-version/sql-asdb-asdbmi-fabricse-fabricdw-fabricsqldb.md)]
3030

3131
Join hints specify that the query optimizer enforce a join strategy between two tables in [!INCLUDE [ssnoversion](../../includes/ssnoversion-md.md)]. For general information about joins and join syntax, see [FROM clause plus JOIN, APPLY, PIVOT](from-transact-sql.md).
3232

@@ -45,17 +45,22 @@ Join hints specify that the query optimizer enforce a join strategy between two
4545

4646
```syntaxsql
4747
<join_hint> ::=
48-
{ LOOP | HASH | MERGE | REMOTE }
48+
{ LOOP | HASH | MERGE | REMOTE | REDUCE | REPLICATE | REDISTRIBUTE [(columns count)]}
4949
```
5050

5151
## Arguments
5252

5353
#### { LOOP | HASH | MERGE }
5454

55+
***Applies to:*** [!INCLUDE [Azure SQL Database](../../includes/ssazure-sqldb.md)], [!INCLUDE [SQL Managed Instance](../../includes/ssazuremi-md.md)], [!INCLUDE [Fabric SQL analytics endpoint](../../includes/fabric-se.md)], [!INCLUDE [fabric-sqldb](../../includes/fabric-sqldb.md)], [!INCLUDE [fabric](../../includes/fabric.md)] [!INCLUDE [Fabric Warehouse](../../includes/fabric-dw.md)]
56+
5557
Specifies that the join in the query should use looping, hashing, or merging. Using `LOOP`, `HASH`, or `MERGE JOIN` enforces a particular join between two tables. `LOOP` can't be specified together with `RIGHT` or `FULL` as a join type. For more information, see [Joins](../../relational-databases/performance/joins.md).
5658

5759
#### REMOTE
5860

61+
***Applies to:*** [!INCLUDE [Azure SQL Database](../../includes/ssazure-sqldb.md)], [!INCLUDE [SQL Managed Instance](../../includes/ssazuremi-md.md)], [!INCLUDE [Fabric SQL analytics endpoint](../../includes/fabric-se.md)], [!INCLUDE [fabric-sqldb](../../includes/fabric-sqldb.md)]
62+
63+
5964
Specifies that the join operation is performed on the site of the right table. This is useful when the left table is a local table and the right table is a remote table. `REMOTE` should be used only when the left table has fewer rows than the right table.
6065

6166
If the right table is local, the join is performed locally. If both tables are remote but from different data sources, `REMOTE` causes the join to be performed on the site of the right table. If both tables are remote tables from the same data source, `REMOTE` isn't required.
@@ -64,6 +69,35 @@ If the right table is local, the join is performed locally. If both tables are r
6469

6570
`REMOTE` can be used only for `INNER JOIN` operations.
6671

72+
#### REDUCE
73+
74+
***Applies to:*** [!INCLUDE[ssazuresynapse-md](../../includes/ssazuresynapse-md.md)] and [!INCLUDE[ssPDW](../../includes/sspdw-md.md)]
75+
76+
Reduces the number of rows to be moved for the table on the right side of the join in order to make two distribution incompatible tables compatible. The REDUCE hint is also called a semi-join hint.
77+
78+
#### REPLICATE
79+
80+
***Applies to:*** [!INCLUDE[ssazuresynapse-md](../../includes/ssazuresynapse-md.md)], [!INCLUDE[ssPDW](../../includes/sspdw-md.md)], [!INCLUDE [fabric](../../includes/fabric.md)] [!INCLUDE [Fabric Warehouse](../../includes/fabric-dw.md)]
81+
82+
Causes a broadcast move operation, where a specific table to be replicated across all distribution nodes.
83+
84+
- Using `REPLICATE` with a `INNER` or `LEFT` join, the broadcast move operation will replicate the right side of the join to all nodes.
85+
- Similarly, while using `REPLICATE` with a `RIGHT` join, the broadcast move operation will replicate the left side of the join to all nodes.
86+
- When using `REPLICATE` with a `FULL` join, an estimated plan cannot be created.
87+
88+
#### REDISTRIBUTE [(columns_count)]
89+
90+
***Applies to:*** [!INCLUDE[ssazuresynapse-md](../../includes/ssazuresynapse-md.md)] and [!INCLUDE[ssPDW](../../includes/sspdw-md.md)]
91+
92+
Forces two data sources to be distributed on columns specified in the JOIN clause. For a distributed table, [!INCLUDE[ssPDW](../../includes/sspdw-md.md)] performs a shuffle move on the first column of both tables For a replicated table, [!INCLUDE[ssPDW](../../includes/sspdw-md.md)] performs a trim move. To understand these move types, see the "DMS Query Plan Operations" section in the "Understanding Query Plans" article in the [!INCLUDE[pdw-product-documentation](../../includes/pdw-product-documentation-md.md)]. This hint can improve performance when the query plan is using a broadcast move to resolve a distribution incompatible join.
93+
94+
***Applies to:*** [!INCLUDE [fabric](../../includes/fabric.md)] [!INCLUDE [Fabric Warehouse](../../includes/fabric-dw.md)]
95+
96+
The `REDISTRIBUTE` hint ensures two data sources are distributed based on `JOIN` clause columns. It handles multiple join conditions, specified by the first *n* columns in both tables, where *n* is the `column_count` argument. Redistributing data optimizes query performance by evenly spreading data across nodes during intermediate steps of execution.
97+
98+
The `(columns_count)` argument is only supported in [!INCLUDE [fabric](../../includes/fabric.md)] [!INCLUDE [Fabric Warehouse](../../includes/fabric-dw.md)].
99+
100+
67101
## Remarks
68102

69103
Join hints are specified in the `FROM` clause of a query. Join hints enforce a join strategy between two tables. If a join hint is specified for any two tables, the query optimizer automatically enforces the join order for all joined tables in the query, based on the position of the `ON` keywords. When a `CROSS JOIN` is used without the `ON` clause, parentheses can be used to indicate the join order.
@@ -115,6 +149,71 @@ INNER MERGE JOIN Purchasing.PurchaseOrderDetail AS pod
115149
GO
116150
```
117151

152+
### D. REDUCE join hint example
153+
154+
The following example uses the `REDUCE` join hint to alter the processing of the derived table within the query. When using the `REDUCE` join hint in this query, the `fis.ProductKey` is projected, replicated and made distinct, and then joined to `DimProduct` during the shuffle of `DimProduct` on `ProductKey`. The resulting derived table is distributed on `fis.ProductKey`.
155+
156+
```sql
157+
-- Uses AdventureWorks
158+
159+
SELECT SalesOrderNumber
160+
FROM (
161+
SELECT fis.SalesOrderNumber,
162+
dp.ProductKey,
163+
dp.EnglishProductName
164+
FROM DimProduct AS dp
165+
INNER REDUCE JOIN FactInternetSales AS fis
166+
ON dp.ProductKey = fis.ProductKey
167+
) AS dTable
168+
ORDER BY SalesOrderNumber;
169+
```
170+
171+
### E. REPLICATE join hint example
172+
173+
This next example shows the same query as the previous example, except that a `REPLICATE` join hint is used instead of the `REDUCE` join hint. Use of the `REPLICATE` hint causes the values in the `ProductKey` (joining) column from the `FactInternetSales` table to be replicated to all nodes. The `DimProduct` table is joined to the replicated version of those values.
174+
175+
```sql
176+
-- Uses AdventureWorks
177+
178+
SELECT SalesOrderNumber
179+
FROM (
180+
SELECT fis.SalesOrderNumber,
181+
dp.ProductKey,
182+
dp.EnglishProductName
183+
FROM DimProduct AS dp
184+
INNER REPLICATE JOIN FactInternetSales AS fis
185+
ON dp.ProductKey = fis.ProductKey
186+
) AS dTable
187+
ORDER BY SalesOrderNumber;
188+
```
189+
190+
### F. Use the REDISTRIBUTE hint to guarantee a Shuffle move for a distribution incompatible join
191+
192+
The following query uses the `REDISTRIBUTE` query hint on a distribution incompatible join. This guarantees the query optimizer uses a Shuffle move in the query plan. This also guarantees the query plan won't use a Broadcast move, which moves a distributed table to a replicated table.
193+
194+
In the following example, the `REDISTRIBUTE` hint forces a Shuffle move on the `FactInternetSales` table because `ProductKey` is the distribution column for `DimProduct`, and isn't the distribution column for `FactInternetSales`.
195+
196+
```sql
197+
-- Uses AdventureWorks
198+
199+
SELECT dp.ProductKey,
200+
fis.SalesOrderNumber,
201+
fis.TotalProductCost
202+
FROM DimProduct AS dp
203+
INNER REDISTRIBUTE JOIN FactInternetSales AS fis
204+
ON dp.ProductKey = fis.ProductKey;
205+
```
206+
207+
### G. Use the columns count argument with the REDISTRIBUTE hint
208+
209+
The following query uses the `REDISTRIBUTE` query hint with the columns count argument, and the shuffle takes place across the first four columns of each table in the join.
210+
211+
```sql
212+
SELECT * FROM DA
213+
INNER REDISTRIBUTE (4) JOIN DB
214+
ON DA.a1 = DB.b1
215+
```
216+
118217
## Related content
119218

120219
- [Hints (Transact-SQL)](hints-transact-sql.md)

0 commit comments

Comments
 (0)