Skip to content

Commit 141ec36

Browse files
committed
acrolinx
1 parent 959e733 commit 141ec36

File tree

1 file changed

+7
-7
lines changed

1 file changed

+7
-7
lines changed

articles/data-factory/data-flow-lookup.md

Lines changed: 7 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -20,17 +20,17 @@ A lookup transformation is similar to a left outer join. All rows from the prima
2020

2121
![Lookup Transformation](media/data-flow/lookup1.png "Lookup")
2222

23-
**Primary stream:** The incoming stream of data. This is equivalent to the left side of a join.
23+
**Primary stream:** The incoming stream of data. This stream is equivalent to the left side of a join.
2424

25-
**Lookup stream:** The data which is appended to the primary stream. Which data is appended is determined by the lookup conditions. This is equivalent to the right side of a join.
25+
**Lookup stream:** The data that is appended to the primary stream. Which data is added is determined by the lookup conditions. This stream is equivalent to the right side of a join.
2626

2727
**Match multiple rows:** If enabled, a row with multiple matches in the primary stream will return multiple rows. Otherwise, only a single row will be returned based upon the 'Match on' condition.
2828

29-
**Match on:** Only visible if 'Match multiple rows' is enabled. Choose whether to match on any row, the first match, or the last match. Any row is recommended as it executes the fastest. If first row or last row are selected, you'll be required to specify sort conditions.
29+
**Match on:** Only visible if 'Match multiple rows' is enabled. Choose whether to match on any row, the first match, or the last match. Any row is recommended as it executes the fastest. If first row or last row is selected, you'll be required to specify sort conditions.
3030

3131
**Lookup conditions:** Choose which columns to match on. If the equality condition is met, then the rows will be considered a match. Hover and select 'Computed column' to extract a value using the [data flow expression language](data-flow-expression-functions).
3232

33-
The lookup transformation only supports equality matches. To customize the lookup expression to include other operators such as greater than, it's recommended to use a [cross join in the join transformation](data-flow-join.md#custom-cross-join). This will avoid any possible cartesian product errors on execution.
33+
The lookup transformation only supports equality matches. To customize the lookup expression to include other operators such as greater than, it's recommended to use a [cross join in the join transformation](data-flow-join.md#custom-cross-join). A cross join will avoid any possible cartesian product errors on execution.
3434

3535
All columns from both streams are included in the output data. To drop duplicate or unwanted columns, add a [select transformation](data-flow-select.md) after your lookup transformation. Columns can also be dropped or renamed in a sink transformation.
3636

@@ -40,19 +40,19 @@ After your lookup transformation, the function `isMatch()` can be used to see if
4040

4141
![Lookup pattern](media/data-flow/lookup111.png "Lookup pattern")
4242

43-
An example of this is using the conditional split transformation to split on the `isMatch()` function. In the example above, matching rows go through the top stream and non-matching rows flow through the ```NoMatch``` stream.
43+
An example of this pattern is using the conditional split transformation to split on the `isMatch()` function. In the example above, matching rows go through the top stream and non-matching rows flow through the ```NoMatch``` stream.
4444

4545
## Testing lookup conditions
4646

4747
When testing the lookup transformation with data preview in debug mode, use a small set of known data. When sampling rows from a large dataset, you can't predict which rows and keys will be read for testing. The result is non-deterministic, meaning that your join conditions may not return any matches.
4848

4949
## Broadcast optimization
5050

51-
In Azure Data Factory mapping data flows execute in scaled-out Spark environments. If your dataset can fit into worker node memory space, your lookup performance can be optimized by enabling broadcasitng.
51+
In Azure Data Factory mapping data flows execute in scaled-out Spark environments. If your dataset can fit into worker node memory space, your lookup performance can be optimized by enabling broadcasting.
5252

5353
![Broadcast Join](media/data-flow/broadcast.png "Broadcast Join")
5454

55-
Enabling broadcasting pushes the entire dataset into memory. For smaller datasets containing only a few thousand rows, this can greatly improve your lookup performance. For large datasets, this can lead to an out of memory exception.
55+
Enabling broadcasting pushes the entire dataset into memory. For smaller datasets containing only a few thousand rows, broadcasting can greatly improve your lookup performance. For large datasets, this option can lead to an out of memory exception.
5656

5757
## Data flow script
5858

0 commit comments

Comments
 (0)