Skip to content

Commit 628dc15

Browse files
committed
docs: add example for selecting columns after deduplication in joins
1 parent 3dc96f3 commit 628dc15

File tree

1 file changed

+8
-0
lines changed
  • docs/source/user-guide/common-operations

1 file changed

+8
-0
lines changed

docs/source/user-guide/common-operations/joins.rst

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -137,3 +137,11 @@ DataFusion uses the ``__right_<col>`` naming convention for conflicting columns
137137
138138
left.join(right, on="id", deduplicate=True)
139139
140+
After deduplication, you can select the join column (which comes from the left DataFrame) and other columns as usual:
141+
142+
.. ipython:: python
143+
144+
# Select the id column and other columns from both DataFrames
145+
joined_dedup = left.join(right, on="id", deduplicate=True)
146+
joined_dedup.select("id", "customer", "name")
147+

0 commit comments

Comments
 (0)