Commit 7b81e3b
authored
fix: handle case where join keys are different for sort-merge multi-partition join (#6243)
## Changes Made
the current sort merge join (multi partition) implementation does not
correctly handle the case where the join keys in the left and right
dataframes are different. this PR fixes this issue by doing the
following:
- aliasing the right keys when generating the samples for determining
boundaries
- renames materialized `boundaries` with right keys when applying the
boundaries to create range partition tasks
- regression test added to ensure fix works
---------
Co-authored-by: gmweaver <gmweaver.usc@gmail.com>1 parent 37b352a commit 7b81e3b
File tree
2 files changed
+78
-6
lines changed- src/daft-distributed/src/pipeline_node/join
- tests/dataframe
2 files changed
+78
-6
lines changedLines changed: 43 additions & 6 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
5 | 5 | | |
6 | 6 | | |
7 | 7 | | |
| 8 | + | |
8 | 9 | | |
9 | 10 | | |
10 | 11 | | |
| |||
174 | 175 | | |
175 | 176 | | |
176 | 177 | | |
| 178 | + | |
| 179 | + | |
| 180 | + | |
| 181 | + | |
| 182 | + | |
| 183 | + | |
| 184 | + | |
| 185 | + | |
| 186 | + | |
| 187 | + | |
| 188 | + | |
| 189 | + | |
| 190 | + | |
| 191 | + | |
| 192 | + | |
| 193 | + | |
| 194 | + | |
177 | 195 | | |
178 | 196 | | |
179 | 197 | | |
180 | 198 | | |
181 | | - | |
| 199 | + | |
182 | 200 | | |
183 | 201 | | |
184 | 202 | | |
185 | 203 | | |
186 | 204 | | |
187 | 205 | | |
188 | | - | |
| 206 | + | |
189 | 207 | | |
190 | 208 | | |
191 | 209 | | |
| |||
196 | 214 | | |
197 | 215 | | |
198 | 216 | | |
199 | | - | |
200 | | - | |
| 217 | + | |
| 218 | + | |
201 | 219 | | |
202 | 220 | | |
203 | 221 | | |
| |||
212 | 230 | | |
213 | 231 | | |
214 | 232 | | |
215 | | - | |
| 233 | + | |
216 | 234 | | |
217 | 235 | | |
218 | 236 | | |
219 | 237 | | |
220 | 238 | | |
221 | 239 | | |
| 240 | + | |
| 241 | + | |
| 242 | + | |
| 243 | + | |
| 244 | + | |
| 245 | + | |
| 246 | + | |
| 247 | + | |
| 248 | + | |
| 249 | + | |
| 250 | + | |
| 251 | + | |
| 252 | + | |
| 253 | + | |
| 254 | + | |
| 255 | + | |
| 256 | + | |
| 257 | + | |
| 258 | + | |
222 | 259 | | |
223 | 260 | | |
224 | 261 | | |
225 | 262 | | |
226 | 263 | | |
227 | 264 | | |
228 | 265 | | |
229 | | - | |
| 266 | + | |
230 | 267 | | |
231 | 268 | | |
232 | 269 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1273 | 1273 | | |
1274 | 1274 | | |
1275 | 1275 | | |
| 1276 | + | |
| 1277 | + | |
| 1278 | + | |
| 1279 | + | |
| 1280 | + | |
| 1281 | + | |
| 1282 | + | |
| 1283 | + | |
| 1284 | + | |
| 1285 | + | |
| 1286 | + | |
| 1287 | + | |
| 1288 | + | |
| 1289 | + | |
| 1290 | + | |
| 1291 | + | |
| 1292 | + | |
| 1293 | + | |
| 1294 | + | |
| 1295 | + | |
| 1296 | + | |
| 1297 | + | |
| 1298 | + | |
| 1299 | + | |
| 1300 | + | |
| 1301 | + | |
| 1302 | + | |
| 1303 | + | |
| 1304 | + | |
| 1305 | + | |
| 1306 | + | |
| 1307 | + | |
| 1308 | + | |
| 1309 | + | |
| 1310 | + | |
1276 | 1311 | | |
1277 | 1312 | | |
1278 | 1313 | | |
| |||
0 commit comments