Commit 196b9a9
Add option to use gather to select embeddings in EC (#3479)
Summary:
Pull Request resolved: #3479
Due to atomic add in torch.index_select, the backward performance sometimes is bad comparing with gather. In this diff, it provides users with control over the indexing process and select the suitable operator based on specific cases.
Perf comparison on pure operators(forward+backward)
2D Embedding, No Repetition
Config: shape=(1000000, 256), dim=0, indices=100000, unique=95300 (95.3%)
Method Time (s) Speedup Status
torch.gather 0.9439 1.00 x 🏆
torch.index_select 1.0509 0.90 x
2D Embedding, Low Repetition
Config: shape=(1000000, 256), dim=0, indices=100000, unique=48732 (48.7%)
Method Time (s) Speedup Status
torch.gather 0.9076 1.00 x 🏆
torch.index_select 1.0415 0.87 x
2D Embedding, High Repetition
Config: shape=(1000000, 256), dim=0, indices=250000, unique=9957 (4.0%)
Method Time (s) Speedup Status
torch.gather 1.2385 1.00 x 🏆
torch.index_select 1.6225 0.76 x
Small Vocab, Low Repetition
Config: shape=(1000, 256), dim=0, indices=2000, unique=635 (31.8%)
Method Time (s) Speedup Status
torch.gather 0.1502 1.00 x 🏆
torch.index_select 0.1763 0.85 x
Small Vocab, Very High Repetition
Config: shape=(1000, 256), dim=0, indices=100000, unique=625 (0.6%)
Method Time (s) Speedup Status
torch.gather 0.2626 1.00 x 🏆
torch.index_select 0.4126 0.64 x
Large Vocab, No Repetition
Config: shape=(10000000, 256), dim=0, indices=10000, unique=9996 (100.0%)
Method Time (s) Speedup Status
torch.gather 5.8014 1.00 x 🏆
torch.index_select 5.8184 1.00 x
Large Vocab, Low Repetition
Config: shape=(10000000, 256), dim=0, indices=10000, unique=5000 (50.0%)
Method Time (s) Speedup Status
torch.gather 5.7912 1.00 x 🏆
torch.index_select 5.8137 1.00 x
Large Vocab, High Repetition
Config: shape=(10000000, 256), dim=0, indices=10000, unique=400 (4.0%)
Method Time (s) Speedup Status
torch.gather 5.7784 1.00 x 🏆
torch.index_select 5.8100 0.99 x
Mast Job Test:
baseline: fire-jingchang-f816557933
torch.index_select backward takes ~37ms
{F1982939713}
exp: fire-jingchang-f816355728
torch.gather backward takes ~10ms
{F1982939742}
Reviewed By: TroyGarden
Differential Revision: D85309309
fbshipit-source-id: c0c5352542ad5bf66d833382057f601c3c181cef1 parent 7ddc21d commit 196b9a9
File tree
3 files changed
+26
-3
lines changed- torchrec
- distributed
- modules
3 files changed
+26
-3
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
338 | 338 | | |
339 | 339 | | |
340 | 340 | | |
| 341 | + | |
341 | 342 | | |
342 | 343 | | |
343 | 344 | | |
| |||
348 | 349 | | |
349 | 350 | | |
350 | 351 | | |
| 352 | + | |
351 | 353 | | |
352 | 354 | | |
353 | 355 | | |
| |||
389 | 391 | | |
390 | 392 | | |
391 | 393 | | |
| 394 | + | |
392 | 395 | | |
393 | 396 | | |
394 | 397 | | |
| |||
529 | 532 | | |
530 | 533 | | |
531 | 534 | | |
| 535 | + | |
532 | 536 | | |
533 | 537 | | |
534 | 538 | | |
| |||
1563 | 1567 | | |
1564 | 1568 | | |
1565 | 1569 | | |
| 1570 | + | |
1566 | 1571 | | |
1567 | 1572 | | |
1568 | 1573 | | |
| |||
1612 | 1617 | | |
1613 | 1618 | | |
1614 | 1619 | | |
| 1620 | + | |
1615 | 1621 | | |
1616 | 1622 | | |
1617 | 1623 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
408 | 408 | | |
409 | 409 | | |
410 | 410 | | |
| 411 | + | |
411 | 412 | | |
412 | 413 | | |
413 | 414 | | |
414 | 415 | | |
415 | 416 | | |
416 | 417 | | |
417 | 418 | | |
| 419 | + | |
418 | 420 | | |
419 | 421 | | |
420 | 422 | | |
| |||
541 | 543 | | |
542 | 544 | | |
543 | 545 | | |
| 546 | + | |
| 547 | + | |
| 548 | + | |
| 549 | + | |
| 550 | + | |
| 551 | + | |
| 552 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
245 | 245 | | |
246 | 246 | | |
247 | 247 | | |
| 248 | + | |
248 | 249 | | |
249 | 250 | | |
250 | 251 | | |
251 | 252 | | |
252 | 253 | | |
253 | | - | |
254 | | - | |
255 | | - | |
| 254 | + | |
| 255 | + | |
| 256 | + | |
| 257 | + | |
| 258 | + | |
| 259 | + | |
| 260 | + | |
| 261 | + | |
| 262 | + | |
| 263 | + | |
256 | 264 | | |
257 | 265 | | |
258 | 266 | | |
| |||
0 commit comments