Skip to content

[RMP] T4R fixes: MultiGPU data parallel training for next-item prediction and fixed serving #522

@karlhigley

Description

@karlhigley

Problem:

We have customers who would like to use multi-GPU Transformers4Rec but are blocked by issues with our existing support for session-based models.

Goal:

  • Unblock customer use cases so they can try out T4R to give us feedback

Constraints:

  • We don't yet have Torchscript support (which is out of scope this issue)

Starting Point:

Note: The multi-GPU training of the specific use cases of session binary classification / regression is addressed by RMP #708

Metadata

Metadata

Labels

Type

No type

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions