This is exactly what I'm looking for to extend my existing cluster that is high cpu/RAM AND 0 GPU.
Can you give some insight if the workers can run on low cpu/ram systems, such as a series of rpi 5 with RTX 4090 over 1x picie, while the master processes checkpoint reallocation using high cpu/ram capacity?
Also is a gigabit cluster network sufficient to relay MQ messages between workers