MinerX scheduler questions #7945
Replies: 6 comments
-
So documenting a couple of the variables that I understand the meaning/use of:
|
Beta Was this translation helpful? Give feedback.
-
I have a test sealing worker with one and two GPUs. The enhanced scheduler only seems to work perfect for the one GPU setup: 1 GPU worker: I set variables like 2 GPU worker: I do the same, but it's does not work as I was expecting. I guess this is because the actual execution of the job is done inside the lotus-worker, while the enhanced scheduler is just the one assigning jobs. So for me - using these variables does not really seems to get me all the way. I would still have to run 1 worker per physical GPU, to ensure it's taking full advantage. Secondly, I noticed that a lotus-worker that does 2C2 jobs, still only provides one CPU thread per GPU while feeding data. This might be a bottleneck. If I run the old trick with 2 workers for one 3090, then each process will run 1 thread at 100%. I guess lotus-worker should assign assign one thread per job to load data to the GPU. So these issues might not be related to the enhanced scheduler directly, but a lot of us would like to use 0.5 GPU per job, and it seems like this would need some more tweaking in the lotus-worker to actually work. Thanks! |
Beta Was this translation helpful? Give feedback.
-
Configuration Expected When GPU(s) not in use
When GPU(s) in use by other threads
Actual |
Beta Was this translation helpful? Give feedback.
-
With env C2_32G_GPU_UTILIZATION=0.5 and PC2_32G_GPU_UTILIZATION=0.5, I can confirm 2 PCs or 2 C2s can run with a single 3090 and a single worker process. |
Beta Was this translation helpful? Give feedback.
-
@magik6k Can you lay out what we expect the scheduler to do, given the feedback above? |
Beta Was this translation helpful? Give feedback.
-
I wondered why my system starts 2 PC2 jobs at the same time, and one just is always done before the other. Well, turns out that the lotus-worker does run each job one-by-one, so first base tree_c for both of them, and then base tree_r afterwards. So it makes good sense now that one job is done once base tree_r computing is done, and then the second job has to go through the process. Unfortunately, this is very inefficient. We want both jobs hitting the GPU at the same time - just like we run multiple PC1 jobs against the CPU! Logs/example here:
|
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
The MinerX team has requested details on the new scheduler process. Mainly, what is the expectation with multiple workers on the same node with multiple GPUs. I am opening this discussion for further clarification from the MinerX team on the questions.
MinerX, please comment your questions and comments on this discussion for review.
Beta Was this translation helpful? Give feedback.
All reactions