-
-
Notifications
You must be signed in to change notification settings - Fork 4.6k
Closed
Labels
enhancementFeature requests and improvementsFeature requests and improvementsperf / speedPerformance: speedPerformance: speedscalingScaling, serving and parallelizing spaCyScaling, serving and parallelizing spaCytrainingTraining and updating modelsTraining and updating modelsπ nightlyDiscussion and contributions related to nightly buildsDiscussion and contributions related to nightly builds
Description
Hello, great feature!, currently I'm doing some experiments on my specific use cases.
But I notice that pretraining speed It's considerably slow, 1 epoch took almost two days in 1B corpus at an average of 4800 w/s.
So I check the uses of resources of the task, I'm training whit a configuration of dual 12 cores Xeon CPUs, (total 24 CPUs) a single machine, without GPU, and I noticed It's only using 1 core at a time.
Will be possible to add the desired number of workers on this task, then we could use the maximum number of cores and parallelize the pretraining, it could accelerate the processing time?
Best regards
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
enhancementFeature requests and improvementsFeature requests and improvementsperf / speedPerformance: speedPerformance: speedscalingScaling, serving and parallelizing spaCyScaling, serving and parallelizing spaCytrainingTraining and updating modelsTraining and updating modelsπ nightlyDiscussion and contributions related to nightly buildsDiscussion and contributions related to nightly builds