Skip to content

Commit 6cceeb3

Browse files
committed
docs: tweak wording around multi-threaded
1 parent e857351 commit 6cceeb3

File tree

2 files changed

+33
-68
lines changed

2 files changed

+33
-68
lines changed

docs/src/antora.yml

Lines changed: 0 additions & 25 deletions
This file was deleted.

docs/src/modules/ROOT/pages/enterprise-edition/enterprise-edition.adoc

Lines changed: 33 additions & 43 deletions
Original file line numberDiff line numberDiff line change
@@ -406,7 +406,7 @@ In this section, we will focus on multi-threaded incremental solving and partiti
406406

407407
[NOTE]
408408
====
409-
A xref:using-timefold-solver/running-the-solver.adoc#logging[logging level] of `debug` or `trace` might cause congestion multi-threaded solving
409+
A xref:using-timefold-solver/running-the-solver.adoc#logging[logging level] of `debug` or `trace` might cause congestion
410410
and slow down the xref:constraints-and-score/performance.adoc#scoreCalculationSpeed[score calculation speed].
411411
====
412412

@@ -416,40 +416,31 @@ and slow down the xref:constraints-and-score/performance.adoc#scoreCalculationSp
416416

417417
With this feature, the solver can run significantly faster,
418418
getting you the right solution earlier.
419-
It is especially useful for large datasets,
420-
where score calculation speed is the bottleneck.
421-
422-
The following table shows the observed score calculation speeds
423-
of the Vehicle Routing Problem and the Maintenance Scheduling Problem,
424-
as the number of threads increases:
425-
426-
|===
427-
|Number of Threads |Vehicle Routing |Maintenance Scheduling
428-
429-
|1
430-
|~ 22,000
431-
|~ 6,000
432-
433-
|2
434-
|~ 40,000
435-
|~ 11,000
436-
437-
|4
438-
|~ 70,000
439-
|~ 19,000
440-
|===
441-
442-
As we can see, the speed increases with the number of threads,
443-
but the scaling is not exactly linear due to the overhead of managing communication between multiple threads.
444-
Above 4 move threads,
445-
this overhead tends to dominate and therefore we do not recommend scaling over that threshold.
419+
It has been designed to speed up the solver in cases where score calculation is the bottleneck.
420+
This typically happens when the constraints are computationally expensive,
421+
or when the dataset is large.
422+
423+
- The sweet spot for this feature is when the score calculation speed is up to 10 thousand per second.
424+
In this case, we have observed the algorithm to scale linearly with the number of move threads.
425+
Every additional move thread will bring a speedup,
426+
albeit with diminishing returns.
427+
- For score calculation speeds on the order of 100 thousand per second,
428+
the algorithm no longer scales linearly,
429+
but using 4 to 8 move threads may still be beneficial.
430+
- For even higher score calculation speeds,
431+
the feature does not bring any benefit.
432+
At these speeds, score calculation is no longer the bottleneck.
433+
If the solver continues to underperform,
434+
perhaps you're suffering from xref:constraints-and-score/performance.adoc#scoreTrap[score traps]
435+
or you may benefit from xref:optimization-algorithms/optimization-algorithms.adoc#customMoves[custom moves]
436+
to help the solver escape local optima.
446437

447438
[NOTE]
448439
====
449-
These numbers are strongly dependent on move selector configuration,
440+
These guidelines are strongly dependent on move selector configuration,
450441
size of the dataset and performance of individual constraints.
451-
We believe they are indicative of the speedups you can expect from this feature,
452-
but your mileage may vary significantly.
442+
We recommend you benchmark your use case
443+
to determine the optimal number of move threads for your problem.
453444
====
454445

455446
===== Enabling multi-threaded incremental solving
@@ -525,8 +516,10 @@ The following ``moveThreadCount``s are supported:
525516
* ``AUTO``: Let Timefold Solver decide how many move threads to run in parallel.
526517
On machines or containers with little or no CPUs, this falls back to the single threaded code.
527518
* Static number: The number of move threads to run in parallel.
528-
This can be `1` to enforce running the multi-threaded code with only 1 move thread
529-
(which is less efficient than `NONE`).
519+
520+
It is counter-effective to set a `moveThreadCount`
521+
that is higher than the number of available CPU cores,
522+
as that will slow down the score calculation speed.
530523

531524
[IMPORTANT]
532525
====
@@ -537,11 +530,6 @@ and therefore you may end up paying more for the same result,
537530
even though the actual compute time needed will be less.
538531
====
539532

540-
It is counter-effective to set a `moveThreadCount`
541-
that is higher than the number of available CPU cores,
542-
as that will slow down the score calculation speed.
543-
One good reason to do it anyway, is to reproduce a bug of a high-end production machine.
544-
545533
[NOTE]
546534
====
547535
Multi-threaded solving is _still reproducible_, as long as the resolved `moveThreadCount` is stable.
@@ -558,16 +546,11 @@ There are additional parameters you can supply to your `solverConfig.xml`:
558546
<solver xmlns="https://timefold.ai/xsd/solver" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
559547
xsi:schemaLocation="https://timefold.ai/xsd/solver https://timefold.ai/xsd/solver/solver.xsd">
560548
<moveThreadCount>4</moveThreadCount>
561-
<moveThreadBufferSize>10</moveThreadBufferSize>
562549
<threadFactoryClass>...MyAppServerThreadFactory</threadFactoryClass>
563550
...
564551
</solver>
565552
----
566553

567-
The `moveThreadBufferSize` power tweaks the number of moves that are selected but won't be foraged.
568-
Setting it too low reduces performance, but setting it too high too.
569-
Unless you're deeply familiar with the inner workings of multi-threaded solving, don't configure this parameter.
570-
571554
To run in an environment that doesn't like arbitrary thread creation,
572555
use `threadFactoryClass` to plug in a <<customThreadFactory,custom thread factory>>.
573556

@@ -1034,3 +1017,10 @@ unless it was already delivered before.
10341017
- If your consumer throws an exception, we will still count the event as delivered.
10351018
- If the system is too occupied to start and execute new threads,
10361019
event delivery will be delayed until a thread can be started.
1020+
1021+
[NOTE]
1022+
====
1023+
If you are using the `ThrottlingBestSolutionConsumer` for intermediate best solutions
1024+
together with a final best solution consumer,
1025+
both these consumers will receive the final best solution.
1026+
====

0 commit comments

Comments
 (0)