Skip to content

Workload Assignment

Andrew Grant edited this page Jan 30, 2024 · 11 revisions

There are four critical components at play when a worker requests a workload.

  1. Priority: Tests with a higher priority are completed before tests of lower priority, in general. However, this does not strictly mean that the highest priority test will be returned to the worker. The highest priority test, that the worker is capable of completing, will be assigned. A worker must have the proper Operating System, CPU Flags, Syzygy requirements, Thread counts, Compiler Versions (public engines), and Fine-grained Tokens (private engines), to be able to accept a workload. The list of all such highest priority tests that a worker can complete comprises the candidate assignments.

  2. Throughput: Every test has a throughput value associated with it. By default, tests are created with a throughput of 1000. Suppose there is exactly one test running, with Throughput=1000. If we want to create another test, and have it get twice a many resources, we would create one with a Throughput=2000. This means that ~2/3rds of all resources will go to the new test. If we wanted a third test, which got 1/2 of all resource, we would need to create one with Throughput=3000. This is because the total throughput is now 6000, and the test itself has 3000. When summing up the throughput values, we only consider tests in the candidate assignments.

  3. Engine Balancing: The balance_engine_throughputs option in the main configuration, if enabled, will scale the throughput of all tests for an engine, by dividing by the number of tests that the engine has at once, in the candidate assignments. For example, if Ethereal has a two tests at Priority=1, and three tests at Priority=0, then the throughput will be scaled down by a factor of 2, not by a factor of 5.

  4. Focus Enabled Workers: The client has an option, --focus ENGINE1 [ENGINE2 ...] which allows the user to specify a list of engines which they would prefer to have their machine contribute games towards. If a focused engine appears in the candidate assignments, then all non-focused engines will be excluded. Furthermore, the balancing algorithm will ignore a machine which is assigned to one of the focused engines, when assigning workloads to non focus-assigned workers. This makes it such that connecting your own machines for your own engine will be a strict increase in the resources your tests get. If a machine is given an assignment whose dev engine appears in their focus list, we call this a focus-assigned machine


Test Selection Algorithm:

  1. Identify all tests which the machine is capable of completing. Filter that list down such that only the highest priority tests remain. If the machine was run with --focus, and some of the focused engines appear in the list still, then remove all engines which are not focused. These are the candidate assignments.

  2. Determine how many threads are currently assigned to each of the candidate tests. If the machine is not going to be focus-assigned, then ignore the thread contribution to the candidate tests that come from other focus-assigned machines. The conditional ignore is needed, in order to still balance amongst focus-assigned machines.

  3. Compute the effective-throughput for each test. If balance_engine_throughputs is not enabled, then the throughput is the effective-throughput. Otherwise, divide each individual test's throughput, by the number of tests from the test's dev engine which appear in the candidate assignments.

  4. Compute the resource ratios for each test. The ratio for a test is the number of assigned threads, divided by the effective-throughput. For example: Test #1 has 1000 effective-throughput and 32 threads. Test #2 has 2000 effective-throughput and 16 threads. The ratios for test 1 and 2 would be 0.032 and 0.008 respectively. This tells us that Test #1 is getting 4x the resources as Test #2. A 2x factor comes from the 1000 vs 2000 throughput, and another 2x factor comes from the 32 vs 16 threads.

  5. Compute the fair resource ratio. This is the total number of threads on the candidate tests, divided by the sum of the effective throughputs. In our example, this would be 0.016. From this, we see that Test #1 is getting twice as many resources as would be fair, and Test #2 is only getting half the resources as would be fair.

  6. For efficiency, we would like the machines to repeat the same workload multiple times, to reduce overhead on downloading and building engines. If the machine 's most recent workload is in our candidates, and no machine is receiving less than 75% of the fair amount of resources, then repeat the same test.

  7. If any of the candidate tests have a ratio of 0, select from all tests with a ratio of 0, using the throughput of each test for random weighting.

  8. Otherwise, return the test which has the lowest ratio. If many tests have a shared lowest ratio, select from them at random, without throughput weighting.


Concurrency Settings

The server provides three critical values in the workload JSON response. These are cutechess-count, concurrency-per, and games-per-cutechess. These values are a function of the number of threads and sockets, as well as the nature of the test or tune. They are explained below.

  1. cutechess-count indicates the number of cutechess copies that should be running at one time. For a typical workload, where each engine is playing with one thread, cutechess-count will be equal to the number of sockets on the worker, as provided via --nsockets or -N when starting the Client. If the workload is an SPSA tune, using the MULTIPLE method of distributing SPSA-points, then cutechess-count will be the maximum number of concurrent games divided by two. Finally, if the previous condition is not true, and the workload uses more than 1 thread for either engine, then cutechess-count will be set to 1.

  2. concurrency-per indicates the number of concurrent games that will be played, for any particular cutechess copy that is running. If the workload is an SPSA tune, using the MULTIPLE method of distributing SPSA-points, then this value will be 2. Otherwise, it will be the maximum number of concurrent games, which is defined as (threads // cutechess-count) // max(dev_threads, base_threads).

  3. games-per-cutechess is the number of games to play total, on each particular cutechess copy that is running. Once again, SPSA tunes using the MULTIPLE method are a special case, and will play 2 * workload_size games, ie a game-pair for each workload_size. The general case will instead play 2 * workload_size * concurrency-per, ie a game-pair for each workload_size for each possible concurrent game.

Clone this wiki locally