Move evaluation speed should not count non-foraged moves in multi-threaded solving

### What is **move evaluation**?

A) The **scoring** of a move
B) The **scoring + foraging** of a move

For single-threaded solving, A and B are equal. Every scored moved is evaluated.
For multi-threaded solving, there are "wasted moves": moves that got scored but don't get foraged.

### Why do users care about move evaluation speed?

1) In **constraint development**, to asses if adding a new constraint halves that number.
2) When **comparing two run**, to validate the hardware is apples to apples.
3) When **comparing two benchmarks**, to see if its "twice as fast", to get a lineair growing number. It's a proxy to performance.
4) During a **model review**, to quickly asses if "the constraints are slow" (below 10'000).

The first three care about the relative difference only. The last one only cares about the order of magnitude of the absolute number.

### Why is move speed used as a proxy to performance quality?

Because there's nothing better.

Ideally, it would use the score: better performance leads to a better score. And better score is better.

But the score doesn't increase linearly. It even flatlines. So if the code is 500% faster for a lot of work, the score might be only be 3% better. Or even the same.  Looking at the score on that dataset would not value the impact of that work correctly. Score is a bad way to measure performance quality.

### Should move evaluation speed count wasted moves in multi-threading?

Only if we define it as A) the scoring of a move.
But we can also define it as B).

**No, because it corrupts "comparing two benchmarks"** between benchmarks with different number of threads.

For example, according to A:
- benchmark X has 2 threads with a speed of 20'000
- benchmark Y has 4 threads with a speed of 25'000

This will lead to the conclusion that **Y is better than X, which is false**: Y has a worse score, because Y has 10'000 wasted moves.

According to B:
- benchmark X has 2 threads with a speed of 20'000
- benchmark Y has 4 threads with a speed of 15'000

This will lead to the right conclusion that X is better than Y.

### C) Should we not show both? scoring speed and scoring+foraging speed?

No. Users don't care about A), not in case 1), case 2), case 3) nor case 4).
Only our solver engineers do. Of course it can be an internal metric.

For users, the only question is "is it faster?" (as in "do I get more?")
The log line and platform UI should stick to one number ("move speed"), the one that matters. Less is more.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Move evaluation speed should not count non-foraged moves in multi-threaded solving #2261

What is move evaluation?

Why do users care about move evaluation speed?

Why is move speed used as a proxy to performance quality?

Should move evaluation speed count wasted moves in multi-threading?

C) Should we not show both? scoring speed and scoring+foraging speed?

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Move evaluation speed should not count non-foraged moves in multi-threaded solving #2261

Description

What is move evaluation?

Why do users care about move evaluation speed?

Why is move speed used as a proxy to performance quality?

Should move evaluation speed count wasted moves in multi-threading?

C) Should we not show both? scoring speed and scoring+foraging speed?

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions