-
Notifications
You must be signed in to change notification settings - Fork 2.8k
Description
There was some discussion recently about possible improvements to the implementation of "Skill Level" in Stockfish on discord. @vondele suggested to try eval randomization, in particular interpolation of the NNUE eval with N(0, RookValueEg) using some parameter. I've implemented it and tested briefly, but it requires more work to assess the quality of the games played and calibrate the Elo rating. The initial results suggest that it might be a good direction.
The experiment I performed was to use pure NNUE evaluation at fixed nodes and vary the interpolation parameter. 46 configurations played a round-robin tournament with 50 games in each pair. The following c-chess-cli command was used:
#!/bin/bash
c-chess-cli \
-concurrency 16 \
-rounds 1 \
-games 50 \
-openings file=/home/sopel/nnue/c-chess-cli/noob_3moves.epd order=random -repeat -resign 3 700 -draw 8 10 \
-pgn tournament.pgn 2 \
-each tc=1000+1 \
-engine cmd=./engines/stockfish/stockfish name=stockfish_pure_0_100k option.RandomEvalPerturb=0 nodes=100000 \
-engine cmd=./engines/stockfish/stockfish name=stockfish_pure_0_90k option.RandomEvalPerturb=0 nodes=90000 \
-engine cmd=./engines/stockfish/stockfish name=stockfish_pure_0_80k option.RandomEvalPerturb=0 nodes=80000 \
-engine cmd=./engines/stockfish/stockfish name=stockfish_pure_0_70k option.RandomEvalPerturb=0 nodes=70000 \
-engine cmd=./engines/stockfish/stockfish name=stockfish_pure_0_60k option.RandomEvalPerturb=0 nodes=60000 \
-engine cmd=./engines/stockfish/stockfish name=stockfish_pure_0_50k option.RandomEvalPerturb=0 nodes=50000 \
-engine cmd=./engines/stockfish/stockfish name=stockfish_pure_0_40k option.RandomEvalPerturb=0 nodes=40000 \
-engine cmd=./engines/stockfish/stockfish name=stockfish_pure_0_30k option.RandomEvalPerturb=0 nodes=30000 \
-engine cmd=./engines/stockfish/stockfish name=stockfish_pure_0_20k option.RandomEvalPerturb=0 nodes=20000 \
-engine cmd=./engines/stockfish/stockfish name=stockfish_pure_0_10k option.RandomEvalPerturb=0 nodes=10000 \
-engine cmd=./engines/stockfish/stockfish name=stockfish_pure_0_5k option.RandomEvalPerturb=0 nodes=5000 \
-engine cmd=./engines/stockfish/stockfish name=stockfish_pure_0_4k option.RandomEvalPerturb=0 nodes=4000 \
-engine cmd=./engines/stockfish/stockfish name=stockfish_pure_0_3k option.RandomEvalPerturb=0 nodes=3000 \
-engine cmd=./engines/stockfish/stockfish name=stockfish_pure_0_2k option.RandomEvalPerturb=0 nodes=2000 \
-engine cmd=./engines/stockfish/stockfish name=stockfish_pure_0_1k option.RandomEvalPerturb=0 nodes=1000 \
-engine cmd=./engines/stockfish/stockfish name=stockfish_pure_1_100k option.RandomEvalPerturb=1 nodes=100000 \
-engine cmd=./engines/stockfish/stockfish name=stockfish_pure_2_100k option.RandomEvalPerturb=2 nodes=100000 \
-engine cmd=./engines/stockfish/stockfish name=stockfish_pure_3_100k option.RandomEvalPerturb=3 nodes=100000 \
-engine cmd=./engines/stockfish/stockfish name=stockfish_pure_4_100k option.RandomEvalPerturb=4 nodes=100000 \
-engine cmd=./engines/stockfish/stockfish name=stockfish_pure_5_100k option.RandomEvalPerturb=5 nodes=100000 \
-engine cmd=./engines/stockfish/stockfish name=stockfish_pure_6_100k option.RandomEvalPerturb=6 nodes=100000 \
-engine cmd=./engines/stockfish/stockfish name=stockfish_pure_7_100k option.RandomEvalPerturb=7 nodes=100000 \
-engine cmd=./engines/stockfish/stockfish name=stockfish_pure_8_100k option.RandomEvalPerturb=8 nodes=100000 \
-engine cmd=./engines/stockfish/stockfish name=stockfish_pure_9_100k option.RandomEvalPerturb=9 nodes=100000 \
-engine cmd=./engines/stockfish/stockfish name=stockfish_pure_10_100k option.RandomEvalPerturb=10 nodes=100000 \
-engine cmd=./engines/stockfish/stockfish name=stockfish_pure_12_100k option.RandomEvalPerturb=12 nodes=100000 \
-engine cmd=./engines/stockfish/stockfish name=stockfish_pure_14_100k option.RandomEvalPerturb=14 nodes=100000 \
-engine cmd=./engines/stockfish/stockfish name=stockfish_pure_16_100k option.RandomEvalPerturb=16 nodes=100000 \
-engine cmd=./engines/stockfish/stockfish name=stockfish_pure_18_100k option.RandomEvalPerturb=18 nodes=100000 \
-engine cmd=./engines/stockfish/stockfish name=stockfish_pure_20_100k option.RandomEvalPerturb=20 nodes=100000 \
-engine cmd=./engines/stockfish/stockfish name=stockfish_pure_25_100k option.RandomEvalPerturb=25 nodes=100000 \
-engine cmd=./engines/stockfish/stockfish name=stockfish_pure_30_100k option.RandomEvalPerturb=30 nodes=100000 \
-engine cmd=./engines/stockfish/stockfish name=stockfish_pure_35_100k option.RandomEvalPerturb=35 nodes=100000 \
-engine cmd=./engines/stockfish/stockfish name=stockfish_pure_40_100k option.RandomEvalPerturb=40 nodes=100000 \
-engine cmd=./engines/stockfish/stockfish name=stockfish_pure_45_100k option.RandomEvalPerturb=45 nodes=100000 \
-engine cmd=./engines/stockfish/stockfish name=stockfish_pure_50_100k option.RandomEvalPerturb=50 nodes=100000 \
-engine cmd=./engines/stockfish/stockfish name=stockfish_pure_55_100k option.RandomEvalPerturb=55 nodes=100000 \
-engine cmd=./engines/stockfish/stockfish name=stockfish_pure_60_100k option.RandomEvalPerturb=60 nodes=100000 \
-engine cmd=./engines/stockfish/stockfish name=stockfish_pure_65_100k option.RandomEvalPerturb=65 nodes=100000 \
-engine cmd=./engines/stockfish/stockfish name=stockfish_pure_70_100k option.RandomEvalPerturb=70 nodes=100000 \
-engine cmd=./engines/stockfish/stockfish name=stockfish_pure_75_100k option.RandomEvalPerturb=75 nodes=100000 \
-engine cmd=./engines/stockfish/stockfish name=stockfish_pure_80_100k option.RandomEvalPerturb=80 nodes=100000 \
-engine cmd=./engines/stockfish/stockfish name=stockfish_pure_85_100k option.RandomEvalPerturb=85 nodes=100000 \
-engine cmd=./engines/stockfish/stockfish name=stockfish_pure_90_100k option.RandomEvalPerturb=90 nodes=100000 \
-engine cmd=./engines/stockfish/stockfish name=stockfish_pure_95_100k option.RandomEvalPerturb=95 nodes=100000 \
-engine cmd=./engines/stockfish/stockfish name=stockfish_pure_100_100k option.RandomEvalPerturb=100 nodes=100000
Naming: stockfish_pure_{RandomEvalPerturb}_{nodes}
Code: Sopel97@56a8a4f
Experiment results: https://drive.google.com/drive/folders/14SZEV6TICedYtNZ2Ym2sFCQynoTBMeQ0?usp=sharing (includes a .pgn with moves saved).
Ordo:
# PLAYER : RATING ERROR POINTS PLAYED (%) CFS(%)
1 stockfish_pure_1_100k : 9.8 19.7 1826.0 2250 81 84
2 stockfish_pure_0_100k : 0.0 ---- 1810.5 2250 80 79
3 stockfish_pure_2_100k : -8.1 19.6 1797.5 2250 80 51
4 stockfish_pure_4_100k : -8.4 20.0 1797.0 2250 80 68
5 stockfish_pure_0_90k : -13.1 19.7 1789.5 2250 80 65
6 stockfish_pure_3_100k : -16.8 19.0 1783.5 2250 79 75
7 stockfish_pure_5_100k : -23.2 19.7 1773.0 2250 79 93
8 stockfish_pure_6_100k : -37.8 19.1 1749.0 2250 78 72
9 stockfish_pure_0_80k : -43.2 19.0 1740.0 2250 77 56
10 stockfish_pure_8_100k : -44.7 19.2 1737.5 2250 77 86
11 stockfish_pure_7_100k : -54.8 19.0 1720.5 2250 76 85
12 stockfish_pure_0_70k : -65.2 19.4 1703.0 2250 76 67
13 stockfish_pure_9_100k : -69.6 19.0 1695.5 2250 75 55
14 stockfish_pure_10_100k : -70.8 20.1 1693.5 2250 75 98
15 stockfish_pure_0_60k : -89.6 19.2 1661.5 2250 74 63
16 stockfish_pure_12_100k : -92.8 19.9 1656.0 2250 74 100
17 stockfish_pure_0_50k : -126.1 19.1 1599.0 2250 71 71
18 stockfish_pure_14_100k : -131.6 19.2 1589.5 2250 71 100
19 stockfish_pure_16_100k : -174.4 19.8 1517.0 2250 67 59
20 stockfish_pure_0_40k : -176.5 19.4 1513.5 2250 67 100
21 stockfish_pure_18_100k : -210.8 19.9 1457.0 2250 65 100
22 stockfish_pure_0_30k : -246.6 19.5 1400.0 2250 62 64
23 stockfish_pure_20_100k : -250.2 20.2 1394.5 2250 62 100
24 stockfish_pure_0_20k : -352.1 21.2 1248.5 2250 55 87
25 stockfish_pure_25_100k : -365.2 20.6 1231.5 2250 55 100
26 stockfish_pure_30_100k : -513.1 24.1 1067.0 2250 47 100
27 stockfish_pure_0_10k : -587.6 25.5 999.5 2250 44 100
28 stockfish_pure_35_100k : -669.2 27.1 933.5 2250 41 100
29 stockfish_pure_40_100k : -831.1 30.0 816.5 2250 36 63
30 stockfish_pure_0_5k : -836.2 29.1 813.0 2250 36 100
31 stockfish_pure_0_4k : -927.1 31.5 752.0 2250 33 99
32 stockfish_pure_45_100k : -966.2 31.4 726.5 2250 32 100
33 stockfish_pure_0_3k : -1031.4 33.1 685.0 2250 30 100
34 stockfish_pure_50_100k : -1161.1 34.6 607.0 2250 27 100
35 stockfish_pure_0_2k : -1209.6 36.0 579.5 2250 26 100
36 stockfish_pure_55_100k : -1333.4 38.0 514.0 2250 23 100
37 stockfish_pure_0_1k : -1426.8 41.5 469.5 2250 21 97
38 stockfish_pure_60_100k : -1464.0 42.1 453.0 2250 20 100
39 stockfish_pure_65_100k : -1690.6 49.2 368.5 2250 16 100
40 stockfish_pure_70_100k : -1936.4 62.2 298.5 2250 13 100
41 stockfish_pure_75_100k : -2076.9 68.5 263.5 2250 12 100
42 stockfish_pure_80_100k : -2360.9 84.0 201.0 2250 9 100
43 stockfish_pure_85_100k : -2581.0 93.3 158.5 2250 7 100
44 stockfish_pure_90_100k : -2913.0 115.3 106.0 2250 5 100
45 stockfish_pure_95_100k : -3417.6 175.1 44.0 2250 2 100
46 stockfish_pure_100_100k : -3677.9 186.7 10.0 2250 0 ---
White advantage = 8.06 +/- 1.85
Draw rate (equal opponents) = 55.79 % +/- 0.45
