Skip to content

[RFC] Considerations for "Skill Level" #3635

@Sopel97

Description

@Sopel97

There was some discussion recently about possible improvements to the implementation of "Skill Level" in Stockfish on discord. @vondele suggested to try eval randomization, in particular interpolation of the NNUE eval with N(0, RookValueEg) using some parameter. I've implemented it and tested briefly, but it requires more work to assess the quality of the games played and calibrate the Elo rating. The initial results suggest that it might be a good direction.

The experiment I performed was to use pure NNUE evaluation at fixed nodes and vary the interpolation parameter. 46 configurations played a round-robin tournament with 50 games in each pair. The following c-chess-cli command was used:

#!/bin/bash
 
c-chess-cli \
    -concurrency 16 \
    -rounds 1 \
    -games 50 \
    -openings file=/home/sopel/nnue/c-chess-cli/noob_3moves.epd order=random -repeat -resign 3 700 -draw 8 10 \
    -pgn tournament.pgn 2 \
    -each tc=1000+1 \
    -engine cmd=./engines/stockfish/stockfish name=stockfish_pure_0_100k option.RandomEvalPerturb=0 nodes=100000 \
    -engine cmd=./engines/stockfish/stockfish name=stockfish_pure_0_90k option.RandomEvalPerturb=0 nodes=90000 \
    -engine cmd=./engines/stockfish/stockfish name=stockfish_pure_0_80k option.RandomEvalPerturb=0 nodes=80000 \
    -engine cmd=./engines/stockfish/stockfish name=stockfish_pure_0_70k option.RandomEvalPerturb=0 nodes=70000 \
    -engine cmd=./engines/stockfish/stockfish name=stockfish_pure_0_60k option.RandomEvalPerturb=0 nodes=60000 \
    -engine cmd=./engines/stockfish/stockfish name=stockfish_pure_0_50k option.RandomEvalPerturb=0 nodes=50000 \
    -engine cmd=./engines/stockfish/stockfish name=stockfish_pure_0_40k option.RandomEvalPerturb=0 nodes=40000 \
    -engine cmd=./engines/stockfish/stockfish name=stockfish_pure_0_30k option.RandomEvalPerturb=0 nodes=30000 \
    -engine cmd=./engines/stockfish/stockfish name=stockfish_pure_0_20k option.RandomEvalPerturb=0 nodes=20000 \
    -engine cmd=./engines/stockfish/stockfish name=stockfish_pure_0_10k option.RandomEvalPerturb=0 nodes=10000 \
    -engine cmd=./engines/stockfish/stockfish name=stockfish_pure_0_5k option.RandomEvalPerturb=0 nodes=5000 \
    -engine cmd=./engines/stockfish/stockfish name=stockfish_pure_0_4k option.RandomEvalPerturb=0 nodes=4000 \
    -engine cmd=./engines/stockfish/stockfish name=stockfish_pure_0_3k option.RandomEvalPerturb=0 nodes=3000 \
    -engine cmd=./engines/stockfish/stockfish name=stockfish_pure_0_2k option.RandomEvalPerturb=0 nodes=2000 \
    -engine cmd=./engines/stockfish/stockfish name=stockfish_pure_0_1k option.RandomEvalPerturb=0 nodes=1000 \
    -engine cmd=./engines/stockfish/stockfish name=stockfish_pure_1_100k option.RandomEvalPerturb=1 nodes=100000 \
    -engine cmd=./engines/stockfish/stockfish name=stockfish_pure_2_100k option.RandomEvalPerturb=2 nodes=100000 \
    -engine cmd=./engines/stockfish/stockfish name=stockfish_pure_3_100k option.RandomEvalPerturb=3 nodes=100000 \
    -engine cmd=./engines/stockfish/stockfish name=stockfish_pure_4_100k option.RandomEvalPerturb=4 nodes=100000 \
    -engine cmd=./engines/stockfish/stockfish name=stockfish_pure_5_100k option.RandomEvalPerturb=5 nodes=100000 \
    -engine cmd=./engines/stockfish/stockfish name=stockfish_pure_6_100k option.RandomEvalPerturb=6 nodes=100000 \
    -engine cmd=./engines/stockfish/stockfish name=stockfish_pure_7_100k option.RandomEvalPerturb=7 nodes=100000 \
    -engine cmd=./engines/stockfish/stockfish name=stockfish_pure_8_100k option.RandomEvalPerturb=8 nodes=100000 \
    -engine cmd=./engines/stockfish/stockfish name=stockfish_pure_9_100k option.RandomEvalPerturb=9 nodes=100000 \
    -engine cmd=./engines/stockfish/stockfish name=stockfish_pure_10_100k option.RandomEvalPerturb=10 nodes=100000 \
    -engine cmd=./engines/stockfish/stockfish name=stockfish_pure_12_100k option.RandomEvalPerturb=12 nodes=100000 \
    -engine cmd=./engines/stockfish/stockfish name=stockfish_pure_14_100k option.RandomEvalPerturb=14 nodes=100000 \
    -engine cmd=./engines/stockfish/stockfish name=stockfish_pure_16_100k option.RandomEvalPerturb=16 nodes=100000 \
    -engine cmd=./engines/stockfish/stockfish name=stockfish_pure_18_100k option.RandomEvalPerturb=18 nodes=100000 \
    -engine cmd=./engines/stockfish/stockfish name=stockfish_pure_20_100k option.RandomEvalPerturb=20 nodes=100000 \
    -engine cmd=./engines/stockfish/stockfish name=stockfish_pure_25_100k option.RandomEvalPerturb=25 nodes=100000 \
    -engine cmd=./engines/stockfish/stockfish name=stockfish_pure_30_100k option.RandomEvalPerturb=30 nodes=100000 \
    -engine cmd=./engines/stockfish/stockfish name=stockfish_pure_35_100k option.RandomEvalPerturb=35 nodes=100000 \
    -engine cmd=./engines/stockfish/stockfish name=stockfish_pure_40_100k option.RandomEvalPerturb=40 nodes=100000 \
    -engine cmd=./engines/stockfish/stockfish name=stockfish_pure_45_100k option.RandomEvalPerturb=45 nodes=100000 \
    -engine cmd=./engines/stockfish/stockfish name=stockfish_pure_50_100k option.RandomEvalPerturb=50 nodes=100000 \
    -engine cmd=./engines/stockfish/stockfish name=stockfish_pure_55_100k option.RandomEvalPerturb=55 nodes=100000 \
    -engine cmd=./engines/stockfish/stockfish name=stockfish_pure_60_100k option.RandomEvalPerturb=60 nodes=100000 \
    -engine cmd=./engines/stockfish/stockfish name=stockfish_pure_65_100k option.RandomEvalPerturb=65 nodes=100000 \
    -engine cmd=./engines/stockfish/stockfish name=stockfish_pure_70_100k option.RandomEvalPerturb=70 nodes=100000 \
    -engine cmd=./engines/stockfish/stockfish name=stockfish_pure_75_100k option.RandomEvalPerturb=75 nodes=100000 \
    -engine cmd=./engines/stockfish/stockfish name=stockfish_pure_80_100k option.RandomEvalPerturb=80 nodes=100000 \
    -engine cmd=./engines/stockfish/stockfish name=stockfish_pure_85_100k option.RandomEvalPerturb=85 nodes=100000 \
    -engine cmd=./engines/stockfish/stockfish name=stockfish_pure_90_100k option.RandomEvalPerturb=90 nodes=100000 \
    -engine cmd=./engines/stockfish/stockfish name=stockfish_pure_95_100k option.RandomEvalPerturb=95 nodes=100000 \
    -engine cmd=./engines/stockfish/stockfish name=stockfish_pure_100_100k option.RandomEvalPerturb=100 nodes=100000

Naming: stockfish_pure_{RandomEvalPerturb}_{nodes}
Code: Sopel97@56a8a4f
Experiment results: https://drive.google.com/drive/folders/14SZEV6TICedYtNZ2Ym2sFCQynoTBMeQ0?usp=sharing (includes a .pgn with moves saved).
Ordo:

   # PLAYER                     :   RATING  ERROR  POINTS  PLAYED   (%)  CFS(%)
   1 stockfish_pure_1_100k      :      9.8   19.7  1826.0    2250    81      84
   2 stockfish_pure_0_100k      :      0.0   ----  1810.5    2250    80      79
   3 stockfish_pure_2_100k      :     -8.1   19.6  1797.5    2250    80      51
   4 stockfish_pure_4_100k      :     -8.4   20.0  1797.0    2250    80      68
   5 stockfish_pure_0_90k       :    -13.1   19.7  1789.5    2250    80      65
   6 stockfish_pure_3_100k      :    -16.8   19.0  1783.5    2250    79      75
   7 stockfish_pure_5_100k      :    -23.2   19.7  1773.0    2250    79      93
   8 stockfish_pure_6_100k      :    -37.8   19.1  1749.0    2250    78      72
   9 stockfish_pure_0_80k       :    -43.2   19.0  1740.0    2250    77      56
  10 stockfish_pure_8_100k      :    -44.7   19.2  1737.5    2250    77      86
  11 stockfish_pure_7_100k      :    -54.8   19.0  1720.5    2250    76      85
  12 stockfish_pure_0_70k       :    -65.2   19.4  1703.0    2250    76      67
  13 stockfish_pure_9_100k      :    -69.6   19.0  1695.5    2250    75      55
  14 stockfish_pure_10_100k     :    -70.8   20.1  1693.5    2250    75      98
  15 stockfish_pure_0_60k       :    -89.6   19.2  1661.5    2250    74      63
  16 stockfish_pure_12_100k     :    -92.8   19.9  1656.0    2250    74     100
  17 stockfish_pure_0_50k       :   -126.1   19.1  1599.0    2250    71      71
  18 stockfish_pure_14_100k     :   -131.6   19.2  1589.5    2250    71     100
  19 stockfish_pure_16_100k     :   -174.4   19.8  1517.0    2250    67      59
  20 stockfish_pure_0_40k       :   -176.5   19.4  1513.5    2250    67     100
  21 stockfish_pure_18_100k     :   -210.8   19.9  1457.0    2250    65     100
  22 stockfish_pure_0_30k       :   -246.6   19.5  1400.0    2250    62      64
  23 stockfish_pure_20_100k     :   -250.2   20.2  1394.5    2250    62     100
  24 stockfish_pure_0_20k       :   -352.1   21.2  1248.5    2250    55      87
  25 stockfish_pure_25_100k     :   -365.2   20.6  1231.5    2250    55     100
  26 stockfish_pure_30_100k     :   -513.1   24.1  1067.0    2250    47     100
  27 stockfish_pure_0_10k       :   -587.6   25.5   999.5    2250    44     100
  28 stockfish_pure_35_100k     :   -669.2   27.1   933.5    2250    41     100
  29 stockfish_pure_40_100k     :   -831.1   30.0   816.5    2250    36      63
  30 stockfish_pure_0_5k        :   -836.2   29.1   813.0    2250    36     100
  31 stockfish_pure_0_4k        :   -927.1   31.5   752.0    2250    33      99
  32 stockfish_pure_45_100k     :   -966.2   31.4   726.5    2250    32     100
  33 stockfish_pure_0_3k        :  -1031.4   33.1   685.0    2250    30     100
  34 stockfish_pure_50_100k     :  -1161.1   34.6   607.0    2250    27     100
  35 stockfish_pure_0_2k        :  -1209.6   36.0   579.5    2250    26     100
  36 stockfish_pure_55_100k     :  -1333.4   38.0   514.0    2250    23     100
  37 stockfish_pure_0_1k        :  -1426.8   41.5   469.5    2250    21      97
  38 stockfish_pure_60_100k     :  -1464.0   42.1   453.0    2250    20     100
  39 stockfish_pure_65_100k     :  -1690.6   49.2   368.5    2250    16     100
  40 stockfish_pure_70_100k     :  -1936.4   62.2   298.5    2250    13     100
  41 stockfish_pure_75_100k     :  -2076.9   68.5   263.5    2250    12     100
  42 stockfish_pure_80_100k     :  -2360.9   84.0   201.0    2250     9     100
  43 stockfish_pure_85_100k     :  -2581.0   93.3   158.5    2250     7     100
  44 stockfish_pure_90_100k     :  -2913.0  115.3   106.0    2250     5     100
  45 stockfish_pure_95_100k     :  -3417.6  175.1    44.0    2250     2     100
  46 stockfish_pure_100_100k    :  -3677.9  186.7    10.0    2250     0     ---
 
White advantage = 8.06 +/- 1.85
Draw rate (equal opponents) = 55.79 % +/- 0.45

Plots:

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions