Releases: bctboi23/CeeChess
CeeChess v2.2
This release comes with minor updates to the move ordering and reductions, providing a reasonable boost in play strength. I am currently rewriting the move ordering entirely, so this will be overhauled in the next update.
STC
Results of 2.2 vs 2.1 (10+0.1, NULL, 16MB, chess.epd):
Elo: 48.08 +/- 7.85, nElo: 66.78 +/- 10.77
LOS: 100.00 %, DrawRatio: 37.80 %, PairsRatio: 1.91
Games: 4000, Wins: 1421, Losses: 871, Draws: 1708, Points: 2275.0 (56.88 %)
Ptnml(0-2): [79, 348, 756, 578, 239], WL/DD Ratio: 0.93
LTC
Results of 2.2 vs 2.1 (60+0.6, NULL, 64MB, chess.epd):
Elo: 43.83 +/- 10.19, nElo: 66.19 +/- 15.23
LOS: 100.00 %, DrawRatio: 39.00 %, PairsRatio: 1.93
Games: 2000, Wins: 626, Losses: 375, Draws: 999, Points: 1125.5 (56.27 %)
Ptnml(0-2): [25, 183, 390, 320, 82], WL/DD Ratio: 0.57
CeeChess v2.1
This minor release removes winboard support from the engine. There should be no difference in playing strength in the engine, but this way the autodetect features of the various GUIs (Arena, LucasChess, etc.) will find UCI only, preventing any old issues with winboard.
CeeChess v2.0
This release comes with a complete rewrite to the tuning mechanism - I hand-wrote SPSA with a newish optimizer seen here: https://arxiv.org/abs/2010.07468. It also comes with a rewrite to the move generator and the board representation, as it is now a bitboard engine. It uses magic bitboards to generate sliding moves, and now the evaluation is much more robust, including mobility, king safety, and threat detection. The self play results are below:
STC:
Results of 2.0 vs 1.4 (10+0.1, NULL, 16MB, chess.epd):
Elo: 161.81 +/- 9.65, nElo: 207.92 +/- 10.77
LOS: 100.00 %, DrawRatio: 27.30 %, PairsRatio: 7.03
Games: 4000, Wins: 2371, Losses: 632, Draws: 997, Points: 2869.5 (71.74 %)
Ptnml(0-2): [38, 143, 546, 588, 685], WL/DD Ratio: 3.11
LTC:
Results of 2.0 vs 1.4 (60+0.6, NULL, 16MB, chess.epd):
Elo: 174.55 +/- 13.35, nElo: 234.55 +/- 15.23
LOS: 100.00 %, DrawRatio: 23.80 %, PairsRatio: 8.65
Games: 2000, Wins: 1188, Losses: 260, Draws: 552, Points: 1464.0 (73.20 %)
Ptnml(0-2): [12, 67, 238, 347, 336], WL/DD Ratio: 2.45
The coming updates (up to 3.0) will involve the following:
- A refactor of the board representation, move generation, and other internal representations (currently bitboards work and pass all tests, but a lot of indices are kind of hacky and not clear or efficient)
- A complete rewrite of the search, including new move ordering, pruning, and other search improvements (SEE is planned, much better history heuristic is planned, and so on)
- Other QoL things, including switching to UCI only, implementing multiPV, and potentially LazySMP or some form of multithreading
After 3.0 I will focus on writing a from scratch NNUE implementation, but I would like to make it to 2800 CCRL (or potentially 3000) without using it.
CeeChess 1.4 - The tuning update
+150 ELO self-play (1s / move)
- Increased Futility and Reverse Futility Pruning Depths
- Tweaked LMR
- Cleaner, easier to understand code
- Researches null window first before searching full window
- Added Second Set of Killer Moves
- Added King Safety in the form of King Tropism
- Extra bonus to diagonals in line with the king
- Extra bonus to attack if enemy king is near semi-open files
- Weighted by attacker's material
- Evaluation tuned using a logistic regression over a custom constructed dataset, similar to the Texel Method
- Black box tuning was done using Simulated annealing + local search, using a pseudo-huber loss
- pseudo-huber loss was used here since there are likely outliers that would unfavorably skew the relatively simple evaluation function. This was a choice I made based on what I understood about the dataset, and made a marginal improvement to the evaluation quality as opposed to the traditional MSE loss (+10ish elo from 2000 games). If my evaluation were more complex, I might be more tempted to stay with MSE loss, as long as on-board checkmates are removed from the dataset
- Black box tuning was done using Simulated annealing + local search, using a pseudo-huber loss
- Fixed some timeout bugs
- Increased hash table stability
(95)
Gauntlet run for test ratings (1 min, 0.5sec inc), with elo centered around the v1.4 release (ratings from bayeselo):
| Rank | Name | Elo | + | - | Games | Score | Oppo. | Draws |
|---|---|---|---|---|---|---|---|---|
| 1 | Barbarossa-0.6.0 | 38 | 34 | 33 | 240 | 55% | 95 | 23% |
| 2 | CeeChess-v1.4 | 0 | 13 | 13 | 1664 | 65% | -13 | 26% |
| 3 | Barbarossa-0.5.0-win10-64 | -34 | 33 | 33 | 240 | 45% | 95 | 28% |
| 4 | Kingfisher.v1.1.1 | -107 | 32 | 33 | 240 | 34% | 95 | 36% |
| 5 | gopher_check | -146 | 34 | 35 | 238 | 29% | 95 | 26% |
| 6 | CeeChess 1.3.2 | -149 | 34 | 36 | 238 | 29% | 95 | 25% |
| ... |
Since CCRL ratings got adjusted down recently (stockfish went from 3900 CCRL to ~3630 afaik), this no longer breaks the CCRL 2400 barrier, but comparing the results here to the old ratings of Barbarossa-0.6.0(2468), Barbarossa-0.5.0(~2375ish i believe?) and the others suggests that this release would have broken that barrier. I now expect the engine to land in the range of 2300-2350, given Barbarossa-0.6.0 has a new rating of 2355
More agressive pruning
(+15 ELO)
- Increased Agressiveness of LMR
- Search less moves fully
- Increased Reverse Futility Depth
Search tuning is over for now, will be moving on to write a king safety evaluation, to hopefully make my engine play much more aggressive and sacrificial chess (while gaining elo of course).
Futility pruning + code cleanup
+5 ELO
- Reduced Razoring to depth 2
- Added Futility Pruning (to depth 6)
- if a node is marked futile, only search checks, captures, and promotions
- Rewrote LMR
- Cleaner, easier to understand code
- Precomputes LMR tables for speed (idea from Ethereal)
Elo was tested with 15"+0.3" games, expect a little more elo in longer time controls
CeeChess v1.3
(+100 ELO)
- added tapered evaluation and game phase interpolation (using the 24 value for game phase calculation)
- added new piece square tables for midgame and endgame (Lyudmils tables in the PSQT challenge)
- changed values for passed pawns in the midgame and endgame
- value the bishop pair more in the endgame
- more aggressive static null move pruning
I changed the name from SeeChess to CeeChess this release to remove the ambiguity of names, as there is already an engine named SEEchess. This release performs about ~2310 ELO according to self play, and I am currently running tests against a new suite of engines to confirm this rating increase.
Check extension fix (2200!)
(+20 ELO)
- Extends checks before qSearch (predicting less elo gain if I had a king safety evaluation).
- Fixed Null Move Pruning constraints to play better in the endgame
- More comments (to help me and others to understand the magnitude of improvements certain code added)
This should bring us over the 2200 hump! Therefore, I am making the version number v1.2 (instead of v1.1.x, as would be normal with this size of change)
More pruning!
( + 15 ELO)
- added static null move pruning (reverse futility pruning)
- cleaned up evaluation constants (added eval.h)
- corrected LMR constraints
Preparing for the move on to rewriting the evaluation. I will begin working on tapered evaluation, with new PSQTs to hopefully play better transitioning to the endgame. After that, the next plans are mobility, and king safety. Once the evaluation function is much more robust, I will feel better about adding more pruning techniques and increasing the aggressiveness of the pruning techniques I already use
Razoring Fix (Final)
(+15 ELO)
New Razoring Scheme:
- More tactically effective
- resolved bugs from old scheme
- increased razoring depth to 3 (with new margins)