Skip to content

Commit d3df5a4

Browse files
committed
fixed until quiescence search and added profiling
1 parent dc567a0 commit d3df5a4

File tree

6 files changed

+100
-64
lines changed

6 files changed

+100
-64
lines changed

docs/AlphaDeepChess/Capitulos/AnalysisOfImprovements.tex

Lines changed: 60 additions & 24 deletions
Original file line numberDiff line numberDiff line change
@@ -1,45 +1,81 @@
11
\chapter{Analysis and evaluation}\label{cap:analysis}
22

3-
This chapter documents the implementation of the following techniques used to improve the chess engine:
3+
This chapter presents an analysis of the performance of the chess engine through profiling, identifying its most computationally intensive components. We then evaluate the effectiveness of the improvement techniques described in ~\cref{cap:ImprovementTechniques}. Finally, we compare the performance of \textit{AlphaDeepChess} against \textit{Stockfish}, and examine its position within the Elo rating distribution on \textit{Lichess.org}.
44

5-
\begin{itemize}[itemsep=1pt]
6-
\item Transposition tables with zobrist hashing.
7-
\item Move generator with magic bitboards and PEXT instructions.
8-
\item Evaluation with king safety and piece mobility parameters.
9-
\item Multithread search.
10-
\item Search with Late move Reductions.
11-
\end{itemize}
5+
\newpage
126

7+
\section{Profiling}
8+
In order to analyze the performance of our chess engine and identify potential bottlenecks where the code consume the most execution time, we used the \texttt{perf} tool available on Linux systems. \texttt{perf} provides robust profiling capabilities by recording CPU events, sampling function execution, and collecting stack traces ~\cite{PerfLinux}.
139

14-
\newpage
10+
\vspace{1em}
1511

16-
\noindent Achieving this level of efficiency and quality requires a well-structured development process. For this reason, we adopted a systematic methodology to guide the implementation and continuous improvement of our engine.
12+
\noindent We run the engine under \texttt{perf} using the following commands:
1713

18-
\section{Methodology}
14+
\begin{lstlisting}[language=bash, caption={Profiling \textit{AlphaDeepChess} with perf}, frame=single, breaklines=true]
15+
# Record performance data with function stack traces
16+
sudo perf record -g ./build/release/AlphaDeepChess
1917

20-
Once the basic foundations are established with an initial version of the essential components or modules (which will be described later), our workflow follows an iterative process: first, we search for existing information on each topic, analyze it, implement a solution, and then profile the implementation to identify bottlenecks. After locating performance issues, we optimize the relevant parts, and finally, compare the new version with the previous one to assess improvements.
18+
# Display interactive report
19+
sudo perf report -g --no-children
20+
\end{lstlisting}
2121

22-
\vspace{1em}
22+
\noindent After recording, \texttt{perf report} opens an interactive terminal interface where functions are sorted by CPU overhead, allowing us to easily identify performance-critical regions.
23+
24+
\vspace{2em}
2325

24-
\noindent Then, at a given moment, we can decide to take action and try to determine the strength of the engine with the last functional version.
26+
\noindent First, we profile the basic architecture of the engine implemented in~\cref{cap:descripcionTrabajo}, and then evaluate it again after applying the optimizations described in~\cref{cap:ImprovementTechniques}.
2527

26-
\subsection*{Profiler}
28+
\subsection*{Profiling of basic engine architecture}
2729

28-
First, in order to analyze the performance of our chess engine and identify potential bottlenecks, we used the \texttt{perf} tool available on Linux systems. \texttt{perf} provides robust profiling capabilities by recording CPU events, sampling function execution, and collecting stack traces.
30+
\noindent As shown in~\cref{tab:profilingBasic}, the profiling results indicate that the majority of the total execution time is spent in the legal move generation function. Specifically, the functions \texttt{generate\_legal\_moves}, \texttt{calculate\_moves\_in\_dir}, and \texttt{update\_danger\_in\_dir} together account for over 72\% of the total overhead. Therefore, the optimizations on this component are expected to yield significant performance improvements.
31+
32+
\begin{table}[H]
33+
\centering
34+
\begin{tabular}{|l|r|}
35+
\hline
36+
\textit{Symbol} & \textit{Overhead} \\
37+
\hline
38+
\texttt{generate\_legal\_moves} & 36.07\% \\
39+
\texttt{calculate\_moves\_in\_dir} & 19.30\% \\
40+
\texttt{evaluate\_position} & 16.63\% \\
41+
\texttt{update\_danger\_in\_dir} & 16.23\% \\
42+
\texttt{calculate\_king\_moves} & 1.24\% \\
43+
\texttt{quiescence\_search} & 0.96\% \\
44+
\texttt{...} & ... \\
45+
\hline
46+
\end{tabular}
47+
\caption{Profiling results of the basic engine implementation.}
48+
\label{tab:profilingBasic}
49+
\end{table}
2950

3051
\vspace{1em}
3152

32-
\noindent Our profiling goal is to identify which parts of the code consume the most execution time. We run the engine under \texttt{perf} using the following commands:
53+
\subsection*{Profiling with improvement techniques}
3354

34-
\begin{lstlisting}[language=bash, caption={Profiling \textit{AlphaDeepChess} with perf}, frame=single, breaklines=true]
35-
# Record performance data with function stack traces
36-
sudo perf record -g ./build/release/AlphaDeepChess
55+
\noindent As shown in~\cref{tab:profilingImprovements}, the updated profiling results demonstrate a successful reduction in the computational cost of move generation. The execution time is now more evenly distributed across various modules, with position evaluation emerging as the new primary performance bottleneck. This shift confirms the effectiveness of the implemented optimization techniques.
3756

38-
# Display interactive report
39-
sudo perf report -g --no-children
40-
\end{lstlisting}
57+
\begin{table}[H]
58+
\centering
59+
\begin{tabular}{|l|r|}
60+
\hline
61+
\textit{Symbol} & \textit{Overhead} \\
62+
\hline
63+
\texttt{evaluate\_position} & 31.90\% \\
64+
\texttt{update\_attacks\_bb} & 22.62\% \\
65+
\texttt{generate\_legal\_moves} & 22.71\% \\
66+
\texttt{order\_moves} & 3.95\% \\
67+
\texttt{make\_move} & 3.83\% \\
68+
\texttt{alpha\_beta\_search} & 1.66\% \\
69+
\texttt{...} & ... \\
70+
\hline
71+
\end{tabular}
72+
\caption{Profiling results after applying optimization techniques.}
73+
\label{tab:profilingImprovements}
74+
\end{table}
75+
76+
\newpage
4177

42-
\noindent After recording, \texttt{perf report} opens an interactive terminal interface where functions are sorted by CPU overhead. This allows us to prioritize which functions to optimize.
78+
\section{cutechess}
4379

4480
\noindent The most common way to measure the strength of a chess engine is by playing games against other engines and analyzing the results. To quantify this strength, the Elo rating system is used. Elo is a statistical rating system originally developed for chess, which assigns a numerical value to each player (or engine) based on their game results against opponents of known strength. When an engine wins games against higher-rated opponents, its Elo increases; if it loses, its Elo decreases. This allows for an objective comparison of playing strength between different engines.
4581

docs/AlphaDeepChess/Capitulos/DescripcionTrabajo.tex

Lines changed: 24 additions & 21 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
\chapter{Basic engine architecture}\label{cap:descripcionTrabajo}
22

3-
This chapter documents the development process of the chess engine. The project is organized into the following modules:
3+
This chapter documents the basic architecture of the chess engine. The project is organized into the following modules:
44

55
\begin{itemize}[itemsep=1pt]
66
\item \textit{Board}: Data structures to represent the chess board.
@@ -11,12 +11,14 @@ \chapter{Basic engine architecture}\label{cap:descripcionTrabajo}
1111
\item \textit{UCI}: Universal Chess Interface implementation.
1212
\end{itemize}
1313

14-
\noindent First, we describe the implementation of the basic parts of the chess engine, then we introduce and explain in detail the algorithmic techniques developed to improve the engine's performance.
14+
\noindent First, we describe the implementation of the basic parts of the chess engine, then in the following~\cref{cap:ImprovementTechniques} we introduce and explain in detail the algorithmic techniques developed to improve the engine's performance.
1515

1616
\vspace{1em}
1717

1818
\noindent We begin by examining the fundamental data structure used for chess position representation.
1919

20+
\newpage
21+
2022
\section{Chessboard representation: bitboards}
2123

2224
\noindent The chessboard is represented using a list of \textit{bitboards}. A bitboard is a 64-bit variable in which each bit corresponds to a square on the board. A bit is set to \texttt{1} if a piece occupies the corresponding square and \texttt{0} otherwise. The least significant bit (LSB) represents the \texttt{a1} square, while the most significant bit (MSB) corresponds to \texttt{h8}~\cite{Bitboards}.
@@ -31,36 +33,39 @@ \section{Chessboard representation: bitboards}
3133

3234
\vspace{1em}
3335

34-
\begin{figure}
35-
\centering
36-
\newchessgame
37-
\chessboard[
38-
showmover=false,
39-
setfen=7k/8/5p2/2p1p1p1/P2p3p/1P1P1P1P/2P1P1P1/R2K3R w KQ - 0 1
40-
]
41-
42-
\vspace{1.0em}
36+
\begin{figure}[H]
4337

44-
\begin{minipage}[c]{0.30\textwidth}
45-
\includegraphics[width=\textwidth]{Imagenes/bitboard_white_pawns.png}
46-
\caption*{Bitboard of white pawns.}
38+
\begin{minipage}[c]{0.35\textwidth}
39+
\newchessgame
40+
\chessboard[
41+
showmover=false,
42+
setfen=7k/8/5p2/2p1p1p1/P2p3p/1P1P1P1P/2P1P1P1/R2K3R w KQ - 0 1
43+
]
4744
\end{minipage}
4845
\hfill
49-
\begin{minipage}[c]{0.30\textwidth}
46+
\begin{minipage}[c]{0.36\textwidth}
5047
\includegraphics[width=\textwidth]{Imagenes/bitboard_black_pawns.png}
5148
\caption*{Bitboard of black pawns.}
5249
\end{minipage}
50+
51+
\vspace{1.1em}
52+
\hspace*{0.03\textwidth}
53+
\begin{minipage}[c]{0.36\textwidth}
54+
\includegraphics[width=\textwidth]{Imagenes/bitboard_white_pawns.png}
55+
\caption*{Bitboard of white pawns.}
56+
\end{minipage}
5357
\hfill
54-
\begin{minipage}[c]{0.30\textwidth}
58+
\begin{minipage}[c]{0.36\textwidth}
5559
\includegraphics[width=\textwidth]{Imagenes/bitboard_white_rooks.png}
5660
\caption*{Bitboard of white rooks.}
5761
\end{minipage}
62+
5863
\caption{List of bitboards data structure example.}\label{fig:bitboardPositionExample}
5964
\end{figure}
6065

6166
\noindent The main advantages of bitboards is that we can operate on multiple squares simultaneously using bitwise operations. For example, we can determine if there are any black pawns on the fifth rank by performing a bitwise AND operation with the corresponding mask.~\cref{fig:bitboardMaskOperation} illustrates this concept.
6267

63-
\begin{figure}
68+
\begin{figure}[H]
6469
\centering
6570
\begin{minipage}[c]{0.30\textwidth}
6671
\centering
@@ -170,14 +175,11 @@ \subsection*{Horizon effect problem, quiescence search}\label{sec:horizon-effect
170175

171176
\vspace{1em}
172177

173-
% TODO: MEJOR EXPLICAD LAS DIFERENCIAS QUE REPETIR EL TEXTO
174-
\noindent The following events occur in a quiescence node:
178+
\noindent The same events occur in a quiescence node as in a regular search node, with the following key differences in execution steps:
175179

176180
\begin{enumerate}
177-
\item \textit{Terminal node verification}: Check for game termination conditions due to checkmate, threefold repetition, the fifty-move rule or reaching a maximum ply.
178181
\item \textit{Standing pat evaluation}: Also known as static evaluation, this step assigns a preliminary score to the position. This score can serve as a lower bound and is immediately used to determine whether alpha-beta pruning can be applied.
179182
\item \textit{Selective legal move generation}: Create a list of every possible legal move excluding moves that are not captures.
180-
\item \textit{Move ordering}: Sort capture moves by estimated quality (best to worst).
181183
\item \textit{Move exploration}: Iterate through each of the capture legal moves from the position in order, update the position evaluation, the value of alpha and beta, and check if we can perform pruning.
182184
\end{enumerate}
183185

@@ -404,6 +406,7 @@ \subsection*{Tapered evaluation}
404406
\end{figure}
405407

406408
\section{Move generator}
409+
\label{sec:moveGenerator}
407410

408411
Calculating the legal moves in a chess position is a more difficult and tedious task
409412
than it might seem, mainly due to the unintuitive rules of \textit{en passant} and castling,

docs/AlphaDeepChess/Capitulos/ImprovementTechniques.tex

Lines changed: 5 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -145,17 +145,7 @@ \subsection*{Collisions}
145145

146146
\section{Move generator with magic bitboards and pext instructions}
147147

148-
To identify potential performance bottlenecks, we performed profiling on the engine, as shown in~\cref{fig:profiling}.
149-
150-
\vspace{1em}
151-
152-
\begin{figure}
153-
\centering
154-
\includegraphics[width=1.0\textwidth]{Imagenes/basic_move_generator_profiling.png}
155-
\caption{Profiling results.}\label{fig:profiling}
156-
\end{figure}
157-
158-
\noindent The profiling results indicate that the majority of the total execution time is spent in the legal move generation function. Therefore, optimizing this component is expected to yield significant performance improvements.
148+
As previously discussed (see ~\cref{sec:moveGenerator}), computing the legal moves for sliding pieces is computationally expensive, as it requires identifying which pieces block their paths within their attack patterns. In this section, we present a technique that enables the precomputation of all possible moves for rooks and bishops, while queen moves can be derived as the union of rook and bishop moves, allowing constant-time O(1) access.
159149

160150
\subsection*{Magic bitboards}
161151

@@ -171,12 +161,8 @@ \subsection*{Magic bitboards}
171161

172162
\begin{itemize}[itemsep=1pt]
173163
\item Preserves relevant blocker information:
174-
The nearest blockers along a piece's movement direction are preserved.
175-
\textit{Example:} Consider a rook with two pawns in its path:
176-
\begin{center}
177-
Rook $\rightarrow \rightarrow \rightarrow$ [Pawn1][Pawn2]
178-
\end{center}
179-
In this case, only `Pawn1` blocks the rook's movement, while `Pawn2` is irrelevant.
164+
Only the nearest blockers along a sliding piece's movement direction are important. For example, in ~\cref{fig:magics_position}, the pawn on d6 is a relevant blocker because it directly restricts the rook's movement. In contrast, the pawn on d7 is irrelevant, as it lies beyond the first blocker and does not influence the final set of legal moves.
165+
180166
\item Compresses the blocker bitboard, pushing the important bits near the most significant bit.
181167
\item The final multiplication must produce a unique index for each possible blocker configuration. The way to ensure the uniqueness is by brute force testing.
182168
\end{itemize}
@@ -193,12 +179,12 @@ \subsection*{Magic bitboards}
193179
\chessboard[
194180
showmover=false,
195181
setfen=n1bk3r/3p4/1p1p2p1/8/3R1p2/8/3p4/7n w - - 0 1,
196-
markstyle=circle,
182+
markstyle=border,
197183
color=red, markfields={d6,f4,d2},
198184
color=green, markfields={c4,b4,a4,e4,d5,d3}
199185
]
200186
\end{minipage}
201-
\caption{Initial chess position with white rook and blockers}\label{fig:magics_position}
187+
\caption{Chess position with white rook legal moves in green and blockers in red.}\label{fig:magics_position}
202188
\end{figure}
203189

204190
\subsection*{Magic number generation}
180 KB
Loading

docs/AlphaDeepChess/TFGTeXiS.pdf

3.65 KB
Binary file not shown.

docs/AlphaDeepChess/biblio.bib

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -364,3 +364,14 @@ @book{Russell2021artificial
364364
publisher={Pearson Education},
365365
lastaccess = {June, 2024}
366366
}
367+
368+
@misc{PerfLinux,
369+
author = {kernel.org},
370+
title = {perf: Linux profiling with performance counters },
371+
howpublished = {online},
372+
year = {2024},
373+
url={https://perfwiki.github.io/main},
374+
lastaccess = {May, 2025}
375+
}
376+
377+

0 commit comments

Comments
 (0)