Skip to content

Commit 30ebfa1

Browse files
authored
Merge pull request #10 from sgrvinod/0.3.0
0.3.0
2 parents 1dcf110 + 34d648f commit 30ebfa1

File tree

42 files changed

+72744
-107
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

42 files changed

+72744
-107
lines changed

CHANGELOG.md

Lines changed: 22 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,26 @@
11
# Change Log
22

3+
## v0.3.0
4+
5+
### Added
6+
7+
* There are 3 new datasets: [ML23c](https://github.com/sgrvinod/chess-transformers#ml23c), [GC22c](https://github.com/sgrvinod/chess-transformers#gc22c), and [ML23d](https://github.com/sgrvinod/chess-transformers#ml23d).
8+
* A new naming convention for datasets is used. Datasets are now named in the format "[*PGN Fileset*][*Filters*]". For example, *LE1222* is now called [*LE22ct*](https://github.com/sgrvinod/chess-transformers#le22ct), where *LE22* is the name of the PGN fileset from which this dataset was derived, and "*c*", "*t*" are filters for games that ended in checkmates and games that used a specific time control respectively.
9+
* [*CT-EFT-85*](https://github.com/sgrvinod/chess-transformers#ct-eft-85) is a new trained model with about 85 million parameters.
10+
* **`chess_transformers.train.utils.get_lr()`** now accepts new arguments, `schedule` and `decay`, to accomodate a new learning rate schedule: exponential decay after warmup.
11+
* **`chess_transformers.data.prepare_data()`** now handles errors where there is a mismatch between the number of moves and the number of FENs, or when the recorded result in the PGN file was incorrect. Such games are now reported and excluded during dataset creation.
12+
13+
### Changed
14+
15+
* The *LE1222* and *LE1222x* datasets are now renamed to [*LE22ct*](https://github.com/sgrvinod/chess-transformers#le22ct) and [*LE22c*](https://github.com/sgrvinod/chess-transformers#le22c) respectively.
16+
* All calls to **`chess_transformers.train.utils.get_lr()`** now use the `schedule` and `decay` arguments, even in cases where a user-defined decay is not required.
17+
* **`chess_transformers.train.datasets.ChessDatasetFT`** was optimized for large datasets. A list of indices for the data split is no longer maintained or indexed in the dataset.
18+
* Dependencies in [**`setup.py`**](https://github.com/sgrvinod/chess-transformers/blob/main/setup.py) have been updated to newer versions.
19+
* Fixed an error in **`chess_transformers.play.model_v_model()`** where a move would be attempted by the model playing black even after white won the game with a checkmate.
20+
* Fixed the `EVAL_GAMES_FOLDER` parameter in the model configuration files pointing to the incorrect folder name **`chess_transformers/eval`** instead of **`chess_transformers/evaluate`**.
21+
* Fixed an error in **`chess_transformers.evaluate.metrics.elo_delta_margin()`** where the upper limit of the winrate for the confidence interval was not capped at a value of 1.
22+
* All calls to `torch.load()` now use `weights_only=True` in compliance with its updated API.
23+
324
## v0.2.1
425

526
### Changed
@@ -13,7 +34,7 @@
1334
* **`ChessTransformerEncoderFT`** is an encoder-only transformer that predicts source (*From*) and destination squares (*To*) squares for the next half-move, instead of the half-move in UCI notation.
1435
* [*CT-EFT-20*](https://github.com/sgrvinod/chess-transformers#ct-eft-20) is a new trained model of this type with about 20 million parameters.
1536
* **`ChessDatasetFT`** is a PyTorch dataset class for this model type.
16-
* [**`chess_transformer.data.levels`**](https://github.com/sgrvinod/chess-transformers/blob/main/chess_transformers/data/levels.py) provides a standardized vocabulary (with indices) for oft-used categorical variables. All models and datasets will hereon use this standard vocabulary instead of a dataset-specific vocabulary.
37+
* [**`chess_transformers.data.levels`**](https://github.com/sgrvinod/chess-transformers/blob/main/chess_transformers/data/levels.py) provides a standardized vocabulary (with indices) for oft-used categorical variables. All models and datasets will hereon use this standard vocabulary instead of a dataset-specific vocabulary.
1738

1839
### Changed
1940

README.md

Lines changed: 165 additions & 30 deletions
Large diffs are not rendered by default.
Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,21 @@
1+
import os
2+
3+
###############################
4+
############ Name #############
5+
###############################
6+
7+
NAME = "GC22c" # name and identifier for this configuration
8+
9+
###############################
10+
############ Data #############
11+
###############################
12+
13+
DATA_FOLDER = (
14+
os.path.join(os.environ.get("CT_DATA_FOLDER"), NAME)
15+
if os.environ.get("CT_DATA_FOLDER")
16+
else None
17+
) # folder containing all data files
18+
H5_FILE = NAME + ".h5" # H5 file containing data
19+
MAX_MOVE_SEQUENCE_LENGTH = 10 # expected maximum length of move sequences
20+
EXPECTED_ROWS = 27000000 # expected number of rows, approximately, in the data
21+
VAL_SPLIT_FRACTION = 0.95 # marker (% into the data) where the validation split begins

chess_transformers/configs/data/LE1222x.py renamed to chess_transformers/configs/data/LE22c.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@
44
############ Name #############
55
###############################
66

7-
NAME = "LE1222x" # name and identifier for this configuration
7+
NAME = "LE22c" # name and identifier for this configuration
88

99
###############################
1010
############ Data #############
Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@
44
############ Name #############
55
###############################
66

7-
NAME = "LE1222" # name and identifier for this configuration
7+
NAME = "LE22ct" # name and identifier for this configuration
88

99
###############################
1010
############ Data #############
Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,21 @@
1+
import os
2+
3+
###############################
4+
############ Name #############
5+
###############################
6+
7+
NAME = "ML23c" # name and identifier for this configuration
8+
9+
###############################
10+
############ Data #############
11+
###############################
12+
13+
DATA_FOLDER = (
14+
os.path.join(os.environ.get("CT_DATA_FOLDER"), NAME)
15+
if os.environ.get("CT_DATA_FOLDER")
16+
else None
17+
) # folder containing all data files
18+
H5_FILE = NAME + ".h5" # H5 file containing data
19+
MAX_MOVE_SEQUENCE_LENGTH = 10 # expected maximum length of move sequences
20+
EXPECTED_ROWS = 11000000 # expected number of rows, approximately, in the data
21+
VAL_SPLIT_FRACTION = 0.925 # marker (% into the data) where the validation split begins
Lines changed: 22 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,22 @@
1+
import os
2+
3+
###############################
4+
############ Name #############
5+
###############################
6+
7+
NAME = "ML23d" # name and identifier for this configuration
8+
9+
###############################
10+
############ Data #############
11+
###############################
12+
13+
DATA_FOLDER = (
14+
os.path.join(os.environ.get("CT_DATA_FOLDER"), NAME)
15+
if os.environ.get("CT_DATA_FOLDER")
16+
else None
17+
) # folder containing all data files
18+
H5_FILE = NAME + ".h5" # H5 file containing data
19+
MAX_MOVE_SEQUENCE_LENGTH = 10 # expected maximum length of move sequences
20+
EXPECTED_ROWS = 170000000 # expected number of rows, approximately, in the data
21+
VAL_SPLIT_FRACTION = 0.98 # marker (% into the data) where the validation split begins
22+
ADD_LOSS_TOKEN = False
Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1 +1 @@
1-
__all__ = ["LE1222", "LE1222x"]
1+
__all__ = ["LE22c", "LE22ct", "ML23c", "ML23d", "GC22c"]

chess_transformers/configs/models/CT-E-20.py

Lines changed: 11 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22
import pathlib
33

44
from chess_transformers.train.utils import get_lr
5-
from chess_transformers.configs.data.LE1222 import *
5+
from chess_transformers.configs.data.LE22ct import *
66
from chess_transformers.configs.other.stockfish import *
77
from chess_transformers.train.datasets import ChessDataset
88
from chess_transformers.configs.other.fairy_stockfish import *
@@ -64,10 +64,16 @@
6464
PRINT_FREQUENCY = 1 # print status once every so many steps
6565
N_STEPS = 100000 # number of training steps
6666
WARMUP_STEPS = 8000 # number of warmup steps where learning rate is increased linearly; twice the value in the paper, as in the official transformer repo.
67-
STEP = 1 # the step number, start from 1 to prevent math error in the next line
67+
STEP = 1 # the step number, start from 1 to prevent math error in the 'LR' line
68+
LR_SCHEDULE = "vaswani" # the learning rate schedule; see utils.py for learning rate schedule
69+
LR_DECAY = None # the decay rate for 'exp_decay' schedule
6870
LR = get_lr(
69-
step=STEP, d_model=D_MODEL, warmup_steps=WARMUP_STEPS
70-
) # see utils.py for learning rate schedule; twice the schedule in the paper, as in the official transformer repo.
71+
step=STEP,
72+
d_model=D_MODEL,
73+
warmup_steps=WARMUP_STEPS,
74+
schedule=LR_SCHEDULE,
75+
decay=LR_DECAY,
76+
) # see utils.py for learning rate schedule
7177
START_EPOCH = 0 # start at this epoch
7278
BETAS = (0.9, 0.98) # beta coefficients in the Adam optimizer
7379
EPSILON = 1e-9 # epsilon term in the Adam optimizer
@@ -105,5 +111,5 @@
105111
################################
106112

107113
EVAL_GAMES_FOLDER = str(
108-
pathlib.Path(__file__).parent.parent.parent.resolve() / "eval" / "games" / NAME
114+
pathlib.Path(__file__).parent.parent.parent.resolve() / "evaluate" / "games" / NAME
109115
) # folder where evaluation games are saved in PGN files

chess_transformers/configs/models/CT-ED-45.py

Lines changed: 11 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22
import pathlib
33

44
from chess_transformers.train.utils import get_lr
5-
from chess_transformers.configs.data.LE1222 import *
5+
from chess_transformers.configs.data.LE22ct import *
66
from chess_transformers.configs.other.stockfish import *
77
from chess_transformers.train.datasets import ChessDataset
88
from chess_transformers.configs.other.fairy_stockfish import *
@@ -64,10 +64,16 @@
6464
PRINT_FREQUENCY = 1 # print status once every so many steps
6565
N_STEPS = 100000 # number of training steps
6666
WARMUP_STEPS = 8000 # number of warmup steps where learning rate is increased linearly; twice the value in the paper, as in the official transformer repo.
67-
STEP = 1 # the step number, start from 1 to prevent math error in the next line
67+
STEP = 1 # the step number, start from 1 to prevent math error in the 'LR' line
68+
LR_SCHEDULE = "vaswani" # the learning rate schedule; see utils.py for learning rate schedule
69+
LR_DECAY = None # the decay rate for 'exp_decay' schedule
6870
LR = get_lr(
69-
step=STEP, d_model=D_MODEL, warmup_steps=WARMUP_STEPS
70-
) # see utils.py for learning rate schedule; twice the schedule in the paper, as in the official transformer repo.
71+
step=STEP,
72+
d_model=D_MODEL,
73+
warmup_steps=WARMUP_STEPS,
74+
schedule=LR_SCHEDULE,
75+
decay=LR_DECAY,
76+
) # see utils.py for learning rate schedule
7177
START_EPOCH = 0 # start at this epoch
7278
BETAS = (0.9, 0.98) # beta coefficients in the Adam optimizer
7379
EPSILON = 1e-9 # epsilon term in the Adam optimizer
@@ -105,5 +111,5 @@
105111
################################
106112

107113
EVAL_GAMES_FOLDER = str(
108-
pathlib.Path(__file__).parent.parent.parent.resolve() / "eval" / "games" / NAME
114+
pathlib.Path(__file__).parent.parent.parent.resolve() / "evaluate" / "games" / NAME
109115
) # folder where evaluation games are saved in PGN files

0 commit comments

Comments
 (0)