You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: CHANGELOG.md
+8-9Lines changed: 8 additions & 9 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -8,12 +8,17 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
8
8
## Unreleased
9
9
10
10
### Added
11
-
-`leave_one_out_mask` function (`rectools.models.nn.transformers.utils.leave_one_out_mask`) for applying leave-one-out validation during transformer models training ([#292](https://github.com/MobileTeleSystems/RecTools/pull/292))
12
-
11
+
- HSTU Model from "Actions Speak Louder then Words..." implemented in the class `HSTUModel` ([#290](https://github.com/MobileTeleSystems/RecTools/pull/290))
12
+
-`leave_one_out_mask` function (`rectools.models.nn.transformers.utils.leave_one_out_mask`) for applying leave-one-out validation during transformer models training.([#292](https://github.com/MobileTeleSystems/RecTools/pull/292))
13
+
-`logits_t` argument to `TransformerLightningModuleBase`. It is used to scale logits when computing the loss. ([#290](https://github.com/MobileTeleSystems/RecTools/pull/290))
14
+
-`use_scale_factor` argument to `LearnableInversePositionalEncoding`. It scales embeddings by the square root of their dimension — following the original approach from the "Attention Is All You Need" ([#290](https://github.com/MobileTeleSystems/RecTools/pull/290))
15
+
- Optional `context` argument to `recommend` method of models and `get_context` function to `rectools.dataset.context.py` ([#290](https://github.com/MobileTeleSystems/RecTools/pull/290))
13
16
### Fixed
17
+
-[Breaking] Corrected computation of `cosine` distance in `DistanceSimilarityModule`([#290](https://github.com/MobileTeleSystems/RecTools/pull/290))
14
18
- Installation issue with `cupy` extra on macOS ([#293](https://github.com/MobileTeleSystems/RecTools/pull/293))
15
19
-`torch.dtype object has no attribute 'kind'` error in `TorchRanker` ([#293](https://github.com/MobileTeleSystems/RecTools/pull/293))
16
-
20
+
### Removed
21
+
-[Breaking]`Dropout` module from `IdEmbeddingsItemNet`. This changes model behaviour during training, so model results starting from this release might slightly differ from previous RecTools versions even when the random seed is fixed.([#290](https://github.com/MobileTeleSystems/RecTools/pull/290))
17
22
18
23
## [0.15.0] - 17.07.2025
19
24
@@ -24,7 +29,6 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
24
29
### Fixed
25
30
-[Breaking] Now `LastNSplitter` guarantees taking the last ordered interaction in dataframe in case of identical timestamps ([#288](https://github.com/MobileTeleSystems/RecTools/pull/288))
26
31
27
-
28
32
## [0.14.0] - 16.05.2025
29
33
30
34
### Added
@@ -33,7 +37,6 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
33
37
-`map_location` and `model_params_update` arguments for the function `load_from_checkpoint` for Transformer-based models. Use `map_location` to explicitly specify the computing new device and `model_params_update` to update original model parameters (e.g. remove training-specific parameters that are not needed anymore) ([#281](https://github.com/MobileTeleSystems/RecTools/pull/281))
34
38
-`get_val_mask_func_kwargs` and `get_trainer_func_kwargs` arguments for Transformer-based models to allow keyword arguments in custom functions used for model training. ([#280](https://github.com/MobileTeleSystems/RecTools/pull/280))
35
39
36
-
37
40
## [0.13.0] - 10.04.2025
38
41
39
42
### Added
@@ -53,7 +56,6 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
53
56
### Changed
54
57
- Interactions extra columns are not dropped in `Dataset.filter_interactions` method [#267](https://github.com/MobileTeleSystems/RecTools/pull/267)
55
58
56
-
57
59
## [0.11.0] - 17.02.2025
58
60
59
61
### Added
@@ -68,14 +70,12 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
68
70
69
71
-`ImplicitRanker``rank` method compatible with `Ranker` protocol. `use_gpu` and `num_threads` params moved from `rank` method to `__init__`. [#251](https://github.com/MobileTeleSystems/RecTools/pull/251)
70
72
71
-
72
73
## [0.10.0] - 16.01.2025
73
74
74
75
### Added
75
76
-`ImplicitBPRWrapperModel` model with algorithm description in extended baselines tutorial ([#232](https://github.com/MobileTeleSystems/RecTools/pull/232), [#239](https://github.com/MobileTeleSystems/RecTools/pull/239))
76
77
- All vector models and `EASEModel` support for enabling ranking on GPU and selecting number of threads for CPU ranking. Added `recommend_n_threads` and `recommend_use_gpu_ranking` parameters to `EASEModel`, `ImplicitALSWrapperModel`, `ImplicitBPRWrapperModel`, `PureSVDModel` and `DSSMModel`. Added `recommend_use_gpu_ranking` to `LightFMWrapperModel`. GPU and CPU ranking may provide different ordering of items with identical scores in recommendation table, so this could change ordering items in recommendations since GPU ranking is now used as a default one. ([#218](https://github.com/MobileTeleSystems/RecTools/pull/218))
77
78
78
-
79
79
## [0.9.0] - 11.12.2024
80
80
81
81
### Added
@@ -115,7 +115,6 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
115
115
### Removed
116
116
-[Breaking]`assume_external_ids` parameter in `recommend` and `recommend_to_items` model methods ([#177](https://github.com/MobileTeleSystems/RecTools/pull/177))
Copy file name to clipboardExpand all lines: README.md
+24-25Lines changed: 24 additions & 25 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -24,17 +24,16 @@
24
24
RecTools is an easy-to-use Python library which makes the process of building recommender systems easier and
25
25
faster than ever before.
26
26
27
-
## ✨ Highlights: Transformer models released! ✨
27
+
## ✨ Highlights: HSTU model released! ✨
28
28
29
-
**BERT4Rec and SASRec are now available in RecTools:**
29
+
**HSTU arhictecture from ["Actions speak louder then words..."](https://arxiv.org/abs/2402.17152) is now available in RecTools as `HSTUModel`:**
30
30
- Fully compatible with our `fit` / `recommend` paradigm and require NO special data processing
31
-
- Explicitly described in our [Transformers Theory & Practice Tutorial](examples/tutorials/transformers_tutorial.ipynb): loss options, item embedding options, category features utilization and more!
31
+
- Supports context-aware recommendations in case Relative Time Bias is enabled
32
+
- Supports all loss options, item embedding options, category features utilization and other common modular functionality of RecTools transformer models
33
+
- In [HSTU tutorial](examples/tutorials/transformers_HSTU_tutorial.ipynb) we show that original metrics reported for HSTU on public Movielens datasets may actually be **underestimated**
32
34
- Configurable, customizable, callback-friendly, checkpoints-included, logs-out-of-the-box, custom-validation-ready, multi-gpu-compatible! See [Transformers Advanced Training User Guide](examples/tutorials/transformers_advanced_training_guide.ipynb) and [Transformers Customization Guide](examples/tutorials/transformers_customization_guide.ipynb)
33
-
- Public benchmarks which compare RecTools models to other open-source implementations following BERT4Rec replicability paper show that RecTools implementations achieve highest scores on multiple datasets: [Performance on public transformers benchmarks](https://github.com/blondered/bert4rec_repro?tab=readme-ov-file#rectools-transformers-benchmark-results)
34
-
35
-
36
-
37
35
36
+
Plase note that we always compare the quality of our implementations to academic papers results. [Public benchmarks for transformer models SASRec and BERT4Rec](https://github.com/blondered/bert4rec_repro?tab=readme-ov-file#rectools-transformers-benchmark-results) show that RecTools implementations achieve highest scores on multiple datasets compared to other published results.
38
37
39
38
40
39
## Get started
@@ -48,11 +47,10 @@ unzip ml-1m.zip
48
47
49
48
```python
50
49
import pandas as pd
51
-
from implicit.nearest_neighbours import TFIDFRecommender
52
50
53
51
from rectools import Columns
54
52
from rectools.dataset import Dataset
55
-
from rectools.models importImplicitItemKNNWrapperModel
53
+
from rectools.models importSASRecModel
56
54
57
55
# Read the data
58
56
ratings = pd.read_csv(
@@ -67,7 +65,7 @@ ratings = pd.read_csv(
67
65
dataset = Dataset.construct(ratings)
68
66
69
67
# Fit model
70
-
model =ImplicitItemKNNWrapperModel(TFIDFRecommender(K=10))
68
+
model =SASRecModel(n_factors=64, epochs=100, loss="sampled_softmax")
71
69
model.fit(dataset)
72
70
73
71
# Make recommendations
@@ -105,22 +103,22 @@ pip install rectools[all]
105
103
106
104
## Recommender Models
107
105
The table below lists recommender models that are available in RecTools.
108
-
See [recommender baselines extended tutorial](https://github.com/MobileTeleSystems/RecTools/blob/main/examples/tutorials/baselines_extended_tutorial.ipynb) for deep dive into theory & practice of our supported models.
109
-
110
-
| Model | Type | Description (🎏 for user/item features, 🔆 for warm inference, ❄️ for cold inference support) | Tutorials & Benchmarks |
111
-
|----|----|---------|--------|
112
-
| SASRec | Neural Network |`rectools.models.SASRecModel` - Transformer-based sequential model with unidirectional attention mechanism and "Shifted Sequence" training objective <br>🎏| 📕 [Transformers Theory & Practice](examples/tutorials/transformers_tutorial.ipynb)<br> 📗 [Advanced training guide](examples/tutorials/transformers_advanced_training_guide.ipynb) <br> 📘 [Customization guide](examples/tutorials/transformers_customization_guide.ipynb) <br> 🚀 [Top performance on public benchmarks](https://github.com/blondered/bert4rec_repro?tab=readme-ov-file#rectools-transformers-benchmark-results)|
113
-
| BERT4Rec | Neural Network |`rectools.models.BERT4RecModel` - Transformer-based sequential model with bidirectional attention mechanism and "MLM" (masked item) training objective <br>🎏| 📕 [Transformers Theory & Practice](examples/tutorials/transformers_tutorial.ipynb)<br> 📗 [Advanced training guide](examples/tutorials/transformers_advanced_training_guide.ipynb) <br> 📘 [Customization guide](examples/tutorials/transformers_customization_guide.ipynb) <br> 🚀 [Top performance on public benchmarks](https://github.com/blondered/bert4rec_repro?tab=readme-ov-file#rectools-transformers-benchmark-results)|
114
-
|[implicit](https://github.com/benfred/implicit) ALS Wrapper | Matrix Factorization |`rectools.models.ImplicitALSWrapperModel` - Alternating Least Squares Matrix Factorizattion algorithm for implicit feedback. <br>🎏| 📙 [Theory & Practice](https://rectools.readthedocs.io/en/latest/examples/tutorials/baselines_extended_tutorial.html#Implicit-ALS)<br> 🚀 [50% boost to metrics with user & item features](examples/5_benchmark_iALS_with_features.ipynb)|
| HSTU | Neural Network | `rectools.models.HSTUModel` - Sequential model with unidirectional pointwise aggregated attention mechanism, incorporating relative attention bias from positional and temporal information, introduced in ["Actions speak louder then words..."](https://arxiv.org/pdf/2402.17152), combined with "Shifted Sequence" training objective as in original public benchmarks<br>🎏 | 📓 [HSTU Theory & Practice](examples/tutorials/transformers_HSTU_tutorial.ipynb) <br> 📕 [Transformers Theory & Practice](examples/tutorials/transformers_tutorial.ipynb)<br> 📗 [Advanced training guide](examples/tutorials/transformers_advanced_training_guide.ipynb) <br> 🚀 [Top performance on public datasets](examples/tutorials/transformers_HSTU_tutorial.ipynb)
110
+
| SASRec | Neural Network |`rectools.models.SASRecModel` - Transformer-based sequential model with unidirectional attention mechanism and "Shifted Sequence" training objective <br>🎏| 📕 [Transformers Theory & Practice](examples/tutorials/transformers_tutorial.ipynb)<br> 📗 [Advanced training guide](examples/tutorials/transformers_advanced_training_guide.ipynb) <br> 📘 [Customization guide](examples/tutorials/transformers_customization_guide.ipynb) <br> 🚀 [Top performance on public benchmarks](https://github.com/blondered/bert4rec_repro?tab=readme-ov-file#rectools-transformers-benchmark-results)|
111
+
| BERT4Rec | Neural Network |`rectools.models.BERT4RecModel` - Transformer-based sequential model with bidirectional attention mechanism and "MLM" (masked item) training objective <br>🎏| 📕 [Transformers Theory & Practice](examples/tutorials/transformers_tutorial.ipynb)<br> 📗 [Advanced training guide](examples/tutorials/transformers_advanced_training_guide.ipynb) <br> 📘 [Customization guide](examples/tutorials/transformers_customization_guide.ipynb) <br> 🚀 [Top performance on public benchmarks](https://github.com/blondered/bert4rec_repro?tab=readme-ov-file#rectools-transformers-benchmark-results)|
112
+
|[implicit](https://github.com/benfred/implicit) ALS Wrapper | Matrix Factorization |`rectools.models.ImplicitALSWrapperModel` - Alternating Least Squares Matrix Factorizattion algorithm for implicit feedback. <br>🎏| 📙 [Theory & Practice](https://rectools.readthedocs.io/en/latest/examples/tutorials/baselines_extended_tutorial.html#Implicit-ALS)<br> 🚀 [50% boost to metrics with user & item features](examples/5_benchmark_iALS_with_features.ipynb)|
|[implicit](https://github.com/benfred/implicit) ItemKNN Wrapper | Nearest Neighbours |`rectools.models.ImplicitItemKNNWrapperModel` - Algorithm that calculates item-item similarity matrix using distances between item vectors in user-item interactions matrix | 📙 [Theory & Practice](https://rectools.readthedocs.io/en/latest/examples/tutorials/baselines_extended_tutorial.html#ItemKNN)|
117
-
|[LightFM](https://github.com/lyst/lightfm) Wrapper | Matrix Factorization |`rectools.models.LightFMWrapperModel` - Hybrid matrix factorization algorithm which utilises user and item features and supports a variety of losses.<br>🎏 🔆 ❄️| 📙 [Theory & Practice](https://rectools.readthedocs.io/en/latest/examples/tutorials/baselines_extended_tutorial.html#LightFM)<br>🚀 [10-25 times faster inference with RecTools](examples/6_benchmark_lightfm_inference.ipynb)|
| PureSVD | Matrix Factorization |`rectools.models.PureSVDModel` - Truncated Singular Value Decomposition of user-item interactions matrix | 📙 [Theory & Practice](https://rectools.readthedocs.io/en/latest/examples/tutorials/baselines_extended_tutorial.html#PureSVD)|
120
-
| DSSM | Neural Network |`rectools.models.DSSMModel` - Two-tower Neural model that learns user and item embeddings utilising their explicit features and learning on triplet loss.<br>🎏 🔆 | - |
121
-
| Popular | Heuristic |`rectools.models.PopularModel` - Classic baseline which computes popularity of items and also accepts params like time window and type of popularity computation.<br>❄️| - |
122
-
| Popular in Category | Heuristic |`rectools.models.PopularInCategoryModel` - Model that computes poularity within category and applies mixing strategy to increase Diversity.<br>❄️| - |
123
-
| Random | Heuristic |`rectools.models.RandomModel` - Simple random algorithm useful to benchmark Novelty, Coverage, etc.<br>❄️| - |
115
+
|[LightFM](https://github.com/lyst/lightfm) Wrapper | Matrix Factorization |`rectools.models.LightFMWrapperModel` - Hybrid matrix factorization algorithm which utilises user and item features and supports a variety of losses.<br>🎏 🔆 ❄️| 📙 [Theory & Practice](https://rectools.readthedocs.io/en/latest/examples/tutorials/baselines_extended_tutorial.html#LightFM)<br>🚀 [10-25 times faster inference with RecTools](examples/6_benchmark_lightfm_inference.ipynb)|
| PureSVD | Matrix Factorization |`rectools.models.PureSVDModel` - Truncated Singular Value Decomposition of user-item interactions matrix| 📙 [Theory & Practice](https://rectools.readthedocs.io/en/latest/examples/tutorials/baselines_extended_tutorial.html#PureSVD)|
118
+
| DSSM | Neural Network |`rectools.models.DSSMModel` - Two-tower Neural model that learns user and item embeddings utilising their explicit features and learning on triplet loss.<br>🎏 🔆| - |
119
+
| Popular | Heuristic |`rectools.models.PopularModel` - Classic baseline which computes popularity of items and also accepts params like time window and type of popularity computation.<br>❄️| - |
120
+
| Popular in Category | Heuristic |`rectools.models.PopularInCategoryModel` - Model that computes poularity within category and applies mixing strategy to increase Diversity.<br>❄️| - |
121
+
| Random | Heuristic |`rectools.models.RandomModel` - Simple random algorithm useful to benchmark Novelty, Coverage, etc.<br>❄️| - |
124
122
125
123
- All of the models follow the same interface. **No exceptions**
126
124
- No need for manual creation of sparse matrixes, torch dataloaders or mapping ids. Preparing data for models is as simple as `dataset = Dataset.construct(interactions_df)`
0 commit comments