Releases: dholzmueller/pytabkit
Releases · dholzmueller/pytabkit
v1.7.2
v1.7.0
- added xRFM (D, HPO)
- added new
'tabarena-new'search space for RealMLP-HPO, including per-fold ensembling (more expensive)
and tuning two more categorical hyperparameters
(with better results) - reduced RealMLP pickle size by not storing the dataset (#33)
- fixed gradient clipping for TabM
(it did nothing previously, see #34).
To ensure backward compatibility, it is set to None in the HPO search spaces now
(it was already None in the default parameters). - removed debug print in TabM training loop
v1.6.1
- For n_ens>1, changed the default behavior for classification to averaging probabilities instead of logits. This can be reverted by setting ens_av_before_softmax=True.
- Implemented time limit for HPO/ensemble methods through time_limit_s parameter.
- Support torch>=2.6 and Python 3.13.
v1.5.2
- v1.5.2: fixed more device bugs for HPO and ensembling
v1.5.1
- v1.5.1: fixed a device bug in TabM for GPU
- v1.5.0:
- added
n_repeatsparameter to scikit-learn interfaces for repeated cross-validation - HPO sklearn interfaces (the ones using random search)
can now do weighted ensembling instead by settinguse_caruana_ensembling=True.
Removed theRealMLP_Ensemble_ClassifierandRealMLP_Ensemble_Regressorfrom v1.4.2
since they are now redundant through this feature. - renamed
spaceparameter of GBDT HPO interface
tohpo_space_nameso now it also works with non-TPE versions. - Added new TabArena search spaces for boosted trees (not TPE),
which should be almost equivalent to the ones from TabArena
except for the early stopping logic. - TabM now supports
val_metric_namefor early stopping on different metrics. - fixed issues #20 and #21 regarding HPO
- small updates for the "Rethinking Early Stopping" paper
- added
v1.4.2
- fixed handling of custom val_metric_name HPO models and Ensemble_TD_Regressor.
- If tmp_folder is specified in HPO models, save each model to disk immediately instead of holding all of them in memory. This can considerably reduce RAM/VRAM usage. In this case, pickled HPO models will still rely on the models stored in the tmp_folder.
- We now provide RealMLP_Ensemble_Classifier and RealMLP_Ensemble_Regressor, which will use weighted ensembling and usually perform better than HPO (but have slower inference time). We recommend using the new hpo_space_name='tabarena' for best results.
v1.4.0
What's Changed
- moved some imports to the new models optional dependencies to have a more light-weight RealMLP installation
- Added GPU support for CatBoost with help from Maximilian Schambach in #16 (not guaranteed to produce exactly the same results).
- Ensembling now saves models after training if a path is supplied, to reduce memory usage
- Added more search spaces
- fixed error in multiquantile output when the passed y was one-dimensional instead of having shape (n_samples, 1)
- Added some examples to the documentation
v1.3.0
- Added multiquantile regression for RealMLP: see the documentation
- More hyperparameters for RealMLP
- Added TabICL wrapper
- Small fixes