Releases · dholzmueller/pytabkit · GitHub

17 Dec 12:08

dholzmueller

v1.7.2 Latest

Latest

Added scikit-learn 1.8 compatibility.
Removed debug print in RealMLP.
fixed device memory estimation error in the scheduler when CUDA_VISIBLE_DEVICES was used.

Assets 2

05 Nov 11:53

dholzmueller

v1.7.0

added xRFM (D, HPO)
added new 'tabarena-new' search space for RealMLP-HPO, including per-fold ensembling (more expensive)
and tuning two more categorical hyperparameters
(with better results)
reduced RealMLP pickle size by not storing the dataset (#33)
fixed gradient clipping for TabM
(it did nothing previously, see #34).
To ensure backward compatibility, it is set to None in the HPO search spaces now
(it was already None in the default parameters).
removed debug print in TabM training loop

Assets 2

14 Aug 11:01

dholzmueller

v1.6.1

For n_ens>1, changed the default behavior for classification to averaging probabilities instead of logits. This can be reverted by setting ens_av_before_softmax=True.
Implemented time limit for HPO/ensemble methods through time_limit_s parameter.
Support torch>=2.6 and Python 3.13.

Assets 2

26 Jun 09:18

dholzmueller

v1.5.2

v1.5.2: fixed more device bugs for HPO and ensembling

Assets 2

25 Jun 21:24

dholzmueller

v1.5.1

v1.5.1: fixed a device bug in TabM for GPU
v1.5.0:
- added n_repeats parameter to scikit-learn interfaces for repeated cross-validation
- HPO sklearn interfaces (the ones using random search)
  can now do weighted ensembling instead by setting use_caruana_ensembling=True.
  Removed the RealMLP_Ensemble_Classifier and RealMLP_Ensemble_Regressor from v1.4.2
  since they are now redundant through this feature.
- renamed space parameter of GBDT HPO interface
  to hpo_space_name so now it also works with non-TPE versions.
- Added new TabArena search spaces for boosted trees (not TPE),
  which should be almost equivalent to the ones from TabArena
  except for the early stopping logic.
- TabM now supports val_metric_name for early stopping on different metrics.
- fixed issues #20 and #21 regarding HPO
- small updates for the "Rethinking Early Stopping" paper

Assets 2

16 Jun 14:56

dholzmueller

v1.4.2

fixed handling of custom val_metric_name HPO models and Ensemble_TD_Regressor.
If tmp_folder is specified in HPO models, save each model to disk immediately instead of holding all of them in memory. This can considerably reduce RAM/VRAM usage. In this case, pickled HPO models will still rely on the models stored in the tmp_folder.
We now provide RealMLP_Ensemble_Classifier and RealMLP_Ensemble_Regressor, which will use weighted ensembling and usually perform better than HPO (but have slower inference time). We recommend using the new hpo_space_name='tabarena' for best results.

Assets 2

24 May 18:59

dholzmueller

v1.4.0

What's Changed

moved some imports to the new models optional dependencies to have a more light-weight RealMLP installation
Added GPU support for CatBoost with help from Maximilian Schambach in #16 (not guaranteed to produce exactly the same results).
Ensembling now saves models after training if a path is supplied, to reduce memory usage
Added more search spaces
fixed error in multiquantile output when the passed y was one-dimensional instead of having shape (n_samples, 1)
Added some examples to the documentation

Assets 2

12 Mar 21:44

dholzmueller

v1.3.0

Added multiquantile regression for RealMLP: see the documentation
More hyperparameters for RealMLP
Added TabICL wrapper
Small fixes

Assets 2