Skip to content

Releases: dholzmueller/pytabkit

v1.7.2

17 Dec 12:08

Choose a tag to compare

  • Added scikit-learn 1.8 compatibility.
  • Removed debug print in RealMLP.
  • fixed device memory estimation error in the scheduler when CUDA_VISIBLE_DEVICES was used.

v1.7.0

05 Nov 11:53

Choose a tag to compare

  • added xRFM (D, HPO)
  • added new 'tabarena-new' search space for RealMLP-HPO, including per-fold ensembling (more expensive)
    and tuning two more categorical hyperparameters
    (with better results)
  • reduced RealMLP pickle size by not storing the dataset (#33)
  • fixed gradient clipping for TabM
    (it did nothing previously, see #34).
    To ensure backward compatibility, it is set to None in the HPO search spaces now
    (it was already None in the default parameters).
  • removed debug print in TabM training loop

v1.6.1

14 Aug 11:01

Choose a tag to compare

  • For n_ens>1, changed the default behavior for classification to averaging probabilities instead of logits. This can be reverted by setting ens_av_before_softmax=True.
  • Implemented time limit for HPO/ensemble methods through time_limit_s parameter.
  • Support torch>=2.6 and Python 3.13.

v1.5.2

26 Jun 09:18

Choose a tag to compare

  • v1.5.2: fixed more device bugs for HPO and ensembling

v1.5.1

25 Jun 21:24

Choose a tag to compare

  • v1.5.1: fixed a device bug in TabM for GPU
  • v1.5.0:
    • added n_repeats parameter to scikit-learn interfaces for repeated cross-validation
    • HPO sklearn interfaces (the ones using random search)
      can now do weighted ensembling instead by setting use_caruana_ensembling=True.
      Removed the RealMLP_Ensemble_Classifier and RealMLP_Ensemble_Regressor from v1.4.2
      since they are now redundant through this feature.
    • renamed space parameter of GBDT HPO interface
      to hpo_space_name so now it also works with non-TPE versions.
    • Added new TabArena search spaces for boosted trees (not TPE),
      which should be almost equivalent to the ones from TabArena
      except for the early stopping logic.
    • TabM now supports val_metric_name for early stopping on different metrics.
    • fixed issues #20 and #21 regarding HPO
    • small updates for the "Rethinking Early Stopping" paper

v1.4.2

16 Jun 14:56

Choose a tag to compare

  • fixed handling of custom val_metric_name HPO models and Ensemble_TD_Regressor.
  • If tmp_folder is specified in HPO models, save each model to disk immediately instead of holding all of them in memory. This can considerably reduce RAM/VRAM usage. In this case, pickled HPO models will still rely on the models stored in the tmp_folder.
  • We now provide RealMLP_Ensemble_Classifier and RealMLP_Ensemble_Regressor, which will use weighted ensembling and usually perform better than HPO (but have slower inference time). We recommend using the new hpo_space_name='tabarena' for best results.

v1.4.0

24 May 18:59

Choose a tag to compare

What's Changed

  • moved some imports to the new models optional dependencies to have a more light-weight RealMLP installation
  • Added GPU support for CatBoost with help from Maximilian Schambach in #16 (not guaranteed to produce exactly the same results).
  • Ensembling now saves models after training if a path is supplied, to reduce memory usage
  • Added more search spaces
  • fixed error in multiquantile output when the passed y was one-dimensional instead of having shape (n_samples, 1)
  • Added some examples to the documentation

v1.3.0

12 Mar 21:44

Choose a tag to compare

  • Added multiquantile regression for RealMLP: see the documentation
  • More hyperparameters for RealMLP
  • Added TabICL wrapper
  • Small fixes