Skip to content

Commit f5da089

Browse files
committed
update news
1 parent 368011e commit f5da089

File tree

1 file changed

+27
-0
lines changed

1 file changed

+27
-0
lines changed

NEWS.md

Lines changed: 27 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,30 @@
1+
-------
2+
stemflow version 1.1.6
3+
-------
4+
**Oct, 2025**
5+
6+
Fixed several issues. Fix prediction bug, lazyloading bug; update plotting function; update docs. #82. Also: A previous bug: after getting an attribute of a LazyLoadingEstimator object, the model was not auto-dumped. This is now fixed.
7+
8+
9+
-------
10+
stemflow version 1.1.5
11+
-------
12+
**Oct, 2025**
13+
14+
This is a large update
15+
16+
Features:
17+
1. The major changes are that the `AdaSTEM` class now supports `duckdb` and `parquet` file path as input, this allow the user to pass in large dataset without duplicating the pandas dataframe cross the processors when working with n_jobs>1 parallel computing. See the new Jupyter notebooks for details. #76
18+
2. The lazy loading is no longer realized by the`LazyLoadingEnsemble` class. Instead, it is realized by `LazyLoadingEstimator`. This allow the model to be dumped once its training/prediction is finished, and we don't need to accumulate the models (hence, memory) until the training is finished for the whole ensemble. This will largely reduce the memory use. See the new Jupyter notebooks for details. #77
19+
3. n_jobs > ensemble_folds are no longer supported for user-end clarity. Those jobs are paralleled by ensemble folds so n_jobs > ensemble_folds is meaning less. We do not want to mislead users to think that a 10-ensemble model will be trained faster using n_jobs=20 compared to n_jobs=10.
20+
4. These features will not be available in `SphereAdaSTEM` due to the negligible user market and the negligible advantages. #75
21+
22+
Major bugs fixed:
23+
1. Previously the models are stored in `self.model_dict` dynamically during the parallel ensemble training process, which means the dictionary is being altered during this process. However, we ask for a `self` as input argument for the ensemble-level training function serialization. This is not ideal since the object being serialized should not be changing. This is fixed by assigning the `model_dict` to `self` after all trainings are finished.
24+
2. Also fixed #74
25+
26+
27+
128
-------
229
stemflow version 1.1.3
330
-------

0 commit comments

Comments
 (0)