Skip to content

Transformers, new features, transfer learning#424

Open
stewarthe6 wants to merge 44 commits into1.8.0from
feat_scaled_rdkit_mordred
Open

Transformers, new features, transfer learning#424
stewarthe6 wants to merge 44 commits into1.8.0from
feat_scaled_rdkit_mordred

Conversation

@stewarthe6
Copy link
Collaborator

@stewarthe6 stewarthe6 commented Feb 25, 2026

This is a large pull request with 3 new features.

  • Additional functionality and features for transfer learning or using a previously trained AMPL model as a feature encoder.
  • 2 New feature sets that scale rdkit and mordred features.
  • Additional feature that allows you to fit and use transforms on larger/unlabeled datasets.
  • Removed deprecated UMAP feature transformer.

stewarthe6 and others added 30 commits January 21, 2025 10:10
Ipc should not be changed to AvgIpc like this because it would break all rdkit_raw models.
…th RobustScaler and PowerTransformer. Updated documentation in related sections. Added functions to ModelFileReader to read out transformer specific parameters. Changed models that test RobustScaler and PowerTransformer to use RF to speed up the training
… it more generalizeable. Fixed tests. Fixed bug where the imputer_strategy parameter was not used
…ndicator' flag because that changed the number of features and crashed.
…model, if transformers are saved and loaded correctly, and if transform_dataset_key_config is saved correctly
…r want to set that manually. Instead added a check when saving metadata to see if the parameters object has that attribute
…well as infill nan or extremely large values
@codecov
Copy link

codecov bot commented Feb 26, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.

@@            Coverage Diff             @@
##            1.8.0     #424      +/-   ##
==========================================
+ Coverage   40.51%   41.65%   +1.14%     
==========================================
  Files          50       51       +1     
  Lines       13518    13729     +211     
==========================================
+ Hits         5477     5719     +242     
+ Misses       8041     8010      -31     
Flag Coverage Δ
unittests 41.65% <100.00%> (+1.14%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
atomsci/ddm/pipeline/compare_models.py 41.08% <ø> (ø)
atomsci/ddm/pipeline/featurization.py 67.59% <100.00%> (+3.11%) ⬆️
atomsci/ddm/pipeline/model_datasets.py 68.35% <100.00%> (+0.26%) ⬆️
atomsci/ddm/pipeline/model_tracker.py 17.45% <ø> (ø)
atomsci/ddm/pipeline/model_wrapper.py 66.70% <100.00%> (+0.41%) ⬆️
atomsci/ddm/pipeline/parameter_parser.py 91.54% <100.00%> (+0.09%) ⬆️
atomsci/ddm/pipeline/transformations.py 70.69% <100.00%> (+12.82%) ⬆️
atomsci/ddm/utils/generate_transformers.py 100.00% <100.00%> (ø)
atomsci/ddm/utils/hyperparam_search_wrapper.py 0.00% <ø> (ø)
atomsci/ddm/utils/model_file_reader.py 70.27% <100.00%> (+4.11%) ⬆️

... and 2 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

…re_transformers when transformers is None. This does not test the pipeline with no transformers, just that the function returns correctly
- Tests that the heavyatom_col paramter is used correctly and cases when there is no heavyatom_col.
- Tests that the NotImplementedError is raised correctly when there is no feature count or if there is no way to featurize data.
- Tests that the Identity features transforms are returned correctly. And that an error is raised if an unrecognized feature transform is used.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant