-
Notifications
You must be signed in to change notification settings - Fork 40
Description
Feature type
-
Add new functionality
-
Change existing functionality
General description of the proposed functionality
There are features measured during morphological profiling that are dependent on the positioning or rotation of the microscope. Simple examples of this are centroids and orientation measurements. Other examples would include measurements on bounding boxes, the image below shows how the bounding box area of a cell changes under rotation of the microscope.
Taking CellProfiler as an example, there are multiples of these measurements. When used for machine learning or statistical analysis they introduce technical noise and can contribute to batch effect and data leakage.
Feature example
I have a trial solution of this that requires the user to specify what software was used to generate their measurements and then iterates over feature names matching the patterns of variant features that have been identified manually. My solution extends feature_select like this
from pycytominer import feature_select
non_variant = feature_select(normalized_df, operation="drop_non_bio_variant", drop_non_bio_variant_data_source="cellprofiler")Alternative Solutions
No response
Additional information
No response