Have you considered using aggregate profiles (e.g., at the well level) instead of single-cell data? The high variance in single-cell data might negatively affect the ranking. It's difficult to assess scores below 0.50 because we can't tell if the replicates are bad or if the treatment effect is simply small. This is further complicated by cellular heterogeneity, where different subsets of cells might respond differently. Have you visualized the single-cell populations using a UMAP to check for heterogeneity between the control and treatment groups?
Originally posted by @axiomcura in #74 (comment)