-
Notifications
You must be signed in to change notification settings - Fork 673
Labels
Area – APIAPI designAPI designArea – Differential ExpressionDifferential expressionDifferential expressiongood first issueeasy first issue to get started in OSS community contribution!easy first issue to get started in OSS community contribution!
Description
What kind of feature would you like to request?
Additional function parameters / changed functionality / changed defaults?
Please describe your wishes
The current aggregate function does not output how many things were aggregated to get the aggregation i.e., cells into metacells, replicates in conditions, etc.
Offhand, since by can be a list, I would think we want at least two obs columns added:
n_obs_aggregatedrepresents the total number of observations that have been aggregated into a given row of the returnedAnnDataobjectf"n_{by[i]}_aggregated"for eachbycounts how many of a given subgroup are present ifbyis a list. So if you onlyaggregateby cell type, for example, i.e.,by="cell_type", this column would not be present because it would not make sense - that value is simplyn_obs_aggregated. But if it wereby=["patient", "cell_type"]then you would haven_obs_aggregatedis the number of cells present in each patient-celltype row,n_cell_type_aggregatedis the number of cells of that cell type present in the row, nadn_patient_aggregatedis the number of patients
If we do point 2., we should better settle on the naming convention because it would basically represent a breaking change if we were to alter it down the line (until scanpy 2.0)
Metadata
Metadata
Assignees
Labels
Area – APIAPI designAPI designArea – Differential ExpressionDifferential expressionDifferential expressiongood first issueeasy first issue to get started in OSS community contribution!easy first issue to get started in OSS community contribution!