Skip to content

Commit e4f0b7c

Browse files
authored
Metrics: using batches for ARI and NMI (clustering_overlap) (#68)
* working ari and nmi batch * consistent naming * add changelog
1 parent c6bb27a commit e4f0b7c

File tree

3 files changed

+55
-2
lines changed

3 files changed

+55
-2
lines changed

CHANGELOG.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,8 @@
55
* Added `metrics/kbet_pg` and `metrics/kbet_pg_label` components (PR #52).
66
* Added `method/drvi` component (PR #61).
77

8+
* Added `ARI_batch` and `NMI_batch` to `metrics/clustering_overlap` (PR #68).
9+
810
## Minor changes
911

1012
* Un-pin the scPRINT version and update parameters (PR #51)

src/metrics/clustering_overlap/config.vsh.yaml

Lines changed: 45 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -49,6 +49,51 @@ info:
4949
min: 0
5050
max: 1
5151
maximize: true
52+
- name: ari_batch
53+
label: ARI_batch
54+
summary: This version of Adjusted Rand Index compares clustering overlap, correcting for batches
55+
and considering correct overlaps and disagreements.
56+
description: |
57+
The Adjusted Rand Index (ARI) compares the overlap of two clusterings;
58+
it considers both correct clustering overlaps while also counting correct
59+
disagreements between two clusterings. We compared the batches with the
60+
NMI-optimized Louvain clustering computed on the integrated dataset.
61+
The adjustment of the Rand index corrects for randomly correct labels.
62+
An ARI_batch of 0 or 1 corresponds to no batch correction or well corrected batches,
63+
respectively.
64+
references:
65+
doi:
66+
- 10.1038/s41592-021-01336-8
67+
- 10.1007/bf01908075
68+
links:
69+
homepage: https://scib.readthedocs.io/en/latest/
70+
documentation: https://scib.readthedocs.io/en/latest/api/scib.metrics.silhouette_batch.html
71+
repository: https://github.com/theislab/scib
72+
min: 0
73+
max: 1
74+
maximize: true
75+
- name: nmi_batch
76+
label: NMI_batch
77+
summary: This version of NMI compares overlap by scaling using mean entropy terms and optimizing
78+
Louvain clustering to obtain the best outcome of batch correction.
79+
description: |
80+
Normalized Mutual Information (NMI) compares the overlap of two clusterings.
81+
We used NMI to compare the batches with Louvain clusters computed on
82+
the integrated dataset. The overlap was scaled using the mean of the entropy terms
83+
for cell-type and cluster labels, then subracted from 1. Thus, NMI_batch scores of 0 or 1 correspond to no batch correction
84+
or well corrected batches, respectively. We performed optimized Louvain clustering
85+
for this metric to obtain the best outcome of batch correction.
86+
references:
87+
doi:
88+
- 10.1145/2808797.2809344
89+
- 10.1038/s41592-021-01336-8
90+
links:
91+
homepage: https://scib.readthedocs.io/en/latest/
92+
documentation: https://scib.readthedocs.io/en/latest/api/scib.metrics.silhouette_batch.html
93+
repository: https://github.com/theislab/scib
94+
min: 0
95+
max: 1
96+
maximize: true
5297
arguments:
5398
- name: --resolutions
5499
type: double

src/metrics/clustering_overlap/script.py

Lines changed: 8 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -47,14 +47,20 @@
4747
print('Compute NMI score', flush=True)
4848
nmi_score = nmi(adata, cluster_key=cluster_key, label_key="cell_type")
4949

50+
print('Compute ARI score with batches', flush=True)
51+
ari_batch_score = 1 - ari(adata, cluster_key=cluster_key, label_key="batch")
52+
53+
print('Compute NMI score with batches', flush=True)
54+
nmi_batch_score = 1 - nmi(adata, cluster_key=cluster_key, label_key="batch")
55+
5056
print("Create output AnnData object", flush=True)
5157
output = ad.AnnData(
5258
uns={
5359
"dataset_id": adata.uns['dataset_id'],
5460
'normalization_id': adata.uns['normalization_id'],
5561
"method_id": adata.uns['method_id'],
56-
"metric_ids": [ "ari", "nmi" ],
57-
"metric_values": [ ari_score, nmi_score ]
62+
"metric_ids": [ "ari", "nmi", "ari_batch", "nmi_batch" ],
63+
"metric_values": [ ari_score, nmi_score, ari_batch_score, nmi_batch_score ]
5864
}
5965
)
6066

0 commit comments

Comments
 (0)