Skip to content

Conversation

@amalia-k510
Copy link
Contributor

@amalia-k510 amalia-k510 commented Apr 25, 2025

This adds two functions to compute modularity scores from a given graph and a clustering like Leiden or Louvain. The goal is to make it easier to compare different community detection methods using an external metric. To my knowledge, there is no built-in way to compare clustering results nor ways to calculate modularity score.

Fixes #2908

Functions:

  • modularity(): accepts a connectivity matrix and label array, generates an igraph, and returns the modularity score
  • modularity_adata(): AnnData wrapper that pulls graph and clustering labels

@amalia-k510 amalia-k510 marked this pull request as ready for review April 25, 2025 11:24
@flying-sheep flying-sheep added this to the 1.12.0 milestone Apr 25, 2025
Copy link
Member

@flying-sheep flying-sheep left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi! Apart from the issue with igraph being an optional dependency, this looks good!

I think we have get_igraph_from_adjacency which might be useful, but maybe not.

Could you please add tests? We have many examples on how, best would probably be

  1. for the direct variant, manually create very small graphs to run this on so you can be sure the results are correct
  2. for the anndata version, use neighbors to create the connectivity matrix.

Please add @needs.igraph so the test only runs when igraph is installed

if you’re unsure about anything, please search the code for examples or ask me!

If you end up implementing a non-igraph flavor for this, please test using parametrization, e.g.: @pytest.mark.parametrize("directed", [True, False], ids=["directed", "undirected"])

Copy link
Member

@flying-sheep flying-sheep left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, this is shaping up! The tests are looking particularly great!

Please address these previous comments:

  1. Please follow the doc style as described here: #3613 (comment)

Also I mentioned in there “But here a is_directed: bool parameter would be better anyway, there will never be more than two options.”

What do you think?

  1. Don’t densify (in modularity_adata or anywhere): #3613 (comment)
  2. Why the np.asarray? #3613 (comment)

@flying-sheep
Copy link
Member

Oh, I just saw that we have get_igraph_from_adjacency, can you use that?

Copy link
Member

@flying-sheep flying-sheep left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! A few points:

  • please add a release note (see the other PR, please adapt the PR number in the command I mention there when running it for this PR)
  • please move the correct change to unique_labels (to_numpy) from the other PR into this one:
    unique_labels = pd.unique(np.concatenate((orig.to_numpy(), new.to_numpy())))
  • please change mode to is_directed: bool (no default value) for modularity

Once your other PR (the neighbors one) is merged, let’s finish this one up:

  • we should make neighbors store an is_directed param in .uns[key_added or "neighbors"]["params"]
  • then we change the modularity_adata parameter obsp: str parameter to key: str = "neighbors"
  • then we use adata.uns[key]["connectivities_key"] and adata.uns[key]["params"]["is_directed"] in modularity_adata to call modularity.

@flying-sheep flying-sheep changed the title Draft PR: Add modularity and modularity_adata functions to scanpy.metrics feat: add modularity and modularity_adata to scanpy.metrics Dec 5, 2025
@codecov
Copy link

codecov bot commented Jan 13, 2026

❌ 1 Tests Failed:

Tests completed Failed Passed Skipped
1626 1 1625 1111
View the top 1 failed test(s) by shortest run time
tests/test_plotting.py::test_highest_expr_genes[layer_name-None]
Stack Traces | 0.422s run time
image_comparer = <function image_comparer.<locals>.save_and_compare at 0x7f0f5a683110>
col = None, layer = 'layer_name'

    #x1B[0m#x1B[37m@pytest#x1B[39;49;00m.mark.parametrize(#x1B[33m"#x1B[39;49;00m#x1B[33mcol#x1B[39;49;00m#x1B[33m"#x1B[39;49;00m, [#x1B[94mNone#x1B[39;49;00m, #x1B[33m"#x1B[39;49;00m#x1B[33msymb#x1B[39;49;00m#x1B[33m"#x1B[39;49;00m])#x1B[90m#x1B[39;49;00m
    #x1B[37m@pytest#x1B[39;49;00m.mark.parametrize(#x1B[33m"#x1B[39;49;00m#x1B[33mlayer#x1B[39;49;00m#x1B[33m"#x1B[39;49;00m, [#x1B[94mNone#x1B[39;49;00m, #x1B[33m"#x1B[39;49;00m#x1B[33mlayer_name#x1B[39;49;00m#x1B[33m"#x1B[39;49;00m])#x1B[90m#x1B[39;49;00m
    #x1B[94mdef#x1B[39;49;00m#x1B[90m #x1B[39;49;00m#x1B[92mtest_highest_expr_genes#x1B[39;49;00m(image_comparer, col, layer):#x1B[90m#x1B[39;49;00m
        save_and_compare_images = partial(image_comparer, ROOT, tol=#x1B[94m5#x1B[39;49;00m)#x1B[90m#x1B[39;49;00m
    #x1B[90m#x1B[39;49;00m
        adata = pbmc3k()#x1B[90m#x1B[39;49;00m
        #x1B[94mif#x1B[39;49;00m layer #x1B[95mis#x1B[39;49;00m #x1B[95mnot#x1B[39;49;00m #x1B[94mNone#x1B[39;49;00m:#x1B[90m#x1B[39;49;00m
            adata.layers[layer] = adata.X#x1B[90m#x1B[39;49;00m
            #x1B[94mdel#x1B[39;49;00m adata.X#x1B[90m#x1B[39;49;00m
        #x1B[90m# check that only existing categories are shown#x1B[39;49;00m#x1B[90m#x1B[39;49;00m
        adata.var[#x1B[33m"#x1B[39;49;00m#x1B[33msymb#x1B[39;49;00m#x1B[33m"#x1B[39;49;00m] = adata.var_names.astype(#x1B[33m"#x1B[39;49;00m#x1B[33mcategory#x1B[39;49;00m#x1B[33m"#x1B[39;49;00m)#x1B[90m#x1B[39;49;00m
    #x1B[90m#x1B[39;49;00m
        sc.pl.highest_expr_genes(adata, #x1B[94m20#x1B[39;49;00m, gene_symbols=col, layer=layer, show=#x1B[94mFalse#x1B[39;49;00m)#x1B[90m#x1B[39;49;00m
    #x1B[90m#x1B[39;49;00m
>       save_and_compare_images(#x1B[33m"#x1B[39;49;00m#x1B[33mhighest_expr_genes#x1B[39;49;00m#x1B[33m"#x1B[39;49;00m)#x1B[90m#x1B[39;49;00m

#x1B[1m#x1B[31mtests/test_plotting.py#x1B[0m:67: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

expected = '.../_images/highest_expr_genes/expected.png'
actual = '.../_images/highest_expr_genes/actual.png'
tol = 5, in_decorator = True

    #x1B[0m#x1B[94mdef#x1B[39;49;00m#x1B[90m #x1B[39;49;00m#x1B[92mcompare_images#x1B[39;49;00m(expected, actual, tol, in_decorator=#x1B[94mFalse#x1B[39;49;00m):#x1B[90m#x1B[39;49;00m
    #x1B[90m    #x1B[39;49;00m#x1B[33m"""#x1B[39;49;00m
    #x1B[33m    Compare two "image" files checking differences within a tolerance.#x1B[39;49;00m
    #x1B[33m#x1B[39;49;00m
    #x1B[33m    The two given filenames may point to files which are convertible to#x1B[39;49;00m
    #x1B[33m    PNG via the `.converter` dictionary. The underlying RMS is calculated#x1B[39;49;00m
    #x1B[33m    with the `.calculate_rms` function.#x1B[39;49;00m
    #x1B[33m#x1B[39;49;00m
    #x1B[33m    Parameters#x1B[39;49;00m
    #x1B[33m    ----------#x1B[39;49;00m
    #x1B[33m    expected : str#x1B[39;49;00m
    #x1B[33m        The filename of the expected image.#x1B[39;49;00m
    #x1B[33m    actual : str#x1B[39;49;00m
    #x1B[33m        The filename of the actual image.#x1B[39;49;00m
    #x1B[33m    tol : float#x1B[39;49;00m
    #x1B[33m        The tolerance (a color value difference, where 255 is the#x1B[39;49;00m
    #x1B[33m        maximal difference).  The test fails if the average pixel#x1B[39;49;00m
    #x1B[33m        difference is greater than this value.#x1B[39;49;00m
    #x1B[33m    in_decorator : bool#x1B[39;49;00m
    #x1B[33m        Determines the output format. If called from image_comparison#x1B[39;49;00m
    #x1B[33m        decorator, this should be True. (default=False)#x1B[39;49;00m
    #x1B[33m#x1B[39;49;00m
    #x1B[33m    Returns#x1B[39;49;00m
    #x1B[33m    -------#x1B[39;49;00m
    #x1B[33m    None or dict or str#x1B[39;49;00m
    #x1B[33m        Return *None* if the images are equal within the given tolerance.#x1B[39;49;00m
    #x1B[33m#x1B[39;49;00m
    #x1B[33m        If the images differ, the return value depends on  *in_decorator*.#x1B[39;49;00m
    #x1B[33m        If *in_decorator* is true, a dict with the following entries is#x1B[39;49;00m
    #x1B[33m        returned:#x1B[39;49;00m
    #x1B[33m#x1B[39;49;00m
    #x1B[33m        - *rms*: The RMS of the image difference.#x1B[39;49;00m
    #x1B[33m        - *expected*: The filename of the expected image.#x1B[39;49;00m
    #x1B[33m        - *actual*: The filename of the actual image.#x1B[39;49;00m
    #x1B[33m        - *diff_image*: The filename of the difference image.#x1B[39;49;00m
    #x1B[33m        - *tol*: The comparison tolerance.#x1B[39;49;00m
    #x1B[33m#x1B[39;49;00m
    #x1B[33m        Otherwise, a human-readable multi-line string representation of this#x1B[39;49;00m
    #x1B[33m        information is returned.#x1B[39;49;00m
    #x1B[33m#x1B[39;49;00m
    #x1B[33m    Examples#x1B[39;49;00m
    #x1B[33m    --------#x1B[39;49;00m
    #x1B[33m    ::#x1B[39;49;00m
    #x1B[33m#x1B[39;49;00m
    #x1B[33m        img1 = "./baseline/plot.png"#x1B[39;49;00m
    #x1B[33m        img2 = "./output/plot.png"#x1B[39;49;00m
    #x1B[33m        compare_images(img1, img2, 0.001)#x1B[39;49;00m
    #x1B[33m#x1B[39;49;00m
    #x1B[33m    """#x1B[39;49;00m#x1B[90m#x1B[39;49;00m
        actual = os.fspath(actual)#x1B[90m#x1B[39;49;00m
        #x1B[94mif#x1B[39;49;00m #x1B[95mnot#x1B[39;49;00m os.path.exists(actual):#x1B[90m#x1B[39;49;00m
            #x1B[94mraise#x1B[39;49;00m #x1B[96mException#x1B[39;49;00m(#x1B[33mf#x1B[39;49;00m#x1B[33m"#x1B[39;49;00m#x1B[33mOutput image #x1B[39;49;00m#x1B[33m{#x1B[39;49;00mactual#x1B[33m}#x1B[39;49;00m#x1B[33m does not exist.#x1B[39;49;00m#x1B[33m"#x1B[39;49;00m)#x1B[90m#x1B[39;49;00m
        #x1B[94mif#x1B[39;49;00m os.stat(actual).st_size == #x1B[94m0#x1B[39;49;00m:#x1B[90m#x1B[39;49;00m
>           #x1B[94mraise#x1B[39;49;00m #x1B[96mException#x1B[39;49;00m(#x1B[33mf#x1B[39;49;00m#x1B[33m"#x1B[39;49;00m#x1B[33mOutput image file #x1B[39;49;00m#x1B[33m{#x1B[39;49;00mactual#x1B[33m}#x1B[39;49;00m#x1B[33m is empty.#x1B[39;49;00m#x1B[33m"#x1B[39;49;00m)#x1B[90m#x1B[39;49;00m
#x1B[1m#x1B[31mE           Exception: Output image file .../_images/highest_expr_genes/actual.png is empty.#x1B[0m

#x1B[1m#x1B[31m../../../..../scanpy/B9PcT7QG/hatch-test.few-extras/lib/python3.14.../matplotlib/testing/compare.py#x1B[0m:446: Exception

To view more test analytics, go to the Test Analytics Dashboard
📋 Got 3 mins? Take this short survey to help us improve Test Analytics.

@flying-sheep flying-sheep enabled auto-merge (squash) January 13, 2026 12:01
@flying-sheep flying-sheep disabled auto-merge January 13, 2026 12:02
@flying-sheep flying-sheep changed the title feat: add modularity and modularity_adata to scanpy.metrics feat: add modularity to scanpy.metrics Jan 19, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Get modularity score after community detection in leiden/louvain

2 participants