Skip to content

Louvain Step 2 appears to drop isolated nodes #617

@sfc-gh-sozer

Description

@sfc-gh-sozer

Louvain Step 2 appears to construct the community graph by only considering Edges that connect different communities. See lines 146-155 in source

This means that any isolated nodes (or disconnected communities) in the graph are eliminated from results. Instead they should be their own communities. Isolated nodes should be connected as self-edges in Step 2.

Steps to reproduce the behavior:

Consider Practice Problem at https://www.nebula-graph.io/posts/practice-graphx-nebula-graph-algorithm. Add a couple of "disconnected" vertices to that graph (edge between node 20 and 21) via an edge set like this:

val nebSample = mySess.createDataFrame(Seq(
    (0, 2, 1),
    (0, 3, 1),
    (0, 4, 1),
    (0, 5, 1),
    (1, 4, 1),
    (1, 2, 1),
    (1, 7, 1),
    (2, 4, 1),
    (2, 6, 1),
    (2, 5, 1),
    (3, 7, 1),
    (4, 10, 1),
    (5, 7, 1),
    (5, 11, 1),
    (6, 7, 1),
    (6, 11, 1),
    (8, 9, 1),
    (8, 10, 1),
    (8, 11, 1),
    (8, 14, 1),
    (8, 15, 1),
    (9, 14, 1),
    (9, 12, 1),
    (10, 12 ,1),
    (10, 13, 1),
    (10, 14, 1),
    (11, 13, 1),
    (20, 21, 1)
  ) 
).toDF("v1", "v2", "weight")

Expected behavior

Community 20 with a pair of vertices (20, 21) in INNERVERTICES array should appear
Instead, community 20 is excluded from results

Additional context

I have designed an approach to Phase 2 that fixes this problem if you are interested...

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions