-
Notifications
You must be signed in to change notification settings - Fork 124
Description
Louvain Step 2 appears to construct the community graph by only considering Edges that connect different communities. See lines 146-155 in source
This means that any isolated nodes (or disconnected communities) in the graph are eliminated from results. Instead they should be their own communities. Isolated nodes should be connected as self-edges in Step 2.
Steps to reproduce the behavior:
Consider Practice Problem at https://www.nebula-graph.io/posts/practice-graphx-nebula-graph-algorithm. Add a couple of "disconnected" vertices to that graph (edge between node 20 and 21) via an edge set like this:
val nebSample = mySess.createDataFrame(Seq(
(0, 2, 1),
(0, 3, 1),
(0, 4, 1),
(0, 5, 1),
(1, 4, 1),
(1, 2, 1),
(1, 7, 1),
(2, 4, 1),
(2, 6, 1),
(2, 5, 1),
(3, 7, 1),
(4, 10, 1),
(5, 7, 1),
(5, 11, 1),
(6, 7, 1),
(6, 11, 1),
(8, 9, 1),
(8, 10, 1),
(8, 11, 1),
(8, 14, 1),
(8, 15, 1),
(9, 14, 1),
(9, 12, 1),
(10, 12 ,1),
(10, 13, 1),
(10, 14, 1),
(11, 13, 1),
(20, 21, 1)
)
).toDF("v1", "v2", "weight")
Expected behavior
Community 20 with a pair of vertices (20, 21) in INNERVERTICES array should appear
Instead, community 20 is excluded from results
Additional context
I have designed an approach to Phase 2 that fixes this problem if you are interested...