|
| 1 | +--- |
| 2 | +title: leiden_community_detection |
| 3 | +description: Explore Memgraph's Leiden community detection capabilities and learn how to analyze the structure of complex networks. Access tutorials and comprehensive documentation to enhance your understanding of Leiden community detection algorithm. |
| 4 | +--- |
| 5 | + |
| 6 | +import { Steps } from 'nextra/components' |
| 7 | +import { Callout } from 'nextra/components' |
| 8 | +import { Card, Cards } from 'nextra/components' |
| 9 | +import GitHub from '/components/icons/GitHub' |
| 10 | + |
| 11 | +# leiden_community_detection |
| 12 | + |
| 13 | +Community in graphs mirrors real-world communities, like social circles. In a |
| 14 | +graph, communities are sets of nodes. M. Girvan and M. E. J. Newman note that |
| 15 | +nodes in a community connect more intensely with each other than with outside |
| 16 | +nodes. |
| 17 | + |
| 18 | +This module employs the [Leiden |
| 19 | +algorithm](https://en.wikipedia.org/wiki/Leiden_algorithm) for community |
| 20 | +detection based on paper [*From Louvain to Leiden: guaranteeing well-connected |
| 21 | +communities*](https://arxiv.org/abs/1810.08473). The Leiden algorithm is a |
| 22 | +hierarchical clustering algorithm, that recursively merges communities into |
| 23 | +single nodes by greedily optimizing the modularity and the process repeats in |
| 24 | +the condensed graph. It enhances the Louvain algorithm by addressing its |
| 25 | +limitations, particularly in situations where some identified communities are |
| 26 | +not well-connected. This improvement is made by periodically subdividing |
| 27 | +communities into smaller, well-connected groups. With an $\mathcal{O}(Lm)$ |
| 28 | +runtime for $m$ edges and $L$ number of iterations, it suits large graphs. The |
| 29 | +space complexity if $\mathcal{O}(VE)$ for $V$ nodes and $E$ edges. |
| 30 | + |
| 31 | +<Cards> |
| 32 | + <Card |
| 33 | + icon={<GitHub />} |
| 34 | + title="Source code" |
| 35 | + href="https://github.com/memgraph/mage/blob/main/cpp/leiden_community_detection_module/leiden_community_detection_module.cpp" |
| 36 | + /> |
| 37 | +</Cards> |
| 38 | + |
| 39 | +| Trait | Value | |
| 40 | +| ------------------------ | --------------------- | |
| 41 | +| **Module type** | algorithm | |
| 42 | +| **Implementation** | C++ | |
| 43 | +| **Graph direction** | undirected | |
| 44 | +| **Relationship weights** | weighted / unweighted | |
| 45 | +| **Parallelism** | parallel | |
| 46 | + |
| 47 | +## Procedures |
| 48 | + |
| 49 | +<Callout type="info"> |
| 50 | +You can execute this algorithm on [graph projections, subgraphs or portions of |
| 51 | +the graph](/advanced-algorithms/run-algorithms#run-procedures-on-subgraph). |
| 52 | +</Callout> |
| 53 | + |
| 54 | +### `get()` |
| 55 | + |
| 56 | +Computes graph communities using the Leiden algorithm. For more information on |
| 57 | +specific algorithm parameters and their use, refer to the paper [*From Louvain |
| 58 | +to Leiden: guaranteeing well-connected |
| 59 | +communities*](https://arxiv.org/abs/1810.08473). |
| 60 | + |
| 61 | +{<h4> Input: </h4>} |
| 62 | + |
| 63 | +- `subgraph: Graph` (**OPTIONAL**) ➡ A specific subgraph, which is an [object |
| 64 | + of type |
| 65 | + Graph](/advanced-algorithms/run-algorithms#run-procedures-on-subgraph) |
| 66 | + returned by the `project()` function, on which the algorithm is run. If |
| 67 | + subgraph is not specified, the algorithm is computed on the entire graph by |
| 68 | + default. |
| 69 | +- `weight: string (default=null)` ➡ Specifies the name of the property |
| 70 | + containing the edge weight. Users can set their own weight property; if this |
| 71 | + property is not specified, the algorithm uses the `weight` edge attribute by |
| 72 | + default. If neither is set, each edge's weight defaults to `1`. To utilize a |
| 73 | + custom weight property, the user must set the |
| 74 | + `--storage-properties-on-edges=true` flag. |
| 75 | +- `gamma: double (default=1.0)` ➡ Resolution parameter used when computing the |
| 76 | + modularity. Higher resolutions lead to more smaller communities, while lower |
| 77 | + resolutions lead to fewer larger communities. |
| 78 | +- `theta: double (default=0.01)` ➡ Controls the randomness while breaking a |
| 79 | + community into smaller ones. |
| 80 | +- `resolution_parameter: double (default=0.01)` ➡ Minimum change in modularity |
| 81 | + that must be achieved when merging nodes within the same community. |
| 82 | +- `max_iterations: int (default=inf)` ➡ Maximum number of iterations the |
| 83 | + algorithm will perform. If set to infinity, the algorithm will run until |
| 84 | + convergence is reached. |
| 85 | + |
| 86 | +{<h4> Output: </h4>} |
| 87 | + |
| 88 | +- `node: Vertex` ➡ A graph node for which the algorithm was performed and |
| 89 | + returned as part of the results. |
| 90 | +- `community_id: integer` ➡ Community ID. Defaults to $-1$ if the node does not |
| 91 | + belong to any community. |
| 92 | +- `communities: list` ➡ List representing the hierarchy of communities that a |
| 93 | + node has belonged to across iterations. |
| 94 | + |
| 95 | +{<h4> Usage: </h4>} |
| 96 | + |
| 97 | +Use the following query to detect communities: |
| 98 | + |
| 99 | +```cypher |
| 100 | +CALL leiden_community_detection.get() |
| 101 | +YIELD node, community_id, communities; |
| 102 | +``` |
| 103 | + |
| 104 | +<Callout type="info"> |
| 105 | +The algorithm throws an exception if no communities are detected. This can happen if in the first iteration |
| 106 | +all nodes merge into a single community or if each node forms its own. If this occurs, try adjusting the algorithm's `gamma` parameter. |
| 107 | +</Callout> |
| 108 | + |
| 109 | +### `get_subgraph()` |
| 110 | + |
| 111 | +Computes graph communities over a subgraph using the Leiden method. |
| 112 | + |
| 113 | +{<h4> Input: </h4>} |
| 114 | + |
| 115 | +- `subgraph: Graph` (**OPTIONAL**) ➡ A specific subgraph, which is an [object |
| 116 | + of type |
| 117 | + Graph](/advanced-algorithms/run-algorithms#run-procedures-on-subgraph) |
| 118 | + returned by the `project()` function, on which the algorithm is run. If |
| 119 | + subgraph is not specified, the algorithm is computed on the entire graph by |
| 120 | + default. |
| 121 | +- `subgraph_nodes: List[Node]` ➡ List of nodes in the subgraph. |
| 122 | +- `subgraph_relationships: List[Relationship]` ➡ List of relationships in the |
| 123 | + subgraph. |
| 124 | +- `weight: string (default=null)` ➡ Specifies the name of the property |
| 125 | + containing the edge weight. Users can set their own weight property; if this |
| 126 | + property is not specified, the algorithm uses the `weight` edge attribute by |
| 127 | + default. If neither is set, each edge's weight defaults to `1`. To utilize a |
| 128 | + custom weight property, the user must set the |
| 129 | + `--storage-properties-on-edges=true` flag. |
| 130 | +- `gamma: double (default=1.0)` ➡ Resolution parameter used when computing the |
| 131 | + modularity. Higher resolutions lead to more smaller communities, while lower |
| 132 | + resolutions lead to fewer larger communities. |
| 133 | +- `theta: double (default=0.01)` ➡ Controls the randomness while breaking a |
| 134 | + community into smaller ones. |
| 135 | +- `resolution_parameter: double (default=0.01)` ➡ Minimum change in modularity |
| 136 | + that must be achieved when merging nodes within the same community. |
| 137 | +- `max_iterations: int (default=inf)` ➡ Maximum number of iterations the |
| 138 | + algorithm will perform. If set to infinity, the algorithm will run until |
| 139 | + convergence is reached. |
| 140 | + |
| 141 | +{<h4> Output: </h4>} |
| 142 | + |
| 143 | +- `node: Vertex` ➡ A graph node for which the algorithm was performed and |
| 144 | + returned as part of the results. |
| 145 | +- `community_id: int` ➡ Community ID. Defaults to $-1$ if the node does not |
| 146 | + belong to any community. |
| 147 | +- `communities: list` ➡ List representing the hierarchy of communities that a |
| 148 | + node has belonged to across iterations. |
| 149 | + |
| 150 | +{<h4> Usage: </h4>} |
| 151 | + |
| 152 | +Use the following query to compute communities in a subgraph: |
| 153 | + |
| 154 | +```cypher |
| 155 | +MATCH (a)-[e]-(b) |
| 156 | +WITH COLLECT(a) AS nodes, COLLECT (e) AS relationships |
| 157 | +CALL leiden_community_detection.get_subgraph(nodes, relationships) |
| 158 | +YIELD node, community_id, communities; |
| 159 | +``` |
| 160 | + |
| 161 | +## Single Iteration Example |
| 162 | + |
| 163 | +<Steps> |
| 164 | + |
| 165 | +{<h3> Database state </h3>} |
| 166 | + |
| 167 | +The database contains the following data: |
| 168 | + |
| 169 | + |
| 170 | + |
| 171 | +Created with the following Cypher queries: |
| 172 | + |
| 173 | +```cypher |
| 174 | +MERGE (a: Node {id: 0}) MERGE (b: Node {id: 1}) CREATE (a)-[r: Relation]->(b); |
| 175 | +MERGE (a: Node {id: 0}) MERGE (b: Node {id: 2}) CREATE (a)-[r: Relation]->(b); |
| 176 | +MERGE (a: Node {id: 1}) MERGE (b: Node {id: 2}) CREATE (a)-[r: Relation]->(b); |
| 177 | +MERGE (a: Node {id: 2}) MERGE (b: Node {id: 3}) CREATE (a)-[r: Relation]->(b); |
| 178 | +MERGE (a: Node {id: 3}) MERGE (b: Node {id: 4}) CREATE (a)-[r: Relation]->(b); |
| 179 | +MERGE (a: Node {id: 3}) MERGE (b: Node {id: 5}) CREATE (a)-[r: Relation]->(b); |
| 180 | +MERGE (a: Node {id: 4}) MERGE (b: Node {id: 5}) CREATE (a)-[r: Relation]->(b); |
| 181 | +``` |
| 182 | + |
| 183 | +{<h3> Detect communities </h3>} |
| 184 | + |
| 185 | +Get communities using the following query: |
| 186 | + |
| 187 | +```cypher |
| 188 | +CALL leiden_community_detection.get() |
| 189 | +YIELD node, community_id, communities |
| 190 | +RETURN node.id AS node_id, community_id, communities |
| 191 | +ORDER BY node_id; |
| 192 | +``` |
| 193 | + |
| 194 | +Results show which nodes belong to community 0, and which to community 1. |
| 195 | + |
| 196 | +```plaintext |
| 197 | ++--------------+--------------+--------------+ |
| 198 | +| node_id | community_id | communities | |
| 199 | ++--------------+--------------+--------------+ |
| 200 | +| 0 | 0 | [0] | |
| 201 | +| 1 | 0 | [0] | |
| 202 | +| 2 | 0 | [0] | |
| 203 | +| 3 | 1 | [1] | |
| 204 | +| 4 | 1 | [1] | |
| 205 | +| 5 | 1 | [1] | |
| 206 | ++--------------+--------------+--------------+ |
| 207 | +``` |
| 208 | + |
| 209 | +</Steps> |
| 210 | + |
| 211 | +## Multiple Iterations Example |
| 212 | + |
| 213 | +<Steps> |
| 214 | + |
| 215 | +{<h3> Database state </h3>} |
| 216 | + |
| 217 | +The database contains the following data: |
| 218 | + |
| 219 | + |
| 220 | + |
| 221 | +Created with the following Cypher queries: |
| 222 | + |
| 223 | +```cypher |
| 224 | +MERGE (a:Node {id: 1}) MERGE (b:Node {id: 0}) CREATE (a)-[:RELATION]->(b); |
| 225 | +MERGE (a:Node {id: 1}) MERGE (b:Node {id: 10}) CREATE (a)-[:RELATION]->(b); |
| 226 | +MERGE (a:Node {id: 2}) MERGE (b:Node {id: 1}) CREATE (a)-[:RELATION]->(b); |
| 227 | +MERGE (a:Node {id: 2}) MERGE (b:Node {id: 8}) CREATE (a)-[:RELATION]->(b); |
| 228 | +MERGE (a:Node {id: 2}) MERGE (b:Node {id: 10}) CREATE (a)-[:RELATION]->(b); |
| 229 | +MERGE (a:Node {id: 3}) MERGE (b:Node {id: 10}) CREATE (a)-[:RELATION]->(b); |
| 230 | +MERGE (a:Node {id: 4}) MERGE (b:Node {id: 10}) CREATE (a)-[:RELATION]->(b); |
| 231 | +MERGE (a:Node {id: 4}) MERGE (b:Node {id: 2}) CREATE (a)-[:RELATION]->(b); |
| 232 | +MERGE (a:Node {id: 5}) MERGE (b:Node {id: 10}) CREATE (a)-[:RELATION]->(b); |
| 233 | +MERGE (a:Node {id: 5}) MERGE (b:Node {id: 2}) CREATE (a)-[:RELATION]->(b); |
| 234 | +MERGE (a:Node {id: 6}) MERGE (b:Node {id: 2}) CREATE (a)-[:RELATION]->(b); |
| 235 | +MERGE (a:Node {id: 7}) MERGE (b:Node {id: 2}) CREATE (a)-[:RELATION]->(b); |
| 236 | +MERGE (a:Node {id: 8}) MERGE (b:Node {id: 2}) CREATE (a)-[:RELATION]->(b); |
| 237 | +MERGE (a:Node {id: 8}) MERGE (b:Node {id: 10}) CREATE (a)-[:RELATION]->(b); |
| 238 | +MERGE (a:Node {id: 9}) MERGE (b:Node {id: 10}) CREATE (a)-[:RELATION]->(b); |
| 239 | +MERGE (a:Node {id: 10}) MERGE (b:Node {id: 9}) CREATE (a)-[:RELATION]->(b); |
| 240 | +``` |
| 241 | + |
| 242 | +{<h3> Detect communities </h3>} |
| 243 | + |
| 244 | +Get communities using the following query: |
| 245 | + |
| 246 | +```cypher |
| 247 | +CALL leiden_community_detection.get() |
| 248 | +YIELD node, community_id, communities |
| 249 | +RETURN node.id AS node_id, community_id, communities |
| 250 | +ORDER BY node_id; |
| 251 | +``` |
| 252 | + |
| 253 | +The results show which nodes belong to community 0 and which to community 1, as well as how nodes changed communities across iterations. |
| 254 | + |
| 255 | +```plaintext |
| 256 | ++--------------+--------------+--------------+ |
| 257 | +| node_id | community_id | communities | |
| 258 | ++--------------+--------------+--------------+ |
| 259 | +| 0 | 1 | [0, 1] | |
| 260 | +| 1 | 1 | [0, 1] | |
| 261 | +| 2 | 0 | [1, 0] | |
| 262 | +| 3 | 1 | [2, 1] | |
| 263 | +| 4 | 1 | [3, 1] | |
| 264 | +| 5 | 0 | [4, 0] | |
| 265 | +| 6 | 0 | [1, 0] | |
| 266 | +| 7 | 0 | [1, 0] | |
| 267 | +| 8 | 0 | [1, 0] | |
| 268 | +| 9 | 1 | [2, 1] | |
| 269 | +| 10 | 1 | [2, 1] | |
| 270 | ++--------------+--------------+--------------+ |
| 271 | +``` |
| 272 | + |
| 273 | +</Steps> |
0 commit comments