Skip to content

Commit 97b7192

Browse files
committed
Update stochastic_variability.py
I have made the changes as per the review.
1 parent dc52eb3 commit 97b7192

File tree

1 file changed

+89
-48
lines changed

1 file changed

+89
-48
lines changed

doc/examples_sphinx-gallery/stochastic_variability.py

Lines changed: 89 additions & 48 deletions
Original file line numberDiff line numberDiff line change
@@ -5,89 +5,130 @@
55
Stochastic Variability in Community Detection Algorithms
66
=========================================================
77
8-
This example demonstrates the variability of stochastic community detection methods by analyzing the consistency of multiple partitions using similarity measures (NMI, VI, RI) on both random and structured graphs.
8+
This example demonstrates the variability of stochastic community detection methods by analyzing the consistency of multiple partitions using similarity measures normalized mutual information (NMI), variation of information (VI), rand index (RI) on both random and structured graphs.
99
1010
"""
1111
# %%
12-
# Import Libraries
12+
# Import libraries
1313
import igraph as ig
14-
import numpy as np
1514
import matplotlib.pyplot as plt
1615
import itertools
1716

1817
# %%
1918
# First, we generate a graph.
20-
# Generates a random Erdos-Renyi graph (no clear community structure)
21-
def generate_random_graph(n, p):
22-
return ig.Graph.Erdos_Renyi(n=n, p=p)
19+
# Load the karate club network
20+
karate = ig.Graph.Famous("Zachary")
2321

2422
# %%
25-
# Generates a clustered graph with clear communities using the Stochastic Block Model (SBM)
26-
def generate_clustered_graph(n, clusters, intra_p, inter_p):
27-
block_sizes = [n // clusters] * clusters
28-
prob_matrix = [[intra_p if i == j else inter_p for j in range(clusters)] for i in range(clusters)]
29-
return ig.Graph.SBM(sum(block_sizes), prob_matrix, block_sizes)
23+
#For the random graph, we use an Erdős-Rényi :math:`G(n, m)` model, where 'n' is the number of nodes
24+
#and 'm' is the number of edges. We set 'm' to match the edge count of the empirical (Karate Club)
25+
#network to ensure structural similarity in terms of connectivity, making comparisons meaningful.
26+
n_nodes = karate.vcount()
27+
n_edges = karate.ecount()
28+
#Generate an Erdős-Rényi graph with the same number of nodes and edges
29+
random_graph = ig.Graph.Erdos_Renyi(n=n_nodes, m=n_edges)
3030

3131
# %%
32-
# Computes pairwise similarity (NMI, VI, RI) between partitions
32+
# Now, lets plot the graph to visually understand them.
33+
34+
# Create subplots
35+
fig, axes = plt.subplots(1, 2, figsize=(12, 6))
36+
37+
# Karate Club Graph
38+
layout_karate = karate.layout("fr")
39+
ig.plot(
40+
karate, layout=layout_karate, target=axes[0], vertex_size=30, vertex_color="lightblue", edge_width=1,
41+
vertex_label=[str(v.index) for v in karate.vs], vertex_label_size=10
42+
)
43+
axes[0].set_title("Karate Club Network")
44+
45+
# Erdős-Rényi Graph
46+
layout_random = random_graph.layout("fr")
47+
ig.plot(
48+
random_graph, layout=layout_random, target=axes[1], vertex_size=30, vertex_color="lightcoral", edge_width=1,
49+
vertex_label=[str(v.index) for v in random_graph.vs], vertex_label_size=10
50+
)
51+
axes[1].set_title("Erdős-Rényi Random Graph")
52+
# %%
53+
# Function to compute similarity between partitions
3354
def compute_pairwise_similarity(partitions, method):
34-
"""Computes pairwise similarity measure between partitions."""
35-
scores = []
55+
similarities = []
56+
3657
for p1, p2 in itertools.combinations(partitions, 2):
37-
scores.append(ig.compare_communities(p1, p2, method=method))
38-
return scores
58+
similarity = ig.compare_communities(p1, p2, method=method)
59+
similarities.append(similarity)
60+
61+
return similarities
3962

4063
# %%
41-
# Stochastic Community Detection
42-
# Runs Louvain's method iteratively to generate partitions
43-
# Computes similarity metrics:
64+
# We have used, stochastic community detection using the Louvain method, iteratively generating partitions and computing similarity metrics to assess stability.
65+
# The Louvain method is a modularity maximization approach for community detection.
66+
# Since exact modularity maximization is NP-hard, the algorithm employs a greedy heuristic that processes vertices in a random order.
67+
# This randomness leads to variations in the detected communities across different runs, which is why results may differ each time the method is applied.
4468
def run_experiment(graph, iterations=50):
45-
"""Runs the stochastic method multiple times and collects community partitions."""
4669
partitions = [graph.community_multilevel().membership for _ in range(iterations)]
4770
nmi_scores = compute_pairwise_similarity(partitions, method="nmi")
4871
vi_scores = compute_pairwise_similarity(partitions, method="vi")
4972
ri_scores = compute_pairwise_similarity(partitions, method="rand")
5073
return nmi_scores, vi_scores, ri_scores
5174

52-
# %%
53-
# Parameters
54-
n_nodes = 100
55-
p_random = 0.05
56-
clusters = 4
57-
p_intra = 0.3 # High intra-cluster connection probability
58-
p_inter = 0.01 # Low inter-cluster connection probability
59-
60-
# %%
61-
# Generate graphs
62-
random_graph = generate_random_graph(n_nodes, p_random)
63-
clustered_graph = generate_clustered_graph(n_nodes, clusters, p_intra, p_inter)
64-
6575
# %%
6676
# Run experiments
77+
nmi_karate, vi_karate, ri_karate = run_experiment(karate)
6778
nmi_random, vi_random, ri_random = run_experiment(random_graph)
68-
nmi_clustered, vi_clustered, ri_clustered = run_experiment(clustered_graph)
6979

70-
# %%
71-
# Lets, plot the histograms
80+
# %%
81+
# Lastly, lets plot probability density histograms to understand the result.
7282
fig, axes = plt.subplots(3, 2, figsize=(12, 10))
73-
measures = [(nmi_random, nmi_clustered, "NMI"), (vi_random, vi_clustered, "VI"), (ri_random, ri_clustered, "RI")]
83+
measures = [
84+
(nmi_karate, nmi_random, "NMI", 0, 1), # Normalized Mutual Information (0-1, higher = more similar)
85+
(vi_karate, vi_random, "VI", 0, None), # Variation of Information (0+, lower = more similar)
86+
(ri_karate, ri_random, "RI", 0, 1), # Rand Index (0-1, higher = more similar)
87+
]
7488
colors = ["red", "blue", "green"]
7589

76-
for i, (random_scores, clustered_scores, measure) in enumerate(measures):
77-
axes[i][0].hist(random_scores, bins=20, alpha=0.7, color=colors[i], edgecolor="black")
78-
axes[i][0].set_title(f"Histogram of {measure} - Random Graph")
90+
for i, (karate_scores, random_scores, measure, lower, upper) in enumerate(measures):
91+
# Karate Club histogram
92+
axes[i][0].hist(
93+
karate_scores, bins=20, alpha=0.7, color=colors[i], edgecolor="black",
94+
density=True # Probability density
95+
)
96+
axes[i][0].set_title(f"Probability Density of {measure} - Karate Club Network")
7997
axes[i][0].set_xlabel(f"{measure} Score")
80-
axes[i][0].set_ylabel("Frequency")
81-
82-
axes[i][1].hist(clustered_scores, bins=20, alpha=0.7, color=colors[i], edgecolor="black")
83-
axes[i][1].set_title(f"Histogram of {measure} - Clustered Graph")
98+
axes[i][0].set_ylabel("Density")
99+
axes[i][0].set_xlim(lower, upper) # Set axis limits explicitly
100+
101+
# Erdős-Rényi Graph histogram
102+
axes[i][1].hist(
103+
random_scores, bins=20, alpha=0.7, color=colors[i], edgecolor="black",
104+
density=True
105+
)
106+
axes[i][1].set_title(f"Probability Density of {measure} - Erdős-Rényi Graph")
84107
axes[i][1].set_xlabel(f"{measure} Score")
108+
axes[i][1].set_xlim(lower, upper) # Set axis limits explicitly
85109

86110
plt.tight_layout()
87111
plt.show()
88112

89113
# %%
90-
# The results are plotted as histograms for random vs. clustered graphs, highlighting differences in detected community structures.
91-
#The key reason for the inconsistency in random graphs and higher consistency in structured graphs is due to community structure strength:
92-
#Random Graphs: Lack clear communities, leading to unstable partitions. Stochastic algorithms detect different structures across runs, resulting in low NMI, high VI, and inconsistent RI.
93-
#Structured Graphs: Have well-defined communities, so detected partitions are more stable across multiple runs, leading to high NMI, low VI, and stable RI.
114+
# We have compared the probability density of NMI, VI, and RI for the Karate Club network (structured) and an Erdős-Rényi random graph.
115+
#
116+
# **NMI (Normalized Mutual Information):**
117+
#
118+
# - Karate Club Network: The distribution is concentrated near 1, indicating high similarity across multiple runs, suggesting stable community detection.
119+
# - Erdős-Rényi Graph: The values are more spread out, with lower NMI scores, showing inconsistent partitions due to the lack of clear community structures.
120+
#
121+
# **VI (Variation of Information):**
122+
#
123+
# - Karate Club Network: The values are low and clustered, indicating stable partitioning with minor variations across runs.
124+
# - Erdős-Rényi Graph: The distribution is broader and shifted toward higher VI values, meaning higher partition variability and less consistency.
125+
#
126+
# **RI (Rand Index):**
127+
#
128+
# - Karate Club Network: The RI values are high and concentrated near 1, suggesting consistent clustering results across multiple iterations.
129+
# - Erdős-Rényi Graph: The distribution is more spread out, but with lower RI values, confirming unstable community detection.
130+
#
131+
# **Conclusion**
132+
#
133+
# The Karate Club Network exhibits strong, well-defined community structures, leading to consistent results across runs.
134+
# The Erdős-Rényi Graph, being random, lacks clear communities, causing high variability in detected partitions.

0 commit comments

Comments
 (0)