SCC (Scalable Correlation Clustering) is a framework for clustering signed graphs, where edges carry positive (attraction) or negative (repulsion) weights. Part of the KaHIP organization.
| What it solves | Cluster signed graphs by grouping nodes with positive connections while separating nodes with negative connections |
| Objective | Minimize the sum of positive inter-cluster edges and negative intra-cluster edges |
| Algorithms | Multilevel approach with label propagation and local search; distributed memetic algorithm combining evolutionary search with multilevel refinement |
| Interfaces | CLI |
| Requires | C++11 compiler, CMake 3.10+, MPI (e.g. OpenMPI) |
brew install KaHIP/kahip/sccgit clone https://github.com/KaHIP/ScalableCorrelationClustering.git && cd ScalableCorrelationClustering
./compile_withcmake.sh# Multilevel clustering
scc network.graph --seed=0
# Distributed memetic algorithm (4 MPI processes, 120s time limit)
mpirun -n 4 scc_evolutionary network.graph --seed=0 --time_limit=120When building from source, binaries are in ./deploy/ (use ./deploy/scc etc.).
| Binary | Description |
|---|---|
scc |
Multilevel signed graph clustering |
scc_evolutionary |
Distributed memetic signed graph clustering (MPI) |
scc_evaluator |
Evaluate a clustering solution |
scc_graphchecker |
Check graph format validity |
scc <graph-file> [options]
mpirun -n <procs> scc_evolutionary <graph-file> [options]
| Option | Description | Default |
|---|---|---|
--seed=<int> |
Random seed for the PRNG | 0 |
--time_limit=<double> |
Time limit in seconds (0 = single run) | 0 |
--output_filename=<string> |
Output file for the clustering | clustering |
--help |
Print help |
SCC uses the standard METIS graph format with edge weights (format flag 1). Positive edge weights represent attraction, negative weights represent repulsion.
Header line:
n m 1
n= number of vertices,m= number of undirected edges,1= edge-weighted
Vertex lines (one per vertex):
Each of the following n lines lists pairs of neighbor weight:
v_1 w_1 v_2 w_2 ... v_k w_k
Vertices are 1-indexed. Positive weights indicate attraction (+), negative weights indicate repulsion (-).
Example (4 vertices, 5 edges):
4 5 1
2 1 3 1
1 1 3 -1 4 1
1 1 2 -1 4 -1
2 1 3 -1
Vertices 1-2 and 2-4 are attracted (+1), while edges 2-3 and 3-4 are repulsive (-1). The optimal clustering places {1, 2, 4} in one cluster and {3} in another.
- C++11 compiler (GCC 7+ or Clang 11+)
- CMake 3.10+
- MPI (e.g. OpenMPI)
./compile_withcmake.shOr using CMake directly:
cmake -B build -DCMAKE_BUILD_TYPE=Release
cmake --build build --parallelBinaries are placed in ./build/ (CMake) or ./deploy/ (compile script).
SCC provides two algorithms for signed graph clustering:
-
Multilevel algorithm (
scc): Coarsens the graph through label propagation, computes an initial clustering, then uncoarsens with FM-based local search refinement at each level. -
Distributed memetic algorithm (
scc_evolutionary): Maintains a population of clustering solutions across MPI processes. Individuals are evolved through mutation (re-clustering with different seeds) and crossover (combining two clusterings via the multilevel framework), with the best solutions surviving across generations.
Both algorithms automatically determine the number of clusters.
| Project | Description |
|---|---|
| KaHIP | Karlsruhe High Quality Graph Partitioning (flagship framework) |
| VieClus | Graph clustering optimizing modularity |
| CluStRE | Streaming graph clustering with multi-stage refinement |
SCC is free software provided under the MIT License. If you publish results using our algorithms, please acknowledge our work by citing our paper:
@InProceedings{HausbergerFFS25,
author = {Felix Hausberger and Marcelo Fonseca Faraj and Christian Schulz},
title = {{Scalable Multilevel and Memetic Signed Graph Clustering}},
booktitle = {Proceedings of the 27th Symposium on Algorithm Engineering and Experiments (ALENEX 2025)},
pages = {81--94},
publisher = {SIAM},
year = {2025},
doi = {10.1137/1.9781611978339.7}
}
