Skip to content

Commit 5ec7ce7

Browse files
authored
Merge pull request #61 from 4ndrelim/branch-DisjointSet
Branch disjoint set
2 parents 3de078e + c5b1f2e commit 5ec7ce7

File tree

12 files changed

+449
-161
lines changed

12 files changed

+449
-161
lines changed

docs/assets/images/QuickFind.png

108 KB
Loading
117 KB
Loading
267 KB
Loading

src/main/java/dataStructures/disjointSet/README.md

Lines changed: 24 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -3,11 +3,14 @@
33
## Background
44

55
A disjoint-set structure also known as a union-find or merge-find set, is a data structure
6-
keeps track of a partition of a set into disjoint (non-overlapping) subsets. In CS2040s, this
6+
keeps track of a partition of a set into disjoint (non-overlapping) subsets.
7+
8+
In CS2040s, this
79
is introduced in the context of checking for dynamic connectivity. For instance, Kruskal's algorithm
8-
in graph theory to find minimum spanning tree of the graph utilizes disjoint set to efficiently
9-
query if there exists a path between 2 nodes. <br>
10-
It supports 2 main operations:
10+
in graph theory to find minimum spanning tree of a graph utilizes disjoint set to efficiently
11+
query if there already exists a path between 2 nodes.
12+
13+
Generally, there are 2 main operations:
1114

1215
1. Union: Join two subsets into a single subset
1316
2. Find: Determine which subset a particular element is in. In practice, this is often done to check
@@ -17,12 +20,26 @@ The Disjoint Set structure is often introduced in 3 parts, with each iteration b
1720
previous either in time or space complexity (or both). More details can be found in the respective folders.
1821
Below is a brief overview:
1922

20-
1. Quick Find - Elements are assigned a component identity.
23+
1. **Quick Find** - Elements are assigned a component identity.
2124
Querying for connectivity and updating usually tracked with an internal array.
2225

23-
2. Quick Union - Component an element belongs to is now tracked with a tree structure. Nothing to enforce
26+
2. **Quick Union** - Component an element belongs to is now tracked with a tree structure. Nothing to enforce
2427
a balanced tree and hence complexity does not necessarily improve
2528
- Note, this is not implemented but details can be found under weighted union folder.
2629

27-
3. Weighted Union - Same idea of using a tree, but constructed in a way that the tree is balanced, leading to improved
30+
3. **Weighted Union** - Same idea of using a tree, but constructed in a way that the tree is balanced, leading to improved
2831
complexities. Can be further augmented with path compression.
32+
33+
## Applications
34+
Because of its efficiency and simplicity in implementing, Disjoint Set structures are widely used in practice:
35+
1. As mentioned, it is often sued as a helper structure for Kruskal's MST algorithm
36+
2. It can be used in the context of network connectivity
37+
- Managing a network of computers
38+
- Or even analyse social networks, finding communities and determining if two users are connected through a chain
39+
3. Can be part of clustering algorithms to group data points based on similarity - useful for ML
40+
4. It can be used to detect cycles in dependency graphs, e.g, software dependency management systems
41+
5. It can be used for image processing, in labelling different connected components of an image
42+
43+
## Notes
44+
Disjoint Set is a data structure designed to keep track of a set of elements partitioned into a number of
45+
non-overlapping subsets. It is not suited for handling duplicates and so our implementation ignores duplicates.
Lines changed: 99 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,99 @@
1+
package dataStructures.disjointSet.quickFind;
2+
3+
import java.util.ArrayList;
4+
import java.util.HashMap;
5+
import java.util.List;
6+
import java.util.Map;
7+
8+
/**
9+
* Implementation of quick-find structure; Turns a list of objects into a data structure that supports union operations
10+
*
11+
* @param <T> generic type of object to be stored
12+
*/
13+
public class DisjointSet<T> {
14+
private final Map<T, Integer> identifier;
15+
16+
/**
17+
* Basic constructor to create the Disjoint Set data structure.
18+
*/
19+
public DisjointSet() {
20+
identifier = new HashMap<>();
21+
}
22+
23+
/**
24+
* Constructor to initialize Disjoint Set with a known list of objects.
25+
* @param objects
26+
*/
27+
public DisjointSet(List<T> objects) {
28+
identifier = new HashMap<>();
29+
int size = objects.size();
30+
for (int i = 0; i < size; i++) {
31+
// internally, component identity is tracked with integers
32+
identifier.put(objects.get(i), identifier.size()); // each obj initialize with a unique identity using size;
33+
}
34+
}
35+
36+
public int size() {
37+
return identifier.size();
38+
}
39+
40+
/**
41+
* Adds an object into the structure.
42+
* @param obj
43+
*/
44+
public void add(T obj) {
45+
identifier.put(obj, identifier.size());
46+
}
47+
48+
/**
49+
* Checks if object a and object b are in the same component.
50+
* @param a
51+
* @param b
52+
* @return a boolean value
53+
*/
54+
public boolean find(T a, T b) {
55+
if (!identifier.containsKey(a) || !identifier.containsKey(b)) { // key(s) does not even exist
56+
return false;
57+
}
58+
return identifier.get(a).equals(identifier.get(b));
59+
}
60+
61+
/**
62+
* Merge the components of object a and object b.
63+
* @param a
64+
* @param b
65+
*/
66+
public void union(T a, T b) {
67+
if (!identifier.containsKey(a) || !identifier.containsKey(b)) { // key(s) does not even exist; do nothing
68+
return;
69+
}
70+
71+
if (identifier.get(a).equals(identifier.get(b))) { // already same; do nothing
72+
return;
73+
}
74+
75+
int compOfA = identifier.get(a);
76+
int compOfB = identifier.get(b);
77+
for (T obj : identifier.keySet()) {
78+
if (identifier.get(obj).equals(compOfA)) {
79+
identifier.put(obj, compOfB);
80+
}
81+
}
82+
}
83+
84+
/**
85+
* Retrieves all elements that are in the same component as the specified object. Not a typical operation
86+
* but here to illustrate other use case.
87+
* @param a
88+
* @return a list of objects
89+
*/
90+
public List<T> retrieveFromSameComponent(T a) {
91+
List<T> ret = new ArrayList<>();
92+
for (T obj : identifier.keySet()) {
93+
if (find(a, obj)) {
94+
ret.add(obj);
95+
}
96+
}
97+
return ret;
98+
}
99+
}

src/main/java/dataStructures/disjointSet/quickFind/README.md

Lines changed: 9 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -2,11 +2,18 @@
22

33
## Background
44
Every object will be assigned a component identity. The implementation of Quick Find often involves
5-
an underlying array that tracks the component identity of each object.
5+
an underlying array or hash map that tracks the component identity of each object.
6+
Our implementation uses a hash map (to easily handle the case when objects aren't integers).
7+
8+
<div align="center">
9+
<img src="../../../../../../docs/assets/images/QuickFind.png" width="50%">
10+
<br>
11+
Credits: CS2040s Lecture Slides
12+
</div>
613

714
### Union
815
Between the two components, decide on the component d, to represent the combined set. Let the other
9-
component's identity be d'. Simply iterate over the component identifier array, and for any element with
16+
component's identity be d'. Simply iterate over the component identifier array / map, and for any element with
1017
identity d', assign it to d.
1118

1219
### Find

src/main/java/dataStructures/disjointSet/quickFind/generalised/QuickFind.java

Lines changed: 0 additions & 87 deletions
This file was deleted.

src/main/java/dataStructures/disjointSet/quickFind/simplified/QuickFind.java

Lines changed: 0 additions & 59 deletions
This file was deleted.

0 commit comments

Comments
 (0)