33## Background
44
55A disjoint-set structure also known as a union-find or merge-find set, is a data structure
6- keeps track of a partition of a set into disjoint (non-overlapping) subsets. In CS2040s, this
6+ keeps track of a partition of a set into disjoint (non-overlapping) subsets.
7+
8+ In CS2040s, this
79is introduced in the context of checking for dynamic connectivity. For instance, Kruskal's algorithm
8- in graph theory to find minimum spanning tree of the graph utilizes disjoint set to efficiently
9- query if there exists a path between 2 nodes. <br >
10- It supports 2 main operations:
10+ in graph theory to find minimum spanning tree of a graph utilizes disjoint set to efficiently
11+ query if there already exists a path between 2 nodes.
12+
13+ Generally, there are 2 main operations:
1114
12151 . Union: Join two subsets into a single subset
13162 . Find: Determine which subset a particular element is in. In practice, this is often done to check
@@ -17,12 +20,26 @@ The Disjoint Set structure is often introduced in 3 parts, with each iteration b
1720previous either in time or space complexity (or both). More details can be found in the respective folders.
1821Below is a brief overview:
1922
20- 1 . Quick Find - Elements are assigned a component identity.
23+ 1 . ** Quick Find** - Elements are assigned a component identity.
2124Querying for connectivity and updating usually tracked with an internal array.
2225
23- 2 . Quick Union - Component an element belongs to is now tracked with a tree structure. Nothing to enforce
26+ 2 . ** Quick Union** - Component an element belongs to is now tracked with a tree structure. Nothing to enforce
2427a balanced tree and hence complexity does not necessarily improve
2528 - Note, this is not implemented but details can be found under weighted union folder.
2629
27- 3 . Weighted Union - Same idea of using a tree, but constructed in a way that the tree is balanced, leading to improved
30+ 3 . ** Weighted Union** - Same idea of using a tree, but constructed in a way that the tree is balanced, leading to improved
2831complexities. Can be further augmented with path compression.
32+
33+ ## Applications
34+ Because of its efficiency and simplicity in implementing, Disjoint Set structures are widely used in practice:
35+ 1 . As mentioned, it is often sued as a helper structure for Kruskal's MST algorithm
36+ 2 . It can be used in the context of network connectivity
37+ - Managing a network of computers
38+ - Or even analyse social networks, finding communities and determining if two users are connected through a chain
39+ 3 . Can be part of clustering algorithms to group data points based on similarity - useful for ML
40+ 4 . It can be used to detect cycles in dependency graphs, e.g, software dependency management systems
41+ 5 . It can be used for image processing, in labelling different connected components of an image
42+
43+ ## Notes
44+ Disjoint Set is a data structure designed to keep track of a set of elements partitioned into a number of
45+ non-overlapping subsets. It is not suited for handling duplicates and so our implementation ignores duplicates.
0 commit comments