Skip to content

Commit e218d40

Browse files
committed
Union-Find: add a new module in kernel library
jira NONE_AUTOMATION Rebuild_History Non-Buildable kernel-5.14.0-570.17.1.el9_6 commit-author Xavier <[email protected]> commit 93c8332 Empty-Commit: Cherry-Pick Conflicts during history rebuild. Will be included in final tarball splat. Ref for failed cherry-pick at: ciq/ciq_backports/kernel-5.14.0-570.17.1.el9_6/93c8332c.failed This patch implements a union-find data structure in the kernel library, which includes operations for allocating nodes, freeing nodes, finding the root of a node, and merging two nodes. Signed-off-by: Xavier <[email protected]> Signed-off-by: Tejun Heo <[email protected]> (cherry picked from commit 93c8332) Signed-off-by: Jonathan Maple <[email protected]> # Conflicts: # MAINTAINERS # lib/Makefile
1 parent 741907a commit e218d40

File tree

1 file changed

+375
-0
lines changed

1 file changed

+375
-0
lines changed
Lines changed: 375 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,375 @@
1+
Union-Find: add a new module in kernel library
2+
3+
jira NONE_AUTOMATION
4+
Rebuild_History Non-Buildable kernel-5.14.0-570.17.1.el9_6
5+
commit-author Xavier <[email protected]>
6+
commit 93c8332c8373fee415bd79f08d5ba4ba7ca5ad15
7+
Empty-Commit: Cherry-Pick Conflicts during history rebuild.
8+
Will be included in final tarball splat. Ref for failed cherry-pick at:
9+
ciq/ciq_backports/kernel-5.14.0-570.17.1.el9_6/93c8332c.failed
10+
11+
This patch implements a union-find data structure in the kernel library,
12+
which includes operations for allocating nodes, freeing nodes,
13+
finding the root of a node, and merging two nodes.
14+
15+
Signed-off-by: Xavier <[email protected]>
16+
Signed-off-by: Tejun Heo <[email protected]>
17+
(cherry picked from commit 93c8332c8373fee415bd79f08d5ba4ba7ca5ad15)
18+
Signed-off-by: Jonathan Maple <[email protected]>
19+
20+
# Conflicts:
21+
# MAINTAINERS
22+
# lib/Makefile
23+
diff --cc MAINTAINERS
24+
index 2da3ca855dd2,82e3924816d2..000000000000
25+
--- a/MAINTAINERS
26+
+++ b/MAINTAINERS
27+
@@@ -19844,13 -23458,14 +19844,24 @@@ F: drivers/cdrom/cdrom.
28+
F: include/linux/cdrom.h
29+
F: include/uapi/linux/cdrom.h
30+
31+
++<<<<<<< HEAD
32+
+UNISYS S-PAR DRIVERS
33+
+M: David Kershner <[email protected]>
34+
+L: [email protected] (Unisys internal)
35+
+S: Supported
36+
+F: drivers/staging/unisys/
37+
+F: drivers/visorbus/
38+
+F: include/linux/visorbus.h
39+
++=======
40+
+ UNION-FIND
41+
+ M: Xavier <[email protected]>
42+
43+
+ S: Maintained
44+
+ F: Documentation/core-api/union_find.rst
45+
+ F: Documentation/translations/zh_CN/core-api/union_find.rst
46+
+ F: include/linux/union_find.h
47+
+ F: lib/union_find.c
48+
++>>>>>>> 93c8332c8373 (Union-Find: add a new module in kernel library)
49+
50+
UNIVERSAL FLASH STORAGE HOST CONTROLLER DRIVER
51+
R: Alim Akhtar <[email protected]>
52+
diff --cc lib/Makefile
53+
index 0b831df6ccf4,a5e3c1d5b6f9..000000000000
54+
--- a/lib/Makefile
55+
+++ b/lib/Makefile
56+
@@@ -34,9 -34,10 +34,13 @@@ lib-y := ctype.o string.o vsprintf.o cm
57+
is_single_threaded.o plist.o decompress.o kobject_uevent.o \
58+
earlycpio.o seq_buf.o siphash.o dec_and_lock.o \
59+
nmi_backtrace.o win_minmax.o memcat_p.o \
60+
++<<<<<<< HEAD
61+
+ buildid.o cpumask.o
62+
++=======
63+
+ buildid.o objpool.o union_find.o
64+
++>>>>>>> 93c8332c8373 (Union-Find: add a new module in kernel library)
65+
66+
lib-$(CONFIG_PRINTK) += dump_stack.o
67+
-lib-$(CONFIG_SMP) += cpumask.o
68+
69+
lib-y += kobject.o klist.o
70+
obj-y += lockref.o
71+
diff --git a/Documentation/core-api/union_find.rst b/Documentation/core-api/union_find.rst
72+
new file mode 100644
73+
index 000000000000..2bf0290c9184
74+
--- /dev/null
75+
+++ b/Documentation/core-api/union_find.rst
76+
@@ -0,0 +1,102 @@
77+
+.. SPDX-License-Identifier: GPL-2.0
78+
+
79+
+====================
80+
+Union-Find in Linux
81+
+====================
82+
+
83+
+
84+
+:Date: June 21, 2024
85+
+:Author: Xavier <[email protected]>
86+
+
87+
+What is union-find, and what is it used for?
88+
+------------------------------------------------
89+
+
90+
+Union-find is a data structure used to handle the merging and querying
91+
+of disjoint sets. The primary operations supported by union-find are:
92+
+
93+
+ Initialization: Resetting each element as an individual set, with
94+
+ each set's initial parent node pointing to itself.
95+
+ Find: Determine which set a particular element belongs to, usually by
96+
+ returning a “representative element” of that set. This operation
97+
+ is used to check if two elements are in the same set.
98+
+ Union: Merge two sets into one.
99+
+
100+
+As a data structure used to maintain sets (groups), union-find is commonly
101+
+utilized to solve problems related to offline queries, dynamic connectivity,
102+
+and graph theory. It is also a key component in Kruskal's algorithm for
103+
+computing the minimum spanning tree, which is crucial in scenarios like
104+
+network routing. Consequently, union-find is widely referenced. Additionally,
105+
+union-find has applications in symbolic computation, register allocation,
106+
+and more.
107+
+
108+
+Space Complexity: O(n), where n is the number of nodes.
109+
+
110+
+Time Complexity: Using path compression can reduce the time complexity of
111+
+the find operation, and using union by rank can reduce the time complexity
112+
+of the union operation. These optimizations reduce the average time
113+
+complexity of each find and union operation to O(α(n)), where α(n) is the
114+
+inverse Ackermann function. This can be roughly considered a constant time
115+
+complexity for practical purposes.
116+
+
117+
+This document covers use of the Linux union-find implementation. For more
118+
+information on the nature and implementation of union-find, see:
119+
+
120+
+ Wikipedia entry on union-find
121+
+ https://en.wikipedia.org/wiki/Disjoint-set_data_structure
122+
+
123+
+Linux implementation of union-find
124+
+-----------------------------------
125+
+
126+
+Linux's union-find implementation resides in the file "lib/union_find.c".
127+
+To use it, "#include <linux/union_find.h>".
128+
+
129+
+The union-find data structure is defined as follows::
130+
+
131+
+ struct uf_node {
132+
+ struct uf_node *parent;
133+
+ unsigned int rank;
134+
+ };
135+
+
136+
+In this structure, parent points to the parent node of the current node.
137+
+The rank field represents the height of the current tree. During a union
138+
+operation, the tree with the smaller rank is attached under the tree with the
139+
+larger rank to maintain balance.
140+
+
141+
+Initializing union-find
142+
+--------------------
143+
+
144+
+You can complete the initialization using either static or initialization
145+
+interface. Initialize the parent pointer to point to itself and set the rank
146+
+to 0.
147+
+Example::
148+
+
149+
+ struct uf_node my_node = UF_INIT_NODE(my_node);
150+
+or
151+
+ uf_node_init(&my_node);
152+
+
153+
+Find the Root Node of union-find
154+
+--------------------------------
155+
+
156+
+This operation is mainly used to determine whether two nodes belong to the same
157+
+set in the union-find. If they have the same root, they are in the same set.
158+
+During the find operation, path compression is performed to improve the
159+
+efficiency of subsequent find operations.
160+
+Example::
161+
+
162+
+ int connected;
163+
+ struct uf_node *root1 = uf_find(&node_1);
164+
+ struct uf_node *root2 = uf_find(&node_2);
165+
+ if (root1 == root2)
166+
+ connected = 1;
167+
+ else
168+
+ connected = 0;
169+
+
170+
+Union Two Sets in union-find
171+
+----------------------------
172+
+
173+
+To union two sets in the union-find, you first find their respective root nodes
174+
+and then link the smaller node to the larger node based on the rank of the root
175+
+nodes.
176+
+Example::
177+
+
178+
+ uf_union(&node_1, &node_2);
179+
diff --git a/Documentation/translations/zh_CN/core-api/union_find.rst b/Documentation/translations/zh_CN/core-api/union_find.rst
180+
new file mode 100644
181+
index 000000000000..a56de57147e9
182+
--- /dev/null
183+
+++ b/Documentation/translations/zh_CN/core-api/union_find.rst
184+
@@ -0,0 +1,87 @@
185+
+.. SPDX-License-Identifier: GPL-2.0
186+
+.. include:: ../disclaimer-zh_CN.rst
187+
+
188+
+:Original: Documentation/core-api/union_find.rst
189+
+
190+
+===========================
191+
+Linux中的并查集(Union-Find)
192+
+===========================
193+
+
194+
+
195+
+:日期: 2024年6月21日
196+
+:作者: Xavier <[email protected]>
197+
+
198+
+何为并查集,它有什么用?
199+
+---------------------
200+
+
201+
+并查集是一种数据结构,用于处理一些不交集的合并及查询问题。并查集支持的主要操作:
202+
+ 初始化:将每个元素初始化为单独的集合,每个集合的初始父节点指向自身
203+
+ 查询:查询某个元素属于哪个集合,通常是返回集合中的一个“代表元素”。这个操作是为
204+
+ 了判断两个元素是否在同一个集合之中。
205+
+ 合并:将两个集合合并为一个。
206+
+
207+
+并查集作为一种用于维护集合(组)的数据结构,它通常用于解决一些离线查询、动态连通性和
208+
+图论等相关问题,同时也是用于计算最小生成树的克鲁斯克尔算法中的关键,由于最小生成树在
209+
+网络路由等场景下十分重要,并查集也得到了广泛的引用。此外,并查集在符号计算,寄存器分
210+
+配等方面也有应用。
211+
+
212+
+空间复杂度: O(n),n为节点数。
213+
+
214+
+时间复杂度:使用路径压缩可以减少查找操作的时间复杂度,使用按秩合并可以减少合并操作的
215+
+时间复杂度,使得并查集每个查询和合并操作的平均时间复杂度仅为O(α(n)),其中α(n)是反阿
216+
+克曼函数,可以粗略地认为并查集的操作有常数的时间复杂度。
217+
+
218+
+本文档涵盖了对Linux并查集实现的使用方法。更多关于并查集的性质和实现的信息,参见:
219+
+
220+
+ 维基百科并查集词条
221+
+ https://en.wikipedia.org/wiki/Disjoint-set_data_structure
222+
+
223+
+并查集的Linux实现
224+
+----------------
225+
+
226+
+Linux的并查集实现在文件“lib/union_find.c”中。要使用它,需要
227+
+“#include <linux/union_find.h>”。
228+
+
229+
+并查集的数据结构定义如下::
230+
+
231+
+ struct uf_node {
232+
+ struct uf_node *parent;
233+
+ unsigned int rank;
234+
+ };
235+
+其中parent为当前节点的父节点,rank为当前树的高度,在合并时将rank小的节点接到rank大
236+
+的节点下面以增加平衡性。
237+
+
238+
+初始化并查集
239+
+---------
240+
+
241+
+可以采用静态或初始化接口完成初始化操作。初始化时,parent 指针指向自身,rank 设置
242+
+为 0。
243+
+示例::
244+
+
245+
+ struct uf_node my_node = UF_INIT_NODE(my_node);
246+
+或
247+
+ uf_node_init(&my_node);
248+
+
249+
+查找并查集的根节点
250+
+----------------
251+
+
252+
+主要用于判断两个并查集是否属于一个集合,如果根相同,那么他们就是一个集合。在查找过程中
253+
+会对路径进行压缩,提高后续查找效率。
254+
+示例::
255+
+
256+
+ int connected;
257+
+ struct uf_node *root1 = uf_find(&node_1);
258+
+ struct uf_node *root2 = uf_find(&node_2);
259+
+ if (root1 == root2)
260+
+ connected = 1;
261+
+ else
262+
+ connected = 0;
263+
+
264+
+合并两个并查集
265+
+-------------
266+
+
267+
+对于两个相交的并查集进行合并,会首先查找它们各自的根节点,然后根据根节点秩大小,将小的
268+
+节点连接到大的节点下面。
269+
+示例::
270+
+
271+
+ uf_union(&node_1, &node_2);
272+
* Unmerged path MAINTAINERS
273+
diff --git a/include/linux/union_find.h b/include/linux/union_find.h
274+
new file mode 100644
275+
index 000000000000..cfd49263c138
276+
--- /dev/null
277+
+++ b/include/linux/union_find.h
278+
@@ -0,0 +1,41 @@
279+
+/* SPDX-License-Identifier: GPL-2.0 */
280+
+#ifndef __LINUX_UNION_FIND_H
281+
+#define __LINUX_UNION_FIND_H
282+
+/**
283+
+ * union_find.h - union-find data structure implementation
284+
+ *
285+
+ * This header provides functions and structures to implement the union-find
286+
+ * data structure. The union-find data structure is used to manage disjoint
287+
+ * sets and supports efficient union and find operations.
288+
+ *
289+
+ * See Documentation/core-api/union_find.rst for documentation and samples.
290+
+ */
291+
+
292+
+struct uf_node {
293+
+ struct uf_node *parent;
294+
+ unsigned int rank;
295+
+};
296+
+
297+
+/* This macro is used for static initialization of a union-find node. */
298+
+#define UF_INIT_NODE(node) {.parent = &node, .rank = 0}
299+
+
300+
+/**
301+
+ * uf_node_init - Initialize a union-find node
302+
+ * @node: pointer to the union-find node to be initialized
303+
+ *
304+
+ * This function sets the parent of the node to itself and
305+
+ * initializes its rank to 0.
306+
+ */
307+
+static inline void uf_node_init(struct uf_node *node)
308+
+{
309+
+ node->parent = node;
310+
+ node->rank = 0;
311+
+}
312+
+
313+
+/* find the root of a node */
314+
+struct uf_node *uf_find(struct uf_node *node);
315+
+
316+
+/* Merge two intersecting nodes */
317+
+void uf_union(struct uf_node *node1, struct uf_node *node2);
318+
+
319+
+#endif /* __LINUX_UNION_FIND_H */
320+
* Unmerged path lib/Makefile
321+
diff --git a/lib/union_find.c b/lib/union_find.c
322+
new file mode 100644
323+
index 000000000000..413b0f8adf7a
324+
--- /dev/null
325+
+++ b/lib/union_find.c
326+
@@ -0,0 +1,49 @@
327+
+// SPDX-License-Identifier: GPL-2.0
328+
+#include <linux/union_find.h>
329+
+
330+
+/**
331+
+ * uf_find - Find the root of a node and perform path compression
332+
+ * @node: the node to find the root of
333+
+ *
334+
+ * This function returns the root of the node by following the parent
335+
+ * pointers. It also performs path compression, making the tree shallower.
336+
+ *
337+
+ * Returns the root node of the set containing node.
338+
+ */
339+
+struct uf_node *uf_find(struct uf_node *node)
340+
+{
341+
+ struct uf_node *parent;
342+
+
343+
+ while (node->parent != node) {
344+
+ parent = node->parent;
345+
+ node->parent = parent->parent;
346+
+ node = parent;
347+
+ }
348+
+ return node;
349+
+}
350+
+
351+
+/**
352+
+ * uf_union - Merge two sets, using union by rank
353+
+ * @node1: the first node
354+
+ * @node2: the second node
355+
+ *
356+
+ * This function merges the sets containing node1 and node2, by comparing
357+
+ * the ranks to keep the tree balanced.
358+
+ */
359+
+void uf_union(struct uf_node *node1, struct uf_node *node2)
360+
+{
361+
+ struct uf_node *root1 = uf_find(node1);
362+
+ struct uf_node *root2 = uf_find(node2);
363+
+
364+
+ if (root1 == root2)
365+
+ return;
366+
+
367+
+ if (root1->rank < root2->rank) {
368+
+ root1->parent = root2;
369+
+ } else if (root1->rank > root2->rank) {
370+
+ root2->parent = root1;
371+
+ } else {
372+
+ root2->parent = root1;
373+
+ root1->rank++;
374+
+ }
375+
+}

0 commit comments

Comments
 (0)