|
| 1 | +# AVL Trees |
| 2 | + |
| 3 | +## Background |
| 4 | +Is the fastest way to search for data to store them in an array, sort them and perform binary search? No. This will |
| 5 | +incur minimally O(nlogn) sorting cost, and O(n) cost per insertion to maintain sorted order. |
| 6 | + |
| 7 | +We have seen binary search trees (BSTs), which always maintains data in sorted order. This allows us to avoid the |
| 8 | +overhead of sorting before we search. However, we also learnt that unbalanced BSTs can be incredibly inefficient for |
| 9 | +insertion, deletion and search operations, which are O(h) in time complexity (in the case of degenerate trees, |
| 10 | +operations can go up to O(n)). |
| 11 | + |
| 12 | +Here we discuss a type of self-balancing BST, known as the AVL tree, that avoids the worst case O(n) performance |
| 13 | +across the operations by ensuring careful updating of the tree's structure whenever there is a change |
| 14 | +(e.g. insert or delete). |
| 15 | + |
| 16 | +### Definition of Balanced Trees |
| 17 | +Balanced trees are a special subset of trees with **height in the order of log(n)**, where n is the number of nodes. |
| 18 | +This choice is not an arbitrary one. It can be mathematically shown that a binary tree of n nodes has height of at least |
| 19 | +log(n) (in the case of a complete binary tree). So, it makes intuitive sense to give trees whose heights are roughly |
| 20 | + in the order of log(n) the desirable 'balanced' label. |
| 21 | + |
| 22 | +<div align="center"> |
| 23 | + <img src="../../../../../docs/assets/images/BalancedProof.png" width="40%"> |
| 24 | + <br> |
| 25 | + Credits: CS2040s Lecture 9 |
| 26 | +</div> |
| 27 | + |
| 28 | +### Height-Balanced Property of AVL Trees |
| 29 | +There are several ways to achieve a balanced tree. Red-black tree, B-Trees, Scapegoat and AVL trees ensure balance |
| 30 | +differently. Each of them relies on some underlying 'good' property to maintain balance - a careful segmenting of nodes |
| 31 | +in the case of RB-trees and enforcing a depth constraint for B-Trees. Go check them out in the other folders! <br> |
| 32 | +What is important is that this **'good' property holds even after every change** (insert/update/delete). |
| 33 | + |
| 34 | +The 'good' property in AVL Trees is the **height-balanced** property. Height-balanced on a node is defined as |
| 35 | +**difference in height between the left and right child node being not more than 1**. <br> |
| 36 | +We say the tree is height-balanced if every node in the tree is height-balanced. Be careful not to conflate |
| 37 | +the concept of "balanced tree" and "height-balanced" property. They are not the same; the latter is used to achieve the |
| 38 | +former. |
| 39 | + |
| 40 | +<details> |
| 41 | +<summary> <b>Ponder..</b> </summary> |
| 42 | +Consider any two nodes (need not have the same immediate parent node) in the tree. Is the difference in height |
| 43 | +between the two nodes <= 1 too? |
| 44 | +</details> |
| 45 | + |
| 46 | +It can be mathematically shown that a **height-balanced tree with n nodes, has at most height <= 2log(n)**. Therefore, |
| 47 | +following the definition of a balanced tree, AVL trees are balanced. |
| 48 | + |
| 49 | +<div align="center"> |
| 50 | + <img src="../../../../../docs/assets/images/AvlTree.png" width="40%"> |
| 51 | + <br> |
| 52 | + Credits: CS2040s Lecture 9 |
| 53 | +</div> |
| 54 | + |
| 55 | +## Complexity Analysis |
| 56 | +**Search, Insertion, Deletion, Predecessor & Successor queries Time**: O(height) = O(logn) |
| 57 | + |
| 58 | +**Space**: O(n) <br> |
| 59 | +where n is the number of elements (whatever the structure, it must store at least n nodes) |
| 60 | + |
| 61 | +## Operations |
| 62 | +Minimally, an implementation of AVL tree must support the standard **insert**, **delete**, and **search** operations. |
| 63 | +**Update** can be simulated by searching for the old key, deleting it, and then inserting a node with the new key. |
| 64 | + |
| 65 | +Naturally, with insertions and deletions, the structure of the tree will change, and it may not satisfy the |
| 66 | +"height-balance" property of the AVL tree. Without this property, we may lose our O(log(n)) run-time guarantee. |
| 67 | +Hence, we need some re-balancing operations. To do so, tree rotation operations are introduced. Below is one example. |
| 68 | + |
| 69 | +<div align="center"> |
| 70 | + <img src="../../../../../docs/assets/images/TreeRotation.png" width="40%"> |
| 71 | + <br> |
| 72 | + Credits: CS2040s Lecture 10 |
| 73 | +</div> |
| 74 | + |
| 75 | +Prof Seth explains it best! Go re-visit his slides (Lecture 10) for the operations :P <br> |
| 76 | +Here is a [link](https://www.youtube.com/watch?v=dS02_IuZPes&list=PLgpwqdiEMkHA0pU_uspC6N88RwMpt9rC8&index=9) |
| 77 | +for prof's lecture on trees. <br> |
| 78 | +_We may add a summary in the near future._ |
| 79 | + |
| 80 | +## Application |
| 81 | +While AVL trees offer excellent lookup, insertion, and deletion times due to their strict balancing, |
| 82 | +the overhead of maintaining this balance can make them less preferred for applications |
| 83 | +where insertions and deletions are significantly more frequent than lookups. As a result, AVL trees often find itself |
| 84 | +over-shadowed in practical use by other counterparts like RB-trees, |
| 85 | +which boast a relatively simple implementation and lower overhead, or B-trees which are ideal for optimizing disk |
| 86 | +accesses in databases. |
| 87 | + |
| 88 | +That said, AVL tree is conceptually simple and often used as the base template for further augmentation to tackle |
| 89 | +niche problems. Orthogonal Range Searching and Interval Trees can be implemented with some minor augmentation to |
| 90 | +an existing AVL tree. |
0 commit comments