|
| 1 | +<h1 align='center'>Implement - Trie - Prefix Tree</h1> |
| 2 | + |
| 3 | +## Problem Statement |
| 4 | + |
| 5 | +**Problem URL :** [Implement Trie (Prefix Tree)](https://leetcode.com/problems/implement-trie-prefix-tree/) |
| 6 | + |
| 7 | + |
| 8 | + |
| 9 | + |
| 10 | +## Problem Explanation |
| 11 | +A **Trie** (also known as a **Prefix Tree**) is a special tree-like data structure used for storing strings, where each node represents a character of the string. It is used to perform efficient retrieval operations for prefix-based searches. |
| 12 | + |
| 13 | +#### **Problem Overview** |
| 14 | +You need to implement a **Trie** with the following methods: |
| 15 | +1. **`insert(word)`** - Inserts a word into the Trie. |
| 16 | +2. **`search(word)`** - Returns `true` if the word exists in the Trie, otherwise returns `false`. |
| 17 | +3. **`startsWith(prefix)`** - Returns `true` if there is any word in the Trie that starts with the given prefix. |
| 18 | + |
| 19 | +#### **Example Explanation** |
| 20 | + |
| 21 | +Let's take an example to understand how the Trie works: |
| 22 | +- Insert the word `"apple"`. |
| 23 | +- Insert the word `"app"`. |
| 24 | +- Insert the word `"banana"`. |
| 25 | + |
| 26 | +Here is what happens step by step: |
| 27 | + |
| 28 | +1. **Inserting "apple"**: |
| 29 | + - Start with an empty Trie. |
| 30 | + - Add `'a'` at the root. |
| 31 | + - Add `'p'` under `'a'`, `'p'` under the first `'p'`, `'l'` under `'p'`, and `'e'` under `'l'`. |
| 32 | + - Mark `'e'` as the terminal character. |
| 33 | + |
| 34 | + The Trie now looks like: |
| 35 | + ``` |
| 36 | + root -> a -> p -> p -> l -> e (isTerminal) |
| 37 | + ``` |
| 38 | + |
| 39 | +2. **Inserting "app"**: |
| 40 | + - Start at the root. |
| 41 | + - The path for `'a'` and `'p'` already exists. |
| 42 | + - Add the second `'p'` and mark it as terminal. |
| 43 | + |
| 44 | + The Trie now looks like: |
| 45 | + ``` |
| 46 | + root -> a -> p -> p (isTerminal) -> l -> e (isTerminal) |
| 47 | + ``` |
| 48 | + |
| 49 | +3. **Inserting "banana"**: |
| 50 | + - Add `'b'`, `'a'`, `'n'`, `'a'`, `'n'`, `'a'` following the same approach. |
| 51 | + |
| 52 | + The Trie now looks like: |
| 53 | + ``` |
| 54 | + root -> a -> p -> p (isTerminal) -> l -> e (isTerminal) |
| 55 | + -> b -> a -> n -> a -> n -> a (isTerminal) |
| 56 | + ``` |
| 57 | + |
| 58 | +4. **Search for "app"**: |
| 59 | + - Traverse the Trie from the root, checking each character in the word `"app"`. |
| 60 | + - Since the word exists, return `true`. |
| 61 | + |
| 62 | +5. **Search for "appl"**: |
| 63 | + - Traverse the Trie, but the character `'l'` does not exist after the prefix `"app"`, so return `false`. |
| 64 | + |
| 65 | +6. **startsWith("ban")**: |
| 66 | + - Traverse the Trie, the characters `'b'`, `'a'`, `'n'` are found, so return `true`. |
| 67 | + |
| 68 | +#### **Approach to Solve the Problem** |
| 69 | + |
| 70 | +1. **Create the TrieNode class**: |
| 71 | + - Each node in the Trie stores: |
| 72 | + - A character `data`. |
| 73 | + - An array `children[26]` to represent 26 possible children (one for each letter of the alphabet). |
| 74 | + - A boolean `isTerminal` to indicate if the node marks the end of a word. |
| 75 | + |
| 76 | +2. **Create the Trie class**: |
| 77 | + - The Trie class has a `root` node, which is a TrieNode. |
| 78 | + - The `insert` method will insert characters into the Trie recursively. |
| 79 | + - The `search` method will check whether a word exists by traversing the Trie. |
| 80 | + - The `startsWith` method will check if a prefix exists by traversing the Trie. |
| 81 | + |
| 82 | +## Problem Solution |
| 83 | +```cpp |
| 84 | +class TrieNode{ |
| 85 | + public: |
| 86 | + char data; |
| 87 | + TrieNode* children[26]; |
| 88 | + bool isTerminal; |
| 89 | + |
| 90 | + TrieNode(char data){ |
| 91 | + this -> data = data; |
| 92 | + for(int i = 0; i < 26; i++) children[i] = NULL; |
| 93 | + isTerminal = false; |
| 94 | + } |
| 95 | +}; |
| 96 | +class Trie { |
| 97 | +public: |
| 98 | + TrieNode* root; |
| 99 | + |
| 100 | + Trie() { |
| 101 | + root = new TrieNode('\0'); |
| 102 | + } |
| 103 | + |
| 104 | + void insertUtil(TrieNode* root, string word, int index){ |
| 105 | + if(index == word.size()){ |
| 106 | + root -> isTerminal = true; |
| 107 | + return; |
| 108 | + } |
| 109 | + |
| 110 | + int charIndex = word[index] - 'a'; |
| 111 | + if(root -> children[charIndex] == NULL) root -> children[charIndex] = new TrieNode(word[index]); |
| 112 | + |
| 113 | + insertUtil(root -> children[charIndex], word, index+1); |
| 114 | + } |
| 115 | + void insert(string word){ |
| 116 | + insertUtil(root, word, 0); |
| 117 | + } |
| 118 | + |
| 119 | + bool searchUtil(TrieNode* root, string word, int index){ |
| 120 | + if(index == word.size()) return root -> isTerminal; |
| 121 | + |
| 122 | + int charIndex = word[index] - 'a'; |
| 123 | + if(root -> children[charIndex] == NULL) return false; |
| 124 | + |
| 125 | + return searchUtil(root -> children[charIndex], word, index+1); |
| 126 | + |
| 127 | + } |
| 128 | + bool search(string word) { |
| 129 | + return searchUtil(root, word, 0); |
| 130 | + } |
| 131 | + |
| 132 | + bool startsWithUtil(TrieNode* root, string prefix, int index){ |
| 133 | + if(index == prefix.size()) return true; |
| 134 | + |
| 135 | + int charIndex = prefix[index] - 'a'; |
| 136 | + if(root -> children[charIndex] == NULL) return false; |
| 137 | + |
| 138 | + return startsWithUtil(root -> children[charIndex], prefix, index+1); |
| 139 | + } |
| 140 | + |
| 141 | + bool startsWith(string prefix){ |
| 142 | + return startsWithUtil(root, prefix, 0); |
| 143 | + } |
| 144 | + |
| 145 | +}; |
| 146 | + |
| 147 | +/** |
| 148 | + * Your Trie object will be instantiated and called as such: |
| 149 | + * Trie* obj = new Trie(); |
| 150 | + * obj->insert(word); |
| 151 | + * bool param_2 = obj->search(word); |
| 152 | + * bool param_3 = obj->startsWith(prefix); |
| 153 | + */ |
| 154 | +``` |
| 155 | +
|
| 156 | +## Problem Solution Explanation |
| 157 | +Here’s a detailed explanation of the given **Trie** implementation in C++: |
| 158 | +
|
| 159 | +```cpp |
| 160 | +class TrieNode { |
| 161 | +public: |
| 162 | + char data; |
| 163 | + TrieNode* children[26]; // Array to hold 26 children (for each letter in the alphabet) |
| 164 | + bool isTerminal; // Marks if the node represents the end of a word |
| 165 | +
|
| 166 | + TrieNode(char data) { |
| 167 | + this->data = data; // Initialize the node with the given character |
| 168 | + for (int i = 0; i < 26; i++) { |
| 169 | + children[i] = NULL; // Initialize all children pointers to NULL (no child initially) |
| 170 | + } |
| 171 | + isTerminal = false; // Initially, the node is not terminal (it doesn't mark the end of a word) |
| 172 | + } |
| 173 | +}; |
| 174 | +``` |
| 175 | + |
| 176 | +#### **Explanation of TrieNode class**: |
| 177 | +1. **`char data`**: |
| 178 | + - This stores the character of the current node. Each node in the Trie represents a single character of the word. |
| 179 | + |
| 180 | +2. **`TrieNode* children[26]`**: |
| 181 | + - This is an array of pointers to child TrieNodes. Each index of this array corresponds to a letter from `a` to `z`. For example, `children[0]` points to 'a', `children[1]` points to 'b', and so on. |
| 182 | + |
| 183 | +3. **`bool isTerminal`**: |
| 184 | + - This boolean flag indicates whether the current node is the end of a word. If `isTerminal` is `true`, it means the node marks the last character of a complete word. If it's `false`, it is just an intermediate node. |
| 185 | + |
| 186 | +4. **Constructor**: |
| 187 | + - The constructor initializes the node with a character (`data`), and all child pointers are set to `NULL`. It also sets `isTerminal` to `false`. |
| 188 | + |
| 189 | +```cpp |
| 190 | +class Trie { |
| 191 | +public: |
| 192 | + TrieNode* root; // The root of the Trie (doesn't store any character) |
| 193 | +``` |
| 194 | +
|
| 195 | +#### **Explanation of the Trie class**: |
| 196 | +- **`TrieNode* root`**: |
| 197 | + - This is a pointer to the root node of the Trie. The root node doesn't store any character (`'\0'`), and it is just the starting point for all words inserted into the Trie. |
| 198 | +
|
| 199 | +```cpp |
| 200 | + Trie() { |
| 201 | + root = new TrieNode('\0'); // Initialize the root with a dummy character ('\0') |
| 202 | + } |
| 203 | +``` |
| 204 | + |
| 205 | +#### **Trie Constructor**: |
| 206 | +- The constructor creates the root node of the Trie. The root node is initialized with a null character (`'\0'`) since it does not represent any specific letter but serves as the starting point for all insertions. |
| 207 | + |
| 208 | +```cpp |
| 209 | + void insertUtil(TrieNode* root, string word, int index) { |
| 210 | + if (index == word.size()) { |
| 211 | + root->isTerminal = true; // If we've reached the end of the word, mark the node as terminal |
| 212 | + return; |
| 213 | + } |
| 214 | + |
| 215 | + int charIndex = word[index] - 'a'; // Calculate the index of the character (0 for 'a', 1 for 'b', etc.) |
| 216 | + |
| 217 | + // If the child node for the current character doesn't exist, create a new TrieNode |
| 218 | + if (root->children[charIndex] == NULL) { |
| 219 | + root->children[charIndex] = new TrieNode(word[index]); |
| 220 | + } |
| 221 | + |
| 222 | + // Recursively insert the remaining characters of the word |
| 223 | + insertUtil(root->children[charIndex], word, index + 1); |
| 224 | + } |
| 225 | +``` |
| 226 | +
|
| 227 | +#### **`insertUtil` function**: |
| 228 | +- **Base Case (Line 4)**: |
| 229 | + - The function checks if `index == word.size()`. If true, it means we have inserted all the characters of the word. The current node (represented by `root`) should be marked as a terminal node (`root->isTerminal = true`), indicating the word ends here. |
| 230 | +
|
| 231 | +- **Calculate `charIndex` (Line 6)**: |
| 232 | + - The `charIndex` variable is used to map the character to the corresponding index in the `children` array. For example, if the character is `'a'`, then `charIndex = 0`, for `'b'`, `charIndex = 1`, and so on. |
| 233 | +
|
| 234 | +- **Check for Existing Child Node (Lines 8-10)**: |
| 235 | + - If the child node for the current character (`root->children[charIndex]`) does not exist, a new `TrieNode` is created and added at the appropriate position in the `children` array. |
| 236 | +
|
| 237 | +- **Recursive Insertion (Line 13)**: |
| 238 | + - The function is then called recursively on the child node of the current character (`root->children[charIndex]`) and proceeds with the next character in the word (`index + 1`). |
| 239 | +
|
| 240 | +```cpp |
| 241 | + void insert(string word) { |
| 242 | + insertUtil(root, word, 0); // Start the insertion from the root node |
| 243 | + } |
| 244 | +``` |
| 245 | + |
| 246 | +#### **`insert` function**: |
| 247 | +- This is the public function that calls the `insertUtil` function, passing the root node, the word to be inserted, and the starting index (`0`). |
| 248 | + |
| 249 | +```cpp |
| 250 | + bool searchUtil(TrieNode* root, string word, int index) { |
| 251 | + if (index == word.size()) { |
| 252 | + return root->isTerminal; // If we've reached the end of the word, return whether the node is terminal |
| 253 | + } |
| 254 | + |
| 255 | + int charIndex = word[index] - 'a'; // Calculate the index of the character |
| 256 | + |
| 257 | + // If the child node for the current character doesn't exist, return false |
| 258 | + if (root->children[charIndex] == NULL) { |
| 259 | + return false; |
| 260 | + } |
| 261 | + |
| 262 | + // Recursively search for the remaining characters of the word |
| 263 | + return searchUtil(root->children[charIndex], word, index + 1); |
| 264 | + } |
| 265 | +``` |
| 266 | +
|
| 267 | +#### **`searchUtil` function**: |
| 268 | +- **Base Case (Line 4)**: |
| 269 | + - When the `index` reaches the end of the word (`index == word.size()`), the function checks whether the current node (`root`) is terminal. If `root->isTerminal` is `true`, it means the word exists in the Trie and it returns `true`. If not, it returns `false`. |
| 270 | +
|
| 271 | +- **Character Index Calculation (Line 6)**: |
| 272 | + - The index for the current character is calculated in the same way as in the `insertUtil` function. |
| 273 | +
|
| 274 | +- **Child Node Check (Lines 8-10)**: |
| 275 | + - If the child node for the current character (`root->children[charIndex]`) is `NULL`, it means the word doesn’t exist in the Trie, so the function returns `false`. |
| 276 | +
|
| 277 | +- **Recursive Search (Line 13)**: |
| 278 | + - The function calls itself recursively for the next character in the word (`index + 1`) until the whole word is checked. |
| 279 | +
|
| 280 | +```cpp |
| 281 | + bool search(string word) { |
| 282 | + return searchUtil(root, word, 0); // Start the search from the root node |
| 283 | + } |
| 284 | +``` |
| 285 | + |
| 286 | +#### **`search` function**: |
| 287 | +- This is the public function that calls the `searchUtil` function, passing the root node, the word to search for, and the starting index (`0`). |
| 288 | + |
| 289 | +```cpp |
| 290 | + bool startsWithUtil(TrieNode* root, string prefix, int index) { |
| 291 | + if (index == prefix.size()) { |
| 292 | + return true; // If we've reached the end of the prefix, return true |
| 293 | + } |
| 294 | + |
| 295 | + int charIndex = prefix[index] - 'a'; // Calculate the index of the character |
| 296 | + |
| 297 | + // If the child node for the current character doesn't exist, return false |
| 298 | + if (root->children[charIndex] == NULL) { |
| 299 | + return false; |
| 300 | + } |
| 301 | + |
| 302 | + // Recursively check for the remaining characters of the prefix |
| 303 | + return startsWithUtil(root->children[charIndex], prefix, index + 1); |
| 304 | + } |
| 305 | +``` |
| 306 | +
|
| 307 | +#### **`startsWithUtil` function**: |
| 308 | +- **Base Case (Line 4)**: |
| 309 | + - If the `index` equals the size of the prefix (`index == prefix.size()`), it means the entire prefix has been found, so the function returns `true`. |
| 310 | +
|
| 311 | +- **Character Index Calculation (Line 6)**: |
| 312 | + - Similar to the other functions, the `charIndex` is calculated to map the current character of the prefix to the correct position in the `children` array. |
| 313 | +
|
| 314 | +- **Child Node Check (Lines 8-10)**: |
| 315 | + - If the child node for the current character doesn’t exist, it means no word with the given prefix exists, so it returns `false`. |
| 316 | +
|
| 317 | +- **Recursive Check (Line 13)**: |
| 318 | + - The function calls itself recursively to check for the next character in the prefix. |
| 319 | +
|
| 320 | +```cpp |
| 321 | + bool startsWith(string prefix) { |
| 322 | + return startsWithUtil(root, prefix, 0); // Start checking the prefix from the root node |
| 323 | + } |
| 324 | +}; |
| 325 | +``` |
| 326 | + |
| 327 | +#### **`startsWith` function**: |
| 328 | +- This public function calls `startsWithUtil`, passing the root node, the prefix, and the starting index (`0`). |
| 329 | + |
| 330 | + |
| 331 | +### Step 3: Example Walkthrough |
| 332 | + |
| 333 | +**Example 1:** |
| 334 | + |
| 335 | +```cpp |
| 336 | +Trie* obj = new Trie(); |
| 337 | +obj->insert("apple"); |
| 338 | +obj->insert("app"); |
| 339 | + |
| 340 | +bool searchResult = obj->search("apple"); // Expected: true |
| 341 | +bool startsWithResult = obj->startsWith("app"); // Expected: true |
| 342 | +``` |
| 343 | +
|
| 344 | +1. **Inserting "apple"**: |
| 345 | + - `'a' -> 'p' -> 'p' -> 'l' -> 'e'` (Terminal at 'e') |
| 346 | +2. **Inserting "app"**: |
| 347 | + - `'a' -> 'p' -> 'p'` (Terminal at second 'p') |
| 348 | +3. **Searching "apple"**: |
| 349 | + - The word exists in the Trie, so it returns `true`. |
| 350 | +4. **startsWith "app"**: |
| 351 | + - The prefix "app" exists in the Trie, so it returns `true`. |
| 352 | +
|
| 353 | +**Example 2:** |
| 354 | +
|
| 355 | +```cpp |
| 356 | +bool searchResult2 = obj->search("banana"); // Expected: false |
| 357 | +``` |
| 358 | + |
| 359 | +- The word `"banana"` doesn't exist in the Trie, so it returns `false`. |
| 360 | + |
| 361 | +### Step 4: Time and Space Complexity |
| 362 | + |
| 363 | +#### **Time Complexity:** |
| 364 | + |
| 365 | +- **insert(word)**: The time complexity is **O(m)** where `m` is the length of the word. Inserting each character requires constant time (since there are 26 possible characters). |
| 366 | +- **search(word)**: The time complexity is **O(m)** where `m` is the length of the word. |
| 367 | +- **startsWith(prefix)**: The time complexity is **O(k)** where `k` is the length of the prefix. |
| 368 | + |
| 369 | +#### **Space Complexity:** |
| 370 | + |
| 371 | +- The space complexity is **O(n)** where `n` is the total number of characters stored in the Trie. Each node represents one character and the Trie stores nodes for each character in all words inserted. |
| 372 | + |
| 373 | +### Step 5: Additional Recommendations |
| 374 | + |
| 375 | +1. **Edge Cases**: |
| 376 | + - Consider edge cases like inserting empty strings or searching for empty strings. |
| 377 | + - Handle cases where the input is a prefix of another word. |
| 378 | +2. **Optimizations**: |
| 379 | + - You could improve the `startsWith` function by adding additional checks or making the `TrieNode` structure more efficient, like using a hash map for children nodes. |
| 380 | +3. **Memory Management**: |
| 381 | + - Although `TrieNode` uses an array of fixed size (26), an optimization could involve using a hash map to handle cases where only a small subset of the alphabet is used. |
| 382 | + |
0 commit comments