-
Notifications
You must be signed in to change notification settings - Fork 2
feat(strings): longest self contained substring #105
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
4 commits
Select commit
Hold shift + click to select a range
1f84e8d
feat(strings, self-contained-substring): longest self contained subst…
BrianLusina a3d3e16
updating DIRECTORY.md
61fada7
refactor(strings, longest_self_contained_substring): add type hints
BrianLusina b80c428
test(strings, longest_self_contained_substring): doc tests
BrianLusina File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Some comments aren't visible on the classic Files Changed page.
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,112 @@ | ||
| # Find Longest Self-Contained Substring | ||
|
|
||
| You are given a string, s, consisting of lowercase English letters. Your task is to find the length of the longest | ||
| self-contained substring of s. | ||
|
|
||
| A substring t of s is called self-contained if: | ||
| - t is not equal to the entire string s. | ||
| - Every character in t does not appear anywhere else in s (outside of t). | ||
|
|
||
| In other words, all characters in t are completely unique to that substring within the string s. | ||
| Return the length of the longest self-contained substring. If no such substring exists, return -1. | ||
|
|
||
| Constraints: | ||
|
|
||
| - 2 ≤ s.length ≤ 1000 | ||
| - s consists only of lowercase English letters. | ||
|
|
||
| ## Examples | ||
|
|
||
|  | ||
|  | ||
|  | ||
|  | ||
|  | ||
|  | ||
|  | ||
|  | ||
|
|
||
| --- | ||
|
|
||
| ## Solution | ||
|
|
||
| We iterate through the string once to record where each character first appears and where it last appears. Two separate | ||
| hash maps store each character’s first and last occurrence indices. Each unique character serves as a potential starting | ||
| point, defining an initial window from its first to last occurrence. The window is adjusted based on the following | ||
| conditions: | ||
|
|
||
| - As we iterate within this window, if we encounter a character whose last occurrence is further to the right than the | ||
| current window’s end, we expand the window to include it. This ensures that the substring remains self-contained, | ||
| meaning all occurrences of each character within it are included. The process continues until no more characters | ||
| extend the window’s boundary. | ||
|
|
||
| - As we expand the window, every character inside it must have its first occurrence within the current window’s start | ||
| boundary. If we encounter a character whose first occurrence index is before the current window’s starting position, | ||
| it means that an earlier part of the string contains an instance of that character, violating the self-contained | ||
| property. When this happens, the current substring is invalid, and we discard it from consideration. | ||
|
|
||
| The maximum valid window length is tracked and updated accordingly, ensuring the longest valid substring is returned. | ||
|
|
||
| The steps of the algorithm are as follows: | ||
|
|
||
| 1. Create two hashmaps, first, and last, to store the first and last occurrence index of each character in the string. | ||
| 2. Iterate through the string once to populate these hash maps with each character’s first and last occurrence. | ||
| 3. Initialize max_len = -1 to keep track of the maximum length of a valid self-contained substring found. | ||
| 4. For each unique character c1 (processed once), start from its first occurrence index: | ||
|
|
||
| - Initialize the start and end of the window by setting the starting point to the character’s first occurrence and the | ||
| ending point to its last occurrence in the string. | ||
|
|
||
| - Iterate through the string from the starting position start, extending the endpoint end whenever a character’s last | ||
| occurrence is beyond the current endpoint end. | ||
|
|
||
| - If a character c2 inside the window has its first occurrence before the window's start, the window is invalid. | ||
|
|
||
| 5. Validate the substring: | ||
|
|
||
| - When the current index j reaches the end, check if the window is valid and its length is less than the total string | ||
| length. | ||
|
|
||
| - If the window is valid, update max_len with the maximum of its current value and the window’s length (end - start + 1). | ||
|
|
||
| 6. After checking all potential starting characters, return the maximum valid length found. If no valid substring exists, | ||
| return -1. | ||
|
|
||
| Let’s look at the illustration below to better understand the solution. | ||
|
|
||
|  | ||
|  | ||
|  | ||
|  | ||
|  | ||
|  | ||
|  | ||
|  | ||
|  | ||
|  | ||
|  | ||
|  | ||
|  | ||
|  | ||
|  | ||
|  | ||
|  | ||
|  | ||
|  | ||
|  | ||
|  | ||
|  | ||
|  | ||
|  | ||
|  | ||
|  | ||
|  | ||
|  | ||
|
|
||
| ### Time Complexity | ||
|
|
||
| The time complexity of the above solution is O(n), where n is the number of characters in the string. | ||
|
|
||
| ### Space Complexity | ||
|
|
||
| The space complexity of the above solution is O(1) because of the fixed character set size, 26 lowercase English letters. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,117 @@ | ||
| def longest_self_contained_substring(s: str) -> int: | ||
| """ | ||
| Finds the longest self-contained substring in a given string. | ||
|
|
||
| A self-contained substring is one where each character only appears within the substring itself. | ||
| The function returns the length of the longest self-contained substring, and -1 if no such substring exists. | ||
|
|
||
| Parameters: | ||
| s (str): The input string. | ||
|
|
||
| Returns: | ||
| int: The length of the longest self-contained substring, or -1 if no such substring exists. | ||
|
|
||
| Examples: | ||
| >>> longest_self_contained_substring("xyyx") | ||
| 2 | ||
| >>> longest_self_contained_substring("xyxy") | ||
| -1 | ||
| >>> longest_self_contained_substring("abacd") | ||
| 4 | ||
|
|
||
| Note: | ||
| This implementation uses a brute-force approach with O(n³) time complexity. | ||
| For better performance, consider using max_substring_length() which runs in O(n). | ||
| """ | ||
| n = len(s) | ||
|
|
||
| # First, find the first and last occurrence of each character | ||
| # This helps us quickly check if a character appears outside a range | ||
| first_occurrence = {} | ||
| last_occurrence = {} | ||
|
|
||
| for i, char in enumerate(s): | ||
| if char not in first_occurrence: | ||
| first_occurrence[char] = i | ||
| last_occurrence[char] = i | ||
|
|
||
| max_length = -1 | ||
|
|
||
| # Try all possible substrings (excluding the entire string) | ||
| for start in range(n): | ||
| for end in range(start, n): | ||
| # Skip the entire string | ||
| if start == 0 and end == n - 1: | ||
| continue | ||
|
|
||
| # Check if this substring is self-contained | ||
| substring = s[start:end + 1] | ||
| is_self_contained = True | ||
|
|
||
| # For each character in the substring, verify it doesn't appear outside | ||
| for char in set(substring): | ||
| # If the character's first occurrence is before our start | ||
| # or last occurrence is after our end, it appears outside | ||
| if first_occurrence[char] < start or last_occurrence[char] > end: | ||
| is_self_contained = False | ||
| break | ||
|
|
||
| # If self-contained, update our maximum | ||
| if is_self_contained: | ||
| max_length = max(max_length, end - start + 1) | ||
|
|
||
| return max_length | ||
|
|
||
|
|
||
| def max_substring_length(s: str) -> int: | ||
| """ | ||
| Finds the length of the longest substring of s that is self-contained. | ||
|
|
||
| A self-contained substring is one in which all characters only appear within the substring. | ||
|
|
||
| The function uses an optimized window expansion approach. For each unique character as a starting point, | ||
| it defines an initial window from the character's first to last occurrence. The window is expanded to include | ||
| all occurrences of characters within it, and is invalidated if any character's first occurrence lies before | ||
| the window start. | ||
|
|
||
| Parameters: | ||
| s (str): The string to find the longest self-contained substring of | ||
|
|
||
| Returns: | ||
| int: The length of the longest self-contained substring of s | ||
|
|
||
| Examples: | ||
| >>> max_substring_length("xyyx") | ||
| 2 | ||
| >>> max_substring_length("xyxy") | ||
| -1 | ||
| >>> max_substring_length("abacd") | ||
| 4 | ||
|
|
||
| Note: | ||
| Time complexity: O(n), Space complexity: O(1) for fixed character set size. | ||
| """ | ||
| first = {} | ||
| last = {} | ||
| for i, c in enumerate(s): | ||
| if c not in first: | ||
| first[c] = i | ||
| last[c] = i | ||
|
|
||
| max_len = -1 | ||
|
|
||
| for c1 in first: | ||
| start = first[c1] | ||
| end = last[c1] | ||
| j = start | ||
|
|
||
| while j < len(s): | ||
| c2 = s[j] | ||
| if first[c2] < start: | ||
| break | ||
| end = max(end, last[c2]) | ||
| if end == j and end - start + 1 != len(s): | ||
| max_len = max(max_len, end - start + 1) | ||
| j += 1 | ||
|
|
||
| return max_len | ||
Binary file added
BIN
+77.3 KB
..._self_contained_substring/images/longest_self_contained_substring_example_1.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added
BIN
+75.1 KB
..._self_contained_substring/images/longest_self_contained_substring_example_2.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added
BIN
+85.6 KB
..._self_contained_substring/images/longest_self_contained_substring_example_3.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added
BIN
+78.5 KB
..._self_contained_substring/images/longest_self_contained_substring_example_4.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added
BIN
+75.2 KB
..._self_contained_substring/images/longest_self_contained_substring_example_5.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added
BIN
+74.7 KB
..._self_contained_substring/images/longest_self_contained_substring_example_6.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added
BIN
+81.3 KB
..._self_contained_substring/images/longest_self_contained_substring_example_7.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added
BIN
+92 KB
..._self_contained_substring/images/longest_self_contained_substring_example_8.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added
BIN
+64.2 KB
...ained_substring/images/solution/longest_self_contained_substring_solution_1.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added
BIN
+81.5 KB
...ined_substring/images/solution/longest_self_contained_substring_solution_10.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added
BIN
+103 KB
...ined_substring/images/solution/longest_self_contained_substring_solution_11.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added
BIN
+87.4 KB
...ined_substring/images/solution/longest_self_contained_substring_solution_12.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added
BIN
+87.7 KB
...ined_substring/images/solution/longest_self_contained_substring_solution_13.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added
BIN
+102 KB
...ined_substring/images/solution/longest_self_contained_substring_solution_14.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added
BIN
+87.8 KB
...ined_substring/images/solution/longest_self_contained_substring_solution_15.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added
BIN
+83.4 KB
...ined_substring/images/solution/longest_self_contained_substring_solution_16.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added
BIN
+85.8 KB
...ined_substring/images/solution/longest_self_contained_substring_solution_17.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added
BIN
+81.2 KB
...ined_substring/images/solution/longest_self_contained_substring_solution_18.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added
BIN
+108 KB
...ined_substring/images/solution/longest_self_contained_substring_solution_19.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added
BIN
+60 KB
...ained_substring/images/solution/longest_self_contained_substring_solution_2.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added
BIN
+86.9 KB
...ined_substring/images/solution/longest_self_contained_substring_solution_20.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added
BIN
+88.8 KB
...ined_substring/images/solution/longest_self_contained_substring_solution_21.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added
BIN
+106 KB
...ined_substring/images/solution/longest_self_contained_substring_solution_22.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added
BIN
+87.7 KB
...ined_substring/images/solution/longest_self_contained_substring_solution_23.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added
BIN
+89.1 KB
...ined_substring/images/solution/longest_self_contained_substring_solution_24.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added
BIN
+110 KB
...ined_substring/images/solution/longest_self_contained_substring_solution_25.png
Oops, something went wrong.
Binary file added
BIN
+98.4 KB
...ined_substring/images/solution/longest_self_contained_substring_solution_26.png
Oops, something went wrong.
Binary file added
BIN
+89.5 KB
...ined_substring/images/solution/longest_self_contained_substring_solution_27.png
Oops, something went wrong.
Binary file added
BIN
+109 KB
...ined_substring/images/solution/longest_self_contained_substring_solution_28.png
Oops, something went wrong.
Binary file added
BIN
+79.3 KB
...ained_substring/images/solution/longest_self_contained_substring_solution_3.png
Oops, something went wrong.
Binary file added
BIN
+90 KB
...ained_substring/images/solution/longest_self_contained_substring_solution_4.png
Oops, something went wrong.
Binary file added
BIN
+86.4 KB
...ained_substring/images/solution/longest_self_contained_substring_solution_5.png
Oops, something went wrong.
Binary file added
BIN
+107 KB
...ained_substring/images/solution/longest_self_contained_substring_solution_6.png
Oops, something went wrong.
Binary file added
BIN
+86.6 KB
...ained_substring/images/solution/longest_self_contained_substring_solution_7.png
Oops, something went wrong.
Binary file added
BIN
+88.7 KB
...ained_substring/images/solution/longest_self_contained_substring_solution_8.png
Oops, something went wrong.
Binary file added
BIN
+88.6 KB
...ained_substring/images/solution/longest_self_contained_substring_solution_9.png
Oops, something went wrong.
142 changes: 142 additions & 0 deletions
142
pystrings/longest_self_contained_substring/test_longest_self_contained_substring.py
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,142 @@ | ||
| import unittest | ||
| from . import longest_self_contained_substring, max_substring_length | ||
|
|
||
|
|
||
| class LongestSelfContainedSubstringTestCase(unittest.TestCase): | ||
| def test_1(self): | ||
| s = "xyyx" | ||
| expected = 2 | ||
| actual = longest_self_contained_substring(s) | ||
| self.assertEqual(expected, actual) | ||
|
|
||
| def test_2(self): | ||
| s = "xyxy" | ||
| expected = -1 | ||
| actual = longest_self_contained_substring(s) | ||
| self.assertEqual(expected, actual) | ||
|
|
||
| def test_3(self): | ||
| s = "abacd" | ||
| expected = 4 | ||
| actual = longest_self_contained_substring(s) | ||
| self.assertEqual(expected, actual) | ||
|
|
||
| def test_4(self): | ||
| s = "aabbcc" | ||
| expected = 4 | ||
| actual = longest_self_contained_substring(s) | ||
| self.assertEqual(expected, actual) | ||
|
|
||
| def test_5(self): | ||
| s = "xyzxy" | ||
| expected = 1 | ||
| actual = longest_self_contained_substring(s) | ||
| self.assertEqual(expected, actual) | ||
|
|
||
| def test_6(self): | ||
| s = "abcde" | ||
| expected = 4 | ||
| actual = longest_self_contained_substring(s) | ||
| self.assertEqual(expected, actual) | ||
|
|
||
| def test_7(self): | ||
| s = "aaaa" | ||
| expected = -1 | ||
| actual = longest_self_contained_substring(s) | ||
| self.assertEqual(expected, actual) | ||
|
|
||
| def test_8(self): | ||
| s = "aabccbdd" | ||
| expected = 6 | ||
| actual = longest_self_contained_substring(s) | ||
| self.assertEqual(expected, actual) | ||
|
|
||
| def test_9(self): | ||
| s = "abcdefghigklmnopqrstuvwxyz" | ||
| expected = 25 | ||
| actual = longest_self_contained_substring(s) | ||
| self.assertEqual(expected, actual) | ||
|
|
||
| def test_10(self): | ||
| s = "abcabcabc" | ||
| expected = -1 | ||
| actual = longest_self_contained_substring(s) | ||
| self.assertEqual(expected, actual) | ||
|
|
||
| def test_11(self): | ||
| s = "aaabbbcccddd" | ||
| expected = 9 | ||
| actual = longest_self_contained_substring(s) | ||
| self.assertEqual(expected, actual) | ||
|
|
||
|
|
||
| class MaxSelfContainedSubstringTestCase(unittest.TestCase): | ||
| def test_1(self): | ||
| s = "xyyx" | ||
| expected = 2 | ||
| actual = max_substring_length(s) | ||
| self.assertEqual(expected, actual) | ||
|
|
||
| def test_2(self): | ||
| s = "xyxy" | ||
| expected = -1 | ||
| actual = max_substring_length(s) | ||
| self.assertEqual(expected, actual) | ||
|
|
||
| def test_3(self): | ||
| s = "abacd" | ||
| expected = 4 | ||
| actual = max_substring_length(s) | ||
| self.assertEqual(expected, actual) | ||
|
|
||
| def test_4(self): | ||
| s = "aabbcc" | ||
| expected = 4 | ||
| actual = max_substring_length(s) | ||
| self.assertEqual(expected, actual) | ||
|
|
||
| def test_5(self): | ||
| s = "xyzxy" | ||
| expected = 1 | ||
| actual = max_substring_length(s) | ||
| self.assertEqual(expected, actual) | ||
|
|
||
| def test_6(self): | ||
| s = "abcde" | ||
| expected = 4 | ||
| actual = max_substring_length(s) | ||
| self.assertEqual(expected, actual) | ||
|
|
||
| def test_7(self): | ||
| s = "aaaa" | ||
| expected = -1 | ||
| actual = max_substring_length(s) | ||
| self.assertEqual(expected, actual) | ||
|
|
||
| def test_8(self): | ||
| s = "aabccbdd" | ||
| expected = 6 | ||
| actual = max_substring_length(s) | ||
| self.assertEqual(expected, actual) | ||
|
|
||
| def test_9(self): | ||
| s = "abcdefghigklmnopqrstuvwxyz" | ||
| expected = 25 | ||
| actual = max_substring_length(s) | ||
| self.assertEqual(expected, actual) | ||
|
|
||
| def test_10(self): | ||
| s = "abcabcabc" | ||
| expected = -1 | ||
| actual = max_substring_length(s) | ||
| self.assertEqual(expected, actual) | ||
|
|
||
| def test_11(self): | ||
| s = "aaabbbcccddd" | ||
| expected = 9 | ||
| actual = max_substring_length(s) | ||
| self.assertEqual(expected, actual) | ||
|
|
||
|
|
||
| if __name__ == '__main__': | ||
| unittest.main() |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.