Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions DIRECTORY.md
Original file line number Diff line number Diff line change
Expand Up @@ -703,6 +703,8 @@
* [Test Is Unique](https://github.com/BrianLusina/PythonSnips/blob/master/pystrings/is_unique/test_is_unique.py)
* Issubsequence
* [Test Is Subsequence](https://github.com/BrianLusina/PythonSnips/blob/master/pystrings/issubsequence/test_is_subsequence.py)
* Longest Self Contained Substring
* [Test Longest Self Contained Substring](https://github.com/BrianLusina/PythonSnips/blob/master/pystrings/longest_self_contained_substring/test_longest_self_contained_substring.py)
* Look And Say Sequence
* [Test Look And Say Sequence](https://github.com/BrianLusina/PythonSnips/blob/master/pystrings/look_and_say_sequence/test_look_and_say_sequence.py)
* Max Vowels In Substring
Expand Down
2 changes: 1 addition & 1 deletion pystrings/length_of_longest_substring/README.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
Longest Substring Without Repeating Characters
# Longest Substring Without Repeating Characters

Given a string s, find the length of the longest substring without repeating characters.

Expand Down
112 changes: 112 additions & 0 deletions pystrings/longest_self_contained_substring/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,112 @@
# Find Longest Self-Contained Substring

You are given a string, s, consisting of lowercase English letters. Your task is to find the length of the longest
self-contained substring of s.

A substring t of s is called self-contained if:
- t is not equal to the entire string s.
- Every character in t does not appear anywhere else in s (outside of t).

In other words, all characters in t are completely unique to that substring within the string s.
Return the length of the longest self-contained substring. If no such substring exists, return -1.

Constraints:

- 2 ≤ s.length ≤ 1000
- s consists only of lowercase English letters.

## Examples

![Example 1](./images/longest_self_contained_substring_example_1.png)
![Example 2](./images/longest_self_contained_substring_example_2.png)
![Example 3](./images/longest_self_contained_substring_example_3.png)
![Example 4](./images/longest_self_contained_substring_example_4.png)
![Example 5](./images/longest_self_contained_substring_example_5.png)
![Example 6](./images/longest_self_contained_substring_example_6.png)
![Example 7](./images/longest_self_contained_substring_example_7.png)
![Example 8](./images/longest_self_contained_substring_example_8.png)

---

## Solution

We iterate through the string once to record where each character first appears and where it last appears. Two separate
hash maps store each character’s first and last occurrence indices. Each unique character serves as a potential starting
point, defining an initial window from its first to last occurrence. The window is adjusted based on the following
conditions:

- As we iterate within this window, if we encounter a character whose last occurrence is further to the right than the
current window’s end, we expand the window to include it. This ensures that the substring remains self-contained,
meaning all occurrences of each character within it are included. The process continues until no more characters
extend the window’s boundary.

- As we expand the window, every character inside it must have its first occurrence within the current window’s start
boundary. If we encounter a character whose first occurrence index is before the current window’s starting position,
it means that an earlier part of the string contains an instance of that character, violating the self-contained
property. When this happens, the current substring is invalid, and we discard it from consideration.

The maximum valid window length is tracked and updated accordingly, ensuring the longest valid substring is returned.

The steps of the algorithm are as follows:

1. Create two hashmaps, first, and last, to store the first and last occurrence index of each character in the string.
2. Iterate through the string once to populate these hash maps with each character’s first and last occurrence.
3. Initialize max_len = -1 to keep track of the maximum length of a valid self-contained substring found.
4. For each unique character c1 (processed once), start from its first occurrence index:

- Initialize the start and end of the window by setting the starting point to the character’s first occurrence and the
ending point to its last occurrence in the string.

- Iterate through the string from the starting position start, extending the endpoint end whenever a character’s last
occurrence is beyond the current endpoint end.

- If a character c2 inside the window has its first occurrence before the window's start, the window is invalid.

5. Validate the substring:

- When the current index j reaches the end, check if the window is valid and its length is less than the total string
length.

- If the window is valid, update max_len with the maximum of its current value and the window’s length (end - start + 1).

6. After checking all potential starting characters, return the maximum valid length found. If no valid substring exists,
return -1.

Let’s look at the illustration below to better understand the solution.

![Solution 1](./images/solution/longest_self_contained_substring_solution_1.png)
![Solution 2](./images/solution/longest_self_contained_substring_solution_2.png)
![Solution 3](./images/solution/longest_self_contained_substring_solution_3.png)
![Solution 4](./images/solution/longest_self_contained_substring_solution_4.png)
![Solution 5](./images/solution/longest_self_contained_substring_solution_5.png)
![Solution 6](./images/solution/longest_self_contained_substring_solution_6.png)
![Solution 7](./images/solution/longest_self_contained_substring_solution_7.png)
![Solution 8](./images/solution/longest_self_contained_substring_solution_8.png)
![Solution 9](./images/solution/longest_self_contained_substring_solution_9.png)
![Solution 10](./images/solution/longest_self_contained_substring_solution_10.png)
![Solution 11](./images/solution/longest_self_contained_substring_solution_11.png)
![Solution 12](./images/solution/longest_self_contained_substring_solution_12.png)
![Solution 13](./images/solution/longest_self_contained_substring_solution_13.png)
![Solution 14](./images/solution/longest_self_contained_substring_solution_14.png)
![Solution 15](./images/solution/longest_self_contained_substring_solution_15.png)
![Solution 16](./images/solution/longest_self_contained_substring_solution_16.png)
![Solution 17](./images/solution/longest_self_contained_substring_solution_17.png)
![Solution 18](./images/solution/longest_self_contained_substring_solution_18.png)
![Solution 19](./images/solution/longest_self_contained_substring_solution_19.png)
![Solution 20](./images/solution/longest_self_contained_substring_solution_20.png)
![Solution 21](./images/solution/longest_self_contained_substring_solution_21.png)
![Solution 22](./images/solution/longest_self_contained_substring_solution_22.png)
![Solution 23](./images/solution/longest_self_contained_substring_solution_23.png)
![Solution 24](./images/solution/longest_self_contained_substring_solution_24.png)
![Solution 25](./images/solution/longest_self_contained_substring_solution_25.png)
![Solution 26](./images/solution/longest_self_contained_substring_solution_26.png)
![Solution 27](./images/solution/longest_self_contained_substring_solution_27.png)
![Solution 28](./images/solution/longest_self_contained_substring_solution_28.png)

### Time Complexity

The time complexity of the above solution is O(n), where n is the number of characters in the string.

### Space Complexity

The space complexity of the above solution is O(1) because of the fixed character set size, 26 lowercase English letters.
117 changes: 117 additions & 0 deletions pystrings/longest_self_contained_substring/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,117 @@
def longest_self_contained_substring(s: str) -> int:
"""
Finds the longest self-contained substring in a given string.

A self-contained substring is one where each character only appears within the substring itself.
The function returns the length of the longest self-contained substring, and -1 if no such substring exists.

Parameters:
s (str): The input string.

Returns:
int: The length of the longest self-contained substring, or -1 if no such substring exists.

Examples:
>>> longest_self_contained_substring("xyyx")
2
>>> longest_self_contained_substring("xyxy")
-1
>>> longest_self_contained_substring("abacd")
4

Note:
This implementation uses a brute-force approach with O(n³) time complexity.
For better performance, consider using max_substring_length() which runs in O(n).
"""
n = len(s)

# First, find the first and last occurrence of each character
# This helps us quickly check if a character appears outside a range
first_occurrence = {}
last_occurrence = {}

for i, char in enumerate(s):
if char not in first_occurrence:
first_occurrence[char] = i
last_occurrence[char] = i

max_length = -1

# Try all possible substrings (excluding the entire string)
for start in range(n):
for end in range(start, n):
# Skip the entire string
if start == 0 and end == n - 1:
continue

# Check if this substring is self-contained
substring = s[start:end + 1]
is_self_contained = True

# For each character in the substring, verify it doesn't appear outside
for char in set(substring):
# If the character's first occurrence is before our start
# or last occurrence is after our end, it appears outside
if first_occurrence[char] < start or last_occurrence[char] > end:
is_self_contained = False
break

# If self-contained, update our maximum
if is_self_contained:
max_length = max(max_length, end - start + 1)

return max_length


def max_substring_length(s: str) -> int:
"""
Finds the length of the longest substring of s that is self-contained.

A self-contained substring is one in which all characters only appear within the substring.

The function uses an optimized window expansion approach. For each unique character as a starting point,
it defines an initial window from the character's first to last occurrence. The window is expanded to include
all occurrences of characters within it, and is invalidated if any character's first occurrence lies before
the window start.

Parameters:
s (str): The string to find the longest self-contained substring of

Returns:
int: The length of the longest self-contained substring of s

Examples:
>>> max_substring_length("xyyx")
2
>>> max_substring_length("xyxy")
-1
>>> max_substring_length("abacd")
4

Note:
Time complexity: O(n), Space complexity: O(1) for fixed character set size.
"""
first = {}
last = {}
for i, c in enumerate(s):
if c not in first:
first[c] = i
last[c] = i

max_len = -1

for c1 in first:
start = first[c1]
end = last[c1]
j = start

while j < len(s):
c2 = s[j]
if first[c2] < start:
break
end = max(end, last[c2])
if end == j and end - start + 1 != len(s):
max_len = max(max_len, end - start + 1)
j += 1

return max_len
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Original file line number Diff line number Diff line change
@@ -0,0 +1,142 @@
import unittest
from . import longest_self_contained_substring, max_substring_length


class LongestSelfContainedSubstringTestCase(unittest.TestCase):
def test_1(self):
s = "xyyx"
expected = 2
actual = longest_self_contained_substring(s)
self.assertEqual(expected, actual)

def test_2(self):
s = "xyxy"
expected = -1
actual = longest_self_contained_substring(s)
self.assertEqual(expected, actual)

def test_3(self):
s = "abacd"
expected = 4
actual = longest_self_contained_substring(s)
self.assertEqual(expected, actual)

def test_4(self):
s = "aabbcc"
expected = 4
actual = longest_self_contained_substring(s)
self.assertEqual(expected, actual)

def test_5(self):
s = "xyzxy"
expected = 1
actual = longest_self_contained_substring(s)
self.assertEqual(expected, actual)

def test_6(self):
s = "abcde"
expected = 4
actual = longest_self_contained_substring(s)
self.assertEqual(expected, actual)

def test_7(self):
s = "aaaa"
expected = -1
actual = longest_self_contained_substring(s)
self.assertEqual(expected, actual)

def test_8(self):
s = "aabccbdd"
expected = 6
actual = longest_self_contained_substring(s)
self.assertEqual(expected, actual)

def test_9(self):
s = "abcdefghigklmnopqrstuvwxyz"
expected = 25
actual = longest_self_contained_substring(s)
self.assertEqual(expected, actual)

def test_10(self):
s = "abcabcabc"
expected = -1
actual = longest_self_contained_substring(s)
self.assertEqual(expected, actual)

def test_11(self):
s = "aaabbbcccddd"
expected = 9
actual = longest_self_contained_substring(s)
self.assertEqual(expected, actual)


class MaxSelfContainedSubstringTestCase(unittest.TestCase):
def test_1(self):
s = "xyyx"
expected = 2
actual = max_substring_length(s)
self.assertEqual(expected, actual)

def test_2(self):
s = "xyxy"
expected = -1
actual = max_substring_length(s)
self.assertEqual(expected, actual)

def test_3(self):
s = "abacd"
expected = 4
actual = max_substring_length(s)
self.assertEqual(expected, actual)

def test_4(self):
s = "aabbcc"
expected = 4
actual = max_substring_length(s)
self.assertEqual(expected, actual)

def test_5(self):
s = "xyzxy"
expected = 1
actual = max_substring_length(s)
self.assertEqual(expected, actual)

def test_6(self):
s = "abcde"
expected = 4
actual = max_substring_length(s)
self.assertEqual(expected, actual)

def test_7(self):
s = "aaaa"
expected = -1
actual = max_substring_length(s)
self.assertEqual(expected, actual)

def test_8(self):
s = "aabccbdd"
expected = 6
actual = max_substring_length(s)
self.assertEqual(expected, actual)

def test_9(self):
s = "abcdefghigklmnopqrstuvwxyz"
expected = 25
actual = max_substring_length(s)
self.assertEqual(expected, actual)

def test_10(self):
s = "abcabcabc"
expected = -1
actual = max_substring_length(s)
self.assertEqual(expected, actual)

def test_11(self):
s = "aaabbbcccddd"
expected = 9
actual = max_substring_length(s)
self.assertEqual(expected, actual)


if __name__ == '__main__':
unittest.main()
Loading