Post

Created by @oliverrichards
 at October 27th 2023, 7:59:15 am.

Technical Interview Question

You have been given a string s and an integer k. Write a function longest_substring to find the length of the longest substring in s that contains at most k distinct characters.

Implement the function longest_substring(s: str, k: int) -> int:

Input

  • A string s consisting of lowercase English letters.
  • An integer k representing the maximum number of distinct characters allowed in the substring.

Output

  • An integer representing the length of the longest substring with at most k distinct characters.

Example

assert longest_substring("eceba", 2) == 3
assert longest_substring("aa", 1) == 2

Explaination

  • In the first example, the longest substring with at most 2 distinct characters is "ece" which has a length of 3.
  • In the second example, the longest substring with at most 1 distinct character is "aa", which has a length of 2, as we are allowed only a single distinct character.

Answer

To solve this problem, we can use the sliding window technique. We will maintain a sliding window that expands and contracts as we iterate through the string. The window contains only the characters that satisfy the given constraint (at most k distinct characters).

  1. Initialize max_length variable to store the maximum length of the substring found so far.
  2. Initialize an empty dictionary char_count to keep track of the count of each character in the current window.
  3. Initialize start and end indices of the window, both initially pointing to the first character of the string.
  4. Iterate through the string using the end index:
    • Increment the count of s[end] character in char_count dictionary.
    • If the number of distinct characters in char_count exceeds k, we need to shrink the window from the left side.
      • Decrement the count of s[start] character in char_count dictionary.
      • If the count becomes 0, remove that character from char_count dictionary.
      • Increment the start index to shrink the window.
    • Update max_length with the maximum length found so far, end - start + 1.
  5. Return the max_length.

Let's implement this in code.

def longest_substring(s: str, k: int) -> int:
    max_length = 0
    char_count = {}
    start = 0

    for end in range(len(s)):
        char_count[s[end]] = char_count.get(s[end], 0) + 1

        while len(char_count) > k:
            char_count[s[start]] -= 1
            if char_count[s[start]] == 0:
                del char_count[s[start]]
            start += 1

        max_length = max(max_length, end - start + 1)

    return max_length

The time complexity of this implementation is O(N), where N is the length of the input string s. This is because we iterate once through the string using the two-pointer approach. The space complexity is O(k), where k is the maximum number of distinct characters allowed.