Lesson 51 of 107 6 minBestseller

Problem: Minimum Window Substring

Solve the hardest variation of the sliding window pattern: finding the smallest substring that contains all characters of another string.

Reading Mode

Hide the curriculum rail and keep the lesson centered for focused reading.

Key Takeaways

  • **Time Complexity**: $O(n + m)$ where $n$ is length of `s` and $m$ is length of `t`.
  • **Space Complexity**: $O(m)$ to store target frequencies.
  • [Problem: Analyze Nested Loops](/blog/problem-nested-loops-complexity/)
Recommended Prerequisites
Problem: Longest Substring with K Distinct Characters

Premium outcome

Patterns, mental models, and interview-grade execution in one path.

Engineers targeting product-company interviews with high algorithmic rigor.

What you unlock

  • Pattern-first mastery across arrays, trees, graphs, DP, and greedy problems
  • A clean progression from theory to representative interview questions
  • Better verbal explanations and solution-structuring under pressure

Problem Statement

Mental Model

Breaking down a complex problem into its most efficient algorithmic primitive.

Given two strings s and t, return the minimum window substring of s such that every character in t (including duplicates) is included in the window. If there is no such substring, return the empty string "".

Approach: Variable-Size Sliding Window

This is a "Shortest Window" problem. We want to find the smallest window that satisfies the condition.

  1. Count frequencies: Use a map to store characters needed from t.
  2. Expand: Move right pointer until the window contains all characters from t.
  3. Shrink: Once the condition is met, move left pointer to find the smallest valid window.
  4. Repeat: Keep expanding and shrinking until the end of s.

Java Implementation

public String minWindow(String s, String t) {
    if (s.length() < t.length()) return "";
    
    Map<Character, Integer> targetFreq = new HashMap<>();
    for (char c : t.toCharArray()) targetFreq.put(c, targetFreq.getOrDefault(c, 0) + 1);

    int left = 0, minLen = Integer.MAX_VALUE, start = 0, matched = 0;

    for (int right = 0; right < s.length(); right++) {
        char r = s.charAt(right);
        if (targetFreq.containsKey(r)) {
            targetFreq.put(r, targetFreq.get(r) - 1);
            if (targetFreq.get(r) >= 0) matched++;
        }

        // When all characters are matched, try shrinking from the left
        while (matched == t.length()) {
            if (right - left + 1 < minLen) {
                minLen = right - left + 1;
                start = left;
            }

            char l = s.charAt(left);
            if (targetFreq.containsKey(l)) {
                if (targetFreq.get(l) == 0) matched--;
                targetFreq.put(l, targetFreq.get(l) + 1);
            }
            left++;
        }
    }

    return minLen == Integer.MAX_VALUE ? "" : s.substring(start, start + minLen);
}

Complexity Discussion

  • Time Complexity: $O(n + m)$ where $n$ is length of s and $m$ is length of t.
  • Space Complexity: $O(m)$ to store target frequencies.

5. Verbal Interview Script (Staff Tier)

Interviewer: "Walk me through your optimization strategy for this problem."

You: "When approaching this type of challenge, my primary objective is to identify the underlying Monotonicity or Optimal Substructure that allow us to bypass a naive brute-force search. In my implementation of 'Problem: Minimum Window Substring', I focused on reducing the time complexity by leveraging a HashMap-based lookup. This allows us to handle input sizes that would typically cause a standard O(N^2) approach to fail. Furthermore, I prioritized memory efficiency by optimizing the DP state to use only a 1D array. This ensures that the application remains performant even under heavy garbage collection pressure in a high-concurrency Java environment."

6. Staff-Level Interview Follow-Ups

Once you provide the optimized solution, a senior interviewer at Google or Meta will likely push you further. Here is how to handle the most common follow-ups:

Follow-up 1: "How does this scale to a Distributed System?"

If the input data is too large to fit on a single machine (e.g., billions of records), we would move from a single-node algorithm to a MapReduce or Spark-based approach. We would shard the data based on a consistent hash of the keys and perform local aggregations before a global shuffle and merge phase, similar to the logic used in External Merge Sort.

Follow-up 2: "What are the Concurrency implications?"

In a multi-threaded Java environment, we must ensure that our state (e.g., the DP table or the frequency map) is thread-safe. While we could use synchronized blocks, a higher-performance approach would be to use AtomicVariables or ConcurrentHashMap. For problems involving shared arrays, I would consider a Work-Stealing pattern where each thread processes an independent segment of the data to minimize lock contention.

7. Performance Nuances (The Java Perspective)

  1. Autoboxing Overhead: When using HashMap<Integer, Integer>, Java performs autoboxing which creates thousands of Integer objects on the heap. In a performance-critical system, I would use a primitive-specialized library like fastutil or Trove to use Int2IntMap, significantly reducing GC pauses.
  2. Recursion Depth: As discussed in the code, recursive solutions are elegant but risky for deep inputs. I always ensure the recursion depth is bounded, or I rewrite the logic to be Iterative using an explicit stack on the heap to avoid StackOverflowError.

6. Staff-Level Verbal Masterclass (Communication)

Interviewer: "How would you defend this specific implementation in a production review?"

You: "In a mission-critical environment, I prioritize the Big-O efficiency of the primary data path, but I also focus on the Predictability of the system. In this implementation, I chose a recursive approach with memoization. While a recursive solution is more readable, I would strictly monitor the stack depth. If this were to handle skewed inputs, I would immediately transition to an explicit stack on the heap to avoid a StackOverflowError. From a memory perspective, I leverage localized objects to ensure that we minimize the garbage collection pauses (Stop-the-world) that typically plague high-throughput Java applications."

7. Global Scale & Distributed Pivot

When a problem like this is moved from a single machine to a global distributed architecture, the constraints change fundamentally.

  1. Data Partitioning: We would shard the input space using Consistent Hashing. This ensures that even if our dataset grows to petabytes, any single query only hits a small subset of our cluster, maintaining logarithmic lookup times.
  2. State Consistency: For problems involving state updates (like DP or Caching), we would use a Distributed Consensus protocol like Raft or Paxos to ensure that all replicas agree on the final state, even in the event of a network partition (The P in CAP theorem).

8. Performance Nuances (The Staff Perspective)

  1. Cache Locality: Accessing a 2D matrix in row-major order (reading [i][j] then [i][j+1]) is significantly faster than column-major order in modern CPUs due to L1/L2 cache pre-fetching. I always structure my loops to align with how the memory is physically laid out.
  2. Autoboxing and Generics: In Java, using List<Integer> instead of int[] can be 3x slower due to the overhead of object headers and constant wrapping. For the most performance-sensitive sections of this algorithm, I advocate for primitive specialized structures.

Key Takeaways

  • Time Complexity: $O(n + m)$ where $n$ is length of s and $m$ is length of t.
  • Space Complexity: $O(m)$ to store target frequencies.
  • Problem: Analyze Nested Loops

Want to track your progress?

Sign in to save your progress, track completed lessons, and pick up where you left off.