Lesson 31 of 47 3 min

MANG Problem #31: Regular Expression Matching (Hard)

Master 2D Dynamic Programming by implementing a regex parser that handles wildcards and Kleene stars in O(S*P) time.

1. Problem Statement

Given an input string s and a pattern p, implement regular expression matching with support for '.' and '*' where:

  • '.' Matches any single character.
  • '*' Matches zero or more of the preceding element.

The matching should cover the entire input string (not partial).

Input: s = "aab", p = "c*a*b"
Output: true (c can be repeated 0 times, a can be repeated 2 times).

2. Approach: 2D Bottom-Up DP

We use a DP table where dp[i][j] is true if s[0..i-1] matches p[0..j-1].

  1. State: dp[i][j].
  2. Base Cases:
    • dp[0][0] = true (empty matches empty).
    • dp[0][j] = dp[0][j-2] if p[j-1] == '*' (empty string matches a*b*).
  3. Transitions:
    • If p[j-1] == s[i-1] or p[j-1] == '.': dp[i][j] = dp[i-1][j-1].
    • If p[j-1] == '*':
      • Zero occurrences: dp[i][j] = dp[i][j-2].
      • One or more occurrences: If preceding char matches s[i-1], then dp[i][j] = dp[i-1][j].

3. Java Implementation

public boolean isMatch(String s, String p) {
    int m = s.length(), n = p.length();
    boolean[][] dp = new boolean[m + 1][n + 1];
    dp[0][0] = true;

    // Base case: empty string matches patterns like a*b*c*
    for (int j = 1; j <= n; j++) {
        if (p.charAt(j - 1) == '*') {
            dp[0][j] = dp[0][j - 2];
        }
    }

    for (int i = 1; i <= m; i++) {
        for (int j = 1; j <= n; j++) {
            if (p.charAt(j - 1) == '.' || p.charAt(j - 1) == s.charAt(i - 1)) {
                dp[i][j] = dp[i - 1][j - 1];
            } else if (p.charAt(j - 1) == '*') {
                dp[i][j] = dp[i][j - 2]; // Zero occurrences
                if (p.charAt(j - 2) == '.' || p.charAt(j - 2) == s.charAt(i - 1)) {
                    dp[i][j] = dp[i][j] || dp[i - 1][j]; // One or more occurrences
                }
            }
        }
    }
    return dp[m][n];
}

4. 5-Minute "Video-Style" Walkthrough

  1. The "Aha!" Moment: The '*' character is the tricky part. It acts on the character before it. This means when we see * at index j, we look at j-2 for the "Zero occurrences" case.
  2. The Zero Case: a* can just disappear. So dp[i][j] = dp[i][j-2]. We always check this first.
  3. The Multiple Case: If a* matches the current a in the string, it could also match previous as. So we look UP in the DP table (dp[i-1][j]).

5. Interview Discussion

  • Interviewer: "What is the time complexity?"
  • You: "O(S * P) where S and P are the lengths of the string and pattern. We fill an (S+1) x (P+1) boolean matrix."
  • Interviewer: "Can this be optimized for space?"
  • You: "Yes, since dp[i][j] only depends on the current and previous row, we can use two 1D arrays of size P+1 to reduce space to O(P)."

Want to track your progress?

Sign in to save your progress, track completed lessons, and pick up where you left off.