1. Problem Statement
Serialization is the process of converting a data structure or object into a sequence of bits so that it can be stored in a file or memory buffer, or transmitted across a network connection link to be reconstructed later in the same or another computer environment.
Design an algorithm to serialize and deserialize a binary tree. There is no restriction on how your serialization/deserialization algorithm should work.
2. Approach: Pre-order DFS (Recursive)
graph TD
subgraph "Serialization"
T1((1)) --> T2((2))
T1 --> T3((3))
T3 --> T4((4))
T3 --> T5((5))
String[1,2,X,X,3,4,X,X,5,X,X]
end
subgraph "Deserialization"
Q[Queue: 1,2,X,X,3,4,X,X,5,X,X] --> BuildRoot[Node 1]
BuildRoot --> BuildLeft[Node 2]
BuildRoot --> BuildRight[Node 3]
end
We use a Pre-order Traversal (Root -> Left -> Right) because it allows us to start building the tree from the root immediately during deserialization.
Serialization Logic:
- If a node is
null, append a marker (e.g.,"X"). - Otherwise, append the value and recurse.
- Use a comma to separate values.
Deserialization Logic:
- Split the string into a Queue.
- Pop the front:
- If it's
"X", returnnull. - Otherwise, create a new node and recursively build its left and right children.
- If it's
3. Java Implementation
public class Codec {
// Encodes a tree to a single string.
public String serialize(TreeNode root) {
StringBuilder sb = new StringBuilder();
buildString(root, sb);
return sb.toString();
}
private void buildString(TreeNode node, StringBuilder sb) {
if (node == null) {
sb.append("X,");
} else {
sb.append(node.val).append(",");
buildString(node.left, sb);
buildString(node.right, sb);
}
}
// Decodes your encoded data to tree.
public TreeNode deserialize(String data) {
Queue<String> nodes = new LinkedList<>(Arrays.asList(data.split(",")));
return buildTree(nodes);
}
private TreeNode buildTree(Queue<String> nodes) {
String val = nodes.poll();
if (val.equals("X")) return null;
TreeNode node = new TreeNode(Integer.parseInt(val));
node.left = buildTree(nodes);
node.right = buildTree(nodes);
return node;
}
}
4. 5-Minute "Video-Style" Walkthrough
- The "Aha!" Moment: How do you represent a 2D structure in a 1D string? You must include the Null Pointers. By explicitly marking where a branch ends (
"X"), we preserve the exact structure of the tree. - Why Pre-order?: If we used In-order, we wouldn't know which node is the root without extra information. Pre-order tells us the very first element in the string is the root.
- The Recursive Hand-off: During deserialization, the left-child call "consumes" as much of the string as it needs. Whatever is left over is exactly what the right-child needs.
5. Interview Discussion
- Interviewer: "Can we use BFS?"
- You: "Yes, level-order traversal also works using a Queue, but the recursive DFS approach is often cleaner and easier to implement bug-free under pressure."
- Interviewer: "How can we reduce the string size?"
- You: "Instead of a comma-separated string, we can use a Binary Format (Protocol Buffers style) to save space. We can also use a bit-map to represent where nulls are."