Agent Work: Homework2
Claude Opus 4.6 · COMP 382: Reasoning about Algorithms
COMP 382: Reasoning about Algorithms
Homework 2
Total Points: 100
---
Problem 1: Persistent Dynamic Sets [25 points]
During the course of an algorithm, you sometimes find that you need to maintain past versions of a dynamic set as it is updated. We call such a set persistent. One way to implement a persistent set is to copy the entire set whenever it is modified, but this approach can slow down a program and also consume a lot of space. Sometimes, you can do much better.
Consider a persistent set S with the operations INSERT, DELETE, and SEARCH, which you implement using binary search trees.
Maintain a separate root for every version of the set. In order to insert a key (e.g., key 5) into the set, create a new node with that key. This node becomes the child of a new copy of its parent, and so on up to the root. Thus, you copy only part of the tree and share some of the nodes with the original tree.
Assume that each tree node has the attributes key, left, and right but no parent.
Part (a)
For a persistent binary search tree (not a red-black tree, just a binary search tree), identify the nodes that need to change to insert or delete a node.
Part (b)
Write a procedure PERSISTENT-TREE-INSERT(T, z) that, given a persistent binary search tree T and a node z to insert, returns a new persistent tree T' that is the result of inserting z into T. Assume that you have a procedure COPY-NODE(x) that makes a copy of node x, including all of its attributes.
Part (c)
If the height of the persistent binary search tree T is h, what are the time and space requirements of your implementation of PERSISTENT-TREE-INSERT? (The space requirement is proportional to the number of nodes that are copied.)
Part (d)
Suppose that you include the parent attribute in each node. In this case, the PERSISTENT-TREE-INSERT procedure needs to perform additional copying. Prove that PERSISTENT-TREE-INSERT then requires Ω(n) time and space, where n is the number of nodes in the tree.
Part (e)
Show how to use red-black trees to guarantee that the worst-case running time and space are O(log n) per insertion. You may assume that all keys are distinct.
---
Problem 2: Join Operation on Red-Black Trees [35 points]
The join operation takes two dynamic sets S₁ and S₂ and an element x such that for any x₁ ∈ S₁ and x₂ ∈ S₂, we have x₁.key ≤ x.key ≤ x₂.key. It returns a set S = S₁ ∪ {x} ∪ S₂.
In this problem, we investigate how to implement the join operation on red-black trees.
Part (a)
Given a red-black tree T, let us store its black-height as the new attribute T.bh. Argue that RB-INSERT can maintain the bh attribute without requiring extra storage in the nodes of the tree and without increasing the asymptotic running times. Show how to determine the black-height of each node visited while descending through T, using O(1) time per node visited.
Let T₁ and T₂ be red-black trees and x be a key value such that for any nodes x₁ in T₁ and x₂ in T₂, we have x₁.key ≤ x.key ≤ x₂.key. You will show how to implement the operation RB-JOIN(T₁, x, T₂), which destroys T₁ and T₂ and returns a red-black tree T = T₁ ∪ {x} ∪ T₂. Let n be the total number of nodes in T₁ and T₂.
Part (b)
Assume that T₁.bh ≥ T₂.bh. Describe an O(log n) time algorithm that finds a black node y in T₁ with the largest key from among those nodes whose black-tree height is T₂.bh.
Part (c)
Let Tᵧ be the subtree rooted at y. Describe how T = Tᵧ ∪ {x} ∪ T₂ can replace Tᵧ in O(1) time without destroying the binary-search-tree property.
Part (d)
What color should we make x so that red-black properties 1, 3, and 5 are maintained? Describe how to enforce properties 2 and 4 in O(log n) time.
Part (e)
Argue that no generality is lost by making the assumption in part (b). Describe the symmetric situation that arises when T₁.bh ≤ T₂.bh.
Part (f)
Show that the running time of RB-JOIN is O(log n).
---
Problem 3: Joining and Splitting 2-3-4 Trees [40 Points]
The join operation takes two dynamic sets S' and S'' and an element x such that x'.key < x.key < x''.key for any x' ∈ S' and x'' ∈ S''. It returns a set S = S' ∪ {x} ∪ S''.
The split operation is like an "inverse" join: given a dynamic set S and an element x ∈ S, it creates a set S' that consists of all elements in S − {x} whose keys are less than x.key and a set S'' that consists of all elements in S − {x} whose keys are greater than x.key.
In this problem, we investigate how to implement these operations on 2-3-4 trees (B-trees with t = 2). We assume for convenience that elements consist only of keys and that all key values are distinct.
Part (a)
Show how to maintain, for every node x of a 2-3-4 tree, the height of the subtree rooted at x as an attribute x.height. Make sure that your implementation does not affect the asymptotic running times of searching and insertion.
Part (b)
Show how to implement the join operation. Given two 2-3-4 trees T' and T'' and a key k, the join operation should run in O(1 + |h' − h''|) time, where h' and h'' are the heights of T' and T'', respectively.
Part (c)
Consider the simple path p from the root of a 2-3-4 tree T to a given key k, the set S' of keys in T that are less than k, and the set S'' of keys in T that are greater than k. Show that p breaks S' into a set of trees {T'₀, T'₁, ..., T'ₘ} and a set of keys {k'₁, k'₂, ..., k'ₘ} such that y < k'ᵢ < z for i = 1, 2, ..., m and any keys y ∈ T'ᵢ₋₁ and z ∈ T'ᵢ. What is the relationship between the heights of T'ᵢ₋₁ and T'ᵢ? Describe how p breaks S'' into sets of trees and keys.
Part (d)
Show how to implement the split operation on T. Use the join operation to assemble the keys in S' into a single 2-3-4 tree T' and the keys in S'' into a single 2-3-4 tree T''. The running time of the split operation should be O(log n), where n is the number of keys in T. (Hint: The costs for joining should telescope.)
# COMP382 Homework 2
## Student Responses
<!-- Write your solutions below. Add sections for each problem as needed. -->
<!-- Use proper mathematical notation where applicable. -->
[Your solutions here]
COMP382 Homework 2
Student Responses
---
Problem 1: Persistent Dynamic Sets
Part (a)
Insert: When inserting a node into a persistent BST, the nodes that need to change are exactly those on the path from the root to the insertion point (the leaf position where the new node is placed). Each node on this path must be copied because its child pointer (left or right) changes to point to the new copy of its child below it. Nodes not on this path remain unchanged and are shared between the old and new tree versions.
Delete: The nodes that need to change depend on the deletion case:
- Node has 0 or 1 children: All nodes on the path from the root to the deleted node must be copied. The deleted node's parent copy has its child pointer updated to NIL (0 children) or to the deleted node's single child (1 child).
- Node has 2 children: We find the successor of the deleted node (go right, then all the way left). All nodes on the path from the root through the deleted node down to the successor must be copied. The deleted node's copy receives the successor's key, and the successor is removed by updating its parent's child pointer.
In all cases, only nodes on the path from root to the physically removed position need to be copied.
Part (b)
PERSISTENT-TREE-INSERT(T, z)
1 let T' be a new tree
2 if T.root == NIL
3 T'.root = z
4 return T'
5 T'.root = COPY-NODE(T.root)
6 y = T'.root
7 while TRUE
8 if z.key < y.key
9 if y.left == NIL
10 y.left = z
11 return T'
12 else
13 y.left = COPY-NODE(y.left)
14 y = y.left
15 else
16 if y.right == NIL
17 y.right = z
18 return T'
19 else
20 y.right = COPY-NODE(y.right)
21 y = y.rightExplanation: The procedure creates a new tree T' by copying only the nodes on the path from the root to the insertion point. Starting at the root, we copy the current node and descend toward the correct insertion position. At each level, we copy the next node on the path before descending into it. When we find a NIL child in the appropriate direction, we attach z there. Nodes not on the path are shared between T and T' (the copied nodes' other child pointers still reference the original, shared nodes).
Part (c)
- Time: O(h). We traverse the path from the root to the insertion point, performing O(1) work (one COPY-NODE call plus pointer updates) at each of the O(h) levels.
- Space: O(h). We create exactly one new node per level on the root-to-insertion path, plus the new node z. The number of nodes on this path is at most h + 1, so the space requirement is O(h).
Part (d)
Claim: If parent pointers are included, PERSISTENT-TREE-INSERT requires Omega(n) time and space.
Proof: Consider inserting a node z. We must copy all nodes on the root-to-insertion path (at least the root). When we copy a node c on this path, the copy c' replaces c in the new tree. Now consider a child d of c that is NOT on the insertion path. In the original tree, d.parent = c. In the new tree, d's parent should be c' (the copy), not c. However, d is shared between the old and new trees — we cannot modify d.parent to point to c' without corrupting the old tree. Therefore, we must also copy d (creating d') and set d'.parent = c'.
But now d' has children whose parent pointers point to d, not d'. So we must copy d's children as well, and their children, and so on. This cascading copying extends to the entire subtree rooted at d.
This cascading occurs for every off-path child of every node on the insertion path. Since every node in the tree is either on the path or in the subtree of some off-path child of a path node, we end up copying every node in the tree.
Therefore, the total number of nodes copied is n, giving Omega(n) time and space. QED.
Part (e)
We use a persistent red-black tree without parent pointers (as motivated by Part (d)). Red-black trees guarantee a height of O(log n), which directly gives us O(log n) worst-case time and space per insertion:
1. Path copying: Since the height of a red-black tree with n nodes is at most 2 lg(n+1), the root-to-insertion path has O(log n) nodes. We copy each of these nodes as in Part (b), taking O(log n) time and space.
2. Maintaining ancestors for fixup: Since we have no parent pointers, we store the path from the root to the inserted node in a stack/array as we descend. This uses O(log n) space (which does not increase the asymptotic space beyond what we already use for copying).
3. RB-INSERT-FIXUP on the persistent tree: During fixup, we traverse back up the path (using our stored path) performing recolorings and rotations: - Recolorings (Case 1): We recolor the parent, uncle, and grandparent, then move up two levels. The parent and grandparent are already on the path (already copied), so we can modify their colors freely. The uncle is NOT on the path, so we must create a copy of the uncle with the new color. We then update the grandparent's copy to point to this uncle copy. This adds at most one extra copy per Case 1 occurrence. Since Case 1 occurs at most O(log n) times, this contributes at most O(log n) additional copies. - Rotations (Cases 2 and 3): At most 2 rotations are performed. The rotated nodes are on or adjacent to the copied path, and rotations involve O(1) pointer changes on already-copied nodes.
4. Total cost: O(log n) path copies + O(log n) uncle copies + O(1) rotation work = O(log n) time and O(log n) space per insertion.
Since the red-black tree height is always O(log n) regardless of the insertion sequence, this O(log n) bound is a worst-case guarantee, not merely amortized.
---
Problem 2: Join Operation on Red-Black Trees
Part (a)
Maintaining T.bh: We store the black-height as a single attribute T.bh on the tree object itself (not on individual nodes), requiring no extra storage in the nodes.
During RB-INSERT, the black-height of the tree changes only when the root is recolored from red to black. This occurs at the very end of RB-INSERT-FIXUP (the final line: T.root.color = BLACK). We check: if the root was previously red (before being set to black), we increment T.bh by 1. This is O(1) additional work per insertion, so the asymptotic running time is unchanged.
Computing black-height during descent: Starting at the root with bh = T.bh, we can determine the black-height of each visited node in O(1) time using the following rule:
When moving from node x (with known black-height bh_x) to a child c:
- If c is black: bh_c = bh_x - 1
- If c is red: bh_c = bh_x
This follows from the definition: the black-height of x counts the black nodes on any path from x (exclusive) to a leaf. If c is black, then c accounts for one of those black nodes, so the remaining count from c to a leaf is bh_x - 1. If c is red, it doesn't count toward x's black-height, so the count from c to a leaf is still bh_x.
Part (b)
Algorithm to find black node y in T_1 with the largest key among those with black-height T_2.bh:
FIND-JOIN-POINT(T₁, T₂)
1 y = T₁.root
2 bh = T₁.bh
3 // Descend the right spine of T₁, tracking black-height
4 while bh ≠ T₂.bh or y.color ≠ BLACK
5 y = y.right
6 if y.color == BLACK
7 bh = bh - 1
8 return yNote: If T_1.bh = T_2.bh, the root (which is always black) is returned immediately without entering the loop.
Correctness: We descend along the right spine of T_1 (always going to the right child), tracking the black-height. Each time we pass through a black node, bh decreases by 1; red nodes leave bh unchanged. We stop at the first black node where bh = T_2.bh. Since we always go right, this node has the largest key among all black nodes at black-height T_2.bh (any other such node would be to the left in the tree and thus have a smaller key).
Running time: O(log n). The path length is bounded by the height of T_1, which is at most 2 lg(n+1) = O(log n).
Part (c)
Let y be the black node found in Part (b), and let p be y's parent in T_1. Since we descended via the right spine, y = p.right.
We replace T_y (the subtree rooted at y) with a new subtree T as follows:
1. Set x.left = y (the root of T_y) 2. Set x.right = T_2.root 3. Set p.right = x (replacing y with x as p's right child)
This takes O(1) time (three pointer assignments).
BST property is maintained:
- Within the new subtree rooted at x: All keys in T_y (= x.left) are from T_1, so they are all ≤ x.key. All keys in T_2 (= x.right) are ≥ x.key. So the BST property holds at x.
- At p: Previously p.right = y, and p.key < y.key (since y was p's right child). Now p.right = x, and p.key < y.key ≤ x.key, so the BST property still holds at p.
- All other nodes are unchanged, so the BST property is preserved throughout the tree.
Part (d)
We color x RED. This maintains:
- Property 1 (every node is red or black): x is colored red. ✓
- Property 3 (every leaf/NIL is black): x is an internal node, not a leaf. ✓
- Property 5 (all paths from a node to descendant leaves have the same number of black nodes): Since x is red, it does not contribute to black-height counts. The left subtree of x (= T_y) has black-height T_2.bh (by our choice of y), and the right subtree (= T_2) has black-height T_2.bh (by definition). So the black-heights are balanced at x. Furthermore, x replaced y (which had black-height T_2.bh) in T_1, and bh(x) = T_2.bh (same as y's was), so property 5 is maintained throughout the tree. ✓
Properties that may be violated:
- Property 2 (root is black): If T_1.bh = T_2.bh, then y was the root of T_1, and x becomes the new root. Since x is red, property 2 is violated. Fix: Simply recolor the root black. This increases the black-height by 1 but doesn't violate any other properties. O(1) time.
- Property 4 (no two consecutive red nodes): If y's parent p is red, then we have p (red) → x (red), violating property 4. Fix: This is exactly the same violation that occurs after a standard RB-INSERT. We run RB-INSERT-FIXUP starting at x and proceeding upward. The fixup performs at most O(log n) recolorings and at most 2 rotations, so it runs in O(log n) time.
Part (e)
No generality is lost because the two cases (T_1.bh ≥ T_2.bh and T_1.bh ≤ T_2.bh) are exhaustive. Every pair of red-black trees satisfies one of these conditions.
Symmetric case when T_1.bh ≤ T_2.bh: We perform the mirror-image operations:
1. Find a black node y in T_2 with the smallest key among those with black-height T_1.bh. This is done by descending the left spine of T_2 (always going to the left child), tracking black-height until we find a black node at bh = T_1.bh.
2. Replace T_y with a new subtree: set x.left = T_1.root, x.right = y, and update y's parent to point to x.
3. Color x red and run RB-INSERT-FIXUP from x upward to restore properties 2 and 4.
The analysis is identical to Parts (b)–(d) by symmetry, and the running time is O(log n).
Part (f)
The running time of RB-JOIN is O(log n), as shown by the following breakdown:
| Step | Operation | Time |
|---|---|---|
| (a) | Maintain/compute black-heights | O(1) per node, O(log n) total |
| (b) | Find join point y | O(log n) — descend one spine |
| (c) | Replace subtree | O(1) — three pointer assignments |
| (d) | Color x red | O(1) |
| (d) | RB-INSERT-FIXUP | O(log n) — at most O(log n) recolorings + 2 rotations |
Total: O(log n).
More precisely, the cost of finding y (step b) is proportional to the difference in heights of T_1 and T_2, which is at most 2|T_1.bh - T_2.bh| ≤ O(log n). The cost of fixup (step d) is proportional to the height of the resulting tree above x, which is also O(log n). Both are bounded by O(log n), and the remaining steps are O(1). Therefore, the total running time of RB-JOIN is O(log n).
---
Problem 3: Joining and Splitting 2-3-4 Trees
Part (a)
Maintaining x.height: We store the height of the subtree rooted at each node x as an attribute x.height. In a 2-3-4 tree, all leaves are at the same depth, so the height of a node equals 1 + the height of any of its children. Leaf nodes (at the bottom level) have height 0.
During search: No modifications are made to the tree, so heights are unchanged. Reading x.height is O(1) per node visited. The asymptotic running time O(log n) is unaffected.
During insertion: We insert a key into a leaf node, potentially splitting nodes along the path from the leaf to the root. We maintain heights as follows:
- When we split a node at height h, it produces two nodes, each also at height h (they have the same children as subsets of the original). The middle key is pushed up to the parent. The parent's height is unchanged (it remains 1 + the height of its children, which hasn't changed). This is O(1) work per split.
- When the root splits, a new root is created with height = (old root's height) + 1. This is O(1).
- All nodes not involved in splits retain their existing heights.
Since insertion visits at most O(log n) nodes (the height of the tree) and does O(1) extra work per node to maintain heights, the asymptotic running time of insertion remains O(log n).
Part (b)
Join algorithm for 2-3-4 trees T' (height h') and T'' (height h'') with key k:
Case 1: h' = h''. Create a new root node (a 2-node) containing key k, with T'.root as its left child and T''.root as its right child. Set the new root's height to h' + 1. Time: O(1).
Case 2: h' > h''. We attach T'' into T' at the appropriate level:
1. Starting at T'.root, descend along the rightmost child pointer at each node. At each node encountered, if it is a 4-node, split it (push the middle key up to the parent, creating two 2-nodes). This top-down splitting ensures no node on our descent path is a 4-node. 2. After descending exactly h' - h'' levels, we reach a node y at height h''. Let p be y's parent. Since we split 4-nodes on the way down, p is not a 4-node (it has at most 3 keys). 3. Insert key k into p as a new key to the right of y's position, and make T''.root the new child to the right of k. Node y remains the child to the left of k. - Since p had at most 3 keys and we added one, p now has at most 4 keys. But with top-down splitting, p had at most 3 children (a 2-node or 3-node), so after adding k and T'', it has at most 4 children — still a valid 2-3-4 tree node. (If p was a 3-node, it becomes a 4-node, which is valid.) 4. Update height attributes along the path if necessary (each is O(1)).
Time: O(h' - h''). We descend h' - h'' levels, doing O(1) work at each level (splitting and/or pointer updates). The insertion at the bottom is O(1).
Case 3: h'' > h'. Symmetric to Case 2. Descend the leftmost spine of T'' for h'' - h' levels, splitting 4-nodes top-down. Attach T' and k on the left side.
Overall time: O(1 + |h' - h''|).
Correctness of BST property: All keys in T' are less than k (given), and all keys in T'' are greater than k (given). Node y is on the rightmost spine of T', so all keys in T_y (the subtree rooted at y) are among the largest in T'. Inserting k to the right of y in p, with T'' as k's right child, preserves the ordering: keys in T_y < k < keys in T''.
Part (c)
Consider the path p from the root of T to key k. At each internal node on the path, the node contains some keys and has some children. One child leads to the next node on the path; the remaining keys and children are distributed to S' or S''.
How p breaks S' into trees and keys:
At each node n_i on the path (from root at depth 0 to the node containing k at depth d):
- Let the keys in n_i be a_1 < a_2 < ... < a_t, and suppose the path follows child c_j (the child between a_j and a_{j+1}, or the leftmost/rightmost child).
- Keys a_1, ..., a_j (those less than k at this node) are contributed to S'.
- Their corresponding left subtrees (children c_0, ..., c_{j-1}) are also contributed to S'.
Collecting all these contributions in order from smallest key to largest key, we get:
S' = T'_0, k'_1, T'_1, k'_2, T'_2, ..., k'_m, T'_m
where:
- Each T'_i is an off-path subtree whose keys are all less than k.
- Each k'_i is a key from a path node that is less than k.
- For any y in T'_{i-1} and z in T'_i: y < k'_i < z. This follows from the BST property of the original 2-3-4 tree — the keys and subtrees are naturally ordered from left to right.
Relationship between heights: height(T'_{i-1}) >= height(T'_i). The heights are non-increasing. This is because the trees are listed in order of increasing key value, and:
- Trees collected from shallower depths (near the root) have larger heights and smaller keys (they come from the left side of high-level nodes).
- Trees collected from deeper levels have smaller heights and keys closer to (but less than) k.
More precisely, an off-path subtree whose root is at depth d+1 in the original tree T has height h - d - 1. Since we list subtrees from shallowest (smallest key, largest height) to deepest (largest key among those < k, smallest height), consecutive heights satisfy height(T'_{i-1}) - height(T'_i) in {0, 1}: they differ by 0 when both come from the same node (same depth), and by 1 when they come from nodes at adjacent depths.
S'' is broken symmetrically: S'' = T''_0, k''_1, T''_1, ..., k''_{m''}, T''_{m''}
where:
- T''_0 has the smallest keys in S'' (closest to k) and the smallest height.
- T''_{m''} has the largest keys in S'' and the largest height.
- Heights are non-decreasing: height(T''_{i-1}) <= height(T''_i).
This is because the trees with the smallest keys in S'' come from the deepest level (keys just above k), while the trees with the largest keys come from the shallowest level (rightmost subtrees near the root).
Part (d)
Split algorithm:
Step 1: Walk the path and collect. Search for k in T, following the path from root to the node containing k. At each node on the path, separate the keys and off-path subtrees into two collections:
- L (for S'): trees T'_0, T'_1, ..., T'_m and keys k'_1, ..., k'_m (those less than k)
- R (for S''): trees T''_0, T''_1, ..., T''_{m''} and keys k''_1, ..., k''_{m''} (those greater than k)
This takes O(h) = O(log n) time.
Step 2: Assemble T' using joins. Since heights of T'_0, T'_1, ..., T'_m are non-increasing, we join from right to left (starting with the smallest trees):
T' = T'_m
for i = m down to 1:
T' = JOIN(T'_{i-1}, k'_i, T')Each JOIN takes O(1 + |height(T'_{i-1}) - height(T')|) time.
Step 3: Assemble T'' using joins. Since heights of T''_0, T''_1, ..., T''_{m''} are non-decreasing, we join from left to right (starting with the smallest trees):
T'' = T''_0
for i = 1 to m'':
T'' = JOIN(T'', k''_i, T''_i)Running time analysis (telescoping argument):
For assembling T', let r_i denote the height of the result after the i-th join, with r_0 = height(T'_m).
At step i (joining T'_{m-i} with the current result):
- height(T'_{m-i}) >= height(T'_{m-i+1}) >= ... (non-increasing sequence)
- Cost_i = O(1 + |height(T'_{m-i}) - r_{i-1}|)
- After join: r_i <= max(height(T'_{m-i}), r_{i-1}) + 1
Since height(T'_{m-i}) >= r_{i-1} in general (the new tree being joined is at least as tall as or taller than the current result), we have:
- Cost_i = O(1 + height(T'_{m-i}) - r_{i-1})
- r_i <= height(T'_{m-i}) + 1
The total cost telescopes:
Sum of costs = Sum_{i=1}^{m} O(1 + height(T'_{m-i}) - r_{i-1})
Since r_{i-1} >= height(T'_{m-i+1}), the height differences telescope:
Sum of (height(T'_{m-i}) - r_{i-1}) <= Sum of (height(T'_{m-i}) - height(T'_{m-i+1})) + m = height(T'_0) - height(T'_m) + m
Since height(T'_0) <= h - 1 = O(log n) and m <= 3h (at most 3 subtrees per node on the path, with path length h), the total cost is:
O(m + h) = O(h) = O(log n)
The same telescoping argument applies to assembling T'' (symmetric case).
Total running time of split: O(log n) for the search + O(log n) for assembling T' + O(log n) for assembling T'' = O(log n).
COMP 382 Homework 2 - Grading Report
Problem 1: Persistent Dynamic Sets [25 points]
Score: 25/25 points
Assessment:
Approach: The student provides thorough, correct answers for all five parts, often exceeding the reference solution in detail.
Strengths:
- Part (a) goes beyond the reference by providing a detailed case analysis for deletion (0/1 children vs. 2 children), not just insertion
- Part (b) pseudocode is clean, correct, and well-annotated with line numbers and a clear explanation
- Part (d) proof is rigorous: clearly shows the cascading copy argument (copying a path node forces copying off-path children due to parent pointer corruption, which cascades to all descendants)
- Part (e) provides an excellent analysis of the RB-INSERT-FIXUP cases on a persistent tree, including the key insight that uncle copies add at most O(log n) additional nodes
Errors/Gaps:
- None identified
Detailed Feedback: All five parts are mathematically correct and well-justified. The pseudocode in Part (b) correctly copies only nodes on the root-to-insertion path, sharing all other nodes between versions. The Ω(n) proof in Part (d) correctly argues that parent pointer updates cascade to the entire tree. Part (e) correctly identifies that red-black trees without parent pointers (using a stack for ancestors) yield O(log n) worst-case guarantees, and the case-by-case analysis of the fixup procedure is more thorough than the reference solution.
Rubric Breakdown:
- Part (a): 5/5 — Correctly identifies path nodes for both insert and delete
- Part (b): 5/5 — Correct, clean pseudocode with explanation
- Part (c): 5/5 — Correct O(h) time and O(h) space analysis
- Part (d): 5/5 — Rigorous cascading copy proof for Ω(n) bound
- Part (e): 5/5 — Correct use of RB trees with stack-based ancestor tracking
---
Problem 2: Join Operation on Red-Black Trees [35 points]
Score: 35/35 points
Assessment:
Approach: The student provides complete, correct solutions for all six parts with clear algorithms and correctness arguments.
Strengths:
- Part (a) precisely explains the O(1)-per-node black-height tracking during descent (decrement for black children, unchanged for red children)
- Part (b) provides clean pseudocode with a correct correctness argument (rightmost spine guarantees largest key at each black-height level)
- Part (c) carefully proves the BST property is maintained with three specific observations
- Part (d) provides a thorough property-by-property analysis for coloring x red
- Part (f) gives a clear summary table of all costs
Errors/Gaps:
- Part (c): Parent pointer updates (y.parent = x, T₂.root.parent = x, x.parent = p) are not explicitly mentioned, though the reference solution includes them. This is a very minor omission since the structural changes are complete and the BST property argument is correct.
Detailed Feedback: The student's algorithm in Part (b) correctly descends the right spine of T₁ tracking black-height, stopping at the first black node where bh = T₂.bh. The correctness argument is sound — always going right ensures the largest key among nodes at the target black-height. Part (c) correctly identifies the three pointer assignments and proves BST preservation at x and at p. Part (d) correctly identifies that x should be red, provides a clean property-by-property verification, and correctly identifies that properties 2 and 4 may be violated, with appropriate fixes (recolor root for property 2, RB-INSERT-FIXUP for property 4). The symmetric case in Part (e) is correctly described. The O(log n) total in Part (f) is correctly derived.
Rubric Breakdown:
- Part (a): 6/6 — Correct bh maintenance and O(1)-per-node descent computation
- Part (b): 6/6 — Correct algorithm with pseudocode and correctness proof
- Part (c): 6/6 — Correct O(1) replacement preserving BST property
- Part (d): 7/7 — Correct color choice with complete property analysis and fixup
- Part (e): 5/5 — Correct symmetric argument
- Part (f): 5/5 — Correct O(log n) running time analysis
---
Problem 3: Joining and Splitting 2-3-4 Trees [40 points]
Score: 40/40 points
Assessment:
Approach: The student provides detailed, correct solutions for all four parts, including a well-structured join algorithm with top-down splitting and a clear telescoping argument for the split operation.
Strengths:
- Part (a) correctly handles both search (no change) and insertion (O(1) per split, O(1) for root split)
- Part (b) provides a comprehensive three-case analysis (h' = h'', h' > h'', h'' > h') with a correct top-down splitting strategy that ensures the insertion node is not full
- Part (c) provides an exceptionally detailed decomposition of how the path breaks S' and S'' into trees and keys, with a precise analysis of the height relationship (non-increasing for S', non-decreasing for S'')
- Part (d) presents a clear split algorithm with a correct telescoping cost analysis
Errors/Gaps:
- The telescoping argument in Part (d) could be slightly more rigorous regarding the intermediate result heights (the +1 from potential root splits during joins), but the conclusion O(log n) is correct.
Detailed Feedback: Part (b) uses a top-down splitting approach (splitting 4-nodes during descent before reaching the target level), which is a valid and arguably more elegant approach than splitting only at the insertion point. The invariant that p is never a 4-node when we insert k is correctly maintained. The time bound O(1 + |h' - h''|) is correct since we do O(1) work per level traversed.
Part (c) is particularly well done. The student correctly identifies the non-increasing height sequence for S' (off-path subtrees from shallower depths have larger heights and smaller keys) and non-decreasing for S'' (symmetric). The precise observation that consecutive heights differ by 0 (same node) or 1 (adjacent depths) is correct and stronger than the reference.
Part (d) correctly assembles T' from right to left (starting with the smallest/shortest trees) and T'' from left to right. The telescoping argument is essentially correct: the sum of height differences across all joins telescopes to O(h) because each join's cost is absorbed by the overall height growth from h_m to h_0. The bound O(m + h) = O(h) = O(log n) follows from m ≤ 3h (at most 3 off-path subtrees per node on a path of length h).
Rubric Breakdown:
- Part (a): 8/8 — Correct height maintenance during search and insertion
- Part (b): 12/12 — Correct join algorithm with all cases and O(1 + |h' - h''|) time
- Part (c): 10/10 — Thorough path decomposition with correct height relationships
- Part (d): 10/10 — Correct split algorithm with valid telescoping argument
---
Overall Summary
| Problem | Score | Notes |
|---|---|---|
| 1: Persistent Dynamic Sets | 25/25 | Excellent throughout; Part (d) proof and Part (e) fixup analysis particularly strong |
| 2: Join on Red-Black Trees | 35/35 | Complete and correct; clean pseudocode and property analysis |
| 3: Joining/Splitting 2-3-4 Trees | 40/40 | Outstanding; thorough case analysis in join, precise height analysis in split |
| Total | 100/100 |
General Comments:
This is an exceptional submission that demonstrates mastery of persistent data structures, red-black tree operations, and B-tree variants. The solutions are consistently correct, complete, and well-organized. In several places — notably Problem 1(a) (deletion cases), Problem 1(e) (fixup case analysis), Problem 2(d) (property-by-property verification), and Problem 3(c) (precise height relationship) — the student provides more detail and rigor than the reference solution. The pseudocode is clean and clearly specified, the correctness arguments are convincing, and the complexity analyses are sound. The telescoping argument in Problem 3(d) correctly captures the key insight that sequential join costs are dominated by the total height of the original tree.
Total: 100/100
diff --git a/writeup.md b/writeup.md
index 0ea5554..7919b87 100644
--- a/writeup.md
+++ b/writeup.md
@@ -2,8 +2,312 @@
## Student Responses
-<!-- Write your solutions below. Add sections for each problem as needed. -->
-<!-- Use proper mathematical notation where applicable. -->
+---
-[Your solutions here]
+## Problem 1: Persistent Dynamic Sets
+### Part (a)
+
+**Insert:** When inserting a node into a persistent BST, the nodes that need to change are exactly those on the path from the root to the insertion point (the leaf position where the new node is placed). Each node on this path must be copied because its child pointer (left or right) changes to point to the new copy of its child below it. Nodes not on this path remain unchanged and are shared between the old and new tree versions.
+
+**Delete:** The nodes that need to change depend on the deletion case:
+- **Node has 0 or 1 children:** All nodes on the path from the root to the deleted node must be copied. The deleted node's parent copy has its child pointer updated to NIL (0 children) or to the deleted node's single child (1 child).
+- **Node has 2 children:** We find the successor of the deleted node (go right, then all the way left). All nodes on the path from the root through the deleted node down to the successor must be copied. The deleted node's copy receives the successor's key, and the successor is removed by updating its parent's child pointer.
+
+In all cases, only nodes on the path from root to the physically removed position need to be copied.
+
+### Part (b)
+
+```
+PERSISTENT-TREE-INSERT(T, z)
+ 1 let T' be a new tree
+ 2 if T.root == NIL
+ 3 T'.root = z
+ 4 return T'
+ 5 T'.root = COPY-NODE(T.root)
+ 6 y = T'.root
+ 7 while TRUE
+ 8 if z.key < y.key
+ 9 if y.left == NIL
+10 y.left = z
+11 return T'
+12 else
+13 y.left = COPY-NODE(y.left)
+14 y = y.left
+15 else
+16 if y.right == NIL
+17 y.right = z
+18 return T'
+19 else
+20 y.right = COPY-NODE(y.right)
+21 y = y.right
+```
+
+**Explanation:** The procedure creates a new tree T' by copying only the nodes on the path from the root to the insertion point. Starting at the root, we copy the current node and descend toward the correct insertion position. At each level, we copy the next node on the path before descending into it. When we find a NIL child in the appropriate direction, we attach z there. Nodes not on the path are shared between T and T' (the copied nodes' other child pointers still reference the original, shared nodes).
+
+### Part (c)
+
+- **Time:** O(h). We traverse the path from the root to the insertion point, performing O(1) work (one COPY-NODE call plus pointer updates) at each of the O(h) levels.
+- **Space:** O(h). We create exactly one new node per level on the root-to-insertion path, plus the new node z. The number of nodes on this path is at most h + 1, so the space requirement is O(h).
+
+### Part (d)
+
+**Claim:** If parent pointers are included, PERSISTENT-TREE-INSERT requires Omega(n) time and space.
+
+**Proof:** Consider inserting a node z. We must copy all nodes on the root-to-insertion path (at least the root). When we copy a node c on this path, the copy c' replaces c in the new tree. Now consider a child d of c that is NOT on the insertion path. In the original tree, d.parent = c. In the new tree, d's parent should be c' (the copy), not c. However, d is shared between the old and new trees — we cannot modify d.parent to point to c' without corrupting the old tree. Therefore, we must also copy d (creating d') and set d'.parent = c'.
+
+But now d' has children whose parent pointers point to d, not d'. So we must copy d's children as well, and their children, and so on. This cascading copying extends to the entire subtree rooted at d.
+
+This cascading occurs for every off-path child of every node on the insertion path. Since every node in the tree is either on the path or in the subtree of some off-path child of a path node, we end up copying every node in the tree.
+
+Therefore, the total number of nodes copied is n, giving Omega(n) time and space. QED.
+
+### Part (e)
+
+We use a persistent red-black tree **without parent pointers** (as motivated by Part (d)). Red-black trees guarantee a height of O(log n), which directly gives us O(log n) worst-case time and space per insertion:
+
+1. **Path copying:** Since the height of a red-black tree with n nodes is at most 2 lg(n+1), the root-to-insertion path has O(log n) nodes. We copy each of these nodes as in Part (b), taking O(log n) time and space.
+
+2. **Maintaining ancestors for fixup:** Since we have no parent pointers, we store the path from the root to the inserted node in a stack/array as we descend. This uses O(log n) space (which does not increase the asymptotic space beyond what we already use for copying).
+
+3. **RB-INSERT-FIXUP on the persistent tree:** During fixup, we traverse back up the path (using our stored path) performing recolorings and rotations:
+ - **Recolorings (Case 1):** We recolor the parent, uncle, and grandparent, then move up two levels. The parent and grandparent are already on the path (already copied), so we can modify their colors freely. The uncle is NOT on the path, so we must create a copy of the uncle with the new color. We then update the grandparent's copy to point to this uncle copy. This adds at most one extra copy per Case 1 occurrence. Since Case 1 occurs at most O(log n) times, this contributes at most O(log n) additional copies.
+ - **Rotations (Cases 2 and 3):** At most 2 rotations are performed. The rotated nodes are on or adjacent to the copied path, and rotations involve O(1) pointer changes on already-copied nodes.
+
+4. **Total cost:** O(log n) path copies + O(log n) uncle copies + O(1) rotation work = **O(log n) time and O(log n) space** per insertion.
+
+Since the red-black tree height is always O(log n) regardless of the insertion sequence, this O(log n) bound is a **worst-case** guarantee, not merely amortized.
+
+---
+
+## Problem 2: Join Operation on Red-Black Trees
+
+### Part (a)
+
+**Maintaining T.bh:** We store the black-height as a single attribute T.bh on the tree object itself (not on individual nodes), requiring no extra storage in the nodes.
+
+During RB-INSERT, the black-height of the tree changes only when the root is recolored from red to black. This occurs at the very end of RB-INSERT-FIXUP (the final line: T.root.color = BLACK). We check: if the root was previously red (before being set to black), we increment T.bh by 1. This is O(1) additional work per insertion, so the asymptotic running time is unchanged.
+
+**Computing black-height during descent:** Starting at the root with bh = T.bh, we can determine the black-height of each visited node in O(1) time using the following rule:
+
+When moving from node x (with known black-height bh_x) to a child c:
+- If c is **black**: bh_c = bh_x - 1
+- If c is **red**: bh_c = bh_x
+
+This follows from the definition: the black-height of x counts the black nodes on any path from x (exclusive) to a leaf. If c is black, then c accounts for one of those black nodes, so the remaining count from c to a leaf is bh_x - 1. If c is red, it doesn't count toward x's black-height, so the count from c to a leaf is still bh_x.
+
+### Part (b)
+
+**Algorithm** to find black node y in T_1 with the largest key among those with black-height T_2.bh:
+
+```
+FIND-JOIN-POINT(T₁, T₂)
+ 1 y = T₁.root
+ 2 bh = T₁.bh
+ 3 // Descend the right spine of T₁, tracking black-height
+ 4 while bh ≠ T₂.bh or y.color ≠ BLACK
+ 5 y = y.right
+ 6 if y.color == BLACK
+ 7 bh = bh - 1
+ 8 return y
+```
+
+Note: If T_1.bh = T_2.bh, the root (which is always black) is returned immediately without entering the loop.
+
+**Correctness:** We descend along the right spine of T_1 (always going to the right child), tracking the black-height. Each time we pass through a black node, bh decreases by 1; red nodes leave bh unchanged. We stop at the first black node where bh = T_2.bh. Since we always go right, this node has the largest key among all black nodes at black-height T_2.bh (any other such node would be to the left in the tree and thus have a smaller key).
+
+**Running time:** O(log n). The path length is bounded by the height of T_1, which is at most 2 lg(n+1) = O(log n).
+
+### Part (c)
+
+Let y be the black node found in Part (b), and let p be y's parent in T_1. Since we descended via the right spine, y = p.right.
+
+We replace T_y (the subtree rooted at y) with a new subtree T as follows:
+
+1. Set x.left = y (the root of T_y)
+2. Set x.right = T_2.root
+3. Set p.right = x (replacing y with x as p's right child)
+
+This takes O(1) time (three pointer assignments).
+
+**BST property is maintained:**
+- **Within the new subtree rooted at x:** All keys in T_y (= x.left) are from T_1, so they are all ≤ x.key. All keys in T_2 (= x.right) are ≥ x.key. So the BST property holds at x.
+- **At p:** Previously p.right = y, and p.key < y.key (since y was p's right child). Now p.right = x, and p.key < y.key ≤ x.key, so the BST property still holds at p.
+- **All other nodes** are unchanged, so the BST property is preserved throughout the tree.
+
+### Part (d)
+
+**We color x RED.** This maintains:
+
+- **Property 1** (every node is red or black): x is colored red. ✓
+- **Property 3** (every leaf/NIL is black): x is an internal node, not a leaf. ✓
+- **Property 5** (all paths from a node to descendant leaves have the same number of black nodes): Since x is red, it does not contribute to black-height counts. The left subtree of x (= T_y) has black-height T_2.bh (by our choice of y), and the right subtree (= T_2) has black-height T_2.bh (by definition). So the black-heights are balanced at x. Furthermore, x replaced y (which had black-height T_2.bh) in T_1, and bh(x) = T_2.bh (same as y's was), so property 5 is maintained throughout the tree. ✓
+
+**Properties that may be violated:**
+
+- **Property 2** (root is black): If T_1.bh = T_2.bh, then y was the root of T_1, and x becomes the new root. Since x is red, property 2 is violated. **Fix:** Simply recolor the root black. This increases the black-height by 1 but doesn't violate any other properties. O(1) time.
+
+- **Property 4** (no two consecutive red nodes): If y's parent p is red, then we have p (red) → x (red), violating property 4. **Fix:** This is exactly the same violation that occurs after a standard RB-INSERT. We run RB-INSERT-FIXUP starting at x and proceeding upward. The fixup performs at most O(log n) recolorings and at most 2 rotations, so it runs in O(log n) time.
+
+### Part (e)
+
+**No generality is lost** because the two cases (T_1.bh ≥ T_2.bh and T_1.bh ≤ T_2.bh) are exhaustive. Every pair of red-black trees satisfies one of these conditions.
+
+**Symmetric case when T_1.bh ≤ T_2.bh:** We perform the mirror-image operations:
+
+1. Find a black node y in T_2 with the **smallest** key among those with black-height T_1.bh. This is done by descending the **left** spine of T_2 (always going to the left child), tracking black-height until we find a black node at bh = T_1.bh.
+
+2. Replace T_y with a new subtree: set x.left = T_1.root, x.right = y, and update y's parent to point to x.
+
+3. Color x red and run RB-INSERT-FIXUP from x upward to restore properties 2 and 4.
+
+The analysis is identical to Parts (b)–(d) by symmetry, and the running time is O(log n).
+
+### Part (f)
+
+The running time of RB-JOIN is O(log n), as shown by the following breakdown:
+
+| Step | Operation | Time |
+|------|-----------|------|
+| (a) | Maintain/compute black-heights | O(1) per node, O(log n) total |
+| (b) | Find join point y | O(log n) — descend one spine |
+| (c) | Replace subtree | O(1) — three pointer assignments |
+| (d) | Color x red | O(1) |
+| (d) | RB-INSERT-FIXUP | O(log n) — at most O(log n) recolorings + 2 rotations |
+
+**Total: O(log n).**
+
+More precisely, the cost of finding y (step b) is proportional to the difference in heights of T_1 and T_2, which is at most 2|T_1.bh - T_2.bh| ≤ O(log n). The cost of fixup (step d) is proportional to the height of the resulting tree above x, which is also O(log n). Both are bounded by O(log n), and the remaining steps are O(1). Therefore, the total running time of RB-JOIN is O(log n).
+
+---
+
+## Problem 3: Joining and Splitting 2-3-4 Trees
+
+### Part (a)
+
+**Maintaining x.height:** We store the height of the subtree rooted at each node x as an attribute x.height. In a 2-3-4 tree, all leaves are at the same depth, so the height of a node equals 1 + the height of any of its children. Leaf nodes (at the bottom level) have height 0.
+
+**During search:** No modifications are made to the tree, so heights are unchanged. Reading x.height is O(1) per node visited. The asymptotic running time O(log n) is unaffected.
+
+**During insertion:** We insert a key into a leaf node, potentially splitting nodes along the path from the leaf to the root. We maintain heights as follows:
+
+- When we **split** a node at height h, it produces two nodes, each also at height h (they have the same children as subsets of the original). The middle key is pushed up to the parent. The parent's height is unchanged (it remains 1 + the height of its children, which hasn't changed). This is O(1) work per split.
+- When the **root splits**, a new root is created with height = (old root's height) + 1. This is O(1).
+- All nodes not involved in splits retain their existing heights.
+
+Since insertion visits at most O(log n) nodes (the height of the tree) and does O(1) extra work per node to maintain heights, the asymptotic running time of insertion remains O(log n).
+
+### Part (b)
+
+**Join algorithm** for 2-3-4 trees T' (height h') and T'' (height h'') with key k:
+
+**Case 1: h' = h''.** Create a new root node (a 2-node) containing key k, with T'.root as its left child and T''.root as its right child. Set the new root's height to h' + 1. Time: O(1).
+
+**Case 2: h' > h''.** We attach T'' into T' at the appropriate level:
+
+1. Starting at T'.root, descend along the **rightmost** child pointer at each node. At each node encountered, if it is a 4-node, **split it** (push the middle key up to the parent, creating two 2-nodes). This top-down splitting ensures no node on our descent path is a 4-node.
+2. After descending exactly h' - h'' levels, we reach a node y at height h''. Let p be y's parent. Since we split 4-nodes on the way down, p is not a 4-node (it has at most 3 keys).
+3. Insert key k into p as a new key to the right of y's position, and make T''.root the new child to the right of k. Node y remains the child to the left of k.
+ - Since p had at most 3 keys and we added one, p now has at most 4 keys. But with top-down splitting, p had at most 3 children (a 2-node or 3-node), so after adding k and T'', it has at most 4 children — still a valid 2-3-4 tree node. (If p was a 3-node, it becomes a 4-node, which is valid.)
+4. Update height attributes along the path if necessary (each is O(1)).
+
+**Time:** O(h' - h''). We descend h' - h'' levels, doing O(1) work at each level (splitting and/or pointer updates). The insertion at the bottom is O(1).
+
+**Case 3: h'' > h'.** Symmetric to Case 2. Descend the **leftmost** spine of T'' for h'' - h' levels, splitting 4-nodes top-down. Attach T' and k on the left side.
+
+**Overall time:** O(1 + |h' - h''|).
+
+**Correctness of BST property:** All keys in T' are less than k (given), and all keys in T'' are greater than k (given). Node y is on the rightmost spine of T', so all keys in T_y (the subtree rooted at y) are among the largest in T'. Inserting k to the right of y in p, with T'' as k's right child, preserves the ordering: keys in T_y < k < keys in T''.
+
+### Part (c)
+
+Consider the path p from the root of T to key k. At each internal node on the path, the node contains some keys and has some children. One child leads to the next node on the path; the remaining keys and children are distributed to S' or S''.
+
+**How p breaks S' into trees and keys:**
+
+At each node n_i on the path (from root at depth 0 to the node containing k at depth d):
+- Let the keys in n_i be a_1 < a_2 < ... < a_t, and suppose the path follows child c_j (the child between a_j and a_{j+1}, or the leftmost/rightmost child).
+- Keys a_1, ..., a_j (those less than k at this node) are contributed to S'.
+- Their corresponding left subtrees (children c_0, ..., c_{j-1}) are also contributed to S'.
+
+Collecting all these contributions in order from smallest key to largest key, we get:
+
+**S' = T'_0, k'_1, T'_1, k'_2, T'_2, ..., k'_m, T'_m**
+
+where:
+- Each T'_i is an off-path subtree whose keys are all less than k.
+- Each k'_i is a key from a path node that is less than k.
+- For any y in T'_{i-1} and z in T'_i: y < k'_i < z. This follows from the BST property of the original 2-3-4 tree — the keys and subtrees are naturally ordered from left to right.
+
+**Relationship between heights:** height(T'_{i-1}) >= height(T'_i). The heights are **non-increasing**. This is because the trees are listed in order of increasing key value, and:
+- Trees collected from shallower depths (near the root) have larger heights and smaller keys (they come from the left side of high-level nodes).
+- Trees collected from deeper levels have smaller heights and keys closer to (but less than) k.
+
+More precisely, an off-path subtree whose root is at depth d+1 in the original tree T has height h - d - 1. Since we list subtrees from shallowest (smallest key, largest height) to deepest (largest key among those < k, smallest height), consecutive heights satisfy height(T'_{i-1}) - height(T'_i) in {0, 1}: they differ by 0 when both come from the same node (same depth), and by 1 when they come from nodes at adjacent depths.
+
+**S'' is broken symmetrically:** S'' = T''_0, k''_1, T''_1, ..., k''_{m''}, T''_{m''}
+
+where:
+- T''_0 has the smallest keys in S'' (closest to k) and the **smallest** height.
+- T''_{m''} has the largest keys in S'' and the **largest** height.
+- Heights are **non-decreasing**: height(T''_{i-1}) <= height(T''_i).
+
+This is because the trees with the smallest keys in S'' come from the deepest level (keys just above k), while the trees with the largest keys come from the shallowest level (rightmost subtrees near the root).
+
+### Part (d)
+
+**Split algorithm:**
+
+**Step 1: Walk the path and collect.** Search for k in T, following the path from root to the node containing k. At each node on the path, separate the keys and off-path subtrees into two collections:
+- **L** (for S'): trees T'_0, T'_1, ..., T'_m and keys k'_1, ..., k'_m (those less than k)
+- **R** (for S''): trees T''_0, T''_1, ..., T''_{m''} and keys k''_1, ..., k''_{m''} (those greater than k)
+
+This takes O(h) = O(log n) time.
+
+**Step 2: Assemble T' using joins.** Since heights of T'_0, T'_1, ..., T'_m are non-increasing, we join from right to left (starting with the smallest trees):
+
+```
+T' = T'_m
+for i = m down to 1:
+ T' = JOIN(T'_{i-1}, k'_i, T')
+```
+
+Each JOIN takes O(1 + |height(T'_{i-1}) - height(T')|) time.
+
+**Step 3: Assemble T'' using joins.** Since heights of T''_0, T''_1, ..., T''_{m''} are non-decreasing, we join from left to right (starting with the smallest trees):
+
+```
+T'' = T''_0
+for i = 1 to m'':
+ T'' = JOIN(T'', k''_i, T''_i)
+```
+
+**Running time analysis (telescoping argument):**
+
+For assembling T', let r_i denote the height of the result after the i-th join, with r_0 = height(T'_m).
+
+At step i (joining T'_{m-i} with the current result):
+- height(T'_{m-i}) >= height(T'_{m-i+1}) >= ... (non-increasing sequence)
+- Cost_i = O(1 + |height(T'_{m-i}) - r_{i-1}|)
+- After join: r_i <= max(height(T'_{m-i}), r_{i-1}) + 1
+
+Since height(T'_{m-i}) >= r_{i-1} in general (the new tree being joined is at least as tall as or taller than the current result), we have:
+- Cost_i = O(1 + height(T'_{m-i}) - r_{i-1})
+- r_i <= height(T'_{m-i}) + 1
+
+The total cost telescopes:
+
+Sum of costs = Sum_{i=1}^{m} O(1 + height(T'_{m-i}) - r_{i-1})
+
+Since r_{i-1} >= height(T'_{m-i+1}), the height differences telescope:
+
+Sum of (height(T'_{m-i}) - r_{i-1}) <= Sum of (height(T'_{m-i}) - height(T'_{m-i+1})) + m
+= height(T'_0) - height(T'_m) + m
+
+Since height(T'_0) <= h - 1 = O(log n) and m <= 3h (at most 3 subtrees per node on the path, with path length h), the total cost is:
+
+**O(m + h) = O(h) = O(log n)**
+
+The same telescoping argument applies to assembling T'' (symmetric case).
+
+**Total running time of split:** O(log n) for the search + O(log n) for assembling T' + O(log n) for assembling T'' = **O(log n)**.
Sub-Model Usage
| Model | Input | Output | Cache Read | Cost |
|---|---|---|---|---|
| claude-opus-4-6 | 6 | 38.0K | 217.1K | $1.43 |
BSCS Bench