Agent Work: Homework2
Claude Haiku 4.5 · COMP 382: Reasoning about Algorithms
COMP 382: Reasoning about Algorithms
Homework 2
Total Points: 100
---
Problem 1: Persistent Dynamic Sets [25 points]
During the course of an algorithm, you sometimes find that you need to maintain past versions of a dynamic set as it is updated. We call such a set persistent. One way to implement a persistent set is to copy the entire set whenever it is modified, but this approach can slow down a program and also consume a lot of space. Sometimes, you can do much better.
Consider a persistent set S with the operations INSERT, DELETE, and SEARCH, which you implement using binary search trees.
Maintain a separate root for every version of the set. In order to insert a key (e.g., key 5) into the set, create a new node with that key. This node becomes the child of a new copy of its parent, and so on up to the root. Thus, you copy only part of the tree and share some of the nodes with the original tree.
Assume that each tree node has the attributes key, left, and right but no parent.
Part (a)
For a persistent binary search tree (not a red-black tree, just a binary search tree), identify the nodes that need to change to insert or delete a node.
Part (b)
Write a procedure PERSISTENT-TREE-INSERT(T, z) that, given a persistent binary search tree T and a node z to insert, returns a new persistent tree T' that is the result of inserting z into T. Assume that you have a procedure COPY-NODE(x) that makes a copy of node x, including all of its attributes.
Part (c)
If the height of the persistent binary search tree T is h, what are the time and space requirements of your implementation of PERSISTENT-TREE-INSERT? (The space requirement is proportional to the number of nodes that are copied.)
Part (d)
Suppose that you include the parent attribute in each node. In this case, the PERSISTENT-TREE-INSERT procedure needs to perform additional copying. Prove that PERSISTENT-TREE-INSERT then requires Ω(n) time and space, where n is the number of nodes in the tree.
Part (e)
Show how to use red-black trees to guarantee that the worst-case running time and space are O(log n) per insertion. You may assume that all keys are distinct.
---
Problem 2: Join Operation on Red-Black Trees [35 points]
The join operation takes two dynamic sets S₁ and S₂ and an element x such that for any x₁ ∈ S₁ and x₂ ∈ S₂, we have x₁.key ≤ x.key ≤ x₂.key. It returns a set S = S₁ ∪ {x} ∪ S₂.
In this problem, we investigate how to implement the join operation on red-black trees.
Part (a)
Given a red-black tree T, let us store its black-height as the new attribute T.bh. Argue that RB-INSERT can maintain the bh attribute without requiring extra storage in the nodes of the tree and without increasing the asymptotic running times. Show how to determine the black-height of each node visited while descending through T, using O(1) time per node visited.
Let T₁ and T₂ be red-black trees and x be a key value such that for any nodes x₁ in T₁ and x₂ in T₂, we have x₁.key ≤ x.key ≤ x₂.key. You will show how to implement the operation RB-JOIN(T₁, x, T₂), which destroys T₁ and T₂ and returns a red-black tree T = T₁ ∪ {x} ∪ T₂. Let n be the total number of nodes in T₁ and T₂.
Part (b)
Assume that T₁.bh ≥ T₂.bh. Describe an O(log n) time algorithm that finds a black node y in T₁ with the largest key from among those nodes whose black-tree height is T₂.bh.
Part (c)
Let Tᵧ be the subtree rooted at y. Describe how T = Tᵧ ∪ {x} ∪ T₂ can replace Tᵧ in O(1) time without destroying the binary-search-tree property.
Part (d)
What color should we make x so that red-black properties 1, 3, and 5 are maintained? Describe how to enforce properties 2 and 4 in O(log n) time.
Part (e)
Argue that no generality is lost by making the assumption in part (b). Describe the symmetric situation that arises when T₁.bh ≤ T₂.bh.
Part (f)
Show that the running time of RB-JOIN is O(log n).
---
Problem 3: Joining and Splitting 2-3-4 Trees [40 Points]
The join operation takes two dynamic sets S' and S'' and an element x such that x'.key < x.key < x''.key for any x' ∈ S' and x'' ∈ S''. It returns a set S = S' ∪ {x} ∪ S''.
The split operation is like an "inverse" join: given a dynamic set S and an element x ∈ S, it creates a set S' that consists of all elements in S − {x} whose keys are less than x.key and a set S'' that consists of all elements in S − {x} whose keys are greater than x.key.
In this problem, we investigate how to implement these operations on 2-3-4 trees (B-trees with t = 2). We assume for convenience that elements consist only of keys and that all key values are distinct.
Part (a)
Show how to maintain, for every node x of a 2-3-4 tree, the height of the subtree rooted at x as an attribute x.height. Make sure that your implementation does not affect the asymptotic running times of searching and insertion.
Part (b)
Show how to implement the join operation. Given two 2-3-4 trees T' and T'' and a key k, the join operation should run in O(1 + |h' − h''|) time, where h' and h'' are the heights of T' and T'', respectively.
Part (c)
Consider the simple path p from the root of a 2-3-4 tree T to a given key k, the set S' of keys in T that are less than k, and the set S'' of keys in T that are greater than k. Show that p breaks S' into a set of trees {T'₀, T'₁, ..., T'ₘ} and a set of keys {k'₁, k'₂, ..., k'ₘ} such that y < k'ᵢ < z for i = 1, 2, ..., m and any keys y ∈ T'ᵢ₋₁ and z ∈ T'ᵢ. What is the relationship between the heights of T'ᵢ₋₁ and T'ᵢ? Describe how p breaks S'' into sets of trees and keys.
Part (d)
Show how to implement the split operation on T. Use the join operation to assemble the keys in S' into a single 2-3-4 tree T' and the keys in S'' into a single 2-3-4 tree T''. The running time of the split operation should be O(log n), where n is the number of keys in T. (Hint: The costs for joining should telescope.)
# COMP382 Homework 2
## Student Responses
<!-- Write your solutions below. Add sections for each problem as needed. -->
<!-- Use proper mathematical notation where applicable. -->
[Your solutions here]
COMP382 Homework 2
Student Responses
---
Problem 1: Persistent Dynamic Sets
Part (a): Nodes That Need to Change
When inserting or deleting a node in a binary search tree, only the nodes on the search path from the root to the insertion/deletion point need to be modified.
Specifically:
- For insertion: The path from the root down to the parent of the newly inserted node must be modified. Any node on this path needs its
leftorrightpointer updated (to point to a new copy of its child). - For deletion: Similarly, the path from root to the deleted node, and nodes whose pointers must be updated to maintain structure, must be modified.
Since no parent pointers exist, we only need to update pointers in the path's ancestors.
Part (b): PERSISTENT-TREE-INSERT Procedure
PERSISTENT-TREE-INSERT(T, z)
return PERSISTENT-INSERT-RECURSIVE(T.root, z)
PERSISTENT-INSERT-RECURSIVE(node, z)
if node == NIL
return z // Create new node with key z
new_node = COPY-NODE(node)
if z.key < node.key
new_node.left = PERSISTENT-INSERT-RECURSIVE(node.left, z)
else
new_node.right = PERSISTENT-INSERT-RECURSIVE(node.right, z)
return new_nodeHow it works: 1. Recursively descend the tree to find the insertion point 2. At each node, create a new copy before modifying it 3. Recursively update the appropriate child pointer to point to either the old child (if not on insertion path) or the new copy (if on insertion path) 4. Return a new root that represents the new persistent tree T'
Part (c): Time and Space Complexity
Time: O(h), where h is the height of the tree.
- We traverse at most h nodes along the search path
- Each node copy takes O(1) time
- All recursive calls and pointer assignments are O(1)
Space: O(h) nodes are copied.
- We copy exactly one node for each node on the path from root to insertion point
- The path length is at most h
- Shared nodes (not on the insertion path) are reused from the original tree
Total asymptotic requirements: O(h) time and O(h) space.
Part (d): Why Parent Pointers Require Ω(n) Copying
With parent pointers, inserting a node requires copying entire subtrees, not just the insertion path.
Argument:
When we insert node z and copy node y on the insertion path:
- The child nodes of y (in the original tree) have their
parentpointer pointing to the original y - Since we must maintain the persistent property (original tree unchanged), we cannot modify these children's parent pointers
- Therefore, we must copy the child nodes and update their parent pointers to point to the new copy of y
- But now the children of those child nodes have parent pointers pointing to the old copies
- This cascades downward, requiring us to copy all descendants of every node on the insertion path
Worst case: If we insert into a path of length Θ(log n) and each node has Ω(n) descendants, we must copy Ω(n) nodes total.
Example: A left-skewed path of h = Ω(log n) nodes, where nodes have broad right subtrees. Inserting at the leftmost leaf requires copying the entire right spine, forcing Ω(n) copies.
Conclusion: PERSISTENT-TREE-INSERT requires Ω(n) time and Ω(n) space with parent pointers.
Part (e): Using Red-Black Trees for O(log n) Efficiency
Red-black trees guarantee O(log n) persistent insertion because they limit structural changes to O(log n) nodes.
Key insight: RB-INSERT modifies at most O(log n) nodes through: 1. Rotations: Each rotation modifies a constant number of node pointers 2. Recoloring: Changes colors without changing tree structure 3. Total structural modifications: O(log n) nodes along the insertion path and during fixup
Implementation using persistence:
- Copy only the O(log n) nodes that are modified by the insertion (nodes on insertion path through fixup)
- Share all unmodified subtrees with the original tree
- Use pointers from copied nodes to either:
- Original subtrees (for paths not on insertion/fixup path) - New copies (for paths affected by rotations)
Result:
- Time: O(log n) - same as standard RB-INSERT
- Space: O(log n) - only modify O(log n) nodes
By combining red-black trees' height guarantee with selective copying of only modified nodes, we achieve O(log n) per insertion.
---
Problem 2: Join Operation on Red-Black Trees
Part (a): Maintaining Black-Height Attribute
Argument that RB-INSERT can maintain T.bh without extra storage:
Black-height bh(x) is completely determined by the subtree structure rooted at x. We can compute it from children's black-heights:
- For a black node: bh(x) = 1 + bh(left_child)
- For a red node: bh(x) = bh(left_child)
Since all children in a red-black tree have equal black-height, bh(x) is well-defined.
During RB-INSERT: We maintain bh(T) = bh(root) at the tree object (not per-node storage). As we bubble up from insertion: 1. After insertion, bubbling up takes O(log n) steps 2. At each node visited, recompute bh in O(1) time using children's black-heights 3. Rotations during rebalancing: Update affected nodes' bh values in O(1) time 4. Recoloring: Doesn't change bh at any node
Result: O(1) time per node visited, no per-node storage needed.
Determining black-height while descending:
- Track current black-height as we descend: initially = T.bh
- When moving from red node to child: black-height stays same
- When moving from black node to child: black-height decreases by 1
- Each step is O(1)
Part (b): Finding Black Node with Largest Key at Height T₂.bh
Algorithm:
FIND-BLACK-NODE(T₁, target_bh)
current = T₁.root
target_node = NIL
bh_tracker = T₁.bh
while current != NIL
if current.color == BLACK
if bh_tracker == target_bh
target_node = current
// Continue right to find largest such node
current = current.right
bh_tracker = bh_tracker - 1 (if current.color == BLACK) or 0
else if bh_tracker > target_bh
// Descend left subtree to find node with bh = target_bh
current = current.left
bh_tracker = bh_tracker - 1
else
// bh_tracker < target_bh: impossible state, backtrack
break
else // RED node
// Red nodes don't affect black-height count
current = current.right // Prefer right to find largest
return target_nodeAnalysis:
- We traverse a path in the tree guided by black-height values
- At each step, we move left/right in O(1) time based on black-height comparison
- Path length is O(log n) in a balanced red-black tree
- Running time: O(log n)
Part (c): Replacing Tᵧ in O(1) Time
Construction:
Create a new node x that becomes the parent of both y and T₂'s root:
x.key = x // The separator key
x.left = y // Root of Tᵧ becomes left child
x.right = T₂.root // Root of T₂ becomes right childWhy this maintains BST property:
- For all keys in Tᵧ: They satisfy y.key (within Tᵧ subtree) ≤ y.key, and y.key ≤ x.key (by problem constraint)
- For all keys in T₂: They satisfy x.key ≤ keys in T₂ (by problem constraint)
- Therefore: BST property is maintained with x as parent
Replacement: Remove y as subtree root, insert x with its two children. This is O(1) pointer updates.
Part (d): Color of x and Enforcing Properties 2 and 4
Color choice: x should be RED.
Why properties 1, 3, 5 are maintained:
- Property 1 (Root is black): x is not the root yet; we handle this separately if needed
- Property 3 (All leaves are black): x's children (y and T₂.root) inherit their colors; not a new leaf
- Property 5 (Black-height property): Since x is RED, it doesn't contribute to black-height paths. All black-height constraints from T₁ and T₂ remain unchanged
Enforcing properties 2 and 4:
These may be violated if:
- x's parent is RED (property 4 violation: two red nodes in a path)
- x is the root and RED (property 2 violation)
Solution: Run RB-INSERT-FIXUP starting from x:
- Perform standard red-black tree fixup for violations
- This involves recoloring and at most 3 rotations
- Since x is at depth related to the black-height of T₁ (at most h = O(log n)), fixup is O(log n)
Part (e): No Generality Lost and Symmetric Case
Why we can assume T₁.bh ≥ T₂.bh:
If T₂.bh > T₁.bh, we can swap the roles of T₁ and T₂ (treating the smaller as left and larger as right). The join operation is symmetric.
Symmetric case when T₁.bh ≤ T₂.bh:
Instead of finding a node in T₁ with black-height T₂.bh:
- Find a black node z in T₂ with the minimum key from among black nodes whose black-height is T₁.bh
- Make z the parent of T₁.root and x:
z.left = T₁.root,z.key = x,z.right = (old right subtree of z) - Apply the same fixup approach
The asymmetry is intentional: we always descend into the taller tree.
Part (f): Running Time of RB-JOIN is O(log n)
Running time breakdown:
1. Comparison of black-heights: O(1) - stored at tree root 2. Part (b) - Finding black node y: O(log n) - guided descent through one tree 3. Part (c) - Inserting x as parent: O(1) - constant pointer assignments 4. Part (d) - RB-INSERT-FIXUP: O(log n) - standard fixup involves at most 3 rotations and recoloring along a path of length O(log n)
Total: O(1) + O(log n) + O(1) + O(log n) = O(log n)
Since n is the total number of nodes in T₁ and T₂, and both are balanced with height O(log n), the entire join operation completes in O(log n) time.
---
Problem 3: Joining and Splitting 2-3-4 Trees
Part (a): Maintaining Height Attribute
Definition: For each node x in a 2-3-4 tree, store x.height = the height of the subtree rooted at x.
Maintenance:
- Leaf nodes:
height = 0 - Internal nodes:
height = 1 + height_of_any_child
- In B-trees, all children of an internal node have equal height, so this is well-defined
During insertion/deletion:
- When a new node is added: New leaf has
height = 0 - When nodes are split: Each piece is assigned the correct height (same for pieces from same node)
- When nodes are merged: Resulting node has
height = 1 + child_height - When tree grows: New root has
height = old_root.height + 1
Asymptotic impact:
- Each height update takes O(1) time per node
- Insertion/deletion already visit O(log n) nodes
- Height maintenance adds no additional visits
- No change to asymptotic running times
Part (b): Join Operation on 2-3-4 Trees
Algorithm RB-JOIN(T', k, T''):
Input: Two 2-3-4 trees T' and T'' with heights h' and h'' Output: Single 2-3-4 tree T = T' ∪ {k} ∪ T''
if h' == h''
Create new root r
Add T'.root as leftmost subtree of r
Add key k as first separator
Add T''.root as rightmost subtree of r
return r
else if h' > h''
node = DescendToDepth(T'.root, h' - h'' - 1)
// 'node' is at depth h' - h'' - 1 (all children have height h'')
new_node = InsertSubtreeAsRightChild(node, k, T''.root)
return T'.root with updates
else // h'' > h'
Similar to above, descend T'' to depth h'' - h' - 1
Insert T'.root as left childTime analysis:
- Descending taller tree by |h' - h''| levels: O(|h' - h''|)
- At each level: O(1) operations (accessing/modifying node)
- Creating connection at the right depth: O(1)
- Total: O(1 + |h' - h''|)
Part (c): How Path p Breaks S' and S''
Path p: From root to key k (the search path)
Breaking S' (keys less than k):
At each internal node visited on path p with keys k₁ < k₂ < ... < k_m:
- When we branch to go deeper toward k, we pass over certain keys
- Each such key kᵢ acts as a separator
Structure of S':
- Trees: {T'₀, T'₁, ..., T'ₘ} - each is a subtree not taken on descent
- Keys: {k'₁, k'₂, ..., k'ₘ} - each is a key in an internal node on p that is less than k
- Property: For i = 1, ..., m: all keys in T'ᵢ₋₁ < k'ᵢ < all keys in T'ᵢ
Height relationship:
- All T'ᵢ have the same height: h' - 1
- This is because they all branch off from the main path, and B-trees maintain uniform child heights
Breaking S'' (keys greater than k):
- Similar structure but for keys greater than k along path p
- Trees: {T''₀, T''₁, ..., T''_m'} with uniform height h' - 1
- Keys: {k''₁, k''₂, ..., k''_m'} separating the trees
- Property: k'ᵢ < k''ⱼ for all i, j (all S' keys less than all S'' keys)
Part (d): Implementing Split Operation
Algorithm:
RB-SPLIT(T, k)
Find path p from root to key k // O(log n)
From path p, identify:
- T'₀, T'₁, ..., T'ₘ (trees for S', all height h' - 1)
- k'₁, k'₂, ..., k'ₘ (keys separating S' trees)
- T''₀, T''₁, ..., T''_m' (trees for S'')
- k''₁, k''₂, ..., k''_m' (keys separating S'' trees)
// Build T' from S' components
result = T'₀
for i = 1 to m
result = JOIN(result, k'ᵢ, T'ᵢ) // O(log n) per join
T_prime = result
// Build T'' from S'' components
result = T''₀
for j = 1 to m'
result = JOIN(result, k''ⱼ, T''ⱼ) // O(log n) per join
T_double_prime = result
return (T_prime, T_double_prime)Time analysis (why costs telescope):
At step i, we join trees of heights:
- First join: height(T'₀) and height(T'₁) - both are h' - 1
- Cost: O(1 + |h' - 1 - h' - 1|) = O(1)
- Second join: height of (T'₀ ∪ k'₁ ∪ T'₁) ≈ h' - 1 + 1 = h' with height(T'₂)
- Cost: O(1 + |h' - (h' - 1)|) = O(1 + 1) = O(2)
- Pattern: Each subsequent join has cost O(1 + decrease_in_height_difference)
Telescoping: The total cost is: ∑ᵢ O(1 + |height_difference_at_step_i|) = O(m + (m-1) + ... + 1) = O(m log n) = O(log n)
Wait, let me reconsider. Actually:
- At each join, the height difference decreases or the trees approach balance
- Total joins: O(m) where m = O(log n) (number of nodes on path)
- Each join costs O(1 + height_difference), but heights are carefully balanced
- Total cost telescopes to O(log n)
Correct analysis: The path has O(log n) nodes, so m = O(log n). However, the key is that heights are uniform, so:
- Each join of trees with same height: O(1 + 0) = O(1)
- As we build larger trees, we join trees of increasing height with carefully managed differences
- Total cost: O(log n) due to the bounded number of join operations
Final running time: O(log n)
Problem 1: Persistent Dynamic Sets [25 points]
Score: 23/25 points
Assessment:
Approach: The student demonstrates strong understanding of persistent data structures, providing clear explanations and a clean recursive implementation.
Strengths:
- Part (a) correctly identifies that only ancestor nodes on the search path need modification
- Part (b) provides an elegant recursive implementation that is arguably cleaner than the reference iterative approach
- Part (c) correctly derives O(h) time and space
- Part (d) provides a valid cascading argument for Ω(n) with parent pointers
Errors/Gaps:
- Part (e) correctly identifies that only O(log n) nodes need copying with red-black trees, but omits the key implementation detail about how to navigate without parent pointers. The reference solution explicitly mentions using a stack to track ancestors during insertion and rebalancing — this is the crucial insight for making the persistent version work without parent pointers.
Detailed Feedback: Parts (a)–(d) are excellent. The recursive approach in (b) is a valid and clean alternative to the reference's iterative solution. The Ω(n) proof in (d) correctly argues that copying any node on the insertion path forces cascading copies of all descendants (since their parent pointers refer to the original nodes), ultimately requiring Ω(n) total copies. Part (e) correctly states the O(log n) conclusion but is missing the mechanism — specifically, that a stack of O(log n) ancestors replaces parent pointers during rebalancing. This is the core implementation insight, not just a detail.
Rubric Breakdown:
- Part (a) – Identifying changed nodes: 4/4 — Correct and well-explained
- Part (b) – PERSISTENT-TREE-INSERT procedure: 6/6 — Clean, correct recursive approach
- Part (c) – Time and space analysis: 4/4 — Correct O(h) for both
- Part (d) – Parent pointer Ω(n) proof: 5/5 — Valid cascading argument
- Part (e) – Red-black tree O(log n) approach: 4/6 — Correct conclusion but missing stack-based implementation detail for handling lack of parent pointers
---
Problem 2: Join Operation on Red-Black Trees [35 points]
Score: 28/35 points
Assessment:
Approach: The student addresses all six parts with generally correct intuition. The answers for color choice (part d) and running time analysis (part f) are particularly strong.
Strengths:
- Part (a) correctly explains maintaining bh at the tree level and tracking during descent
- Part (d) correctly chooses red for x and properly explains which properties are maintained/violated
- Part (f) provides a clear and correct O(log n) breakdown
Errors/Gaps:
- Part (b): The algorithm incorrectly goes LEFT when
bh_tracker > target_bhat a black node. To find the node with the largest key at the target black-height, we must always descend the right spine of T₁. Going left moves toward smaller keys. - Part (c): Missing explicit mention of how y's parent in T₁ gets updated to point to x (the replacement node)
- Part (e): The symmetric construction is confused — the student writes "z.key = x" which conflates the found node z with the separator element x, and the construction details are muddled
Detailed Feedback: Part (b) shows the right idea of tracking black-height while descending, but the directional logic is flawed. The reference simply descends the right spine of T₁, decrementing a counter at each black node, until reaching the target black-height. The student's algorithm sometimes goes left, which would not find the largest-key node. Part (c) is mostly correct but should explicitly state that y's parent now points to x. Part (e) correctly identifies the symmetry but the construction reverses roles of z and x in a confusing way.
Rubric Breakdown:
- Part (a) – Maintaining bh attribute: 5/6 — Correct approach; minor imprecision in how bh is tracked during INSERT fixup
- Part (b) – Finding black node y: 4/7 — Right concept but algorithm has directional error (goes left instead of right)
- Part (c) – Replacing Tᵧ in O(1): 5/6 — Correct construction and BST argument; missing parent-pointer update detail
- Part (d) – Color choice and fixup: 6/6 — Correct: color red, maintain properties 1/3/5, fix 2/4 via RB-INSERT-FIXUP
- Part (e) – Symmetric case: 3/5 — Correct symmetry argument but confused construction details
- Part (f) – O(log n) running time: 5/5 — Correct breakdown and summation
---
Problem 3: Joining and Splitting 2-3-4 Trees [40 points]
Score: 28/40 points
Assessment:
Approach: The student handles parts (a) and (b) well but makes a critical error in part (c) regarding the height relationship of decomposed subtrees, which weakens the telescoping argument in part (d).
Strengths:
- Part (a) correctly maintains height attributes with O(1) updates per node
- Part (b) provides a reasonable join algorithm with correct O(1 + |h' − h''|) complexity
- Part (d) arrives at the correct O(log n) final answer
Errors/Gaps:
- Part (c) — Critical error: The student claims "All T'ᵢ have the same height: h' − 1." This is wrong. In a 2-3-4 tree, subtrees broken off at depth d along the path have height h − d − 1. The heights are non-increasing (decreasing) as we descend: T'₀ has the largest height and T'ₘ has the smallest. This height relationship is explicitly asked for in the problem and is essential for the telescoping argument.
- Part (d) — Confused analysis: The student's telescoping argument is disorganized — they write an incorrect analysis, backtrack ("Wait, let me reconsider"), and end with a hand-wavy conclusion. The correct argument relies on the decreasing heights from part (c): joining T'₀ with T'₁ costs O(1 + |h₀ − h₁|), joining the result with T'₂ costs O(1 + |result.height − h₂|), etc., and these costs telescope to O(h) = O(log n).
Detailed Feedback: Part (a) is solid. Part (b) correctly describes the join algorithm for 2-3-4 trees with the right time complexity. Part (c) correctly describes the decomposition structure and ordering property but gets the crucial height relationship wrong — the problem explicitly asks for this relationship and it's fundamental to the analysis. The heights decrease as we go deeper along the path, not remain constant. Part (d) provides the right algorithmic structure for split but the telescoping analysis is confused and self-contradictory, partly due to the error in part (c). The student reaches the correct O(log n) answer but without a convincing proof.
Rubric Breakdown:
- Part (a) – Maintaining height: 8/8 — Correct maintenance with O(1) per node, no asymptotic impact
- Part (b) – Join operation: 8/10 — Correct approach and time complexity; somewhat informal on split/merge details
- Part (c) – Path decomposition: 6/12 — Correct decomposition structure and ordering; incorrect height relationship (claims uniform h'−1, should be non-increasing); symmetric case addressed but similarly flawed
- Part (d) – Split operation: 6/10 — Correct algorithm structure; telescoping argument is confused and hand-wavy; correct final answer but unconvincing proof
---
Overall Summary
| Problem | Score | Notes |
|---|---|---|
| 1: Persistent Dynamic Sets | 23/25 | Strong overall; minor gap in RB-tree implementation detail |
| 2: Join on Red-Black Trees | 28/35 | Good conceptual understanding; algorithm error in part (b), confused symmetric case |
| 3: Join/Split 2-3-4 Trees | 28/40 | Critical error on subtree height relationship; weak telescoping argument |
| Total | 79/100 |
General Comments:
The submission demonstrates solid understanding of persistent data structures and red-black tree operations. Parts (a)–(d) of Problem 1 are particularly well-done, with a clean recursive implementation that is a valid alternative to the reference. The main weaknesses are: (1) occasional omission of key implementation details (e.g., stack-based ancestor tracking in 1e), (2) algorithm design errors (wrong descent direction in 2b), and (3) the critical misunderstanding in Problem 3 about subtree heights being uniform rather than decreasing along the path, which undermines the telescoping analysis. Future work should focus on being more precise about structural properties of balanced trees and ensuring formal arguments are tight rather than hand-wavy.
Total: 79/100
diff --git a/writeup.md b/writeup.md
index 0ea5554..3ed9cbe 100644
--- a/writeup.md
+++ b/writeup.md
@@ -2,8 +2,366 @@
## Student Responses
-<!-- Write your solutions below. Add sections for each problem as needed. -->
-<!-- Use proper mathematical notation where applicable. -->
+---
-[Your solutions here]
+## Problem 1: Persistent Dynamic Sets
+
+### Part (a): Nodes That Need to Change
+
+When inserting or deleting a node in a binary search tree, only the nodes on the **search path from the root to the insertion/deletion point** need to be modified.
+
+**Specifically:**
+- For **insertion**: The path from the root down to the parent of the newly inserted node must be modified. Any node on this path needs its `left` or `right` pointer updated (to point to a new copy of its child).
+- For **deletion**: Similarly, the path from root to the deleted node, and nodes whose pointers must be updated to maintain structure, must be modified.
+
+Since no parent pointers exist, we only need to update pointers in the path's ancestors.
+
+### Part (b): PERSISTENT-TREE-INSERT Procedure
+
+```
+PERSISTENT-TREE-INSERT(T, z)
+ return PERSISTENT-INSERT-RECURSIVE(T.root, z)
+
+PERSISTENT-INSERT-RECURSIVE(node, z)
+ if node == NIL
+ return z // Create new node with key z
+
+ new_node = COPY-NODE(node)
+
+ if z.key < node.key
+ new_node.left = PERSISTENT-INSERT-RECURSIVE(node.left, z)
+ else
+ new_node.right = PERSISTENT-INSERT-RECURSIVE(node.right, z)
+
+ return new_node
+```
+
+**How it works:**
+1. Recursively descend the tree to find the insertion point
+2. At each node, create a new copy before modifying it
+3. Recursively update the appropriate child pointer to point to either the old child (if not on insertion path) or the new copy (if on insertion path)
+4. Return a new root that represents the new persistent tree T'
+
+### Part (c): Time and Space Complexity
+
+**Time:** O(h), where h is the height of the tree.
+- We traverse at most h nodes along the search path
+- Each node copy takes O(1) time
+- All recursive calls and pointer assignments are O(1)
+
+**Space:** O(h) nodes are copied.
+- We copy exactly one node for each node on the path from root to insertion point
+- The path length is at most h
+- Shared nodes (not on the insertion path) are reused from the original tree
+
+**Total asymptotic requirements: O(h) time and O(h) space.**
+
+### Part (d): Why Parent Pointers Require Ω(n) Copying
+
+With parent pointers, inserting a node requires **copying entire subtrees**, not just the insertion path.
+
+**Argument:**
+
+When we insert node z and copy node y on the insertion path:
+- The child nodes of y (in the original tree) have their `parent` pointer pointing to the original y
+- Since we must maintain the persistent property (original tree unchanged), we cannot modify these children's parent pointers
+- Therefore, we must **copy the child nodes** and update their parent pointers to point to the new copy of y
+- But now the children of those child nodes have parent pointers pointing to the old copies
+- This cascades downward, requiring us to copy **all descendants** of every node on the insertion path
+
+**Worst case:** If we insert into a path of length Θ(log n) and each node has Ω(n) descendants, we must copy Ω(n) nodes total.
+
+**Example:** A left-skewed path of h = Ω(log n) nodes, where nodes have broad right subtrees. Inserting at the leftmost leaf requires copying the entire right spine, forcing Ω(n) copies.
+
+**Conclusion:** PERSISTENT-TREE-INSERT requires **Ω(n) time and Ω(n) space** with parent pointers.
+
+### Part (e): Using Red-Black Trees for O(log n) Efficiency
+
+Red-black trees guarantee O(log n) persistent insertion because they **limit structural changes** to O(log n) nodes.
+
+**Key insight:** RB-INSERT modifies at most O(log n) nodes through:
+1. Rotations: Each rotation modifies a constant number of node pointers
+2. Recoloring: Changes colors without changing tree structure
+3. Total structural modifications: O(log n) nodes along the insertion path and during fixup
+
+**Implementation using persistence:**
+- Copy only the O(log n) nodes that are modified by the insertion (nodes on insertion path through fixup)
+- Share all unmodified subtrees with the original tree
+- Use pointers from copied nodes to either:
+ - Original subtrees (for paths not on insertion/fixup path)
+ - New copies (for paths affected by rotations)
+
+**Result:**
+- **Time:** O(log n) - same as standard RB-INSERT
+- **Space:** O(log n) - only modify O(log n) nodes
+
+By combining red-black trees' height guarantee with selective copying of only modified nodes, we achieve O(log n) per insertion.
+
+---
+
+## Problem 2: Join Operation on Red-Black Trees
+
+### Part (a): Maintaining Black-Height Attribute
+
+**Argument that RB-INSERT can maintain T.bh without extra storage:**
+
+Black-height bh(x) is completely determined by the subtree structure rooted at x. We can compute it from children's black-heights:
+- **For a black node:** bh(x) = 1 + bh(left_child)
+- **For a red node:** bh(x) = bh(left_child)
+
+Since all children in a red-black tree have equal black-height, bh(x) is well-defined.
+
+**During RB-INSERT:** We maintain bh(T) = bh(root) at the tree object (not per-node storage). As we bubble up from insertion:
+1. After insertion, bubbling up takes O(log n) steps
+2. At each node visited, recompute bh in O(1) time using children's black-heights
+3. Rotations during rebalancing: Update affected nodes' bh values in O(1) time
+4. Recoloring: Doesn't change bh at any node
+
+**Result:** O(1) time per node visited, no per-node storage needed.
+
+**Determining black-height while descending:**
+- Track current black-height as we descend: initially = T.bh
+- When moving from red node to child: black-height stays same
+- When moving from black node to child: black-height decreases by 1
+- Each step is O(1)
+
+### Part (b): Finding Black Node with Largest Key at Height T₂.bh
+
+**Algorithm:**
+
+```
+FIND-BLACK-NODE(T₁, target_bh)
+ current = T₁.root
+ target_node = NIL
+ bh_tracker = T₁.bh
+
+ while current != NIL
+ if current.color == BLACK
+ if bh_tracker == target_bh
+ target_node = current
+ // Continue right to find largest such node
+ current = current.right
+ bh_tracker = bh_tracker - 1 (if current.color == BLACK) or 0
+ else if bh_tracker > target_bh
+ // Descend left subtree to find node with bh = target_bh
+ current = current.left
+ bh_tracker = bh_tracker - 1
+ else
+ // bh_tracker < target_bh: impossible state, backtrack
+ break
+ else // RED node
+ // Red nodes don't affect black-height count
+ current = current.right // Prefer right to find largest
+
+ return target_node
+```
+
+**Analysis:**
+- We traverse a path in the tree guided by black-height values
+- At each step, we move left/right in O(1) time based on black-height comparison
+- Path length is O(log n) in a balanced red-black tree
+- **Running time: O(log n)**
+
+### Part (c): Replacing Tᵧ in O(1) Time
+
+**Construction:**
+
+Create a new node x that becomes the parent of both y and T₂'s root:
+```
+x.key = x // The separator key
+x.left = y // Root of Tᵧ becomes left child
+x.right = T₂.root // Root of T₂ becomes right child
+```
+
+**Why this maintains BST property:**
+- For all keys in Tᵧ: They satisfy y.key (within Tᵧ subtree) ≤ y.key, and y.key ≤ x.key (by problem constraint)
+- For all keys in T₂: They satisfy x.key ≤ keys in T₂ (by problem constraint)
+- Therefore: BST property is maintained with x as parent
+
+**Replacement:** Remove y as subtree root, insert x with its two children. This is O(1) pointer updates.
+
+### Part (d): Color of x and Enforcing Properties 2 and 4
+
+**Color choice:** x should be **RED**.
+
+**Why properties 1, 3, 5 are maintained:**
+- **Property 1 (Root is black):** x is not the root yet; we handle this separately if needed
+- **Property 3 (All leaves are black):** x's children (y and T₂.root) inherit their colors; not a new leaf
+- **Property 5 (Black-height property):** Since x is RED, it doesn't contribute to black-height paths. All black-height constraints from T₁ and T₂ remain unchanged
+
+**Enforcing properties 2 and 4:**
+
+These may be violated if:
+- x's parent is RED (property 4 violation: two red nodes in a path)
+- x is the root and RED (property 2 violation)
+
+**Solution:** Run **RB-INSERT-FIXUP** starting from x:
+- Perform standard red-black tree fixup for violations
+- This involves recoloring and at most 3 rotations
+- Since x is at depth related to the black-height of T₁ (at most h = O(log n)), fixup is **O(log n)**
+
+### Part (e): No Generality Lost and Symmetric Case
+
+**Why we can assume T₁.bh ≥ T₂.bh:**
+
+If T₂.bh > T₁.bh, we can swap the roles of T₁ and T₂ (treating the smaller as left and larger as right). The join operation is symmetric.
+
+**Symmetric case when T₁.bh ≤ T₂.bh:**
+
+Instead of finding a node in T₁ with black-height T₂.bh:
+- Find a **black node z in T₂** with the **minimum key** from among black nodes whose black-height is **T₁.bh**
+- Make z the parent of T₁.root and x: `z.left = T₁.root`, `z.key = x`, `z.right = (old right subtree of z)`
+- Apply the same fixup approach
+
+The asymmetry is intentional: we always descend into the taller tree.
+
+### Part (f): Running Time of RB-JOIN is O(log n)
+
+**Running time breakdown:**
+
+1. **Comparison of black-heights:** O(1) - stored at tree root
+2. **Part (b) - Finding black node y:** O(log n) - guided descent through one tree
+3. **Part (c) - Inserting x as parent:** O(1) - constant pointer assignments
+4. **Part (d) - RB-INSERT-FIXUP:** O(log n) - standard fixup involves at most 3 rotations and recoloring along a path of length O(log n)
+
+**Total: O(1) + O(log n) + O(1) + O(log n) = O(log n)**
+
+Since n is the total number of nodes in T₁ and T₂, and both are balanced with height O(log n), the entire join operation completes in **O(log n) time**.
+
+---
+
+## Problem 3: Joining and Splitting 2-3-4 Trees
+
+### Part (a): Maintaining Height Attribute
+
+**Definition:** For each node x in a 2-3-4 tree, store `x.height` = the height of the subtree rooted at x.
+
+**Maintenance:**
+- **Leaf nodes:** `height = 0`
+- **Internal nodes:** `height = 1 + height_of_any_child`
+ - In B-trees, all children of an internal node have equal height, so this is well-defined
+
+**During insertion/deletion:**
+- When a new node is added: New leaf has `height = 0`
+- When nodes are split: Each piece is assigned the correct height (same for pieces from same node)
+- When nodes are merged: Resulting node has `height = 1 + child_height`
+- When tree grows: New root has `height = old_root.height + 1`
+
+**Asymptotic impact:**
+- Each height update takes O(1) time per node
+- Insertion/deletion already visit O(log n) nodes
+- Height maintenance adds no additional visits
+- **No change to asymptotic running times**
+
+### Part (b): Join Operation on 2-3-4 Trees
+
+**Algorithm RB-JOIN(T', k, T''):**
+
+Input: Two 2-3-4 trees T' and T'' with heights h' and h''
+Output: Single 2-3-4 tree T = T' ∪ {k} ∪ T''
+
+```
+if h' == h''
+ Create new root r
+ Add T'.root as leftmost subtree of r
+ Add key k as first separator
+ Add T''.root as rightmost subtree of r
+ return r
+
+else if h' > h''
+ node = DescendToDepth(T'.root, h' - h'' - 1)
+ // 'node' is at depth h' - h'' - 1 (all children have height h'')
+ new_node = InsertSubtreeAsRightChild(node, k, T''.root)
+ return T'.root with updates
+
+else // h'' > h'
+ Similar to above, descend T'' to depth h'' - h' - 1
+ Insert T'.root as left child
+```
+
+**Time analysis:**
+- Descending taller tree by |h' - h''| levels: O(|h' - h''|)
+- At each level: O(1) operations (accessing/modifying node)
+- Creating connection at the right depth: O(1)
+- **Total: O(1 + |h' - h''|)**
+
+### Part (c): How Path p Breaks S' and S''
+
+**Path p:** From root to key k (the search path)
+
+**Breaking S' (keys less than k):**
+
+At each internal node visited on path p with keys k₁ < k₂ < ... < k_m:
+- When we branch to go deeper toward k, we pass over certain keys
+- Each such key kᵢ acts as a separator
+
+**Structure of S':**
+- Trees: {T'₀, T'₁, ..., T'ₘ} - each is a subtree not taken on descent
+- Keys: {k'₁, k'₂, ..., k'ₘ} - each is a key in an internal node on p that is less than k
+- **Property:** For i = 1, ..., m: all keys in T'ᵢ₋₁ < k'ᵢ < all keys in T'ᵢ
+
+**Height relationship:**
+- All T'ᵢ have the **same height**: h' - 1
+- This is because they all branch off from the main path, and B-trees maintain uniform child heights
+
+**Breaking S'' (keys greater than k):**
+- Similar structure but for keys greater than k along path p
+- Trees: {T''₀, T''₁, ..., T''_m'} with uniform height h' - 1
+- Keys: {k''₁, k''₂, ..., k''_m'} separating the trees
+- **Property:** k'ᵢ < k''ⱼ for all i, j (all S' keys less than all S'' keys)
+
+### Part (d): Implementing Split Operation
+
+**Algorithm:**
+
+```
+RB-SPLIT(T, k)
+ Find path p from root to key k // O(log n)
+
+ From path p, identify:
+ - T'₀, T'₁, ..., T'ₘ (trees for S', all height h' - 1)
+ - k'₁, k'₂, ..., k'ₘ (keys separating S' trees)
+ - T''₀, T''₁, ..., T''_m' (trees for S'')
+ - k''₁, k''₂, ..., k''_m' (keys separating S'' trees)
+
+ // Build T' from S' components
+ result = T'₀
+ for i = 1 to m
+ result = JOIN(result, k'ᵢ, T'ᵢ) // O(log n) per join
+ T_prime = result
+
+ // Build T'' from S'' components
+ result = T''₀
+ for j = 1 to m'
+ result = JOIN(result, k''ⱼ, T''ⱼ) // O(log n) per join
+ T_double_prime = result
+
+ return (T_prime, T_double_prime)
+```
+
+**Time analysis (why costs telescope):**
+
+At step i, we join trees of heights:
+- First join: height(T'₀) and height(T'₁) - both are h' - 1
+ - Cost: O(1 + |h' - 1 - h' - 1|) = O(1)
+- Second join: height of (T'₀ ∪ k'₁ ∪ T'₁) ≈ h' - 1 + 1 = h' with height(T'₂)
+ - Cost: O(1 + |h' - (h' - 1)|) = O(1 + 1) = O(2)
+- Pattern: Each subsequent join has cost O(1 + decrease_in_height_difference)
+
+**Telescoping:** The total cost is:
+∑ᵢ O(1 + |height_difference_at_step_i|) = O(m + (m-1) + ... + 1) = O(m log n) = O(log n)
+
+Wait, let me reconsider. Actually:
+- At each join, the height difference decreases or the trees approach balance
+- Total joins: O(m) where m = O(log n) (number of nodes on path)
+- Each join costs O(1 + height_difference), but heights are carefully balanced
+- Total cost telescopes to O(log n)
+
+**Correct analysis:**
+The path has O(log n) nodes, so m = O(log n). However, the key is that heights are uniform, so:
+- Each join of trees with same height: O(1 + 0) = O(1)
+- As we build larger trees, we join trees of increasing height with carefully managed differences
+- Total cost: O(log n) due to the bounded number of join operations
+
+**Final running time: O(log n)**
Sub-Model Usage
| Model | Input | Output | Cache Read | Cost |
|---|---|---|---|---|
| claude-haiku-4-5-20251001 | 30 | 8.8K | 110.2K | $0.08 |
BSCS Bench