Agent Work: Homework4
Claude Opus 4.6 · COMP 382: Reasoning about Algorithms
COMP 382: Reasoning about Algorithms
Homework 4
Total Points: 100
---
Problem 1: Hashing with Chaining [20 Points]
Suppose that we have stored n keys in a hash table of size m, with collisions resolved by chaining. We know the length of each chain, including the length L of the longest chain.
Describe a procedure that selects a key uniformly at random from among the keys in the hash table and returns it in expected time O(L · (1 + 1/α)), where α = n/m is the load factor.
---
Problem 2: Longest-Probe Bound for Hashing [40 Points]
Suppose you are using an open-addressed hash table of size m to store n ≤ m/2 items.
Part (a)
Assuming independent uniform permutation hashing, show that for i = 1, 2, ..., n, the probability is at most 2⁻ᵏ that the i-th insertion requires strictly more than k probes.
Part (b)
Show that for i = 1, 2, ..., n, the probability is O(1/n²) that the i-th insertion requires more than 2 log₂ n probes.
Let the random variable Xᵢ denote the number of probes required by the i-th insertion. You have shown in part (b) that Pr[Xᵢ > 2 log₂ n] = O(1/n²). Let the random variable X = max₁≤ᵢ≤ₙ Xᵢ denote the maximum number of probes required by any of the n insertions.
Part (c)
Show that Pr[X > 2 log₂ n] = O(1/n).
Part (d)
Show that the expected length E[X] of the longest probe sequence is O(log₂ n).
---
Problem 3: Randomized Coloring [15 Points]
We are given an undirected graph G = (V, E) and we want to color each node with one of three colors, even if we aren't necessarily able to give different colors to every pair of adjacent nodes. Instead, we say that an edge (u, v) is *satisfied* if the colors assigned to u and v are different, and seek to maximize the number of satisfied edges.
Consider a 3-coloring that maximizes the number of satisfied edges, and let c* denote this number.
Give a polynomial-time randomized algorithm for our problem where the expected number of satisfied edges is at least (2/3)c*.
---
Problem 4: Analysis of Selection Using Recurrence [25 Points]
Consider the randomized variant of selection shown in Algorithm 1, which selects the element with rank k in an unsorted array A[1..n]. Assume that the array elements have distinct values.
Algorithm 1: SELECT(A[1..n], k)
Input: array of integers A[1..n] and integer k ∈ {1, ..., n}
Output: Element of rank k
1 i ← RANDOM(1..n) // select i from {1, ..., n} uniformly at random
2 r ← PARTITION(A[1..n], i) // re-arrange elements using A[i] as pivot
3 if k < r then
4 return SELECT(A[1..(r − 1)], k)
5 else if k > r then
6 return SELECT(A[(r + 1)..n], k − r)
7 else
8 return A[k]The function PARTITION(A[1..n], i) partitions an array A into three parts by comparing the pivot element A[i] to every other element, using (n − 1) comparisons, and returns the new index of the pivot element. The elements that are smaller (resp., larger) than the pivot are placed to the left (resp., right) of the pivot.
The function RANDOM(1..n) selects a number from the set {1, ..., n} uniformly at random.
Let T(n) be the expected number of comparisons that SELECT performs. Find the smallest integer constant c you can for which T(n) ≤ cn for every n.
Hint: You can follow the recurrence approach that was discussed in class.
# COMP382 Homework 4
## Student Responses
<!-- Write your solutions below. Add sections for each problem as needed. -->
<!-- Use proper mathematical notation where applicable. -->
[Your solutions here]
COMP382 Homework 4
Problem 1: Hashing with Chaining [20 Points]
Setup: We have n keys stored in a hash table of size m with chaining. We know the length l_j of each chain and the length L of the longest chain. We want to select a key uniformly at random in expected time O(L(1 + 1/α)), where α = n/m.
Algorithm (Rejection Sampling):
UNIFORM-RANDOM-KEY(H, m, L):
repeat
1. Pick a slot j uniformly at random from {0, 1, ..., m-1}.
2. Pick an integer k uniformly at random from {1, 2, ..., L}.
3. If k ≤ l_j (the length of chain at slot j):
Walk to the k-th element in chain j and return it.
// Otherwise (k > l_j), reject and repeat.Correctness (Uniform Selection):
Consider any key x stored at position p in the chain at slot s (so 1 ≤ p ≤ l_s). In a single trial, x is selected if and only if j = s and k = p. The probability of this is:
P(select x in one trial) = P(j = s) · P(k = p) = (1/m)(1/L)
This probability is the same for every key in the table, regardless of which slot it is in or its position in the chain. Therefore, conditioned on acceptance, each of the n keys is equally likely to be returned. The algorithm selects a key uniformly at random.
Expected Running Time:
The probability of accepting (returning a key) in a single trial is:
P(accept) = Σ_{j=0}^{m-1} P(j selected) · P(k ≤ l_j) = Σ_{j=0}^{m-1} (1/m)(l_j/L) = (1/(mL)) Σ_{j=0}^{m-1} l_j = n/(mL) = α/L
So the expected number of trials until acceptance is L/α.
Now we analyze the cost per trial:
- Rejection cost: When k > l_j, the trial costs O(1) time (picking two random numbers and comparing k with l_j, which we know).
- Acceptance cost: When k ≤ l_j, we walk to position k in the chain, which takes O(k) ≤ O(L) time.
We accept exactly once (the last trial). All other trials are rejections. Therefore:
E[total time] = (expected number of rejections) · O(1) + O(L) = (L/α - 1) · O(1) + O(L) = O(L/α) + O(L) = O(L(1/α + 1)) = O(L(1 + 1/α))
This matches the required bound. ∎
---
Problem 2: Longest-Probe Bound for Hashing [40 Points]
Setup: Open-addressed hash table of size m with n ≤ m/2 items, using independent uniform permutation hashing.
Part (a)
Claim: For i = 1, 2, ..., n, the probability that the i-th insertion requires strictly more than k probes is at most 2^(-k).
Proof:
The i-th insertion requires more than k probes if and only if the first k probed positions are all occupied. When inserting the i-th element, there are i - 1 elements already in the table. Under uniform permutation hashing, the probe sequence is a random permutation of all m slots.
P(X_i > k) = P(first k probes all hit occupied slots) = ((i-1)/m) · ((i-2)/(m-1)) · ((i-3)/(m-2)) · ... · ((i-k)/(m-k+1)) = ∏_{j=0}^{k-1} (i - 1 - j)/(m - j)
If i - 1 < k, then one of the numerators is zero or negative, so P(X_i > k) = 0 ≤ 2^(-k). ✓
If i - 1 ≥ k, we bound each factor. Since i ≤ n ≤ m/2, for each j = 0, 1, ..., k-1:
(i - 1 - j)/(m - j) ≤ (m/2 - 1 - j)/(m - j) = (m - 2 - 2j)/(2(m - j))
Since m - 2 - 2j = (m - j) - (j + 2) < m - j, we have:
(m - 2 - 2j)/(2(m - j)) < (m - j)/(2(m - j)) = 1/2
Therefore each factor is strictly less than 1/2, and:
P(X_i > k) = ∏_{j=0}^{k-1} (i - 1 - j)/(m - j) < (1/2)^k = 2^(-k) ∎
Part (b)
Claim: P(X_i > 2 log₂ n) = O(1/n²).
Proof:
Setting k = 2 log₂ n in the result from Part (a):
P(X_i > 2 log₂ n) ≤ 2^(-2 log₂ n) = (2^(log₂ n))^(-2) = n^(-2) = 1/n²
Therefore P(X_i > 2 log₂ n) ≤ 1/n² = O(1/n²). ∎
Part (c)
Claim: P(X > 2 log₂ n) = O(1/n), where X = max_{1 ≤ i ≤ n} X_i.
Proof:
By the union bound:
P(X > 2 log₂ n) = P(∃ i ∈ {1,...,n} : X_i > 2 log₂ n) ≤ Σ_{i=1}^{n} P(X_i > 2 log₂ n) ≤ n · (1/n²) [by Part (b)] = 1/n = O(1/n) ∎
Part (d)
Claim: E[X] = O(log₂ n).
Proof:
We use the tail-sum formula for expectation of a non-negative integer-valued random variable:
E[X] = Σ_{k=0}^{∞} P(X > k)
We split this sum at t = ⌊2 log₂ n⌋:
E[X] = Σ_{k=0}^{⌊2 log₂ n⌋} P(X > k) + Σ_{k=⌊2 log₂ n⌋+1}^{∞} P(X > k)
First sum: Each probability is at most 1, so:
Σ_{k=0}^{⌊2 log₂ n⌋} P(X > k) ≤ ⌊2 log₂ n⌋ + 1 ≤ 2 log₂ n + 1
Second sum: For k > 2 log₂ n, we apply the union bound with Part (a):
P(X > k) ≤ Σ_{i=1}^{n} P(X_i > k) ≤ n · 2^(-k)
Therefore:
Σ_{k=⌊2 log₂ n⌋+1}^{∞} P(X > k) ≤ Σ_{k=⌊2 log₂ n⌋+1}^{∞} n · 2^(-k) = n · 2^{-(⌊2 log₂ n⌋+1)} · Σ_{j=0}^{∞} 2^(-j) = n · 2^{-(⌊2 log₂ n⌋+1)} · 2
Since ⌊2 log₂ n⌋ ≥ 2 log₂ n - 1:
≤ 2n · 2^{-(2 log₂ n - 1 + 1)} = 2n · 2^{-2 log₂ n} = 2n · n^{-2} = 2/n
Combining both sums:
E[X] ≤ (2 log₂ n + 1) + 2/n = O(log₂ n) ∎
---
Problem 3: Randomized Coloring [15 Points]
Algorithm:
RANDOM-3-COLOR(G = (V, E)):
For each vertex v ∈ V:
Assign color(v) ← uniform random choice from {1, 2, 3} (independently)
Return the coloring.This runs in O(|V|) time, which is polynomial.
Analysis:
Consider any edge (u, v) ∈ E. Since color(u) and color(v) are chosen independently and uniformly from {1, 2, 3}:
P(edge (u,v) is satisfied) = P(color(u) ≠ color(v)) = 1 - P(color(u) = color(v)) = 1 - 3 · (1/3)² = 1 - 1/3 = 2/3
Let S denote the number of satisfied edges. By linearity of expectation:
E[S] = Σ_{(u,v) ∈ E} P(color(u) ≠ color(v)) = (2/3)|E|
Since c* is the maximum number of satisfied edges over all 3-colorings, and no coloring can satisfy more edges than exist in the graph:
c* ≤ |E|
Therefore:
E[S] = (2/3)|E| ≥ (2/3)c*
The randomized algorithm produces a 3-coloring whose expected number of satisfied edges is at least (2/3)c*. ∎
---
Problem 4: Analysis of Selection Using Recurrence [25 Points]
Goal: Find the smallest integer constant c such that T(n) ≤ cn for all n, where T(n) is the expected number of comparisons performed by SELECT.
Setting up the recurrence:
When SELECT is called on an array of size n, the pivot is chosen uniformly at random. PARTITION uses n - 1 comparisons. After partitioning, the pivot ends up at some rank r (equally likely to be any value in {1, ..., n}), and we recurse on one side.
For a specific rank-k query, when the pivot has rank r:
- If r = k: we are done (no further recursion).
- If r > k: we recurse on the left subarray of size r - 1.
- If r < k: we recurse on the right subarray of size n - r.
To find the worst case over k, consider k = ⌈n/2⌉ (the median query). This maximizes the expected recursive subproblem size because both the left and right recursions produce large subproblems.
For k = n/2 (assuming n is even):
- When r > k: subproblem sizes are r - 1 = n/2, n/2+1, ..., n-1 (for r = n/2+1, ..., n).
- When r < k: subproblem sizes are n - r = n/2+1, n/2+2, ..., n-1 (for r = 1, ..., n/2-1).
Size n/2 appears once, and each size from n/2+1 to n-1 appears twice. This gives the upper bound:
T(n) ≤ (n - 1) + (2/n) Σ_{j=⌈n/2⌉}^{n-1} T(j)
Solving the recurrence by substitution:
We claim T(n) ≤ cn for all n ≥ 1. Assume inductively that T(j) ≤ cj for all j < n. Then:
T(n) ≤ (n - 1) + (2/n) Σ_{j=⌈n/2⌉}^{n-1} cj = (n - 1) + (2c/n) Σ_{j=⌈n/2⌉}^{n-1} j
Computing the sum (for n even, with ⌈n/2⌉ = n/2):
Σ_{j=n/2}^{n-1} j = Σ_{j=1}^{n-1} j - Σ_{j=1}^{n/2-1} j = n(n-1)/2 - (n/2 - 1)(n/2)/2 = n(n-1)/2 - (n² - 2n)/8 = (4n(n-1) - n² + 2n) / 8 = (4n² - 4n - n² + 2n) / 8 = (3n² - 2n) / 8
Substituting:
T(n) ≤ (n - 1) + (2c/n) · (3n² - 2n)/8 = (n - 1) + c(3n - 2)/4
For the induction to go through, we need:
(n - 1) + c(3n - 2)/4 ≤ cn
Rearranging:
n - 1 ≤ cn - 3cn/4 + c/2 = cn/4 + c/2 = c(n + 2)/4
So: c ≥ 4(n - 1)/(n + 2)
Checking what values of c work: Since 4(n-1)/(n+2) is increasing in n and approaches 4 as n → ∞ (but is strictly less than 4 for all finite n), we need c ≥ 4. Specifically:
- For c = 3: The inequality requires 3 ≥ 4(n-1)/(n+2), i.e., 3n + 6 ≥ 4n - 4, i.e., n ≤ 10. This fails for n > 10, so c = 3 does not work.
- For c = 4: The inequality requires 4 ≥ 4(n-1)/(n+2), i.e., 4n + 8 ≥ 4n - 4, i.e., 8 ≥ -4. This is always true. ✓
Verification with c = 4:
(n - 1) + 4(3n - 2)/4 = (n - 1) + (3n - 2) = 4n - 3 ≤ 4n ✓
Base case: T(1) = 0 ≤ 4 · 1 = 4. ✓
Therefore, by induction, T(n) ≤ 4n for all n ≥ 1, and since c = 3 fails for large n:
The smallest integer constant is c = 4. ∎
Problem 1: Hashing with Chaining
Score: 20/20 points
Assessment:
Approach: Rejection sampling — repeatedly pick a random slot and random position within L, accept if position exists in that chain.
Strengths:
- Algorithm is clearly stated and correct
- Correctness proof is rigorous: correctly shows each key has equal selection probability 1/(mL), establishing uniformity
- Time analysis is clean and well-structured: correctly computes P(accept) = α/L, expected trials = L/α
- Cost decomposition into rejection cost O(1) and single acceptance cost O(L) is elegant and valid
Errors/Gaps:
- None
Detailed Feedback: This is an excellent solution that matches the reference approach. The student clearly separates correctness (uniform selection) from efficiency analysis. The insight that all rejections cost O(1) while the single acceptance costs O(L) leads cleanly to the O(L/α + L) = O(L(1 + 1/α)) bound. The presentation is exemplary.
Rubric Breakdown:
- Algorithm description: 6/6 — Clear, correct rejection sampling procedure
- Correctness proof (uniformity): 7/7 — Rigorous proof that each key is selected with equal probability
- Running time analysis: 7/7 — Correct derivation of O(L(1 + 1/α))
---
Problem 2: Longest-Probe Bound for Hashing
Score: 40/40 points
Part (a): 10/10 points
Approach: Expresses the exact probability as a product of conditional probabilities, handles edge cases, bounds each factor by 1/2.
Strengths:
- More detailed than the reference solution — writes out the exact product formula
- Correctly handles the edge case i − 1 < k (probability is zero)
- Rigorously bounds each factor (i−1−j)/(m−j) < 1/2 using n ≤ m/2
- Clean algebraic manipulation
Errors/Gaps: None
Part (b): 10/10 points
Approach: Direct substitution of k = 2 log₂ n into part (a).
Strengths:
- Clean, correct application of part (a)
- Algebra is correct: 2^(−2 log₂ n) = n^(−2)
Errors/Gaps: None
Part (c): 10/10 points
Approach: Union bound over all n insertions.
Strengths:
- Correct application of union bound
- Clear chain of inequalities
Errors/Gaps: None
Part (d): 10/10 points
Approach: Uses the tail-sum formula E[X] = Σ P(X > k), splitting at threshold 2 log₂ n.
Strengths:
- This is actually a *more rigorous* approach than the reference solution's simple two-case split
- First sum bounded by trivially bounding each probability by 1
- Second sum uses union bound + part (a) to bound P(X > k) ≤ n · 2^(−k), then correctly sums the geometric series
- Correctly handles the floor function in the split point
- Final bound of (2 log₂ n + 1) + 2/n = O(log₂ n) is tight
Errors/Gaps: None
Detailed Feedback: All four parts are solved correctly with complete rigor. Part (a) provides more detail than necessary (which is a virtue), and part (d) uses the tail-sum technique which is more elegant than the reference solution's approach. The handling of edge cases and floor functions demonstrates strong mathematical maturity.
---
Problem 3: Randomized Coloring
Score: 15/15 points
Assessment:
Approach: Color each vertex independently and uniformly at random from {1, 2, 3}.
Strengths:
- Algorithm is clearly stated and trivially polynomial (O(|V|))
- Probability calculation is correct: P(satisfied) = 1 − 3·(1/3)² = 2/3
- Linearity of expectation applied correctly
- Key inequality c* ≤ |E| is explicitly stated and used
- Conclusion E[S] = (2/3)|E| ≥ (2/3)c* follows cleanly
Errors/Gaps:
- None
Detailed Feedback: This is a textbook-quality solution that matches the reference perfectly. The student correctly identifies the random coloring approach, computes the satisfaction probability for each edge, and uses linearity of expectation to establish the 2/3 approximation guarantee. The argument that c* ≤ |E| to bridge from E[S] = (2/3)|E| to E[S] ≥ (2/3)c* is explicitly stated, which is good practice.
Rubric Breakdown:
- Algorithm description: 4/4 — Clear polynomial-time algorithm
- Probability computation: 5/5 — Correct P(satisfied) = 2/3
- Linearity of expectation + final bound: 6/6 — Complete and correct
---
Problem 4: Analysis of Selection Using Recurrence
Score: 25/25 points
Assessment:
Approach: Sets up the recurrence for worst-case k (median query), applies substitution method, finds c = 4.
Strengths:
- Correctly identifies k = n/2 as the worst case for the query parameter
- Recurrence T(n) ≤ (n−1) + (2/n) Σ_{j=⌈n/2⌉}^{n−1} T(j) is the standard upper bound and is valid
- Sum computation Σ_{j=n/2}^{n−1} j = (3n²−2n)/8 is correct
- Substitution and algebra are carefully done
- Shows c = 3 fails for large n (good — demonstrates minimality)
- Shows c = 4 works with clean verification: 4n − 3 ≤ 4n
- Base case T(1) = 0 ≤ 4 is verified
- Final answer c = 4 matches the reference solution
Errors/Gaps:
- The claim that k = n/2 is worst case is stated with intuition rather than derived rigorously (the reference takes the derivative), but this is a minor point since the bound obtained is still valid regardless — the student's expression is an upper bound on the true recurrence for any k.
Detailed Feedback: The solution is thorough and correct. The student sets up the worst-case recurrence properly, executes the substitution proof by induction cleanly, and correctly arrives at c = 4. The additional analysis showing c = 3 fails (by checking 4(n−1)/(n+2) → 4 as n → ∞) demonstrates understanding that c = 4 is indeed the smallest integer constant. The verification step (4n − 3 ≤ 4n) provides a clean sanity check.
Rubric Breakdown:
- Recurrence setup: 7/7 — Correct worst-case recurrence with proper justification
- Inductive proof: 10/10 — Clean substitution, correct algebra, valid base case
- Finding c = 4 and showing minimality: 8/8 — Correct answer with proof that c = 3 fails
---
Overall Summary
| Problem | Score | Notes |
|---|---|---|
| 1: Hashing with Chaining | 20/20 | Perfect — clean rejection sampling with rigorous analysis |
| 2a: Probe bound | 10/10 | Perfect — detailed proof with edge case handling |
| 2b: O(1/n²) bound | 10/10 | Perfect — direct application of part (a) |
| 2c: O(1/n) bound | 10/10 | Perfect — clean union bound |
| 2d: E[X] = O(log n) | 10/10 | Perfect — elegant tail-sum approach, more rigorous than reference |
| 3: Randomized Coloring | 15/15 | Perfect — textbook solution with all components |
| 4: Selection Recurrence | 25/25 | Perfect — correct answer c = 4 with complete proof |
| Total | 100/100 |
General Comments:
This is an outstanding submission. Every problem is solved correctly with clear mathematical reasoning and complete rigor. The student demonstrates strong command of probabilistic analysis (Problems 1–3) and recurrence-based algorithm analysis (Problem 4). Several solutions are actually more detailed and rigorous than the reference — notably Problem 2(a)'s explicit product formula with edge case handling, and Problem 2(d)'s tail-sum approach. The writing is clear and well-organized throughout, with proper use of indicator random variables, linearity of expectation, union bounds, and inductive proofs. No errors or gaps were found.
Total: 100/100
diff --git a/writeup.md b/writeup.md
index daaca40..c048f66 100644
--- a/writeup.md
+++ b/writeup.md
@@ -1,9 +1,254 @@
# COMP382 Homework 4
-## Student Responses
+## Problem 1: Hashing with Chaining [20 Points]
-<!-- Write your solutions below. Add sections for each problem as needed. -->
-<!-- Use proper mathematical notation where applicable. -->
+**Setup:** We have n keys stored in a hash table of size m with chaining. We know the length l_j of each chain and the length L of the longest chain. We want to select a key uniformly at random in expected time O(L(1 + 1/α)), where α = n/m.
-[Your solutions here]
+**Algorithm (Rejection Sampling):**
+```
+UNIFORM-RANDOM-KEY(H, m, L):
+ repeat
+ 1. Pick a slot j uniformly at random from {0, 1, ..., m-1}.
+ 2. Pick an integer k uniformly at random from {1, 2, ..., L}.
+ 3. If k ≤ l_j (the length of chain at slot j):
+ Walk to the k-th element in chain j and return it.
+ // Otherwise (k > l_j), reject and repeat.
+```
+
+**Correctness (Uniform Selection):**
+
+Consider any key x stored at position p in the chain at slot s (so 1 ≤ p ≤ l_s). In a single trial, x is selected if and only if j = s and k = p. The probability of this is:
+
+P(select x in one trial) = P(j = s) · P(k = p) = (1/m)(1/L)
+
+This probability is the same for every key in the table, regardless of which slot it is in or its position in the chain. Therefore, conditioned on acceptance, each of the n keys is equally likely to be returned. The algorithm selects a key uniformly at random.
+
+**Expected Running Time:**
+
+The probability of accepting (returning a key) in a single trial is:
+
+P(accept) = Σ_{j=0}^{m-1} P(j selected) · P(k ≤ l_j) = Σ_{j=0}^{m-1} (1/m)(l_j/L) = (1/(mL)) Σ_{j=0}^{m-1} l_j = n/(mL) = α/L
+
+So the expected number of trials until acceptance is L/α.
+
+Now we analyze the cost per trial:
+- **Rejection cost:** When k > l_j, the trial costs O(1) time (picking two random numbers and comparing k with l_j, which we know).
+- **Acceptance cost:** When k ≤ l_j, we walk to position k in the chain, which takes O(k) ≤ O(L) time.
+
+We accept exactly once (the last trial). All other trials are rejections. Therefore:
+
+E[total time] = (expected number of rejections) · O(1) + O(L)
+ = (L/α - 1) · O(1) + O(L)
+ = O(L/α) + O(L)
+ = O(L(1/α + 1))
+ = **O(L(1 + 1/α))**
+
+This matches the required bound. ∎
+
+---
+
+## Problem 2: Longest-Probe Bound for Hashing [40 Points]
+
+**Setup:** Open-addressed hash table of size m with n ≤ m/2 items, using independent uniform permutation hashing.
+
+### Part (a)
+
+**Claim:** For i = 1, 2, ..., n, the probability that the i-th insertion requires strictly more than k probes is at most 2^(-k).
+
+**Proof:**
+
+The i-th insertion requires more than k probes if and only if the first k probed positions are all occupied. When inserting the i-th element, there are i - 1 elements already in the table. Under uniform permutation hashing, the probe sequence is a random permutation of all m slots.
+
+P(X_i > k) = P(first k probes all hit occupied slots)
+ = ((i-1)/m) · ((i-2)/(m-1)) · ((i-3)/(m-2)) · ... · ((i-k)/(m-k+1))
+ = ∏_{j=0}^{k-1} (i - 1 - j)/(m - j)
+
+If i - 1 < k, then one of the numerators is zero or negative, so P(X_i > k) = 0 ≤ 2^(-k). ✓
+
+If i - 1 ≥ k, we bound each factor. Since i ≤ n ≤ m/2, for each j = 0, 1, ..., k-1:
+
+(i - 1 - j)/(m - j) ≤ (m/2 - 1 - j)/(m - j) = (m - 2 - 2j)/(2(m - j))
+
+Since m - 2 - 2j = (m - j) - (j + 2) < m - j, we have:
+
+(m - 2 - 2j)/(2(m - j)) < (m - j)/(2(m - j)) = 1/2
+
+Therefore each factor is strictly less than 1/2, and:
+
+P(X_i > k) = ∏_{j=0}^{k-1} (i - 1 - j)/(m - j) < (1/2)^k = 2^(-k) ∎
+
+### Part (b)
+
+**Claim:** P(X_i > 2 log₂ n) = O(1/n²).
+
+**Proof:**
+
+Setting k = 2 log₂ n in the result from Part (a):
+
+P(X_i > 2 log₂ n) ≤ 2^(-2 log₂ n) = (2^(log₂ n))^(-2) = n^(-2) = 1/n²
+
+Therefore P(X_i > 2 log₂ n) ≤ 1/n² = O(1/n²). ∎
+
+### Part (c)
+
+**Claim:** P(X > 2 log₂ n) = O(1/n), where X = max_{1 ≤ i ≤ n} X_i.
+
+**Proof:**
+
+By the union bound:
+
+P(X > 2 log₂ n) = P(∃ i ∈ {1,...,n} : X_i > 2 log₂ n)
+ ≤ Σ_{i=1}^{n} P(X_i > 2 log₂ n)
+ ≤ n · (1/n²) [by Part (b)]
+ = 1/n
+ = O(1/n) ∎
+
+### Part (d)
+
+**Claim:** E[X] = O(log₂ n).
+
+**Proof:**
+
+We use the tail-sum formula for expectation of a non-negative integer-valued random variable:
+
+E[X] = Σ_{k=0}^{∞} P(X > k)
+
+We split this sum at t = ⌊2 log₂ n⌋:
+
+E[X] = Σ_{k=0}^{⌊2 log₂ n⌋} P(X > k) + Σ_{k=⌊2 log₂ n⌋+1}^{∞} P(X > k)
+
+**First sum:** Each probability is at most 1, so:
+
+Σ_{k=0}^{⌊2 log₂ n⌋} P(X > k) ≤ ⌊2 log₂ n⌋ + 1 ≤ 2 log₂ n + 1
+
+**Second sum:** For k > 2 log₂ n, we apply the union bound with Part (a):
+
+P(X > k) ≤ Σ_{i=1}^{n} P(X_i > k) ≤ n · 2^(-k)
+
+Therefore:
+
+Σ_{k=⌊2 log₂ n⌋+1}^{∞} P(X > k) ≤ Σ_{k=⌊2 log₂ n⌋+1}^{∞} n · 2^(-k)
+ = n · 2^{-(⌊2 log₂ n⌋+1)} · Σ_{j=0}^{∞} 2^(-j)
+ = n · 2^{-(⌊2 log₂ n⌋+1)} · 2
+
+Since ⌊2 log₂ n⌋ ≥ 2 log₂ n - 1:
+
+ ≤ 2n · 2^{-(2 log₂ n - 1 + 1)}
+ = 2n · 2^{-2 log₂ n}
+ = 2n · n^{-2}
+ = 2/n
+
+Combining both sums:
+
+E[X] ≤ (2 log₂ n + 1) + 2/n = O(log₂ n) ∎
+
+---
+
+## Problem 3: Randomized Coloring [15 Points]
+
+**Algorithm:**
+
+```
+RANDOM-3-COLOR(G = (V, E)):
+ For each vertex v ∈ V:
+ Assign color(v) ← uniform random choice from {1, 2, 3} (independently)
+ Return the coloring.
+```
+
+This runs in O(|V|) time, which is polynomial.
+
+**Analysis:**
+
+Consider any edge (u, v) ∈ E. Since color(u) and color(v) are chosen independently and uniformly from {1, 2, 3}:
+
+P(edge (u,v) is satisfied) = P(color(u) ≠ color(v))
+ = 1 - P(color(u) = color(v))
+ = 1 - 3 · (1/3)²
+ = 1 - 1/3
+ = 2/3
+
+Let S denote the number of satisfied edges. By linearity of expectation:
+
+E[S] = Σ_{(u,v) ∈ E} P(color(u) ≠ color(v)) = (2/3)|E|
+
+Since c* is the maximum number of satisfied edges over all 3-colorings, and no coloring can satisfy more edges than exist in the graph:
+
+c* ≤ |E|
+
+Therefore:
+
+E[S] = (2/3)|E| ≥ (2/3)c*
+
+The randomized algorithm produces a 3-coloring whose expected number of satisfied edges is at least (2/3)c*. ∎
+
+---
+
+## Problem 4: Analysis of Selection Using Recurrence [25 Points]
+
+**Goal:** Find the smallest integer constant c such that T(n) ≤ cn for all n, where T(n) is the expected number of comparisons performed by SELECT.
+
+**Setting up the recurrence:**
+
+When SELECT is called on an array of size n, the pivot is chosen uniformly at random. PARTITION uses n - 1 comparisons. After partitioning, the pivot ends up at some rank r (equally likely to be any value in {1, ..., n}), and we recurse on one side.
+
+For a specific rank-k query, when the pivot has rank r:
+- If r = k: we are done (no further recursion).
+- If r > k: we recurse on the left subarray of size r - 1.
+- If r < k: we recurse on the right subarray of size n - r.
+
+To find the worst case over k, consider k = ⌈n/2⌉ (the median query). This maximizes the expected recursive subproblem size because both the left and right recursions produce large subproblems.
+
+For k = n/2 (assuming n is even):
+- When r > k: subproblem sizes are r - 1 = n/2, n/2+1, ..., n-1 (for r = n/2+1, ..., n).
+- When r < k: subproblem sizes are n - r = n/2+1, n/2+2, ..., n-1 (for r = 1, ..., n/2-1).
+
+Size n/2 appears once, and each size from n/2+1 to n-1 appears twice. This gives the upper bound:
+
+T(n) ≤ (n - 1) + (2/n) Σ_{j=⌈n/2⌉}^{n-1} T(j)
+
+**Solving the recurrence by substitution:**
+
+We claim T(n) ≤ cn for all n ≥ 1. Assume inductively that T(j) ≤ cj for all j < n. Then:
+
+T(n) ≤ (n - 1) + (2/n) Σ_{j=⌈n/2⌉}^{n-1} cj
+ = (n - 1) + (2c/n) Σ_{j=⌈n/2⌉}^{n-1} j
+
+**Computing the sum** (for n even, with ⌈n/2⌉ = n/2):
+
+Σ_{j=n/2}^{n-1} j = Σ_{j=1}^{n-1} j - Σ_{j=1}^{n/2-1} j
+ = n(n-1)/2 - (n/2 - 1)(n/2)/2
+ = n(n-1)/2 - (n² - 2n)/8
+ = (4n(n-1) - n² + 2n) / 8
+ = (4n² - 4n - n² + 2n) / 8
+ = (3n² - 2n) / 8
+
+Substituting:
+
+T(n) ≤ (n - 1) + (2c/n) · (3n² - 2n)/8
+ = (n - 1) + c(3n - 2)/4
+
+For the induction to go through, we need:
+
+(n - 1) + c(3n - 2)/4 ≤ cn
+
+Rearranging:
+
+n - 1 ≤ cn - 3cn/4 + c/2 = cn/4 + c/2 = c(n + 2)/4
+
+So: c ≥ 4(n - 1)/(n + 2)
+
+**Checking what values of c work:** Since 4(n-1)/(n+2) is increasing in n and approaches 4 as n → ∞ (but is strictly less than 4 for all finite n), we need c ≥ 4. Specifically:
+
+- For c = 3: The inequality requires 3 ≥ 4(n-1)/(n+2), i.e., 3n + 6 ≥ 4n - 4, i.e., n ≤ 10. This fails for n > 10, so c = 3 does not work.
+- For c = 4: The inequality requires 4 ≥ 4(n-1)/(n+2), i.e., 4n + 8 ≥ 4n - 4, i.e., 8 ≥ -4. This is always true. ✓
+
+**Verification with c = 4:**
+
+(n - 1) + 4(3n - 2)/4 = (n - 1) + (3n - 2) = 4n - 3 ≤ 4n ✓
+
+**Base case:** T(1) = 0 ≤ 4 · 1 = 4. ✓
+
+Therefore, by induction, T(n) ≤ 4n for all n ≥ 1, and since c = 3 fails for large n:
+
+**The smallest integer constant is c = 4.** ∎
Sub-Model Usage
| Model | Input | Output | Cache Read | Cost |
|---|---|---|---|---|
| claude-opus-4-6 | 7 | 19.5K | 215.8K | $0.80 |
BSCS Bench