Question text is in black, solutions are in blue.
Q1: 10 points Q2: 10 points Q3: 20 points Q4: 20 points Q5: 25 points Q6: 15 points Total: 100 points
True or false with justification: Let A be an algorithm that operates on a binary tree, without changing the tree, as follows: A first spends O(1) time processing the root node of the tree, then recursively calls A on the left and the right subtrees. Then A can be substantially sped up on general binary trees using dynamic programming.
FALSE. Dynamic programming exploits overlap between the subproblems of a problem. Here the subproblems that are identified through the recursive calls involve processing subtrees of the tree, and different subtrees of the same tree do not overlap unless one is a subtree of the other. Memoization will not help at all as the algorithm will be called only once on each subtree.
True or false with justification: Let k be any positive integer constant greater than 1 . Then the recurrence T(n) = kT(n/k) + O(n), with T(1) = O(1), has the same big-O solution for any such k. (You may assume that T(n) is evaluated only when n is a power of k.)
TRUE. We are familiar with the Mergesort recurrence, with k=2, and this
results in T(n) = O(n log n). For any integer constant k, we will have:
As long as k > 1 the recurrence will stop this way, and as long as k is
O(1), logkn = Θ(log n).
Consider the following Java method (assume that it will only be called with 0 ≤ k and k ≤ n):
int choose (int n, int k) {
if ((k == 0) || (k == n)) return 1;
return choose (n-1, k) + choose (n-1, k-1);}
Explain why the running time will not be polynomial in n in general. Describe (in code or in English) how to revise the algorithm to be polynomial time in n. Carefully determine and justify a big-O bound on the running time of your improved version.
The recursion will call the method Memoization would make only one call to each of these O(n2)
pairs of arguments, and the processing of each pair of arguments requires only
O(1) time if we exclude the time for the recursive calls. Thus the total time
for the memoized algorithm (which records the value for each pair of arguments
in a table the first time it is computed) is O(n2).
Similarly, we could fill out the table of results without recursion as
follows:
This code's running time is dominated by the two nested loops on lines
6 and 7, which take O(n2) time.
choose on pairs of arguments
(i,j) with i ≤ n and j ≤ k, once it is originally called on (n,k). But
there will be multiple calls to the same pairs of arguments -- for example,
if n is even and the original call is to (n,n/2), then both the subcalls to
(n-1,n/2) and (n-1,n/2-1) will result in separate calls to (n-2,n/2-1). Each
of these two calls will result in two separate calls to (n-4,n/2-2), making
four calls with those arguments. Similarly we get (at least)
eight total calls to (n-6,n/2-3), 16 calls to (n-8,n/2-4), and eventually
2n/2 calls to (0,0) just from calls to (2,1).
int dpChoose (int n, int k) {
int[] table = new int[n+1,k+1];
for (int i=0; i < n+1; i++) {
table [i,0] = 1;
if (i <= k) table [i,i] = 1;}
for (int j=0; j <= k; j++)
for (int m=j+1; m <= n; m++)
table [m,j] = table [m-1,j-1] + table [m-1,j];
return table [n,k];}
Let G be a directed graph and let s1,...,sk and t1,...,tk be any 2k distinct vertices of G. We want to know whether there exist k paths P1,...,Pk such that:
Describe an algorithm to solve this problem in a time that is polynomial in m, the number of edges in G. Determine the big-O running time of your algorithm in terms of m, n (the number of vertices in G) and k (the number of paths to be found).
Here's the solution for the problem I meant to assign. Take the graph, make a
new source s and a new sink t, connect s to each of the nodes si,
connect each of the nodes ti to t, and add loops at any other sources
or sinks so that s and t are the only source and the only sink. Give every
edge a capacity of 1.
We can prove that there is a flow of k through this new graph if and only if
the desired k edge-disjoint paths exist in the original graph. First, note that
any integer flow in this graph must have unit flow over each edge in some
subset, and this subset must include all k edges out of s and all k edges into
t. Every other node must have the same number of unit-flow edges in and out of
it. We can prove by induction on j that such a set of edges must consist of
j disjoint paths from j of the si's to j of the sj's.
For j=0 there are no edges and these form zero edge-disjoint paths. For the
case of j, consider any si with an edge out of it and follow a path
of unit-flow edges starting there. This path can only stop at t, so it goes
through one of the ti's. If we delete this path, there is still a
flow of j-1 from s to t, so by the inductive hypothesis there are j-1 paths that
union together to form the rest of the edges. The new path we just constructed
makes j paths, completing the induction.
Finally, we need to show that if the k edge-disjoint paths exist in the old
graph, the flow of size k exists in the new graph. If we place a unit flow on
each edge of each path, and on the 2k edges out of s and into t, we have
a flow of size k -- it meets the conservation constraints because there are
an equal number of edges into and out of each vertex other than s or t.
Consider the following algorithm strangeSort, which sorts n Comparable
items in a list A. I should have added the assumption that
the n items are all distinct. Otherwise the algorithm below fails to terminate
if given a list of two or more items that are all equal.:
Clearly if n ≤ 1 the list is already sorted, and thus line 1 returns the sorted version of the list as desired. Assume now that strangeSort sorts all lists of length n/2, with smaller items first. Consider the operation of strangeSort on a list of size n. In line 2, each item is assigned a number from 0 through n-1, and in line 3 the smallest n/2 items are put in B. The two lists B and C thus each have n/2 items, so by the inductive hypothesis the recursive calls sort B and C correctly. Since every item in B is smaller than every item in C, the append operation in line 6 creates a sorted list of n elements. So the inductive step is complete, assuming that n is a power of 2.
Line 1 means that T(1) = O(1). Step 2 requires n scans of the entire list of
n elements and
so takes O(n2) time. Lines 3 and 4 each move n/2 items and so take
O(n) time. Line 5 makes two calls to strangeSort with arguments of size n/2
and so takes time 2T(n/2). Line 6 also takes O(n) time. The total time is
thus 2T(n/2) + O(n2) + O(n) = 2T(n/2) + O(n2).
This recurrence was solved to T(n) = O(n2) in the book -- we
repeat the derivation here:
Let G be a directed graph with exactly one source s, exactly one sink t, and a positive integer capacity on each edge. Explain carefully why we know that there is a maximum-size flow in G from s to t that sends an integer flow over each edge of G.
We have proved that the Ford-Fulkerson algorithm, which finds augmenting paths,
sends the maximum possible augmenting flow across them, and then recalculates
the residual graph, achieves a maximum flow in any diagram with positive integer
capacities. (It adds an integer flow on each phase, and after at most C phases,
where C is the capacity out of s, it must reach a point where the residual
graph has no path from s to t, meaning that there is a cut of the graph that is
saturated.)
By induction on the number of phases, the edge labels in the residual graph
are always integers. This is because they start out as the integer capacities
of the graph, and the flows added on the augmenting paths are always equal to
one of the edge labels in the previous residual graph. Adding this integer
flow can only change the residual graph edge labels by an integer, so they
stay integers. At the end of the algorithm, the flow over each edge is the
difference between the label of that edge in the residual graph and the capacity
of the edge -- since this is the difference of two integers, it is an integer.
Last modified 6 November 2006