Question text is in black, solutions in blue.
Q1: 15 points Q2: 15 points Q3: 10 points Q4: 20 points Q5: 10 points Q6: 20 points Q7: 10 points Q8: 20 points Total: 120 points
Give an algorithm, with running time polynomial in n, that determines an order of the jobs that will achieve the optimal reward for all values of t. (You do not need to prove here that your order has this property, that is Question 2.) State and justify a polynomial bound on the running time.
For each job Ji, compute zi = ri/di,
the rate at which we accumulate reward while doing job Ji.
Then order the jobs by decreasing zi, breaking ties arbitrarily.
The computation of the rates takes O(n) time, and sorting the jobs by rate
takes O(n log n) time if we use Mergesort, for example. Total time to find a
correct order is thus O(n) + O(n log n) = O(n log n).
Prove that no other order on the jobs can achieve a greater value than your order from Question 1, for any possible value of t. (Hint: You can show that any other order either achieves the same value as yours does, or can be altered to get a greater value for some t.)
We first claim that any two orders that are nonincreasing by rate score the
same reward for any t. We prove this by an exchange argument -- we can get from
any one such order to any other by a sequence of swaps of adjacent jobs with the
same rate. Let the jobs be ordered J1,..., Jn by
nonincreasing rate, assume that Ji and Ji+1 have the
same rate z, and consider the order J1,..., Ji-1,
Ji+1, Ji, Ji+2,..., Jn obtained by
swapping Ji and Ji+1. We need to prove that for any t,
these two orders achieve the same reward. (Then the general result will follow
by induction on these single exchanges.)
If t is less than the sum of the first i-1 durations, or greater than the sum
of the first i+1 durations, then the two orders complete exactly the same jobs
and finish by devoting the same amount of time to the same final job. Thus in
these cases they achieve the same reward. Now consider some t between these
two sums, so that t = (sum of first i-1 durations) + y. Both orders complete
the first i-1 jobs, and each then spends a time of y on jobs Ji
and/or Ji+1, which each have rate z. So each order achieves
(sum of first i-1 rewards) + yz, and thus both achieve the same reward.
Now we must prove that any order that is not nonincreasing by rate
fails to achieve the optimal reward for some value of t. Again we use an
exchange argument. Suppose that the order J1,..., Jn
fails to be nonincreasing and let Ji and Ji+1 be the first
pair of adjacent jobs such that zi < zi+1. We will
show that swapping Ji and Ji+1 gets a smaller reward for
some t. Let t be the sum of the first i-1 durations, plus the smaller of the
two numbers di and di+1, which we may call d.
The original order gets a reward of (sum of first i-1 rewards) + dzi,
while the other order gets (sum of first i-1 rewards) + dzi+1, which
is strictly greater by our hypothesis.
As above, we can check that swapping two adjacent items that are out of
order by rate has no effect on the total reward for t's that are outside the
range from (sum of first i-1 durations) to (sum of first i+1 durations), and
increases the reward for t's within that range. Thus if we take any order and
perform swaps of adjacent jobs until we get an order that is nonincreasing by
rate, we never hurt the reward and we sometimes help it, so our final order is
optimal for all values of t.
True or false with justification: Let A be an algorithm that operates on a list of size n as follows. If n ≤ 1, then A takes only O(1) time. Otherwise A spends O(n) time to split the list into two pieces, each of which has size at most 2n/3. It then calls itself recursively on the two pieces. Then any such A has a running time that is polynomial in n.
TRUE. The recurrence is T(n) ≤ 2T(2n/3) + O(n), with base case T(O(1)) =
O(1), and we saw in lecture that this has a solution of T(n) =
O(nlog3/22), which is polynomial.
Many of you wrote that the solution was O(n log n), which is true if we
interpret the word "split" in the problem statement to mean that the sizes
of the two pieces add to n. If we assume that T(m) ≤ c(m log m) for m
smaller than n, and then compute T(n) ≤ T(s) + T(n-s) + dn where s is
between n/3 and 2n/3, we get the following. T(n) is at most
c(s log s) + c(n-s)(log(n-s)) + dn ≤ cn(log(2n/3) + dn = cn(log n) -
cn(log 3/2) + dn, which is less than cn(log n) for the appropriate choice of
c in terms of d.
Consider the special case of the BIN-PACKING problem where the size of each item is a positive integer and the bin size is a constant positive integer, called b. Given n items, we need to find the exact minimum number of bins needed to store them all. (The problem is no longer NP-complete in this special case, unless P = NP.) Describe an algorithm to solve this problem that has a running time that is polynomial in b if b is considered to be a constant. State and justify a polynomial bound on your running time.
My intended solution here was to use dynamic programming, in the same way that
we solved the subset-sum problem in lecture when the size of each object
was an integer polynomial in n. We can represent any assignment of objects to
bins by a vector (n1,...,nb) where ni is the
number of bins that have exactly i units in them. There are at most
nb of these vectors.
Let's make a table with a boolean for each one of these vectors. Originally
we have (0,...,0) true and all the other entries false. This reperesents all
the possible arrangements of the first 0 items. Assuming our table represents
all the arrangements of the first i items, we scan the table and determine all
the ways we can make a valid arrangement by adding the i+1'st item and make a
new table containing true entries for all those vectors. For example, if b=3
and we are considering the arrangment (4,7,2) with a new item of size 2, we
could get a new arrangement (4,8,2) by putting the new item in a new bin, or
(3,7,3) by adding the new item to a bin with one unit already in it.
After we have
processed all n items, we have a list of all the valid arrangements of n items
and we can check whether any uses k or fewer bins. This takes n passes through
the O(nb) size table, for total time O(nb+1), a polynomial
as long as b is constant.
Most of you took a greedy approach to the problem, filling the first bin as
full as possible, then the next bin as full as possible from the remaining
items, and so forth. You got significant partial credit for presenting and
analyzing such an algorithm, but these algorithms are not correct in
general. There may be many ways to fill the next bin using the remaining items,
and some ways may be better than others with regard to leaving a set of items
that will fit into a particular number of remaining bins. For example, let
b = 3 and let our item sizes be 2, 2, 2, 1, 1, 1. We can fit these items into
three bins but if we fill the first bin greedily with the three 1's we wind
up using four bins. A natural heuristic will solve this example but we can
easily make more complicated ones. (If your algorithm never makes use of the
fact that b = O(1), for example, then it is either incorrect or proves that
P = NP by solving the general case where b is polynomial in n -- we asserted in
class that BIN-PACKING is strongly NP-complete.)
True or false with justification: The SQUARE-TILING problem has as input a set of n2 tiles, each of which has one of n2 possible colors. (It is possible that not all the colors are used.) The output is a boolean saying whether the tiles can be put in an n by n square such that each tile is used exacctly once and such that no color appears more than once in any row or in any column. Then there is a polynomial-time reduction from SQUARE-TILING to HAMILTON-CIRCUIT.
TRUE. Any decision problem in NP reduces to HAMILTON-CIRCUIT, because
HAMILTON-CIRCUIT is NP-complete. It is easy to show that SQUARE-TILING is in
the class NP, because given an alleged valid tiling we have only to show that
each tile is used exacctly once, that no color appears twice in the same row,
and that no color appears twice in the same column. Thus the problem does not
require us to determine whether SQUARE-TILING is in P, is NP-complete, or
neither.
As it turns out, SQUARE-TILING is in P. If there are n+1 or more tiles
with the same color, then by the Pigeonhole Principle there cannot possibly
be a valid tiling. If there are no more than n of any given color, then we
can always do it. Place the tiles in any order where tiles of each color
occur consecutively, and colors with n tiles come first. Then place the tiles
in the square starting with location (0,0), then (1,1),..., (n-1,n-1), (1,0),
(2,1),..., (n-1,n-2), (0,n-1), (2,0), (3,1),..., (1,n-1), and so forth. The
colors with n tiles, if any, are put on complete diagonals. Any other set of
n-1 consecutive entries in this ordering occur in n-1 different rows and
n-1 different columns.
The RECTANGLE problem is the following variation of BIN-PACKING. We are given a set of n jobs, each with a positive integer size. Let S be the sum of the n sizes. The problem is to determine whether there exist integers b and k such that b > 1, k > 1, bk = S, and the items can be divided into k sets each of which has total size b. (The name RECTANGLE comes from thinking of each job of size si as a 1 by si rectangle, and asking whether these rectangles can be packed into any single rectangle without wasted space, other than the trivial 1 by S solution.) Prove that the RECTANGLE problem is NP-complete.
We first show that the RECTANGLE problem is in the class NP. If we are given
a division of the jobs into sets of equal size, we only have to verify that
each job appears in exactly one set and that the sets all have the same size
(and that b and k are each greater than 1). This takes O(n) time, and there
exists an arrangement that we will verify if and only if the answer to the
RECTANGLE question is "yes".
We can prove RECTANGLE to be NP-complete by reducing SUBSET-SUM to it,
though there are complications. If we are given a SUBSET-SUM instance of n
items of total size s
and a target t, the natural solution is to make a job for each item and
add two new jobs so that in order to divide the set of jobs in half we must
put original jobs adding to t with one new job and original jobs adding to s
in the other.
The complication, though, is that we must make sure there is no
other way to divide the new set of jobs into more than two equal pieces.
For example, if we just did the simplest thing and made our two new jobs have
size t and s-t, it might be possible to divide the jobs into three groups each
of size 2s/3. This would mean that we convert an insoluble SUBSET-SUM
instance into a soluble RECTANGLE instance, making our reduction invalid.
However, we can fix this easily by making the new jobs so large that no
division into more than two sets is possible. For example, if they are size
2s+t and 3s-t, the total size is 6s and we can't possibly have three or more
sets because the new items are too big to fit into a bin of size 2s. Now
we can fit the jobs into two sets if and only if we put the 3s-t job with
original jobs of size t and the other new job with original jobs of size s-t.
It is natural to try to reduce the given general BIN-PACKING problem
to RECTANGLE, and this can be done similarly to the reduction above with the
same complication to address. Given n items of total size s,
a bin size b, and a target number
of bins k, we make a job for each item and then make k new jobs of size z (where
z is to be determined) and
bk - s new jobs of size 1. If we can put the original items into k bins, then
we can add the size-1 jobs to get k sets with exactly b units each in them,
then add a new item of size z to each bin to get k bins with exactly b+z
units in each. Now we want to pick z so that there is no other way to
divide the jobs equally into a number of sets other than k. If we
pick z > bk, we have z > k(z+b)/(k+1) and the new jobs of size z are
too big to fit into the k+1 of more sets.
But there's one more problem -- there could be a way to divide the jobs into
fewer than k bins by putting more than one size-z job in the same bin.
This can't happen if k is a prime number, so we can adjust the jobs to fix this
by picking a prime number p with p ≥ k, and adding p-k new jobs of size
z+b. Then any division of the p jobs of size z or z+b into fewer than k bins
would leave a gap too large to be filled with the original jobs (which total to
less than z).
Of course the reduction from SUBSET-SUM or NUMBER-PARTITION works fine, so
we don't need this prime-number trick to solve the problem. I just wanted
to show that the reduction from general BIN-PACKING is possible.
True or false with justification: Let R be a resource to which I want access, and suppose that in any round of a protocol I have at least a 1/n chance of getting R. Further suppose that the events of success in each round are independent. Then my chance of succeeding at least once in the first n rounds is at least 1/2.
TRUE. The chance that I fail n times in a row is at most ((n-1)/n)n
because we take the product of the probabilities of the
n different failure events and each of those is at most (1 - 1/n). We saw in
lecture that this number ((n-1)/n) is less than 1/e for any positive n, and
since 1/e < 1/2 the success probability is more than 1/2.
A disappointing number of you got to the correct answer of "TRUE" by a
completely bogus argument. You said that since you have a 1/n chance of
success in each of the n attemps, your total chance of success is n(1/n) = 1.
You may have been thinking of the Union Bound, which tells us that this
total probability is at most 1. This is true but doesn't tell help
us for this problem since what we care about is whether it is at least 1/2.
You probably don't really believe that if you flip a fair coin once, you
are guaranteed to get heads at least once. (If you do believe this, stay
away from Foxwoods.) Note that this is exactly the n=2 case of the
reasoning that most of you used on this problem -- the two trials each have
success probability 1/2 but the chance of at least one success is 3/4, not 1.
Now consider the variant of BIN-PACKING where the bin size is b (not necessarily an integer) and each item has size less than b/3. This version is still NP-complete, though we won't prove this here. Your problem is to give an algorithm that approximates the optimal packing into bins, and prove a bound on the quality of your approximation. In particular, prove that if your algorithm uses 3a+1 bins for some integer a, then the optimal algorithm uses at least 2a+1 bins.
As in Discussion #11, we can use the simple algorithm where we keep one bin
open at a time, look at each item in turn, and put it into the current bin if
and only if it fits. If it doesn't fit, we open a new bin. Every bin except
the last one we use must be filled to more than 2b/3, since we could only close
it if a new item, of size less than b/3, failed to fit.
Thus if our algorithm uses 3a+1 bins, the first 3a bins contain a total size
of more than 3a(2b/3) = 2ab. The optimal algorithm could not possibly fit this
much size into 2a bins of size b, so it must use at least 2a+1 bins as desired.
Last modified 15 January 2007