Today we consider an optimization problem called BIN-PACKING which is similar to the LOAD-BALANCING problem from Section 11.1 of KT. In BIN-PACKING our input is a set of objects, each with a given size, and a bin size that is at least as big as any object size. Our goal is to partition the objects into as few subsets as possible, such that the total size of the objects in each subset is at most the bin size. The optimization problem is to find the minimum number of bins possible, and the decision problem takes a positive integer k as additional input and asks whether it can be done with k bins.
In Problem 8.26 on HW #5, you proved that the NUMBER-PARTITIONING problem is NP-complete, where you are given a set of objects with positive integer sizes and ask whether they can be partitioned into two subsets of equal total size. This is simply a special case of the decision version of BIN-PACKING: Given a set of objects of total size S, we add a bin size of S/2 and a k of 2, and we have a BIN-PACKING instance where the packing is possible if and only if the partition is possible. We have thus reduced a known NP-complete problem to BIN-PACKING. The only remaining task is to show that BIN-PACKING is in the class NP. This is easy because a division of the objects into bins can be specified by a string of polynomial length, and in polynomial time we can check that each item occurs in exactly one subset and that each subset meets the condition of having its total size less than or equal to the bin size.
Now consider the special case of BIN-PACKING where the bin size is 1 and the object sizes are all greater than 1/3. Describe a poly-time algorithm to find the exact optimal number of bins in this case. (Hint: Reduce to a problem from earlier in the course, which is solved by one of our standard methods.)
First note that no bin can contain more than two objects, and that each object
of size more than 1/2 must go in a separate bin (call these large
objects). If we have n objects, k of them large, the only question is how
many of the n-k other objects we can put into bins that already contain a
large object. Any remaining small objects can be packed two to a bin. So
if we match m small objects, we use k + ceiling ((n - k - m)/2) bins. Our goal
then is to match as many small objects as possible to large ones.
The reduction I had in mind was to MAXIMUM BIPARTITE MATCHING, which we know is
solvable in polynomial time by reduction to NETWORK-FLOW. We make a graph
with a vertex for each large object on the left, a vertex for each large
object on the right, and an edge between the vertices for any large object and
small object that fit together into one bin. A maximum bipartite matching
in this graph gives us the largest possible m and thus the optimal number of
bins.
Most of you in discussion applied a greedy algorithm to find the maximum
matching, first sorting the objects by size and then finding matches by:
The only problem with this solution is that we have to prove that it is correct,
meaning that there is no way to match more pairs than this algorithm does.
Suppose the optimal algorithm forms pairs (a,a'), (b,b'), ..., (z,z'), where
we list the larger object first in each pair
and put the pairs in order of descending size
of larger object. Note that we can rearrange the smaller elements so that they
are in ascending order of size, using an exchange argument: if (b,b') and
(c,c') are both valid pairs and b' > c', then (b,c') and (c,b') are both
valid pairs because each has smaller total size than (b,b'). Now replace
a', b', ..., z' with the m smallest objects in the set -- we still have m valid
pairs.
We want to show
that our greedy algorithm found at least as many matches as the optimal
algorithm did. Each of the items a, b, ..., z will be matched by the greedy
algorithm unless it finds another match first. The greedy algorithm's matches
will use the m smallest items in order. It cannot fail to find these matches,
because if the item d, for example, matches the fourth smallest item, then
the greedy algorithm will find its fourth match at or before the time it reaches
d.
We could replace the word "smallest" in the code above by "largest", which would
make more sense as a greedy algorithm because we are filling the current bin
as full as possible. This also works, because whenever we make a match we leave
a situation that is strictly better than the one we would have by making
the smallest match - for each i the i'th smallest item remaining is no bigger
than the i'th smallest item in the other list. This implies that both greedy
algorithms find the same number of matches, though of course they don't find
the same matches.
for each object X in downward order
if there is an object that fits with X
put X in a bin with the smallest remaining such object
else put X in a bin by itself
The key observation is that when we start a new bin, the size of the old bin and the new object must be greater than the bin size (which we'll call "1" as we can use whatever units we like). This means that the first two bins contain more than 1 unit of size between, that the third and fourth bins contain at least a unit, and so on up to the last bin. If we use an odd number of bins, then the first k pairs contain more than one unit each, so the total size is greater than k and the optimal algorithm must use at least k+1 bins, more than half the number we used. If we used an even number, the k pairs again each use more than a unit each (for example, we would not have started bin 2k unless the size of the last item plus the contents of bin 2k-1 were more than one), and again we have more than k size and we need at least k+1 bins.
For any positive integer n, let the bin size be n and consider the sequence of 2n objects with sizes n, 1, n, 1, ..., n, 1. The greedy algorithm must put each item in a separate bin and thus uses 2n bins. The optimal algorithm could put all the size-1 items in one bin and thus use only n+1 bins. For any positive ε, we can choose n large enough that 2n/(n+1) > 2 - ε.
Last modified 7 December 2006