In the last two discussions we've looked at various versions of the problem of assigning students to classes and classes to timeslots in a school. Some of these versions could be solved by network flow, and others (as far as we could see) could not. Now that we have introduced NP-completeness in last week's lectures, we can prove that one version of this problem cannot be solved in polynomial time, unless P = NP.
The version is this: We have n students and c classes. Each student has two disjoint lists of classes and must take exactly one class from each list. Each class has a positive integer capacity, the maximum number of students that may be assigned to it. Finally, each class must be assigned to one of two timeslots, and no student may take two classes in the same timeslot.
The known NP-complete problem we will reduce from is CNF-SAT, where the input is a set of m clauses, each an OR of one or more literals on n boolean variables. The output is a boolean telling whether there exists a setting of the variables making all the clauses true.
To show CNF-SAT to be NP-complete, we must show that CNF-SAT is in NP and then
show X ≤p CNF-SAT, where X is some known NP-complete problem. The
first task is easy because a formula φ is in CNF-SAT if and only if a
setting of the variables exists making φ true. But if we are given the
setting (which is only n bits) and φ, we know that we can evaluate φ
on the setting in a time polynomial in the length of φ. Hence by the
definition of NP, CNF-SAT is in NP.
It is possible to reduce CIRCUIT-SAT to CNF-SAT, following the reduction from
CIRCUIT-SAT to 3-SAT given in lecture, but it is far easier to reduce 3-SAT to
CNF-SAT. The reduction function is the identity function -- if φ
is a properly formed 3-CNF formula we simply let f(φ) = φ. Then we
know that f(φ) is a valid CNF formula (since it is in 3-CNF), and of course
it is satisfiable if and only if φ is. By the definition of reductions, we
have shown 3-SAT ≤p CNF-SAT and thus that CNF-SAT is NP-complete.
Explain briefly why our schedule problem is in the class NP.
The input to our problem is a list of students, each with two lists of classes, and a list of classes with capacities. The language of our problem is the set of inputs such that there exists a solution, which is an assignment of students to classes and an assignment of classes to timeslots that meets the given conditions. The solution can be written down in a polynomial number of bits, and we can check the conditions in polynomial time: We need only check that each student is assigned one class from each of her lists, that each class is given a timeslot, that no student gets two classes in the same timeslot, and that no class gets a number of students greater than its capacity.
First note that we need to designate the two timeslots as "true" and "false" --
we do this by making two special classes T and F and a student s0
whose two lists are {T} and {F}. Then T and F must go in different slots, and
we can call these "true" and "false" respectively.
Now we make a class ci for each variable xi,
that will go to slot "true" if the variable
is set to true and to slot "false" otherwise. We now need to design students
and lists to force all the clauses to be satisfied. It's easy to handle
clauses of either all positive or all negative literals. If the clause is
"x1 OR x2", for example, we make a student whose two
lists are {c1, c2} and {F}. This student can be scheduled
if either c1 or c2 (or both) is in timeslot "true", and
not otherwise.
The problem comes with a clause like "x1 OR NOT x2",
which has both positive and negative literals. We might try giving the
student the lists {c1, T} and {c2, F}, but this won't do
because we could schedule this student with the classes T and F even if the
clause is not satisified.
But we haven't yet used the capacities of the courses. If the student for
clause j comes with her own courses Tj and Fj, we can
make both these courses have small capacity and then use other students to make
sure Tj goes to "true", Fj goes to "false", and the
original student goes to only one of the two. Here's how we do it -- we make
Tj, Fj, and another course Zj each have
capacity 1 and have a total of three students for the clause. The main
student, as above, has one list of classes for all the positive literals plus
Tj, and the other list of classes for all the negative literals
plus Fj. The second student has lists {Tj, Zj}
and {F}, and the third student has lists {T} and {Fj, Zj}.
We should check carefully that these three students can all be scheduled
if and only if the setting satisfies the clause. If a positive literal's
class is put in timeslot "true", we give the first student that literal and
Fj, the second student Tj and F, and the third student
T and Zj. If a negative literal's class is in timeslot "false", we
give the first student Tj and that literal, the second student
Zj and F, and the third student T and Fj. But if the
clause is not satisfied, we can only give the first student Tj
and Fj, which means that we need Zj for both the other
two students, which is impossible as Zj has capacity 1.
(It would be simpler if we could put Zj on both of the
first student's lists, but it was stated that those lists had to be disjoint.)
Last modified 26 November 2006