CMPSCI 311 Discussion #9: An NP-Complete School Scheduling Problem

David Mix Barrington

13/15 November 2006

In the last two discussions we've looked at various versions of the problem of assigning students to classes and classes to timeslots in a school. Some of these versions could be solved by network flow, and others (as far as we could see) could not. Now that we have introduced NP-completeness in last week's lectures, we can prove that one version of this problem cannot be solved in polynomial time, unless P = NP.

The version is this: We have n students and c classes. Each student has two disjoint lists of classes and must take exactly one class from each list. Each class has a positive integer capacity, the maximum number of students that may be assigned to it. Finally, each class must be assigned to one of two timeslots, and no student may take two classes in the same timeslot.

The known NP-complete problem we will reduce from is CNF-SAT, where the input is a set of m clauses, each an OR of one or more literals on n boolean variables. The output is a boolean telling whether there exists a setting of the variables making all the clauses true.

  1. We have not actually proved CNF-SAT to be NP-complete in lecture. We have proved two similar problems to be NP-complete: CIRCUIT-SAT, where the AND of clauses is replaced by an arbitrary boolean circuit, and 3-SAT, which is the special case of CNF-SAT where each clause has exactly three literals. From the NP-completeness of one of these two problems it is easy to prove that CNF-SAT is NP-complete -- give a careful argument for this.

    To show CNF-SAT to be NP-complete, we must show that CNF-SAT is in NP and then show X ≤p CNF-SAT, where X is some known NP-complete problem. The first task is easy because a formula φ is in CNF-SAT if and only if a setting of the variables exists making φ true. But if we are given the setting (which is only n bits) and φ, we know that we can evaluate φ on the setting in a time polynomial in the length of φ. Hence by the definition of NP, CNF-SAT is in NP.

    It is possible to reduce CIRCUIT-SAT to CNF-SAT, following the reduction from CIRCUIT-SAT to 3-SAT given in lecture, but it is far easier to reduce 3-SAT to CNF-SAT. The reduction function is the identity function -- if φ is a properly formed 3-CNF formula we simply let f(φ) = φ. Then we know that f(φ) is a valid CNF formula (since it is in 3-CNF), and of course it is satisfiable if and only if φ is. By the definition of reductions, we have shown 3-SAT ≤p CNF-SAT and thus that CNF-SAT is NP-complete.

  2. Explain briefly why our schedule problem is in the class NP.

    The input to our problem is a list of students, each with two lists of classes, and a list of classes with capacities. The language of our problem is the set of inputs such that there exists a solution, which is an assignment of students to classes and an assignment of classes to timeslots that meets the given conditions. The solution can be written down in a polynomial number of bits, and we can check the conditions in polynomial time: We need only check that each student is assigned one class from each of her lists, that each class is given a timeslot, that no student gets two classes in the same timeslot, and that no class gets a number of students greater than its capacity.

  3. The remaining problem is to reduce CNF-SAT to our scheduling problem. Given a set of m clauses on n variables, we need a set of students, a set of classes, two class lists for each student, and a capacity for each class, such that the schedule is feasible if and only if the clauses are satisfiable. The basic idea is natural -- have a class for each variable, and put it in one timeslot if that variable is set to true and in the other if it is set to false. Then we need to arrange students and student class lists to make an assignment of classes to timeslots work if and only if the setting of the variables satisfies all the clauses. See whether you can work this out -- remember that you may introduce additional studnents and classes if it helps you.

    First note that we need to designate the two timeslots as "true" and "false" -- we do this by making two special classes T and F and a student s0 whose two lists are {T} and {F}. Then T and F must go in different slots, and we can call these "true" and "false" respectively.

    Now we make a class ci for each variable xi, that will go to slot "true" if the variable is set to true and to slot "false" otherwise. We now need to design students and lists to force all the clauses to be satisfied. It's easy to handle clauses of either all positive or all negative literals. If the clause is "x1 OR x2", for example, we make a student whose two lists are {c1, c2} and {F}. This student can be scheduled if either c1 or c2 (or both) is in timeslot "true", and not otherwise.

    The problem comes with a clause like "x1 OR NOT x2", which has both positive and negative literals. We might try giving the student the lists {c1, T} and {c2, F}, but this won't do because we could schedule this student with the classes T and F even if the clause is not satisified.

    But we haven't yet used the capacities of the courses. If the student for clause j comes with her own courses Tj and Fj, we can make both these courses have small capacity and then use other students to make sure Tj goes to "true", Fj goes to "false", and the original student goes to only one of the two. Here's how we do it -- we make Tj, Fj, and another course Zj each have capacity 1 and have a total of three students for the clause. The main student, as above, has one list of classes for all the positive literals plus Tj, and the other list of classes for all the negative literals plus Fj. The second student has lists {Tj, Zj} and {F}, and the third student has lists {T} and {Fj, Zj}.

    We should check carefully that these three students can all be scheduled if and only if the setting satisfies the clause. If a positive literal's class is put in timeslot "true", we give the first student that literal and Fj, the second student Tj and F, and the third student T and Zj. If a negative literal's class is in timeslot "false", we give the first student Tj and that literal, the second student Zj and F, and the third student T and Fj. But if the clause is not satisfied, we can only give the first student Tj and Fj, which means that we need Zj for both the other two students, which is impossible as Zj has capacity 1.

    (It would be simpler if we could put Zj on both of the first student's lists, but it was stated that those lists had to be disjoint.)

Last modified 26 November 2006