CMPSCI 311 Discussion #7: Scheduling With Network Flow

David Mix Barrington

30 October/1 November 2006

Scheduling the classes of a school is far more complicated that the interval scheduling problems we've seen in the course -- in fact we will prove later that even simplified versions of the problem are NP-complete. But network flow can help with some of the difficulties we face. This week we have three problems which you are to reduce to the network flow problem. This means that for each problem you need to describe a directed graph, with a single source s and a single sink t, with positive integer capacities on each edge, such that finding the maximum flow from s to t will solve the problem. You need to make an argument that your graph correctly does this, and make it clear that your graph can be computed from the given input and has reasonable size. In lecture we've seen that there is a general solution to the network flow problem, so here we will consider ourselves to be done once we've transformed the problem to a network flow instance.

  1. Perhaps the easiest thing to schedule at our hypothetical high school are the phys-ed courses. There are n students, each with a list of three preferred phys-ed courses, and c classes, each with a positive integer capacity. Each student is to take exactly one phys-ed class, and since all the phys-ed classes meet at the same time, there are no time conflict. Our goal is to assign as many students as possible to one class each, taken from their list of three, such that no class exceeds its capacity.

    Our graph has four layers. We have a single source s, a node for each student, a node for each class, and a single sink t. There is an edge of capacity 1 from s to each student node. There is an edge of capacity 1 from each student node to the class nodes for each of their three preferred classes. Finally, there is an edge from each class node to t, whose capacity is the capacity of the class.

    If we have an integer flow of size z in this graph, the flow out of s can only go to a set of exactly z student nodes. Since each of these student nodes has a flow of 1 into it, it must have a flow of 1 out, on a single edge as the flows are integer-valued. The only way to do this is to have the flow go to one of the class nodes for that student's classes -- we assign the student to that class. This assignment of students to classes gives each class a number of students exactly equal to the flow out of that class' node in the flow. Since the flow meets the capacity constraints, the assignment meets the class capacity constraints.

    We must also show that every valid schedule corresponds to a flow in this graph, whose size is equal to the number of students assigned. We make this flow by putting one unit on the edge from s to x for every student x that is assigned a class, one unit on the edge from x to y whenever student x is assigned to class y, and a flow from class node y to t equal to the number of students assigned to y. This flow is valid because there is one unit in and one out of each node for an assigned student, an equal flow in and out of each class node, and a flow to t that meets the edge capacity constraint.

    Since the assignments are in one-to-one correspondence with the integer flows, with the number of students assigned equaling the size of the flow, the maximum flow found by the network flow algorithm must give us the assignment with the maximum number of students assigned.

  2. Now we want to assign each student to the set of six classes they will take over the course of a year. We have one or more sections for each course, and each section has a positive integer capacity. Each student has six sets of sections, and needs to be assigned to exactly one section from each set. For example, one set might be {English 1a, English 1b, English 1c} and another might be {Theatre Tech, Basket Weaving, Art Appreciation 2a, Art Appreciation 2b}. You may assume that no section appears on more than one list for the same student. We want to assign as many students to as many classes as possible, such that no section exceeds its capacity and no student ever gets a section not on their list. Note that we are completely ignoring any time conflicts.

    Our graph has a single node s, seven nodes for each student, one node for each section, and a single node t. We'll let {x0,...x6} be the set of seven nodes for student x. There is an edge of capacity 6 from s to each node x0, and edges of capacity 1 from each x0 to its corresponding nodes x1, x2,...,x6. Each node xi (other than x0) has edges of capacity 1 to the nodes for each of the sections on the i'th list from student x. Finally, each section node has an edge to t, whose capacity is the capacity of that section.

    We must show that valid partial schedules (which assign some of the students to some of their classes on their lists without exceeding any section capacities) are in one-to-one correspondence with the integer flows in this graph. The flow through the graph will equal the sum over all the sections of the number of students assigned to the section -- since this is the quantity that we want to maximize, the maximum flow through the graph will solve our problem.

    If we have a valid partial schedule we create a flow from it as follows. On each edge from s to x0 we put a flow equal to the number of classes that student x is assigned. For each of these sections, we put a flow of 1 on the edge from x0 to xi, where i is the number of the list on which that section occurs. We then also put a flow of 1 from this xi to the node for that section. The flow from each section node to t is equal to the number of students assigned to that section. This is a valid flow because no edge has a flow exceeding the edge capacities defined above, and because we have assigned an equal flow into and out of each node other than s and t.

    Given an integer flow in the graph we have defined, we may define the corresponding partial schedule as follows. Each student node x0 has a flow of some integer from 0 through 6 from s -- that student will be assigned to a number of classes equal to the size of that flow. Then we must decide which sections the student is assigned to. If the node x0 has a flow of k into it, there must be exactly k different nodes xi with flows of size 1 from x0, and each of these nodes must have a flow of size 1 to some section node. This defines the k sections to which student x is assigned. The structure of the xi nodes means that this partial schedule includes classes from distinct lists for each students, and the capacities of the edges into t force the schedule to obey the capacity constraints on the sections.

  3. Finally we look at the problem of dividing the class sections for one term among three timeslots A, B, and C subject to two constraints. First, each section has exactly one instructor, each instructor has exactly two sections, and the instructor's two sections cannot be scheduled at the same time. Secondly, we are given numbers nA, nB, and nC such that we may assign no more than nA sections to time slot A, no more than nB to timeslot B, and no more than nC to timeslot C.

    The problem is to get an assignment of sections to timeslots that obeys these constraints and assigns as many sections as possible. Note that we are ignoring the possibility that a student gets assigned to two classes at the same time (there was a solution to this latter problem offered in the third Harry Potter book).

    The trick here is to largely ignore the classes and consider the instructors. Each instructor falls into three categories: those teaching in slots A and B, those in A and C, and those in B and C. The most straightforward way to make a graph is to have global nodes s, A, B, C, and t, and then a node x for each instructor x. The edges are as follows: an edge of capacity 2 from s to each node x, edges of capacity 1 from each x to A, B, and C, and edges of capacity nA, nB, and nC respectively from A, B, and C to t.

    Again we show that we can go from a valid partial schedule to an integer flow in this graph, and vice versa. Given the schedule, we take each instructor x who has courses assigned, assign a flow from s to x equal to the number of classes x has assigned, then assign flows of size 1 from x to the timeslot or timeslots of x's assigned classes. Then the flow from A, B, and C to t will be the number of classes assigned to each timeslot. Each x has the same flow in and out, as do nodes A, B, and C. The edge capacities are all obeyed, because a valid schedule cannot exceed nA classes in timeslot A, etc.

    Given a valid integer flow, we can construct a valid partial schedule. Each instructor X has a flow of 0, 1, or 2 into his/her node and flows out to 0, 1, or 2 of the three timeslots. We arbitrarily assign 0, 1, or 2 of x's two sections to the timeslots that have flow to them from x. Then the number of sections assigned to each timeslot will be the same as the amount of flow out of that timeslot's node.

    The correspondence of integer flows and schedules is no longer one-to-one, because different schedules will give rise to the same flow as long as each instructor has the same set of timeslots assigned. But the size of the flow equals the number of sections given timeslots in each case, so maximizing the size of the flow will solve the problem.

    There is an even easier solution to this problem. Since we have no reason to care which instructors get which timeslot assignments as long as we know how many are type AB, type AC, and BC, all we care about is that the number of classes in each timeslot is less than the given constraint. So if we can find the numbers x, y, and z such that x + y ≤ nA, x + z ≤ nB, y + z ≤ nC, and x + y + z is maximized, any assignment of x instructors to AB, y instructors to AC, and z instructors to BC will be an optimal solution to our problem. These inequalities are a simple, O(1)-size example of a linear program and can be solved by simple algebra or graphing.

Last modified 5 November 2006