Lecture 18: You can never have enough logic (by Trek Palmer) =========================================== Carry-lookahead ---------------- If we consult the truth table for the full adder (or just think hard about bit-wise addition), we see that the only way we can have a carry is if there are at least two 1s flowing into the full adder. So, if we want to detect whether or not a given bit pattern is going to carry, we can construct a logic function that is true only if two or more of the full-adder inputs are true. If the inputs to the full adder are Xi, Yi, and Ci (meaning the two ith bits of the input, and the carry from the (i-1)th bit), then Ci+1 = (Xi ^ Yi) v (Xi ^ Ci) v (Yi ^ Ci). By a similar line of reasoning (or a protracted battle with algebra) the sum for the ith bit, Si = Xi XOR Yi XOR Ci. Now for some algebra, if we look at the expression for Ci+1, Ci+1 = (Xi ^ Yi) v (Xi ^ Ci) v (Yi ^ Ci) -> Ci+1 = (Xi ^ Yi) v Ci^(Xi v Yi) If we call (Xi ^ Yi) Gi, and (Xi v Yi) Pi, this becomes Ci+1 = Gi v Ci^Pi If we look at this, we see that Gi is 1 if Xi and Yi are 1, and that if either Xi or Yi is 1 an input carry will produce an output carry. Now, we can iterate this process to express Ci+1 in terms of Ci-1: Ci+1 = Gi + PiGi-1 + PiPi-1Ci-1, now here's the witty bit. If you repeat this enough, you can derive an expression for a carry in terms of the initial input carry, C0. The expression looks something like: Ci+1 = Gi + PiGi-1 + PiPi-1Gi-2 + ... + PiPi-1Pi-2...P1G0 + PiPi-1...P0C0 What this actually means is this: you can determine, a priori whether or not a given bit will generate a carry given the two input values and the initial input carry! This means that if you implement this function in circuitry the nth bit carry can be determined without having to wait for the n-1th half-adder to compute it! This means that you can add all the bits of the inputs in parallel! This means that you can have very fast addition! Hooray! An adder that uses this technique is called a carry-lookahead adder. There are faster, more exotic adders, but this is pretty zippy. Of course, in science and engineering there's no such thing as a free lunch so there is a downside: namely that the carry lookahead logic is complex, and it requires the carry bit to be propagated to all the other bits. On high-frequency chips, this propagation distance may be prohibitive, so actual adders in hardware are much more twisted creations. But, on average, carry-lookahead is pretty cool. Subtraction ------------ Subtraction is relatively simple. Because systems use 2's complement for negative representations, in order to subtract B from A you just complement B and add it to A. To complement B, remember that X XOR 1 = ~X, so if you just put XORs in line with the arguments, you can have one input be the incoming value and the other input be the 2's complement selector. The only other issue with subtraction is that you need to detect overflow a little differently. Overflow occurs when the inputs have the same sign, but the output has a different sign, this can actually be detected with Cn XOR Cn-1. So, to turn 32-bit add into 32-bit subtract, you need 33 XOR gates. Condition codes ---------------- In order to set the condition codes, we need to reexamine them. There are 4 conditions N, C, Z, V. C is easy, it's just the carry out of the 31st bit full adder. V, we have already discussed. N is 1 if the result is negative. Knowing, as we do, that the numbers are stored in 2's complement form, we know that if the number is negative the high-order bit is 1. So N is just the sum from the 31st bit full adder. Last is Z, if we just NOR all the sum bits from all the adders we can generate Z (because NOR is 1 only if all it's inputs are 0). Now you know how to generate condition codes! Memory and Flip Flops ---------------------- Although there is considerably more that goes into an ALU, you now have a fair grasp of the basics. In addtion to multiply and division circuits, there is the issue of selecting the operation, routing the inputs to the appropriate units and routing the correct output back out. This is a lot of combinatorial logic which you don't really need to know (but still should learn because it's good for your soul :). The big, untouched area of functionality is memory. Note that none of the circuits we have looked at can be said to have memory. They just process whatever flows in one end and spit out the result. There's no way for them to 'remember' what the last value generated was, or other state information. It is possible, in fact, to synthesize memory components from logic gates. First, we need to look at a simplified case: 0/1 |\ 1/0 +----+ \o------+ | | / | | |/ | | | | /| | | 0/1 / | 1/0 | +----o\ +------+ \| Here we have connected two inverters such that there's a feedback loop running the previous values back through the circuit. In the example there are two stable states, 0 on the left, 1 on the left. Passing a 0 through the top inverter forces the system to go into the 0 on the left state. Passing a one through the inverter forces the system into the other state. Similarly, the second inverter can force the system into one of these stable states. Now we basically have a single bit memory. If you can get this circuit into one of its two stable states it'll retain that state indefinitely (as long as it keeps recieving power). There is a problem with it: how do you put a value into the circuit? In order to have a memory cell that can be set and read, you need to build a similar sort of feedback circuit, but out of NANDs (or NORs). In this case, the circuit is known as a flip-flop: ______ ~Set | \ Q +-------+ NAND |o-----------+-------- | | | +----+_____/ | | ______________/ \________/_______ / \ / \ ______/ _______ \ | | \ \ ~Q +-------+ NAND |o---+------------- | | +----------+______/ ~Reset In this circuit, if Set is high (~Set is low), then the system will set Q to 1 and retain that value. If Reset is high (~reset is low) then the system will set Q to 0 and retain that value. Now we have a 1-bit memory! Now that we have a memory, we'd like a way to control it. Essentially we need a way to read and write values into the thing. So, as a first step, we can gaurd the inputs to the flip flop with a second set of NANDs. -------NAND----~Set +---/ | +---\ ---|---NAND-----~Reset | Control What you've gained by this is a control line, which decides whether or not the values coming into the circuit get to be sent to the flip flop. The control line is now essentially a write bit. If it's 1, then the incoming values can make their way into the flip flop and change the stored value. If it's zero, it forces the outputs to be 1, thereby leaving the flip-flop unchanged. We can now add an input by cleverly inserting an inverter thus: D-----+----------NAND----~Set _|_ | \ / +---/ o | | +---\ +------|---NAND-----~Reset | Control Now, D is the bit that is going into the flip flop, and control determines whether or not D is actually stored in the flip-flop. Now we have a usable 1-bit memory! Of course, we're not done yet. There is another issue: namely reading out values from the flip-flop while it's being written to. So, if we use our current circuit, there's a chance that if the flip-flop's state is changing while we read from it that we may read indeterminite values, the new value or the old value. In order to have a usable consistent system, we need to ensure that whenever you read from memory, you're always gauranteed to get a good value. A possible solution would be to have two flip flops. One you write to, the other you read from, and control logic between the two to make sure that the readable flip-flop only recieves good values. This sort of a system is actually known as a master-slave flip-flop. Here's a diagram of such a flip-flop: D-----+----------NAND----~Set ... Q------NAND--~Set ... ---Q _|_ | ... | \ / +---/ +--/ o | | | +---\ +---\ +------|---NAND---~Reset... ~Q--|--NAND--~Reset ... ---~Q | | Control Control | |\ | clock---------+--------+ \o------------+ |/ In this case, we are using the clock signal to make sure that the master is set before the slave can recieve the value. There is still an issue. This flip-flop is depending on the propagation delay of the clock signal, and so there's actually a hardware race condition to deal with. Most flip-flops that are actually used are what are known as edge-clocked. This eliminates the race-condition problem, but the basic idea is the same. So now we have a 1-bit memory that can safely be read from while being written to! Woot!