Lecture 5: Even More ARM Goodies (by Trek Palmer) ------------- &&, ||, ! ========= In Java, you're used to being able to have arbitrarily complicated boolean expressions as an argument to an 'if' statement. Things are not so convenient in assembly land. Basically, all you can test is one simple arithmetic expression at a time. So, for instance: Java: if((i < 3) && (j > 15)) { i++; } i = i - 5; In ARM, you can't simply test i and then j, because the test for j will overwrite the condition values generated by the test of i. So, what would you do? Does this work? (Assuming i is in R2 and j in R3) CMP R2, #3 CMPLT R3, #15 ADDGT R2, R2, #1 SUB R2, R2, #5 No, it doesn't. There is a subtle bug here. Remember the condition codes are a global register. The processor has no real way of associating a given condition code with the instruction that set it. Consider the case where i (R2) is 5. Now do you see the bug? In order to correctly implment the semantics of this Java if statement, we're going to have to resort to some branching. Does this code work? CMP R2, #3 BGE after CMP R3, #15 ADDGT R2, R2, #1 after: SUB R2, R2, #5 Now, what about the following java expression: Java: if((i < 3) || (j > 15)) { i++; } i = i - 5; Would the following ARM assembly work? CMP R2, #3 CMPGE R3, #15 ADDGT R2, R2, #1 SUB R2, R2, #5 Again, no. What if i = 2? Once again we will need to branch. CMP R2, #3 BLT if_body CMP R3, #15 BGT if_body B after if_body: ADD R2, R2, #1 after: SUB R2, R2, #5 Logical Operations =================== In addition to the old chestnuts ADD, SUB, RSB, MUL, there are a bevy of logical operations that can be performed on values. AND --- AND performs a bitwise and of two 32 bit values. For those familiar with the & operation in Java (and C, C++, etc.), AND is the same thing. For those unfamiliar, AND is like the && operator, but it performs it on each bit seperately. If you consider 1 to be true and 0 to be false, then the AND truth table is: a AND b ------- b 0 | 1 b false | true a ------------ a ------------------ 0 0 | 0 false false | false 1 0 | 1 true false | true So if a = 11001111 and b = 11100011, a AND b = 11000011 ORR --- ORR performs a bitwise OR of its two arguments (like the | operator in Java). As AND is to &&, ORR is to ||. Here are the appropriate truth tables: a OR b ------ b 0 | 1 b false | true a ------------ a ------------------ 0 0 | 1 false false | true 1 1 | 1 true true | true EOR --- EOR stands for exclusive OR, which is a different operation than normal (inclusive) OR. It is analgous to the ^ operator in Java, but it is unlikely that any of you have encountered it. The basic rule for exclusive or is that exclusive or of two values is true if the two values are different. Here's the truth tables: a EOR b ------- b 0 | 1 b false | true a ------------ a ----------------- 0 0 | 1 false false | true 1 1 | 0 true true | false For the immediate future, you are unlikely to need these operations, but when we dig into boolean algebra they'll start to make more sense. In terms of programmatic usage, logical operators are often used for masking and setting bit patterns. Often an integer will actually be used as a set of boolean flags (one bit per flag), and ORR can be used to set flags, while AND can be used to check for flags. Example ------- Consider the case where you are maintaining your own version of the condition codes. In R0, the low-order 4 bits have been designated (by you) as NCZV (in that order), so R0 in ASCII is: N C Z V bit: 31 30 29 . . . 8 7 6 5 4 3 2 1 0 X X X . . . X X X X X Y Y Y Y So, in order to get the value of the lowest 4 bits from R0 and store them in R1, you could use the following code: AND R1, R0, #hF Why is this? Because 0 AND X = 0, 0xF which is 0x0000000F will automatically zero (the technical term for this is mask) the upper 28 bits. And because 1 AND X = X, 0xF will preserve the value of the lower bits. Now in order to get and test for a specific condition code, you could use the following code: AND R1, R0, #h8 ; test for N CMP R1, #1 AND R1, R0, #h4 ; test for C CMP R1, #1 AND R1, R0, #h2 ; test for Z CMP R1, #1 AND R1, R0, #h1 ; test for V CMP R1, #1 To set codes, you can use ORR, because 1 OR X = 1, and 0 OR X = X, the following code will set the N bit in R0 to 1: ORR R0, R0, #h8 This sort of masking and setting of bits happens all the time in systems. Network packet flags and fields, Object metadata, process status flags are all instances of systems constructs where bit-level masking and setting is crucial. Shifting ========= Because assembly programs deal mostly with collections of bits, there are all kinds of operations that can be performed on bits. The logical ops above are an example of that, but there's more that can be done. Some of you may have seen the left and right shift operations in C/Java. Left shift is '<<' and right shift '>>', what these operations do is literally shift their first argument's bit pattern either left or right by a number of bits specified by their second argument. For instance, if a = 10111101 (with an additional 24 zeros in front), then a << 2 = 1011110100 and a >> 2 = 101111. Note, that in the right shifting case the low order bits 'fell off the end'. Where did they go? you may ask. Well, they were discarded. Bits that are shifted out of the register are usually just dropped on the floor. So, for instance: (a >> 2) << 2 = 10111100 (left shift always fills in the new spaces with 0). So, be careful with shifting because it isn't commutative and you can permanently change the value of a register. Extension ---------- You may have noticed that when you shifted a value left, the new bits were automatically set to 0. This is known as zero or logical extension. Left shifting is always logical (that is, it always extends the bitpattern with 0s). There are two forms of right shifting: arithmetic and logical. Logical right shifting fills in the new bits with 0s. Arithmetic shifting fills in the new zeros with either 0 or 1 depending on the sign of the value being shifted. That is, arithmetic shifting preserves the sign of the shifted value. Because signed values are stored in 2's complement form, the sign of the argument can be determined by examining the MSB, so arithmetic shift reduces to filling in the new bits with the MSB of the shifted value. Ex. Consider -1 in 2's complement form on an 8-bit system A = -00000001 -> 11111110 + 1 -> 11111111 A logical right shifted 2 = 00111111 = 63 A arith. right shifted 2 = 11111111 = -1 Consider -5 in 2's complement for on an 8-bit system: B = -00000101 -> 11111010 + 1 -> 11111011 B logical right shifted 2 = 00111110 = 62 B arith. right shifted 2 = 11111110 = -2 Why is sign-extension important? It is important because shifting isn't just some instruction for the convenience of bit twiddling assembly programmers, shifting actually corresponds to multiplication and division by powers of 2! Consider the 8-bit value for 5 00000101, now consider 5 << 2 00010100 = 16 + 4 = 20 = 5 * 4 = 5 * 2^2 ! Therefore, left shifting a value by x is equivalent to multiplying that value by 2^x (assuming no overflow). Now look at the 8-bit value for 80 01010000, now look at 80 >> 3 = 00001010 = 2 + 8 = 10 = 80 / 8 = 80 / 2^3 ! Therefore right shifting a value by x is equivalent to dividing that value by 2^x (this is integer division). Now consider -80 (in 2's comp) = 10110000, -80 logical >> 3 = 00010110 = 16 + 4 + 2 = 22 != -80 / 8 -80 arith >> 3 = 11110110 = - 00001010 = - 10 = -80 / 8 Ok. So left and right shift have this nifty property that they are equivalent to certain kinds of multiplication and division. Why not just use the multiply and divide instructions? Two reasons. Reason 1: on some architectures (like the ARM) there is no integer divide operation. Reason 2: generic multiplication and division are sloooooooow operations. Slow enough that compilers will still try and turn multiplications and divisions into shifts. Shifting on the ARM ------------------- Most architectures have several shifting instructions (left, right, right arith). The ARM doesn't. Remember, the ARM was designed to have some DSP functionality in it, so it has a few funny quirks. The lack of specific shifting instructions is just one such quirk. On the ARM, almost any instruction can cause one of its arguments to be shifted before being operated on. This feature is one of the reasons why only the last argument to an ARM instruction can be constant valued. So, say you wanted to compute the value R0 << 3, you could write: MOV R0, R0, LSL #3 This new syntax specifies that the value to be copied into R0 is the value of R0 left shifted (LSL) by 3. LSR specifies logical (0-extended) right shift, and ASR specifies arithmetic (sign-extended) right shift. Now, here's where it gets wacky. Assume you wanted to compute the value R1 + (R0 << 3) (this sort of stuff comes up all the time with cryptographic and DSP systems). You could write: MOV R2, R0, LSL #3 ADD R1, R1, R2 But, because you can shift the last argument for most ARM instructions, you can actually perform this operation in one line thus: ADD R1, R1, R0, LSL #3 It's important here to remember that this argument shifting thing is really ARM-specific. Most general purpose architectures support nothing of the kind. But this gives you an idea as to how odd DSP systems must be.