Lecture 10: Function Calls and the Stack by Example (by Trek Palmer) ---------------------------------------------------- Functions are very important in assembly-level programming. As assembly programmers, they are important for all the reasons that functions are important in Java. They help reduce complexity and enforce useful interfaces. For compiled languages, function support at the hardware level is important because it makes libraries and seperate compilation possible, and this allows for many useful high-level features. But, as in most things, function calls at the assembly level are more complicated for the programmer than their Java counterparts. The programmer has to manage local data, procedure linkage, and the stack by hand. Although, by themselves, each responsibility seems simple and manageable; all together they are a source of additional complexity. To illustrate this fully, we'll walk through a (somewhat contrived) example of functions. Consider the following Java-like psuedo-code: int foo() { int temp = bar(5); temp += 2; temp = gah(temp); } int bar(int arg) { int temp2; if(gah(arg) > 3) { temp2 = gah(arg + 1); }else { temp2 = 1; } return temp2; } int gah(int arg2) { int temp3; if((arg2 >= 5) && (arg2 < 9)) { return arg2 + 1; }else if(arg2 > 11) { return arg2 + 2; }else { return gah(arg2 + 1); } } Now, in order to convert this into ARM assembly, we need a calling convention. For the purposes of the lecture, we'll formalize the calling convention we've been using so far: R0-R3 are for system call params and returns R4-R7 are for function parameters and returns R13 is the stack pointer R14 is the link register R15 is PC So, R8-R12 are definitely free for local data, and unused registers in the R4-R7 range can also be used. And the following ARM equivalents: foo: STMDB R13!, {R8, R14} MOV R4, #5 BL bar MOV R8, R4 ADD R8, R8, #2 MOV R4, R8 BL gah MOV R8, R4 LDMIA R13!, {R8, R14} MOV PC, R14 bar: STMDB R13!, {R8, R14} MOV R8, R4 BL gah CMP R4, #3 BLE else ADD R4, R8, #1 BL gah MOV R8, R4 LDMIA R13!, {R8, R14} MOV PC, R14 else: MOV R8, #1 MOV R4, R8 LDMIA R13!, {R8, R14} MOV PC, R14 gah: STMDB R13!, {R14} CMP R4, #5 BLT elseif CMP R4, #9 ADDLT R4, R4, #1 LDMLTIA R13!, {R14} MOVLT PC, R14 elseif: CMP R4, #11 ADDGT R4, R4, #2 LDMGTIA R13!, {R14} MOVGT PC, R14 else2: ADD R4, R4, #1 BL gah LDMIA R13!, {R14} MOV PC, R14 Some things to note, in the translation of gah, I used predication to translate the if-statements, but in bar I used branches. The reason for the difference is that in bar, I have to make a call in the middle of a conditional. Because I don't know what the called function will do to the condition codes, predicated instructions after the call may not behave correctly, so I used branches instead. Also, bar and gah have multiple return sites. Now let's examine the call stack for a call to foo(): ================ <-------------SP at entry into foo R14 --------------- R8 ================ <-------------SP after foo's prologue, entry into bar(5) R14 --------------- R8 ================ <-------------SP after bar's prologue, entry into gah(arg) R14 ================ <-------------SP after gah's prologue At this point, R4 = 5, so gah will add 1 to R4 and return (after restoring, R8 and R14, of course). ================ <-------------SP at entry into foo R14 --------------- R8 ================ <-------------SP after foo's prologue, entry into bar(5) R14 --------------- R8 ================ <-------------SP after gah(arg) returns R14 ================ <-------------SP after gah(arg + 1) prologue Now R4 = 6 and gah will add one to it and return ================ <-------------SP at entry into foo R14 --------------- R8 ================ <-------------SP after foo's prologue, entry into bar(5) R14 --------------- R8 ================ <-------------SP after gah(arg + 1) returns Now bar will assign the return value (in R4) to temp2 (in R8) and return to foo. ================ <-------------SP at entry into foo R14 --------------- R8 ================ <-------------SP after return of bar(5) Now foo will assign the return value to temp, add two to temp and then call gah with temp as an argument (temp = 9) ================ <-------------SP at entry into foo R14 --------------- R8 ================ <-------------SP at entry to gah(temp) R14 ================ <-------------SP at entry to gah(9 + 1) R14 ================ <-------------SP at entry to gah(10 + 1) R14 ================ <-------------SP at entry to gah(11 + 1) R14 ================ <-------------SP after gah(11 + 1) prologue Now, after some recursion, gah is called with an argument that exceeds 11, so it returns the argument + 2, which is returned all the way back to foo which then assigns it to temp and returns =============== <--------------SP after foo's return Functions with too many arguments to pass in registers ------------------------------------------------------- With our current calling convention, it is possible to pass up to 4 arguments in registers. Although most functions will probably have no difficulty with that, systems need to support functions with arities greater than 4. The standard solution is to 'spill' the excess arguments to the stack. Consider the following code: int tooMany(int a, int b, int c, int d, int e) { int temp = a + b + c + d + e; } And if we translate into ARM in a brain-dead manner: tooMany: STMDB R13!, {R8,R9,R14} ????? And already we've run into a problem, there's the question of exactly where do the excess arguments get spilled? Remember, too, that the calling function is the one that'll have to do the spilling (because it knows in which registers it's storing the arguments to tooMany). This is where our calling convention comes in and lets us know the layout of arguments on the stack. Because the calling function is responsible for writing out the excess arguments, and because SP will be pointing to the top of the stack when it calls tooMany, it would be convenient to have the excess parameters at the very top (or bottom, depending on your view) of the next stack frame. So the frame layout would be: ================ <--------------Frame pointer Excess arg n ---------------- Excess arg n - 1 ---------------- ... ---------------- Excess arg 1 ---------------- R14 ---------------- Additional saved registers ================ <---------------Stack pointer Therefore the code for tooMany would look like: tooMany: STMDB R13!, {R8,R9,R14} LDR R9, [R13, 12] ;fetching e ADD R8, R4, R5 ADD R8, R8, R6 ADD R8, R8, R7 ADD R8, R8, R9 ;R8 = a+b+c+d+e MOV R4, R8 LDMIA R13!, {R8, R9, R14} MOV PC, R14 Notice that tooMany isn't cleaning up the excess arguments. This is part of the calling convention, namely who's responsibility it is to correctly adjust the stack pointer. So, with the current code for tooMany, the following code needs to be used to call the function: STMDB R13!, {R9} BL tooMany LDMIA R13!, {R9}