Lecture 23: memory organization and Exceptions (by Trek Palmer) ================================================ SDRAMs -------------- Now, what most systems have is known as SDRAM or synchronous DRAM. This synchronous word derives from the fact that the RAM chip itself has a clock signal. SDRAMs are more complicated than DRAMs, but the complexity allows for higher performance. Most SDRAMs support a kind of burst-mode, where the RAM can increment the column count and actually output successive words stored in a given row. DDR -------- Some SDRAMs are so-called double data rate (DDR) rams. Single data rate RAMs, do all their work on the rising edge of the clock, but Double Date Rate RAMs can operate on both the rising and falling edge of the clock. This is accomplished by banking the RAM columns. Banking is a standard RAM technique, words are grouped together into banks, and the banks can be accessed separately (and in parallel). In a DDR system, one bank works on the rising edge, while the other works on the falling edge. In modern computers, much of the memory controlling circuitry is now part of the actual chip. ROM --------- Not all memory is RAM-based. Most systems have at least some read-only memory in them (ROM). A ROM is basically a small, fast, permanent storage device. Most system's BIOS, is at least, in part ROM. Often, embedded systems have no large-scale storage devices, and so they place all the long-term data in ROM chips (although this is sometimes flash). Usually, ROM is where you place all the stuff necessary to bootstrap yourself to the point where you can get more information out of more complicated devices (like a disk). Basically, there are several kinds of ROMS. PROM, or programmable ROMs are like dynamic memory chips, but rather than a capacitor, they have a fuse. You program the ROM by overpowering the ROM and burning out certain fuses. PROMs are write-once, and aren't used much any more. EPROMs (erasable PROMs) allow the ROM chip itself to be placed in a special mode where it can be erased. A special type of transistor is used rather than a fuse. EPROMs often needed special equipment to be erased (like a UV-light source). EEPROMs (electrically EPROMs) can be erased in-situ. Many systems use EEPROMs now. Flash ----------- Flash is the new, sexy fast storage device. The density of data in flash circuitry is impressive (Multiple GBs on something the size of a quarter and as thick as a playing card). Flash memories don't require power to retain values, and they have very low power requirements for reading. Writing is tricky, not only does it require more power, but it is only possible to write blocks of words at a time. In addition, flash cells can only be written to a few thousand times before they degrade to the point where they are non-functional. But, without them most digital cameras would only be able to store a few dozen pictures. Exceptions and interrupts -------------------------- Exceptions are asynchronous events that the processor has to react to. I/O interrupts are a sort of exception (at least as far as ARM is concerned), SWI instructions generate exceptions as well. Also, certain kinds of instruction errors are handled with exceptions. For instance, on processors with a divide operation, division by zero usually generates an exception. So, an exception represents some unexpected condition that the CPU suddenly finds itself in. Usually the CPU has to bail out of the current execution, do a context switch, and start executing exception handling code. How the ARM does it -------------------- ARM processors distinguish between 7 different types of exceptions. This is a little on the high side. Many processors have different types of exceptions (they usually give them different names, too). The types are: Type Mode ---------------------------------------------------- Reset Supervisor Undefined Instruction Undefined SWI Supervisor Prefetch Abort (icache fetch) Abort Data Abort (dcache abort) Abort IRQ (I/O interrupt) IRQ FIQ (fast interrupt) FIQ The mode column tells you which mode the processor enters to handle the exception. In the case of the ARM, the mode corresponds to a separate bank of registers. Also, the mode has it's own processor status word (PSR) so that exception handlers won't corrupt user condition codes (among other things). Reset is basically a software reset that halts the processor and starts it over again in the OS (for rebooting presumably). Undefined instruction is what happens when the processor reads in a bunch of bits that it can't decode. In this case the processor waits to see if any attached co-processors can execute it, otherwise it triggers an undefined instruction exception. In this case, the exception handler can be used to emulate co-processor functionality in software (like division, for instance). SWI is basically a trap into the operating system. It is for controlled access to hardware and other processes. You've already used SWI, so it should make a certain amount of sense. Memory abort instructions are used to mark data as invalid. This mostly has to do with caching and so we won't dwell on it. IRQ and FIQ are for I/O. An IRQ (interrupt request) is generated by an I/O device that requires servicing by the processor/OS. An FIQ is a higher priority interrupt than an IRQ and it has more new registers available in FIQ mode (this reduces context switching overhead). How an exception is handled ---------------------------- When the processor needs to deal with an exception, a whole sequence of events is triggered. 1)Each mode has it's own (banked) R14, which is set to the address of the user mode instruction that was executing when the exception arrived. 2) Each mode has it's own CPSR (called the SPSR) this is set to the value of CPSR at the time of the exception 3) CPSR[4:0] = exception mode number 4) CPSR[5] = 0 5) if mode == Reset or FIQ, disable fast interrupts (CPSR[6] = 1) 6) CPSR[7] = 1 7) PC = exception vector address CPSR Structure: 31 30 29 28 27 26 8 7 6 5 4 3 2 1 0 +--+--+--+--+--+--------------------------------------+-+-+-+--+--+--+--+--+ |N | Z| C| V| Q| DNM (don't matter) |I|F|T|M4|M3|M2|M1|M0| ---------------------------------------------------------------------------- NZCV, are the condition codes. Q is a special condition code used by DSP instructions that you won't (mercifully) be exposed to. I is the interrupt enable/disable flag F is the fast interrupt enable/disable flag T is the thumb mode flag M4 - M0 are the mode bits (telling you what exception mode you're in) The exception vector is where the code to handle the exception is stored. On the arm, the vectors are each 1 word wide, so the code is actually going to be a branch into the OS handling code (usually). In the case of FIQ's (Because they're the last vector) the code can be right there. The structure of exception vectors varies from system to system, but the basic idea is that at some fixed memory address, the user will place exception handling code that the processor will jump to when it encounters an exception. When the exception handler is done executing, the CPSR is restored (from the SPSR) and the mode's R14 is moved back into the PC. This will cause execution to resume back in user mode. And the user code has no way of telling that an exception occurred. An example, the getc (SWI #h00F00001) trap ------------------------------------------- Now actual I/O for a real OS (like linux) is fairly complicated, involving file descriptors, buffers, and a whole bunch of overhead. So we use a simpler abstraction. getc just grabs the next character off of standard input. Of course, this is implemented in the simulator as just a hunk of code, but as an exception it works like this: at address 0xFEED0000, SWI #h00F00001 is encountered 1) We enter supervisor exception handling mode 2) R14_svc = 0xFEED0008 (the PC on ARM is always off by 8) 3) SPSR_svc = CPSR 4) CPSR[4:0] = 0b10011 5) CPSR[5] = 0 (no thumb exception handlers) 6) CPSR[6] = 0 (we allow fast interrupts, b/c they have higher priority) 7) CPSR[7] = 1 (disable normal interrupts) 8) PC = 0x00000008 Now at 0x00000008, there will be a branch instruction, like B 0xFFFF0000 (address of branch handler) The branch handler will decode SWI to decide which syscall it's doing (inst & 0x000FFFFF), and will then use that as an index into an array of addresses, each of which is the address of the appropriate trap code. For instance, assuming the trap number is in R4: ADR R5, trapBaseAddr LDR PC, [R5, R4, LSL #2] The LSL is necessary because ARM instructions are 4 bytes wide, so you need to scale the offset by 4, which is the same as shifting left by 2. Now the PC will have the address (+ 8, actually) of the OS internal function to process this trap. At the end of the trap handler will be the following line of code: MOVS PC, R14 In supervisor mode, this will copy R14_svc into PC AND restore CPSR from SPSR. And that's it, now you know how user-mode programs can communicate to supervisor-mode programs!