Switching Threads
- On a cache miss (fewer switches, but
still stalls for a short time)
- On every load (more switches, can avoid
stalls by detection at decode)
- On every instruction (most switches,
but simple and fills pipes with independent operations)
- On I-cache blocks (like above, but fewer
switches, more locality)