High-Bandwidth/Low-Latency Temporary Storage for Supercomputers
Swensen, John Alan
Technical Report Identifier: CSD-87-383
Abstract: The traditional use of memory and a symmetrical set of registers for storage of temporary results of scientific programs requires more execution time, hardware, and instruction-stream bandwidth than necessary. Novel register organizations that can be easily integrated into traditional supercomputer architectures can reduce all of these requirements.
Execution speed can be more than doubled by storing temporary results in an asymmetrical set of general-purpose registers or an asymmetrical set of vector registers, instead of in memory and a small register-set. Faster access and a hardware cost one fourth that of traditional vector registers can be had by using a vector register that incorporates a pipelined, random-access-memory chip. If a large enough set of registers is used, the need to store temporary results in memory and then reload them for later use can be eliminated; this saves both instruction-stream bandwidth and execution time.