Abstract
Improvements in semiconductor technology enabled smaller feature sizes, better clock speeds, and high performance. Improvements in computer architectures were enabled by RISC architectures and efficient high-level language compilers. Together, we have enabled customized computer architectures from system-on-hips to powerful GPUs and high-performance processors. Users need that the CPU should be able to access unlimited amounts of memory with low latency. The cost of fast memory is multi-fold compared to lower speed memory. Another characteristic of CPU memory access is principle of spatial and temporal locality. The solution is to organize the memory into hierarchy by caching data at different levels. Section 12.3 covers cache basics in detail. All the memory addressable by CPU need not be in physical memory due to space and cost. It can reside in disk. The address range is mapped by virtual memory manager. Virtual address constitutes the page number and the offset within the page. This page is placed in the physical memory in the free page slot available. This is indexed in the page table. Thus, virtual memory is mapped into physical memory. Section 12.4 details the virtual memory management in detail. RISC stands for Reduced Instruction Set Computer. The clock per instruction (CPI) is one in RISC. This architecture uses optimized set of instructions executed in one cycle. This allows pipelining by which multiple instructions can be executed simultaneously in different stages. RISC has several registers; instruction decoding is simple and simple addressing modes. Section 12.5 explains RISC architectures in detail. An efficient implementation of instruction execution is to overlap the instruction executions by which each hardware unit is busy all the time. Section 12.6 explains in detail this concept of pipelining and hazards are controlled in the architecture. Several advances in pipelining architecture have been developed. But the performance improvements get saturated with new constraints and issues in implementation. When a single instruction operates on multiple data elements in a single instruction cycle, the instructions are called Single Instruction Multiple Data (SIMD) instructions. Section 12.7 introduces data-level parallelism with vector processing. Section 12.9 introduces Single instruction Multi-threading (SIMT) in GPUs. We can exploit certain type of programs where they are inherently parallel and have very little dependence. We call them as threads of execution. Thread-Level Parallelism (TLP) is explained in detail in Sect. 12.10. FPGA-based technology has made system-on-chip designs a cake’s walk. Systems with high-performance requirements are possible with hardware configured to such requirements. Temporal re-configuration in FPGAs mimicking DLLs in software has made re-use of same FPGA fabric for just-in-time “use and throw” hardware blocks. Section 12.11 covers reconfigurable computing in detail. After reading this chapter, readers will be able to understand internal architecture of any processor which helps in selecting a processor for individual requirement.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Cortex™-A8 (2006) Revision: r3p2, technical reference manual
Elam D, Iovescu C (2003) A block floating point implementation for an N-Point FFT on the TMS320C55x DSP. Application Report SPRA948
FPGA architecture overview, Xilinx Corp (2020)
Gokhale M, Graham PS (2005) Reconfigurable Computing. Springer
Hassidim A (2010) Cache replacement policies for multicore processors. In: Innovations in computer science
Hauck S, DeHon A (2008) Reconfigurable computing: the theory and practice off FPGA based computation. Elsevier Publications
Hennessy JL, Patterson DA (2011) Computer architecture—a quantitative approach. Elsevier
MIPS® Architecture for programmers, vol I-A: introduction to the MIPS32®. Architecture, Document Number: MD00083 (2020)
Nagraj SV (2004) Cache replacement algorithms. The international series in engineering and computer science book series (SECS, vol 772)
Sweety CP, Chaudhary P (2018) Branch prediction techniques used in pipeline processors: a review. Int J Pure Appl Math 119(15)
Author information
Authors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this chapter
Cite this chapter
Murti, K. (2022). Embedded Processor Architectures. In: Design Principles for Embedded Systems. Transactions on Computer Systems and Networks. Springer, Singapore. https://doi.org/10.1007/978-981-16-3293-8_12
Download citation
DOI: https://doi.org/10.1007/978-981-16-3293-8_12
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-16-3292-1
Online ISBN: 978-981-16-3293-8
eBook Packages: EngineeringEngineering (R0)