Processor design: Difference between revisions

From Wikipedia
Jump to navigation Jump to search
Basics: fixed typo
 
 
Line 1: Line 1:
{{short description|Task of creating a processor}}
{{short description|Task of creating a processor}}
'''Processor design''' is a subfield of [[computer science]] and [[computer engineering]] (fabrication) that deals with creating a [[processor (computing)|processor]], a key component of [[computer hardware]].
[[File:CPU_Intel_80486DX-50.JPG | thumb | right | alt=CPU Intel 80486DX-50 | CPU Intel 80486DX-50]]
'''Processor design''' is a subfield of computer engineering and electronics that deals with creating a processor, a key component of computer hardware. While historically focused on the [[central processing unit]] (CPU), modern design often involves [[system-on-chip]] (SoC) architectures<ref>{{Cite thesis |last=Bejo |first=Agus |title=A system on chip design of A 6-axis robotic ARM controller implemented on a low-cost FPGA |date=2008 |publisher=Office of Academic Resources, Chulalongkorn University |doi=10.58837/chula.the.2008.1665 |url=https://doi.org/10.58837/chula.the.2008.1665|url-access=subscription }}</ref>, which integrate multiple processing units such as CPUs, [[graphics processing units]] (GPUs), and [[neural processing units]] (NPUs)<ref>{{Cite web |date=2024-09-27 |title=What is a Neural Processing Unit (NPU)? {{!}} IBM |url=https://www.ibm.com/think/topics/neural-processing-unit |access-date=2025-12-16 |website=www.ibm.com |language=en}}</ref> onto a single die or set of chiplets.<ref>{{Cite journal |last=Vaithianathan |first=Muthukumaran |date=2025 |title=The Future of Heterogeneous Computing: Integrating CPUs, GPUs, and FPGAs for High-Performance Applications |url=https://doi.org/10.63282/3050-9246.ijetcsit-v6i1p102 |journal=International Journal of Emerging Trends in Computer Science and Information Technology |volume=6 |pages=12–23 |doi=10.63282/3050-9246.ijetcsit-v6i1p102 |issn=3050-9246}}</ref><ref name=":0">{{Cite web |title=Analysis: Five Key Trends for Compute in 2025 {{!}} TechInsights |url=https://www.techinsights.com/blog/analysis-five-key-trends-compute-2025 |access-date=2025-12-16 |website=www.techinsights.com}}</ref>


The design process involves choosing an [[instruction set]] and a certain execution paradigm (e.g. [[Very long instruction word|VLIW]] or [[Reduced instruction set computing|RISC]]) and results in a [[microarchitecture]], which might be described in e.g. [[VHDL]] or [[Verilog]]. For [[microprocessor]] design, this description is then manufactured employing some of the various [[semiconductor device fabrication]] processes, resulting in a [[Die (integrated circuit)|die]] which is bonded onto a [[chip carrier]]. This chip carrier is then soldered onto, or inserted into a [[CPU socket|socket]] on, a [[printed circuit board]] (PCB).
The design process involves choosing an [[instruction set]] and a certain execution paradigm (e.g. [[Very long instruction word|VLIW]] or [[Reduced instruction set computing|RISC]]) and results in a [[microarchitecture]], which might be described in e.g. [[VHDL]] or [[Verilog]]. For [[microprocessor]] design, this description is then manufactured employing some of the various [[semiconductor device fabrication]] processes, resulting in a [[Die (integrated circuit)|die]] which is bonded onto a [[chip carrier]]. This chip carrier is then soldered onto, or inserted into a [[CPU socket|socket]] on, a [[printed circuit board]] (PCB).
Line 6: Line 7:
The mode of operation of any processor is the execution of lists of instructions. Instructions typically include those to compute or manipulate data values using [[Processor register|registers]], change or retrieve values in read/write memory, perform relational tests between data values and to control program flow.
The mode of operation of any processor is the execution of lists of instructions. Instructions typically include those to compute or manipulate data values using [[Processor register|registers]], change or retrieve values in read/write memory, perform relational tests between data values and to control program flow.


Processor designs are often tested and validated on one or several FPGAs before sending the design of the processor to a foundry for [[semiconductor fabrication]].<ref>{{cite web|url=https://www.anandtech.com/show/14798/xilinx-announces-world-largest-fpga-virtex-ultrascale-vu19p-with-9m-cells|title=Xilinx Announces World Largest FPGA: Virtex Ultrascale+ VU19P with 9m Cells|first=Ian|last=Cutress|date=August 27, 2019|website=[[AnandTech]]}}</ref>
Processor designs are often tested and validated on one or several FPGAs before sending the design of the processor to a foundry for [[semiconductor fabrication]].<ref>{{cite web|url=https://www.anandtech.com/show/14798/xilinx-announces-world-largest-fpga-virtex-ultrascale-vu19p-with-9m-cells|archive-url=https://web.archive.org/web/20190827160514/https://www.anandtech.com/show/14798/xilinx-announces-world-largest-fpga-virtex-ultrascale-vu19p-with-9m-cells|url-status=dead|archive-date=August 27, 2019|title=Xilinx Announces World Largest FPGA: Virtex Ultrascale+ VU19P with 9m Cells|first=Ian|last=Cutress|date=August 27, 2019|website=[[AnandTech]]}}</ref>


== Details ==
== Details ==
{{Prose|section|date=May 2011}}


=== Basics ===
=== Basics ===
CPU design is divided into multiple components. Information is transferred through [[datapath]]s (such as [[Arithmetic logic unit|ALUs]] and [[Pipeline (computing)|pipelines]]). These datapaths are controlled through logic by [[control unit]]s. [[Memory (computing)|Memory]] components include [[register file]]s and [[Cache (computing)|caches]] to retain information, or certain actions. [[Clock signal|Clock circuitry]] maintains internal rhythms and timing through clock drivers, [[Phase-locked loop|PLLs]], and [[clock distribution network]]s. Pad transceiver circuitry which allows signals to be received and sent and a [[logic gate]] cell [[Library (electronics)|library]] which is used to implement the logic. Logic gates are the foundation for processor design as they are used to implement most of the processor's components.<ref>{{cite book | url=https://books.google.com/books?id=GBVADQAAQBAJ&q=processor+logic+gates | title=Digital Systems: From Logic Gates to Processors | isbn=978-3-319-41198-9 | last1=Deschamps | first1=Jean-Pierre | last2=Valderrama | first2=Elena | last3=Terés | first3=Lluís | date=12 October 2016 | publisher=Springer }}</ref>
n is transferred through [[datapath]]s (such as [[Arithmetic logic unit|ALUs]] and [[Pipeline (computing)|pipelines]]). These datapaths are controlled through logic by [[control unit]]s. [[Memory (computing)|Memory]] components include [[register file]]s and [[Cache (computing)|caches]] to retain information, or certain actions. [[Clock signal|Clock circuitry]] maintains internal rhythms and timing through clock drivers, [[Phase-locked loop|PLLs]], and [[clock distribution network]]s. Pad transceiver circuitry which allows signals to be received and sent and a [[logic gate]] cell [[Library (electronics)|library]] which is used to implement the logic. Logic gates are the foundation for processor design as they are used to implement most of the processor's components.<ref>{{cite book | url=https://books.google.com/books?id=GBVADQAAQBAJ&q=processor+logic+gates | title=Digital Systems: From Logic Gates to Processors | isbn=978-3-319-41198-9 | last1=Deschamps | first1=Jean-Pierre | last2=Valderrama | first2=Elena | last3=Terés | first3=Lluís | date=12 October 2016 | publisher=Springer }}</ref>


CPUs designed for high-performance markets might require custom (optimized or application specific (see below)) designs for each of these items to achieve frequency, [[power consumption|power-dissipation]], and chip-area goals whereas CPUs designed for lower performance markets might lessen the implementation burden by acquiring some of these items by purchasing them as [[intellectual property]]. Control logic implementation techniques ([[logic synthesis]] using CAD tools) can be used to implement datapaths, register files, and clocks. Common logic styles used in CPU design include unstructured random logic, [[finite-state machine]]s, [[microprogramming]] (common from 1965 to 1985), and [[Programmable logic array]]s (common in the 1980s, no longer common).
CPUs designed for high-performance markets might require custom (optimized or application-specific (see below)) designs for each of these items to achieve frequency, [[power consumption|power-dissipation]], and chip-area goals whereas CPUs designed for lower performance markets might lessen the implementation burden by acquiring some of these items by purchasing them as [[intellectual property]]. Control logic implementation techniques ([[logic synthesis]] using [[Computer-aided design|CAD]] tools) can be used to implement datapaths, register files, and clocks. Common logic styles used in CPU design include unstructured random logic, [[finite-state machine]]s, [[microprogramming]] (common from 1965 to 1985), and [[programmable logic array]]s (common in the 1980s, no longer common).
 
=== Specialized Accelerators ===
Modern processor designs increasingly rely on [[heterogeneous computing]], integrating specialized accelerators alongside general-purpose cores. The most prominent addition is the [[Neural processing unit|Neural Processing Unit (NPU)]], designed specifically to execute machine learning mathematics (matrix multiplication) more efficiently than a standard CPU. This specialization allows for significant gains in performance-per-watt for AI workloads.<ref>{{Cite web |last=Fleischer |first=Adam J. |date=Mar 17, 2025 |title=6 Top Microprocessor Trends for 2025 |url=https://octopart.com/pulse/p/top-microprocessor-trends }}</ref><ref name=":0" />


=== Implementation logic ===
=== Implementation logic ===
Device types used to implement the logic include:
Device technologies used to implement [[Central processing unit|CPU]] logic have changed over time. Early implementations used individual relays, vacuum tubes, and discrete components ([[Transistor|transistors]] and [[Diode|diodes]]), and later small-scale integration [[Transistor–transistor logic|TTL]] chips, but these are no longer used for CPUs. Programmable array logic and other programmable logic devices are also no longer used for CPUs in this role, and ECL gate arrays are now uncommon. [[CMOS]] gate arrays are no longer used for CPUs, while CMOS mass-produced integrated circuits account for most CPUs by volume. Custom CMOS [[Application-specific integrated circuit|ASICs]] are generally practical only for high-volume applications because of the engineering cost. [[Field-programmable gate array|Field-programmable gate arrays (FPGAs)]] remain common for soft microprocessors and are often used for reconfigurable computing.
* Individual [[vacuum tube]]s, individual [[transistor]]s and semiconductor [[diode]]s, and [[transistor-transistor logic]] [[small-scale integration]] logic chips no longer used for CPUs
* [[Programmable array logic]] and [[programmable logic device]]s – no longer used for CPUs
* [[Emitter-coupled logic]] (ECL) [[gate array]]s – no longer common
* [[CMOS]] [[gate array]]s – no longer used for CPUs
* [[CMOS]] [[Integrated circuit|mass-produced IC]]s – the vast majority of CPUs by volume
* [[CMOS]] [[Application-specific integrated circuit|ASIC]]s – only for a minority of special applications due to expense
* [[Field-programmable gate array]]s (FPGA) – common for [[soft microprocessor]]s, and more or less required for [[reconfigurable computing]]


A CPU design project generally has these major tasks:
A CPU design project generally has these major tasks:
Line 42: Line 38:
As with most complex electronic designs, the [[functional verification|logic verification]] effort (proving that the design does not have bugs) now dominates the project schedule of a CPU.
As with most complex electronic designs, the [[functional verification|logic verification]] effort (proving that the design does not have bugs) now dominates the project schedule of a CPU.


Key CPU architectural innovations include [[index register]], [[CPU cache|cache]], [[virtual memory]], [[instruction pipelining]], [[superscalar]], [[Complex instruction set computer|CISC]], [[Reduced instruction set computer|RISC]], [[virtual machine]], [[emulator]]s, [[microprogram]], and [[Stack (data structure)|stack]].
Key CPU architectural innovations include [[Accumulator (computing)|accumulator]], [[index register]], [[general-purpose register]], [[CPU cache|cache]], [[virtual memory]], [[instruction pipelining]], [[superscalar]], [[Complex instruction set computer|CISC]], [[Reduced instruction set computer|RISC]], [[virtual machine]], [[emulator]]s, [[microprogram]], and [[Stack (data structure)|stack]].


=== Microarchitectural concepts ===
=== Microarchitectural concepts ===
Line 55: Line 51:


===Performance analysis and benchmarking===
===Performance analysis and benchmarking===
{{Main| Computer performance}}
{{Main|Computer performance}}
 
[[benchmark (computing)|Benchmarking]] is a way of testing CPU speed. Examples include SPECint and [[SPECfp]], developed by [[Standard Performance Evaluation Corporation]], and ConsumerMark developed by the Embedded Microprocessor Benchmark Consortium [[EEMBC]].
[[benchmark (computing)|Benchmarking]] is a way of testing CPU speed. Examples include SPECint and [[SPECfp]], developed by [[Standard Performance Evaluation Corporation]], and ConsumerMark developed by the Embedded Microprocessor Benchmark Consortium [[EEMBC]].


Line 72: Line 69:


==Markets==
==Markets==
{{Update section|date=December 2023|reason=No update since 2010, the market has significantly evolved since then}}
There are several different markets in which CPUs are used. Since each of these markets differ in their requirements for CPUs, the devices designed for one market are in most cases inappropriate for the other markets.
There are several different markets in which CPUs are used. Since each of these markets differ in their requirements for CPUs, the devices designed for one market are in most cases inappropriate for the other markets.


===General-purpose computing===
===General-purpose computing===
{{As of|2010}}, in the general-purpose computing market, that is, desktop, laptop, and server computers commonly used in businesses and homes, the Intel [[IA-32]] and the 64-bit version [[x86-64]] architecture dominate the market, with its rivals [[PowerPC]] and [[SPARC]] maintaining much smaller customer bases. Yearly, hundreds of millions of IA-32 architecture CPUs are used by this market.  A growing percentage of these processors are for mobile implementations such as netbooks and laptops.<ref>Kerr, Justin. [http://www.maximumpc.com/article/news/amd_loses_market_share_mobile_cpu_sales_outsell_desktop_first_time "AMD Loses Market Share as Mobile CPU Sales Outsell Desktop for the First Time."]  Maximum PC. Published 2010-10-26.</ref>
In the general-purpose computing market (desktop, laptop, and server computers), processors implementing the x86-64 instruction set architecture remain widely used, with Intel and AMD as the primary suppliers. Within the x86 CPU market, Mercury Research estimated that Intel held 74.4% and AMD 25.6% of unit shipments in Q3 2025.<ref>{{Cite web |author1=Anton Shilov |date=2025-11-14 |title=AMD continues to chip away at Intel's X86 market share — company now sells over 25% of all x86 chips and powers 33% of all desktop systems |url=https://www.tomshardware.com/pc-components/cpus/amd-continues-to-chip-away-at-intels-x86-market-share-company-now-sells-over-25-percent-of-all-x86-chips-and-powers-33-percent-of-all-desktop-systems |access-date=2025-12-16 |website=Tom's Hardware |language=en}}</ref> Arm-based processors dominate smartphones and are also used in some PCs and servers; ABI Research forecast that Arm-based PCs would represent about 13% of total PC shipments in 2025, while IDC estimated that Arm-architecture servers would account for 21.1% of total server shipments in 2025.<ref>{{Cite web |title=2025 Will See AI PCs become the New Normal, but ARM-Based PCs Will Not Grow Out of Its Minority Segment |url=https://www.abiresearch.com/press/2025-will-see-ai-pcs-become-the-new-normal-but-arm-based-pcs-will-not-grow-out-of-its-minority-segment |access-date=2025-12-16 |website=www.abiresearch.com |language=en}}</ref> RISC-V has also seen growing adoption in embedded systems, and some vendors have announced RISC-V-based microcontroller families for automotive applications.<ref>{{Cite web |title=Infineon brings RISC-V to the automotive industry and i {{!}} Infineon Technologies |url=https://www.infineon.com/press-release/2025/infatv202503-067 |access-date=2025-12-16 |website=www.infineon.com |language=en}}</ref>


Since these devices are used to run countless different types of programs, these CPU designs are not specifically targeted at one type of application or one function. The demands of being able to run a wide range of programs efficiently has made these CPU designs among the more advanced technically, along with some disadvantages of being relatively costly, and having high power consumption.
Since these devices are used to run countless different types of programs, these CPU designs are not specifically targeted at one type of application or one function. The demands of being able to run a wide range of programs efficiently has made these CPU designs among the more advanced technically, along with some disadvantages of being relatively costly, and having high power consumption.
====High-end processor economics====
In 1984, most high-performance CPUs required four to five years to develop.<ref>
"New system manages hundreds of transactions per second" article
by Robert Horst and Sandra Metz, of Tandem Computers Inc.,
"Electronics" magazine, 1984 April 19:
"While most high-performance CPUs require four to five years to develop,
The [[NonStop (server computers)|NonStop]] TXP processor took just 2+1/2 years --
six months to develop a complete written specification,
one year to construct a working prototype,
and another year to reach volume production."
</ref>


===Scientific computing===
===Scientific computing===
{{Main|Supercomputer}}
{{Main|Supercomputer}}
Scientific computing is a much smaller niche market (in revenue and units shipped).  It is used in government research labs and universities. Before 1990, CPU design was often done for this market, but mass market CPUs organized into large clusters have proven to be more affordable. The main remaining area of active hardware design and research for scientific computing is for high-speed data transmission systems to connect mass market CPUs.
Scientific computing is a much smaller niche market (in revenue and units shipped).  It is used in government research labs and universities. Before 1990, CPU design was often done for this market, but mass market CPUs organized into large clusters have proven to be more affordable. The main remaining area of active hardware design and research for scientific computing is for high-speed data transmission systems to connect mass market CPUs.


===Embedded design===
===Embedded design===
{{Main|Embedded system}}
{{Main|Embedded system}}
As measured by units shipped, most CPUs are embedded in other machinery, such as telephones, clocks, appliances, vehicles, and infrastructure. Embedded processors sell in the volume of many billions of units per year, however, mostly at much lower price points than that of the general purpose processors.
As measured by units shipped, most CPUs are embedded in other machinery, such as telephones, clocks, appliances, vehicles, and infrastructure. Embedded processors sell in the volume of many billions of units per year, however, mostly at much lower price points than that of the general purpose processors.


Line 113: Line 99:
{{cite web| url = http://www.keil.com/dd/docs/datashts/evatronix/t8051_ds.pdf| title = T8051 Tiny 8051-compatible Microcontroller| archive-url = https://web.archive.org/web/20110929033902/https://www.keil.com/dd/docs/datashts/evatronix/t8051_ds.pdf| archive-date = 2011-09-29}}</ref><ref>To figure dollars per square millimeter, see [http://www.overclockers.com/forums/showthread.php?t=550542], and note that an SOC component has no pin or packaging costs.</ref>
{{cite web| url = http://www.keil.com/dd/docs/datashts/evatronix/t8051_ds.pdf| title = T8051 Tiny 8051-compatible Microcontroller| archive-url = https://web.archive.org/web/20110929033902/https://www.keil.com/dd/docs/datashts/evatronix/t8051_ds.pdf| archive-date = 2011-09-29}}</ref><ref>To figure dollars per square millimeter, see [http://www.overclockers.com/forums/showthread.php?t=550542], and note that an SOC component has no pin or packaging costs.</ref>


As of 2009, more CPUs are produced using the [[ARM architecture family]] instruction sets than any other 32-bit instruction set.<ref>
ARM architecture dominates embedded and mobile processor shipments globally. As of 2024, ARM-based processors account for the majority of all processor units shipped annually, driven by widespread adoption in smartphones, IoT devices, and microcontrollers. The original ARM architecture and first ARM chip were designed in approximately one and a half years with 5 human-years of effort.<ref>{{Cite web |title=Microcontroller Market Size & Share {{!}} Industry Report, 2033 |url=https://www.grandviewresearch.com/industry-analysis/microcontroller-market |access-date=2025-12-16 |website=www.grandviewresearch.com |language=en}}</ref>
[http://www.extremetech.com/extreme/52180-arm-cores-climb-into-3g-territory "ARM Cores Climb Into 3G Territory"] by Mark Hachman, 2002.
</ref><ref>
[http://www.embedded.com/electronics-blogs/significant-bits/4024488/The-Two-Percent-Solution "The Two Percent Solution"] by Jim Turley 2002.
</ref>
The ARM architecture and the first ARM chip were designed in about one and a half years and 5 human years of work time.<ref>[https://web.archive.org/web/20090606152116/http://atterer.net/acorn/arm.html "ARM's way"] 1998</ref>


The 32-bit [[Parallax Propeller]] microcontroller architecture and the first chip were designed by two people in about 10 human years of work time.<ref>{{Cite web
The 32-bit [[Parallax Propeller]] microcontroller architecture and the first chip were designed by two people in about 10 human years of work time.<ref>{{Cite web
Line 147: Line 128:
==== Soft microprocessor cores ====
==== Soft microprocessor cores ====
{{Main|Soft microprocessor}}
{{Main|Soft microprocessor}}
For embedded systems, the highest performance levels are often not needed or desired due to the power consumption requirements. This allows for the use of processors which can be totally implemented by [[logic synthesis]] techniques. These synthesized processors can be implemented in a much shorter amount of time, giving quicker [[time-to-market]].
For embedded systems, the highest performance levels are often not needed or desired due to the power consumption requirements. This allows for the use of processors which can be totally implemented by [[logic synthesis]] techniques. These synthesized processors can be implemented in a much shorter amount of time, giving quicker [[time-to-market]].


Line 169: Line 151:
* [[Network on a chip]]
* [[Network on a chip]]
* [[Process design kit]] – a set of documents created or accumulated for a semiconductor device production process
* [[Process design kit]] – a set of documents created or accumulated for a semiconductor device production process
* [[Uncore]]


== References ==
== References ==

Latest revision as of 05:26, 9 May 2026

CPU Intel 80486DX-50
CPU Intel 80486DX-50

Processor design is a subfield of computer engineering and electronics that deals with creating a processor, a key component of computer hardware. While historically focused on the central processing unit (CPU), modern design often involves system-on-chip (SoC) architectures[1], which integrate multiple processing units such as CPUs, graphics processing units (GPUs), and neural processing units (NPUs)[2] onto a single die or set of chiplets.[3][4]

The design process involves choosing an instruction set and a certain execution paradigm (e.g. VLIW or RISC) and results in a microarchitecture, which might be described in e.g. VHDL or Verilog. For microprocessor design, this description is then manufactured employing some of the various semiconductor device fabrication processes, resulting in a die which is bonded onto a chip carrier. This chip carrier is then soldered onto, or inserted into a socket on, a printed circuit board (PCB).

The mode of operation of any processor is the execution of lists of instructions. Instructions typically include those to compute or manipulate data values using registers, change or retrieve values in read/write memory, perform relational tests between data values and to control program flow.

Processor designs are often tested and validated on one or several FPGAs before sending the design of the processor to a foundry for semiconductor fabrication.[5]

Details

Basics

n is transferred through datapaths (such as ALUs and pipelines). These datapaths are controlled through logic by control units. Memory components include register files and caches to retain information, or certain actions. Clock circuitry maintains internal rhythms and timing through clock drivers, PLLs, and clock distribution networks. Pad transceiver circuitry which allows signals to be received and sent and a logic gate cell library which is used to implement the logic. Logic gates are the foundation for processor design as they are used to implement most of the processor's components.[6]

CPUs designed for high-performance markets might require custom (optimized or application-specific (see below)) designs for each of these items to achieve frequency, power-dissipation, and chip-area goals whereas CPUs designed for lower performance markets might lessen the implementation burden by acquiring some of these items by purchasing them as intellectual property. Control logic implementation techniques (logic synthesis using CAD tools) can be used to implement datapaths, register files, and clocks. Common logic styles used in CPU design include unstructured random logic, finite-state machines, microprogramming (common from 1965 to 1985), and programmable logic arrays (common in the 1980s, no longer common).

Specialized Accelerators

Modern processor designs increasingly rely on heterogeneous computing, integrating specialized accelerators alongside general-purpose cores. The most prominent addition is the Neural Processing Unit (NPU), designed specifically to execute machine learning mathematics (matrix multiplication) more efficiently than a standard CPU. This specialization allows for significant gains in performance-per-watt for AI workloads.[7][4]

Implementation logic

Device technologies used to implement CPU logic have changed over time. Early implementations used individual relays, vacuum tubes, and discrete components (transistors and diodes), and later small-scale integration TTL chips, but these are no longer used for CPUs. Programmable array logic and other programmable logic devices are also no longer used for CPUs in this role, and ECL gate arrays are now uncommon. CMOS gate arrays are no longer used for CPUs, while CMOS mass-produced integrated circuits account for most CPUs by volume. Custom CMOS ASICs are generally practical only for high-volume applications because of the engineering cost. Field-programmable gate arrays (FPGAs) remain common for soft microprocessors and are often used for reconfigurable computing.

A CPU design project generally has these major tasks:

Re-designing a CPU core to a smaller die area helps to shrink everything (a "photomask shrink"), resulting in the same number of transistors on a smaller die. It improves performance (smaller transistors switch faster), reduces power (smaller wires have less parasitic capacitance) and reduces cost (more CPUs fit on the same wafer of silicon). Releasing a CPU on the same size die, but with a smaller CPU core, keeps the cost about the same but allows higher levels of integration within one very-large-scale integration chip (additional cache, multiple CPUs or other components), improving performance and reducing overall system cost.

As with most complex electronic designs, the logic verification effort (proving that the design does not have bugs) now dominates the project schedule of a CPU.

Key CPU architectural innovations include accumulator, index register, general-purpose register, cache, virtual memory, instruction pipelining, superscalar, CISC, RISC, virtual machine, emulators, microprogram, and stack.

Microarchitectural concepts

Research topics

A variety of new CPU design ideas have been proposed, including reconfigurable logic, clockless CPUs, computational RAM, and optical computing.

Performance analysis and benchmarking

Benchmarking is a way of testing CPU speed. Examples include SPECint and SPECfp, developed by Standard Performance Evaluation Corporation, and ConsumerMark developed by the Embedded Microprocessor Benchmark Consortium EEMBC.

Some of the commonly used metrics include:

  • Instructions per second - Most consumers pick a computer architecture (normally Intel IA32 architecture) to be able to run a large base of pre-existing pre-compiled software. Being relatively uninformed on computer benchmarks, some of them pick a particular CPU based on operating frequency (see Megahertz Myth).
  • FLOPS - The number of floating point operations per second is often important in selecting computers for scientific computations.
  • Performance per watt - System designers building parallel computers, such as Google, pick CPUs based on their speed per watt of power, because the cost of powering the CPU outweighs the cost of the CPU itself.[8][9]
  • Some system designers building parallel computers pick CPUs based on the speed per dollar.
  • System designers building real-time computing systems want to guarantee worst-case response. That is easier to do when the CPU has low interrupt latency and when it has deterministic response. (DSP)
  • Computer programmers who program directly in assembly language want a CPU to support a full featured instruction set.
  • Low power - For systems with limited power sources (e.g. solar, batteries, human power).
  • Small size or low weight - for portable embedded systems, systems for spacecraft.
  • Environmental impact - Minimizing environmental impact of computers during manufacturing and recycling as well during use. Reducing waste, reducing hazardous materials. (see Green computing).

There may be tradeoffs in optimizing some of these metrics. In particular, many design techniques that make a CPU run faster make the "performance per watt", "performance per dollar", and "deterministic response" much worse, and vice versa.

Markets

There are several different markets in which CPUs are used. Since each of these markets differ in their requirements for CPUs, the devices designed for one market are in most cases inappropriate for the other markets.

General-purpose computing

In the general-purpose computing market (desktop, laptop, and server computers), processors implementing the x86-64 instruction set architecture remain widely used, with Intel and AMD as the primary suppliers. Within the x86 CPU market, Mercury Research estimated that Intel held 74.4% and AMD 25.6% of unit shipments in Q3 2025.[10] Arm-based processors dominate smartphones and are also used in some PCs and servers; ABI Research forecast that Arm-based PCs would represent about 13% of total PC shipments in 2025, while IDC estimated that Arm-architecture servers would account for 21.1% of total server shipments in 2025.[11] RISC-V has also seen growing adoption in embedded systems, and some vendors have announced RISC-V-based microcontroller families for automotive applications.[12]

Since these devices are used to run countless different types of programs, these CPU designs are not specifically targeted at one type of application or one function. The demands of being able to run a wide range of programs efficiently has made these CPU designs among the more advanced technically, along with some disadvantages of being relatively costly, and having high power consumption.

Scientific computing

Scientific computing is a much smaller niche market (in revenue and units shipped). It is used in government research labs and universities. Before 1990, CPU design was often done for this market, but mass market CPUs organized into large clusters have proven to be more affordable. The main remaining area of active hardware design and research for scientific computing is for high-speed data transmission systems to connect mass market CPUs.

Embedded design

As measured by units shipped, most CPUs are embedded in other machinery, such as telephones, clocks, appliances, vehicles, and infrastructure. Embedded processors sell in the volume of many billions of units per year, however, mostly at much lower price points than that of the general purpose processors.

These single-function devices differ from the more familiar general-purpose CPUs in several ways:

  • Low cost is of high importance.
  • It is important to maintain a low power dissipation as embedded devices often have a limited battery life and it is often impractical to include cooling fans.
  • To give lower system cost, peripherals are integrated with the processor on the same silicon chip.
  • Keeping peripherals on-chip also reduces power consumption as external GPIO ports typically require buffering so that they can source or sink the relatively high current loads that are required to maintain a strong signal outside of the chip.
    • Many embedded applications have a limited amount of physical space for circuitry; keeping peripherals on-chip will reduce the space required for the circuit board.
    • The program and data memories are often integrated on the same chip. When the only allowed program memory is ROM, the device is known as a microcontroller.
  • For many embedded applications, interrupt latency will be more critical than in some general-purpose processors.

Embedded processor economics

The embedded CPU family with the largest number of total units shipped is the 8051, averaging nearly a billion units per year.[13] The 8051 is widely used because it is very inexpensive. The design time is now roughly zero, because it is widely available as commercial intellectual property. It is now often embedded as a small part of a larger system on a chip. The silicon cost of an 8051 is now as low as US$0.001, because some implementations use as few as 2,200 logic gates and take 0.4730 square millimeters of silicon.[14][15]

ARM architecture dominates embedded and mobile processor shipments globally. As of 2024, ARM-based processors account for the majority of all processor units shipped annually, driven by widespread adoption in smartphones, IoT devices, and microcontrollers. The original ARM architecture and first ARM chip were designed in approximately one and a half years with 5 human-years of effort.[16]

The 32-bit Parallax Propeller microcontroller architecture and the first chip were designed by two people in about 10 human years of work time.[17]

The 8-bit AVR architecture and first AVR microcontroller was conceived and designed by two students at the Norwegian Institute of Technology.

The 8-bit 6502 architecture and the first MOS Technology 6502 chip were designed in 13 months by a group of about 9 people.[18]

Research and educational CPU design

The 32-bit Berkeley RISC I and RISC II processors were mostly designed by a series of students as part of a four quarter sequence of graduate courses.[19] This design became the basis of the commercial SPARC processor design.

For about a decade, every student taking the 6.004 class at MIT was part of a team—each team had one semester to design and build a simple 8 bit CPU out of 7400 series integrated circuits. One team of 4 students designed and built a simple 32 bit CPU during that semester.[20]

Some undergraduate courses require a team of 2 to 5 students to design, implement, and test a simple CPU in a FPGA in a single 15-week semester.[21]

The MultiTitan CPU was designed with 2.5 man years of effort, which was considered "relatively little design effort" at the time.[22] 24 people contributed to the 3.5 year MultiTitan research project, which included designing and building a prototype CPU.[23]

Soft microprocessor cores

For embedded systems, the highest performance levels are often not needed or desired due to the power consumption requirements. This allows for the use of processors which can be totally implemented by logic synthesis techniques. These synthesized processors can be implemented in a much shorter amount of time, giving quicker time-to-market.

See also

References

  1. Template:Cite thesis
  2. "What is a Neural Processing Unit (NPU)? | IBM". www.ibm.com. 2024-09-27. Retrieved 2025-12-16.
  3. Vaithianathan, Muthukumaran (2025). "The Future of Heterogeneous Computing: Integrating CPUs, GPUs, and FPGAs for High-Performance Applications". International Journal of Emerging Trends in Computer Science and Information Technology. 6: 12–23. doi:10.63282/3050-9246.ijetcsit-v6i1p102 Check |doi= value (help). ISSN 3050-9246.
  4. 4.0 4.1 "Analysis: Five Key Trends for Compute in 2025 | TechInsights". www.techinsights.com. Retrieved 2025-12-16.
  5. Cutress, Ian (August 27, 2019). "Xilinx Announces World Largest FPGA: Virtex Ultrascale+ VU19P with 9m Cells". AnandTech. Archived from the original on August 27, 2019.
  6. Deschamps, Jean-Pierre; Valderrama, Elena; Terés, Lluís (12 October 2016). Digital Systems: From Logic Gates to Processors. Springer. ISBN 978-3-319-41198-9.
  7. Fleischer, Adam J. (Mar 17, 2025). "6 Top Microprocessor Trends for 2025".
  8. "EEMBC ConsumerMark". Archived from the original on March 27, 2005.
  9. Stephen Shankland (December 9, 2005). "Power could cost more than servers, Google warns". ZDNet.
  10. Anton Shilov (2025-11-14). "AMD continues to chip away at Intel's X86 market share — company now sells over 25% of all x86 chips and powers 33% of all desktop systems". Tom's Hardware. Retrieved 2025-12-16.
  11. "2025 Will See AI PCs become the New Normal, but ARM-Based PCs Will Not Grow Out of Its Minority Segment". www.abiresearch.com. Retrieved 2025-12-16.
  12. "Infineon brings RISC-V to the automotive industry and i | Infineon Technologies". www.infineon.com. Retrieved 2025-12-16.
  13. Curtis A. Nelson. "8051 Overview" (PDF). Archived from the original (PDF) on 2011-10-09. Retrieved 2011-07-10.
  14. "T8051 Tiny 8051-compatible Microcontroller" (PDF). Archived from the original (PDF) on 2011-09-29.
  15. To figure dollars per square millimeter, see [1], and note that an SOC component has no pin or packaging costs.
  16. "Microcontroller Market Size & Share | Industry Report, 2033". www.grandviewresearch.com. Retrieved 2025-12-16.
  17. Gracey, Chip. "Why the Propeller Works" (PDF). Archived from the original (PDF) on 2009-04-19.
  18. "Interview with William Mensch". Archived from the original on 2016-03-04. Retrieved 2009-02-01.
  19. C.H. Séquin; D.A. Patterson. "Design and Implementation of RISC I" (PDF). Archived (PDF) from the original on 2006-03-05.
  20. "the VHS". Archived from the original on 2010-02-27.
  21. Jan Gray. "Teaching Computer Design with FPGAs".
  22. Jouppi, N.P.; Tang, J.Y.-F. (October 1989). "A 20-MIPS sustained 32-bit CMOS microprocessor with high ratio of sustained to peak performance". IEEE Journal of Solid-State Circuits. 24 (5): 1348–1359. Bibcode:1989IJSSC..24.1348J. doi:10.1109/JSSC.1989.572612.
  23. "MultiTitan: Four Architecture Papers" (PDF). 1988. pp. 4–5. Archived (PDF) from the original on 2004-08-25.

General references

Template:CPU technologies Template:Design