Inside the Machine: An Illustrated Introduction to Microprocessors and Computer Architecture by jon stokes
Author:jon stokes [user]
Language: eng
Format: epub
Tags: Computers, Systems Architecture, General, Microprocessors
ISBN: 9781593271046
Google: Q1zSIarI8xoC
Amazon: B004OEJO0A
Publisher: No Starch Press
Published: 2007-11-15T03:45:30.222000+00:00
RS
RS
RS
RS
VPU-1
VSIU-1
VCIU-1 VFPU-1
FPU-1
IU1a-1
IU2-1
LSU-1
VPU-2
Finish
VCIU-2 VFPU-2
FPU-2
Finish
LSU-2
VCIU-3 VFPU-3
FPU-3
Vector
Load-
VCIU-4 VFPU-4
FPU-4
Permute
Store
Unit
FPU-5
Integer Unit
Unit
Finish
Vector ALU
FPU
Memory Access
Vector Arithmetic Logic Units
Scalar Arithmetic Logic Units
Units
Back End
Completion
Queue
Write
Commit Unit
Figure 7-4: The basic microarchitecture of the G4e
144
Chapter 7
Before instructions can enter the G4e’s pipeline, they have to be avail-
able in its 32KB instruction cache. This instruction cache, together with the
32KB data cache, makes up the G4e’s 64KB L1 cache. An instruction leaves
the L1 and goes down through the various front-end stages until it hits the
back end, at which point it’s executed by one of the G4e’s eight execution
units (not counting the branch execution unit, which we’ll talk about in a
second).
As I’ve already noted, the G4e breaks down the G4’s classic, four-stage
pipeline into seven, shorter stages:
G4
G4e
1
Fetch
1
Fetch-1
2
Fetch-2
2
Decode/dispatch
3
Decode/dispatch
4
Issue
3
Execute
5
Execute
6
Complete
4
Write-back
7
Write-back (Commit)
Notice that the G4e dedicates one pipeline stage each to the character-
istic issue and complete phases that bracket the out-of-order execution phase
of a dynamically scheduled instruction’s lifecycle.
Let’s take a quick look at the basic pipeline stages of the G4e, because
this will highlight some of the ways in which the G4e differs from the original
G4. Also, an understanding of the G4e’s more classic RISC pipeline will
provide you with a good foundation for our upcoming discussion of the
Pentium 4’s much longer, more peculiar pipeline.
Stages 1 and 2: Instruction Fetch
These two stages are both dedicated primarily to grabbing an instruction
from the L1 cache. Like its predecessor, the G4, the G4e can fetch up to four
instructions per clock cycle from the L1 cache and send them on to the next
stage. Hopefully, the needed instructions are in the L1 cache. If they aren’t
in the L1 cache, the G4e has to hit the much slower L2 cache to find them,
which can add up to nine cycles of delay into the instruction pipeline.
Stage 3: Decode/Dispatch
Once an instruction has been fetched, it goes into the G4e’s 12-entry instruc-
tion queue to be decoded. Once instructions are decoded, they’re dispatched
at a rate of up to three non-branch instructions per cycle to the proper issue queue.
Note that the G4e’s dispatch logic dispatches instructions to the issue
queues in accordance with “The Four Rules of Instruction Dispatch” on
page 127. The only modification to the rules is in the issue buffer rule; instead Intel’s Pentium 4 vs. Motorola’s G4e: Approaches and Design Philosophies
145
of requiring that the proper execution unit and reservation station be
available before an instruction can be dispatched, the G4e requires that
there be space in one of the three issue queues.
Stage 4: Issue
The issue stage is the place where the G4e differs the most from the G4.
Specifically, the presence of the G4e’s three issue queues endows it with
power and flexibility that the G4 lacks.
As you learned in Chapter 6, instructions can stall in the original G4’s
dispatch stage if there is no execution unit available to take them. The G4e
eliminates this potential dispatch stall condition by placing a set of buffers,
called issue queues, in between the dispatch stage and the reservation
stations. On the G4e, it doesn’t matter if the execution units are busy and
their reservation stations are full; an instruction can still dispatch to the
back end if there is space in the proper issue queue.
Download
This site does not store any files on its server. We only index and link to content provided by other sites. Please contact the content providers to delete copyright contents if any and email us, we'll remove relevant links or contents immediately.
Linux Device Driver Development Cookbook by Rodolfo Giometti(3939)
Embedded Programming with Modern C++ Cookbook by Igor Viarheichyk(3661)
Implementing Cellular IoT Solutions for Digital Transformation by Dennis McCain(3596)
Embedded Linux Development Using Yocto Project - Third Edition by Otavio Salvador & Daiane Angolini(3430)
TinyML Cookbook by Gian Marco Iodice(3351)
Simplifying 3D Printing with OpenSCAD by Colin Dow(2845)
TinyML Cookbook by Gian Marco Iodice & Ronan Naughton(2610)
Fusion 360 for Makers by Lydia Sloan Cline(2225)
Networking A Beginner's Guide by Bruce Hallberg(2216)
Hands-On Linux for Architects by Denis Salamanca(2056)
But How Do It Know? by J. Clark Scott(2036)
Computers For Seniors For Dummies by Nancy C. Muir(2009)
Raspberry Pi and MQTT Essentials by Dhairya Parikh(1961)
Arduino Project Handbook, Volume 2: 25 Simple Electronics Projects for Beginners by Geddes Mark(1954)
9781803246888-ENHANCING DEEP LEARNING WITH BAYESIAN INFERENCE by Unknown(1906)
Hack and HHVM by Owen Yamauchi(1887)
31 Days Before Your CompTIA A+ Exams (Shanette Luellen's Library) by Benjamin Patrick Conry(1867)
MicroPython Projects by Jacob Beningo(1746)
Hands-On Internet of Things with MQTT by Tim Pulver(1717)
