Books > Computing & IT > Computer hardware & operating systems > Computer architecture & logic design
|
Buy Now
Chip Multiprocessor Architecture - Techniques to Improve Throughput and Latency (Paperback)
Loot Price: R1,127
Discovery Miles 11 270
|
|
Chip Multiprocessor Architecture - Techniques to Improve Throughput and Latency (Paperback)
Series: Synthesis Lectures on Computer Architecture
Expected to ship within 10 - 15 working days
|
Chip multiprocessors - also called multi-core microprocessors or
CMPs for short - are now the only way to build high-performance
microprocessors, for a variety of reasons. Large uniprocessors are
no longer scaling in performance, because it is only possible to
extract a limited amount of parallelism from a typical instruction
stream using conventional superscalar instruction issue techniques.
In addition, one cannot simply ratchet up the clock speed on
today's processors, or the power dissipation will become
prohibitive in all but water-cooled systems. Compounding these
problems is the simple fact that with the immense numbers of
transistors available on today's microprocessor chips, it is too
costly to design and debug ever-larger processors every year or
two. CMPs avoid these problems by filling up a processor die with
multiple, relatively simpler processor cores instead of just one
huge core. The exact size of a CMP's cores can vary from very
simple pipelines to moderately complex superscalar processors, but
once a core has been selected the CMP's performance can easily
scale across silicon process generations simply by stamping down
more copies of the hard-to-design, high-speed processor core in
each successive chip generation. In addition, parallel code
execution, obtained by spreading multiple threads of execution
across the various cores, can achieve significantly higher
performance than would be possible using only a single core. While
parallel threads are already common in many useful workloads, there
are still important workloads that are hard to divide into parallel
threads. The low inter-processor communication latency between the
cores in a CMP helps make a much wider range of applications viable
candidates for parallel execution than was possible with
conventional, multi-chip multiprocessors; nevertheless, limited
parallelism in key applications is the main factor limiting
acceptance of CMPs in some types of systems. After a discussion of
the basic pros and cons of CMPs when they are compared with
conventional uniprocessors, this book examines how CMPs can best be
designed to handle two radically different kinds of workloads that
are likely to be used with a CMP: highly parallel,
throughput-sensitive applications at one end of the spectrum, and
less parallel, latency-sensitive applications at the other.
Throughput-sensitive applications, such as server workloads that
handle many independent transactions at once, require careful
balancing of all parts of a CMP that can limit throughput, such as
the individual cores, on-chip cache memory, and off-chip memory
interfaces. Several studies and example systems, such as the Sun
Niagara, that examine the necessary tradeoffs are presented here.
In contrast, latency-sensitive applications - many desktop
applications fall into this category - require a focus on reducing
inter-core communication latency and applying techniques to help
programmers divide their programs into multiple threads as easily
as possible. This book discusses many techniques that can be used
in CMPs to simplify parallel programming, with an emphasis on
research directions proposed at Stanford University. To illustrate
the advantages possible with a CMP using a couple of solid
examples, extra focus is given to thread-level speculation (TLS), a
way to automatically break up nominally sequential applications
into parallel threads on a CMP, and transactional memory. This
model can greatly simplify manual parallel programming by using
hardware - instead of conventional software locks - to enforce
atomic code execution of blocks of instructions, a technique that
makes parallel coding much less error-prone. Contents: The Case for
CMPs / Improving Throughput / Improving Latency Automatically /
Improving Latency using Manual Parallel Programming / A Multicore
World: The Future of CMPs
General
Is the information for this product incomplete, wrong or inappropriate?
Let us know about it.
Does this product have an incorrect or missing image?
Send us a new image.
Is this product missing categories?
Add more categories.
Review This Product
No reviews yet - be the first to create one!
|
You might also like..
|