|
|
|
Modern Operating Systems by Herbert Bos and Andrew S. Tanenb...
Modern_Operating_Systems_by_Herbert_Bos_and_Andrew_S._Tanenbaum_4th_Ed.pdf
Showing 555-556 out of 1137
Modern Operating Systems by Herbert Bos and Andrew...
Modern_Operating_Systems_by_Herbert_Bos_and_Andrew_S._Tanenbaum_4th_Ed.pdf-M ODERN O PERATING S YSTEMS
Modern Operating Systems by Herbert...
Modern_Operating_Systems_by_Herbert_Bos_and_Andrew_S._Tanenbaum_4th_Ed.pdf-M ODERN O PERATING S YSTEMS
Page 555
524
MULTIPLE PROCESSOR SYSTEMS
CHAP. 8
CPUs
b
b
b
b
a
a
a
a
3 Stages
Memories
000
001
010
011
100
101
110
111
000
001
010
011
100
101
110
111
1A
1B
1C
1D
2A
2B
2C
2D
3A
3B
3C
3D
Figure 8-5.
An omega switching network.
All the second-stage switches, including 2D, use the second bit for routing.
This, too, is a 1, so the message is now forwarded via the lower output to 3D.
Here
the third bit is tested and found to be a 0.
Consequently, the message goes out on
the upper output and arrives at memory 110, as desired. The path followed by this
message is marked in Fig. 8-5 by the letter
a
.
As the message moves through the switching network, the bits at the left-hand
end of the module number are no longer needed. They can be put to good use by
recording the incoming line number there, so the reply can find its way back. For
path
a
, the incoming lines are 0 (upper input to 1D), 1 (lower input to 2D), and 1
(lower input to 3D), respectively. The reply is routed back using 011, only reading
it from right to left this time.
At the same time all this is going on, CPU 001 wants to write a word to memo-
ry module 001.
An analogous process happens here, with the message routed via
the upper, upper, and lower outputs, respectively, marked by the letter
b
.
When it
arrives, its
Module
field reads 001, representing the path it took. Since these two
requests do not use any of the same switches, lines, or memory modules, they can
proceed in parallel.
Now consider what would happen if CPU 000 simultaneously wanted to access
memory module 000.
Its request would come into conflict with CPU 001’s request
at switch 3A.
One of them would then have to wait. Unlike the crossbar switch,
the omega network is a
blocking network
.
Not every set of requests can be proc-
essed simultaneously. Conflicts can occur over the use of a wire or a switch, as
well as between requests
to
memory and replies
from
memory.
Since it is highly desirable to spread the memory references uniformly across
the modules, one common technique is to use the low-order bits as the module
number.
Consider, for example, a byte-oriented address space for a computer that
Page 556
SEC. 8.1
MULTIPROCESSORS
525
mostly accesses full 32-bit words. The 2 low-order bits will usually be 00, but the
next 3 bits will be uniformly distributed. By using these 3 bits as the module num-
ber, consecutively words will be in consecutive modules. A memory system in
which consecutive words are in different modules is said to be
interleaved
. Inter-
leaved memories maximize parallelism because most memory references are to
consecutive addresses. It is also possible to design switching networks that are
nonblocking and offer multiple paths from each CPU to each memory module to
spread the traffic better.
NUMA Multiprocessors
Single-bus UMA multiprocessors are generally limited to no more than a few
dozen CPUs, and crossbar or switched multiprocessors need a lot of (expensive)
hardware and are not that much bigger.
To get to more than 100 CPUs, something
has to give.
Usually, what gives is the idea that all memory modules have the same
access time. This concession leads to the idea of NUMA multiprocessors, as men-
tioned above.
Like their UMA cousins, they provide a single address space across
all the CPUs, but unlike the UMA machines, access to local memory modules is
faster than access to remote ones. Thus all UMA programs will run without change
on NUMA machines, but the performance will be worse than on a UMA machine.
NUMA machines have three key characteristics that all of them possess and
which together distinguish them from other multiprocessors:
1.
There is a single address space visible to all CPUs.
2.
Access to remote memory is via
LOAD
and
STORE
instructions.
3.
Access to remote memory is slower than access to local memory.
When the access time to remote memory is not hidden (because there is no cach-
ing), the system is called
NC-NUMA
(
Non Cache-coherent NUMA
). When the
caches are coherent, the system is called
CC-NUMA
(
Cache-Coherent NUMA
).
A popular approach for building large CC-NUMA multiprocessors is the
directory-based multiprocessor
.
The idea is to maintain a database telling where
each cache line is and what its status is. When a cache line is referenced, the data-
base is queried to find out where it is and whether it is clean or dirty.
Since this
database is queried on every instruction that touches memory, it must be kept in ex-
tremely fast special-purpose hardware that can respond in a fraction of a bus cycle.
To make the idea of a directory-based multiprocessor somewhat more concrete,
let us consider as a simple (hypothetical) example, a 256-node system, each node
consisting of one CPU and 16 MB of RAM connected to the CPU via a local bus.
The total memory is 2
32
bytes and it is divided up into 2
26
cache lines of 64 bytes
each. The memory is statically allocated among the nodes, with 0–16M in node 0,
16M–32M in node 1, etc. The nodes are connected by an interconnection network,
Ace your assessments! Get Better Grades
Browse thousands of Study Materials & Solutions from your Favorite Schools
Concordia University
Concordia_University
School:
Operating_Systems
Course:
Introducing Study Plan
Using AI Tools to Help you understand and remember your course concepts better and faster than any other resource.
Find the best videos to learn every concept in that course from Youtube and Tiktok without searching.
Save All Relavent Videos & Materials and access anytime and anywhere
Prepare Smart and Guarantee better grades
Students also viewed documents
lab 18.docx
lab_18.docx
Course
Course
3
Module5QuizSTA2023.d...
Module5QuizSTA2023.docx.docx
Course
Course
10
Week 7 Test Math302....
Week_7_Test_Math302.docx.docx
Course
Course
30
Chapter 1 Assigment ...
Chapter_1_Assigment_Questions.docx.docx
Course
Course
5
Week 4 tests.docx.do...
Week_4_tests.docx.docx
Course
Course
23
Week 6 tests.docx.do...
Week_6_tests.docx.docx
Course
Course
106