91 lines
3.5 KiB
TeX
91 lines
3.5 KiB
TeX
% Preamble
|
|
\documentclass[11pt,fleqn]{article}
|
|
\usepackage{amsmath, amsthm, amssymb}
|
|
\usepackage{fancyhdr}
|
|
\oddsidemargin -0.25in
|
|
\textwidth 6.75in
|
|
\topmargin -0.5in
|
|
\headheight 0.75in
|
|
\headsep 0.25in
|
|
\textheight 8.75in
|
|
\pagestyle{fancy}
|
|
\renewcommand{\headrulewidth}{0pt}
|
|
\renewcommand{\footrulewidth}{0pt}
|
|
\fancyhf{}
|
|
\lhead{HW Chap. 6\\\ \\\ }
|
|
\rhead{Josh Holtrop\\2008-11-19\\CS 677}
|
|
\rfoot{\thepage}
|
|
|
|
\begin{document}
|
|
|
|
\noindent
|
|
\begin{enumerate}
|
|
\item[1.]{
|
|
The obvious benefit of using non-blocking communication is that
|
|
the calling process does not block while the communication is taking
|
|
place.
|
|
This leaves the process free to do further computation or work on
|
|
something else.
|
|
Another benefit of non-blocking communication is that certain types
|
|
of deadlocks can be avoided by not having the Send function block
|
|
until the message is received (for example, when two machines are
|
|
both doing a send operation followed by a receive operation).
|
|
|
|
One challenge to using non-blocking communication is that it is
|
|
harder to program safely.
|
|
The programmer must take more care to ensure that data being used
|
|
in non-blocking communication is not modified while it is being used.
|
|
A second challenge with using non-blocking communication is that if
|
|
synchronization is necessary (i.e. the sender wants to know when
|
|
the message was received), then they must manually poll to obtain
|
|
this information.
|
|
}
|
|
|
|
\vskip 1em
|
|
\item[2.]{
|
|
Assume the mesh size is $n \times n$.
|
|
Further assume that the node doing the scatter is located in the
|
|
top left corner of the mesh.
|
|
First, the total data is divided $n$ ways.
|
|
1 of these sections are sent to the node doing the scattering,
|
|
and $n-1$ of these are sent to the next node down.
|
|
The nodes along the left continue sending data sections down, each
|
|
taking one section for itself.
|
|
Then, all of the left edges break each of these $n$ sections into
|
|
$n$ parts, and repeat the previous process to distribute these
|
|
$n$ subparts to the nodes to their right.
|
|
|
|
This method will take $2n$ steps to complete.
|
|
The message transfer size is not fixed.
|
|
The messages at the beginning (closer to the scattering node)
|
|
are larger and subsequent messages get smaller and smaller as
|
|
they get further from the scattering node.
|
|
}
|
|
|
|
\vskip 1em
|
|
\item[3.]{
|
|
I wrote an MPI application which incremented a length variable from
|
|
100 to 100000, and then sent and received a message of that length
|
|
(the master did a send, then receive, while the slave did a receive
|
|
and then a send).
|
|
For each length value, I repeated this test 100 times and averaged the
|
|
times to get the final round-trip time.
|
|
Finally, I took the length value divided by the round-trip time to get
|
|
the round-trip bytes per second that could be sent from one MPI host
|
|
to another and back again.
|
|
I recorded the length that gave the highest round-trip throughput value
|
|
and printed that out at the end of the test.
|
|
Unfortunately, each time I ran the test the optimal length value
|
|
significantly.
|
|
Sometimes it printed 6700 or 8400, and sometimes 27000 or 86000.
|
|
So, it varied a lot.
|
|
I am not sure if this is because once you reach a certain length the
|
|
data is simply packed by MPI into the same-sized packets for transfer,
|
|
or for some other reason all transfers take about the same amount of
|
|
time so it is relatively random which one comes up most efficient.
|
|
}
|
|
|
|
\end{enumerate}
|
|
|
|
\end{document}
|