% Preamble \documentclass[11pt,fleqn]{article} \usepackage{amsmath, amsthm, amssymb} \usepackage{fancyhdr} \oddsidemargin -0.25in \textwidth 6.75in \topmargin -0.5in \headheight 0.75in \headsep 0.25in \textheight 8.75in \pagestyle{fancy} \renewcommand{\headrulewidth}{0pt} \renewcommand{\footrulewidth}{0pt} \fancyhf{} \lhead{HW Chap. 6\\\ \\\ } \rhead{Josh Holtrop\\2008-11-19\\CS 677} \rfoot{\thepage} \begin{document} \noindent \begin{enumerate} \item[1.]{ The obvious benefit of using non-blocking communication is that the calling process does not block while the communication is taking place. This leaves the process free to do further computation or work on something else. Another benefit of non-blocking communication is that certain types of deadlocks can be avoided by not having the Send function block until the message is received (for example, when two machines are both doing a send operation followed by a receive operation). One challenge to using non-blocking communication is that it is harder to program safely. The programmer must take more care to ensure that data being used in non-blocking communication is not modified while it is being used. A second challenge with using non-blocking communication is that if synchronization is necessary (i.e. the sender wants to know when the message was received), then they must manually poll to obtain this information. } \vskip 1em \item[2.]{ Assume the mesh size is $n \times n$. Further assume that the node doing the scatter is located in the top left corner of the mesh. First, the total data is divided $n$ ways. 1 of these sections are sent to the node doing the scattering, and $n-1$ of these are sent to the next node down. The nodes along the left continue sending data sections down, each taking one section for itself. Then, all of the left edges break each of these $n$ sections into $n$ parts, and repeat the previous process to distribute these $n$ subparts to the nodes to their right. This method will take $2n$ steps to complete. The message transfer size is not fixed. The messages at the beginning (closer to the scattering node) are larger and subsequent messages get smaller and smaller as they get further from the scattering node. } \vskip 1em \item[3.]{ I wrote an MPI application which incremented a length variable from 100 to 100000, and then sent and received a message of that length (the master did a send, then receive, while the slave did a receive and then a send). For each length value, I repeated this test 100 times and averaged the times to get the final round-trip time. Finally, I took the length value divided by the round-trip time to get the round-trip bytes per second that could be sent from one MPI host to another and back again. I recorded the length that gave the highest round-trip throughput value and printed that out at the end of the test. Unfortunately, each time I ran the test the optimal length value significantly. Sometimes it printed 6700 or 8400, and sometimes 27000 or 86000. So, it varied a lot. I am not sure if this is because once you reach a certain length the data is simply packed by MPI into the same-sized packets for transfer, or for some other reason all transfers take about the same amount of time so it is relatively random which one comes up most efficient. } \end{enumerate} \end{document}