finishing up report

git-svn-id: svn://anubis/gvsu@408 45c1a28c-8058-47b2-ae61-ca45b979098e
2009-04-16 02:49:37 +00:00 · 2009-04-16 02:49:37 +00:00 · ea4e68b769
commit ea4e68b769
parent fbe0691460
1 changed files with 27 additions and 0 deletions
--- a/cs658/html/report.html
+++ b/cs658/html/report.html
@ -206,6 +206,33 @@
        In this way, one worker process was spawned per core on the
        slave node.
    </p>
+    <p>
+        I had originally planned on implementing fault-tolerance in the
+        distribution architecture by establishing a second TCP connection
+        from each slave node to the master which served as a polling
+        connection to make sure that the slaves were still alive.
+        During implementation, I arrived at a more elegant solution.
+        I was already keeping track of the set of tasks that were
+        considered "in progress" as far as the master process was concerned.
+        If the master process received a request from a slave for a task
+        to work on, it would normally respond with the next available task
+        number until all tasks had been given out, and then it would
+        respond saying that there were no more tasks to work on.
+        I changed this slightly so that if the master got a request from
+        a slave for a task to work on, and all of the tasks were already
+        given out, then the master would respond to the slave with a
+        task ID from the set of tasks that were currently in progress.
+        That way, whether the original slave node or the new one finished
+        the task, the data for it would be collected.
+        If the original node was dead, then the new slave node would
+        take over the task and return the data.
+        If the original node was alive, but just responding very slowly,
+        then the replacement node could finish the task and return
+        the results before the original node.
+        This ended up working very well, as I was able to kill all of
+        the worker processes on a given slave node and the tasks
+        that they were working on were finished by other nodes later on.
+    </p>

    <a name="evaluation" />
    <h4>Evaluation</h4>