finishing up report
git-svn-id: svn://anubis/gvsu@408 45c1a28c-8058-47b2-ae61-ca45b979098e
This commit is contained in:
parent
fbe0691460
commit
ea4e68b769
@ -206,6 +206,33 @@
|
||||
In this way, one worker process was spawned per core on the
|
||||
slave node.
|
||||
</p>
|
||||
<p>
|
||||
I had originally planned on implementing fault-tolerance in the
|
||||
distribution architecture by establishing a second TCP connection
|
||||
from each slave node to the master which served as a polling
|
||||
connection to make sure that the slaves were still alive.
|
||||
During implementation, I arrived at a more elegant solution.
|
||||
I was already keeping track of the set of tasks that were
|
||||
considered "in progress" as far as the master process was concerned.
|
||||
If the master process received a request from a slave for a task
|
||||
to work on, it would normally respond with the next available task
|
||||
number until all tasks had been given out, and then it would
|
||||
respond saying that there were no more tasks to work on.
|
||||
I changed this slightly so that if the master got a request from
|
||||
a slave for a task to work on, and all of the tasks were already
|
||||
given out, then the master would respond to the slave with a
|
||||
task ID from the set of tasks that were currently in progress.
|
||||
That way, whether the original slave node or the new one finished
|
||||
the task, the data for it would be collected.
|
||||
If the original node was dead, then the new slave node would
|
||||
take over the task and return the data.
|
||||
If the original node was alive, but just responding very slowly,
|
||||
then the replacement node could finish the task and return
|
||||
the results before the original node.
|
||||
This ended up working very well, as I was able to kill all of
|
||||
the worker processes on a given slave node and the tasks
|
||||
that they were working on were finished by other nodes later on.
|
||||
</p>
|
||||
|
||||
<a name="evaluation" />
|
||||
<h4>Evaluation</h4>
|
||||
|
Loading…
x
Reference in New Issue
Block a user