35 lines
1.2 KiB
Plaintext
35 lines
1.2 KiB
Plaintext
Times for big.seq and big.unk on mp:
|
|
|
|
Sequential: 5.33091 sec
|
|
Threaded:
|
|
num_threads time
|
|
1 10.7522
|
|
2 11.1476
|
|
4 15.0832
|
|
8 36.2662
|
|
|
|
The threaded version is slower in general than the sequential version because
|
|
there is a lot more logic involved in each step for calculating the indices
|
|
into the matrix and traversing it along diagonals instead of row-wise.
|
|
With more threads, the threaded version also has to synchronize more (each of
|
|
the threads does a barrier wait at the end of each step). Since some threads
|
|
do not have much work, especially at the beginning and end of the algorithm,
|
|
this leads to more time being taken doing the synchronization than the actual
|
|
computation.
|
|
|
|
In contrast, when the "unknown" search sequence grows in length and the known
|
|
database sequence shrinks, such that the two sequences are "more even" in
|
|
length, then the threaded version does much better because there is less
|
|
synchronization required compared to the amount of actual computation being
|
|
done:
|
|
|
|
Times for more-even.seq and more-even.unk on mp:
|
|
|
|
Sequential: 1.34995 sec
|
|
Threaded:
|
|
num_threads time
|
|
1 10.0506
|
|
2 4.26101
|
|
4 2.7733
|
|
8 2.64885
|