Using state-of-the-art multi-threading computing concept, we implement a new parallel version of hmmpfam on EARTH (Efficient Architecture for Running Threads). EARTH is an event-driven fine-grain multi-threaded programming execution model, which supports fine-grain, non-preemptive fibers, developed by CAPSL (Computer Architecture and Parallel System Laboratory) at the University of Delaware. In its current implementations, the EARTH multi-threaded execution model is built with off-the-shelf microprocessors in a distributed memory environment. The EARTH runtime system (version 2.5) performs fiber scheduling, inter-node communication, inter-fiber synchronization, global memory management, dynamic load balancing and SMP node support. The EARTH architecture executes applications coded in Threaded-C, a multi-threaded extension of C.
For parallelizing hmmpfam, we develop two different schemes: one pre-determines job distribution on all computing nodes by a round-robin algorithm; the other takes advantage of the dynamic load balancing support of EARTH Runtime system, which simplifies the programmer's coding work by making the job distribution completely transparent.
In this poster, we will show a detailed analysis of the hmmpfam program and different parallel schemes, and some basic concepts regarding multi-threaded parallelization of HMM-pfam on EARTH RTS 2.5. Then we will show our test results on various computing environments with comparison to the PVM version hmmpfam. When searching 250 sequences against a 585-family Hmmer database on 18 dual-CPU computing nodes, the PVM version gets absolute speedup of 18.50, while EARTH version gets 30.91, achieving a 40.1% improvement on execution time. We also test our program on the advanced supercomputing cluster Chiba City at Argonne National Laboratory . When the seqfile contains 38192 sequences, and the HMMer database has 50 families, the EARTH version achieves an absolute speedup of 222.8 on 128 dual-CPU nodes, which means that it could reduce the total execution time from 15.9 hours (serial program) to only 4.3 minutes.