Parallelizing Protein Folding

We obtained good speedups on multiple platforms, ranging from small linux clusters, to distributed shared memory machines, to massively parallel machines.

The following are performance results for all three types of proteins:

Speedups for Linux Cluster A
It consists of four boards, each of which has two processors and 2 GB RAM. Two boards have 1 GHz processors with 256 KB caches, and two boards have 1.1 GHz processors with 512 KB caches. They are connected with a Gbit dedicated Ethernet switch.
Total timeEach phase for Protein CTXIII
Speedups for SGI Altix 3700
A distributed shared-memory machine in the Texas A&M University Supercomputing facility. It contains 32 nodes, each with two pairs of 1.3 GHz 64-bit processors, and 256 GB RAM.
Total timeEach phase for Protein A
Speedups for MCR
A large, dedicated Linux cluster at the Lawrence Livermore National Laboratory. It has 1152 nodes with two 2.4 GHz processors and 4 GB RAM each. They are connected with a Gbit Ethernet switch.
Total timeEach phase for Protein G
Speedups for BlueGene/L
A scalable massively parallel 180 Teraflop machine which will have up to 65,536 compute nodes, each with 256 MB of memory, configured as a 64x32x32 three-dimensional torus. Each node has a single ASIC and 256 MB of memory.
Total timeEach phase for Protein CTXIII

Parallel Protein Folding with STAPL, Shawna Thomas, Nancy M. Amato, In Proc. IEEE Int. Wkshp. on High Performance Computational Biology, Santa Fe, NM, Apr 2004.
Proceedings(ps, pdf, abstract)

Probabilistic Roadmap Methods are Embarrassingly Parallel, Nancy M. Amato, Lucia K. Dale, In Proc. IEEE Int. Conf. Robot. Autom. (ICRA), pp. 688-694, Detroit, Michigan, USA, May 1999.
Proceedings(ps, pdf, abstract)

Protein Folding

