Issue No 23, Oct 2009 

 
Editorial - HPC Resources for Life Sciences Research - by Tan Chee Chiang, Associate Director, Computer Centre

In the past few years, we have seen a significant increase in the use of HPC resources for life sciences related computing such as in bioinformatics and computational biology. Compared to other major HPC applications, life sciences research computation has been known to be as demanding, if not more, in terms of compute and storage requirements. In this issue, we will share some of the ongoing efforts we are making to enable more effective and productive use of the HPC resources and technologies for life sciences research.

The HPCBio Portal, which was launched recently, will provide one-stop web-based access to some 30 life sciences applications in areas such as docking, sequencing, modelling, phylogeny etc. The Portal will hide the complexity of the backend HPC resources from users while allowing them to tap onto the new computing capability provided. More details will be provided in the Technical Updates section. You will also find an article on the up-coming parallel file system, which provides tens of terabytes high-performance storage-space for parallel processing on the compute clusters. The large storage will come in handy for researchers who need to work on data intensive life sciences applications. We will also revisit the NUS Grid infrastructure to highlight its strategic role in life sciences computational support.

Two user articles will be presented in the HPC Showcasing section, one focusing on PHYLIP and the other on the Gaussian application software. The PHYLIP application will demonstrate the speedup capability of the NUS Grid (reducing one year's CPU time to one week runtime in one case) whereas the Gaussian application will highlight the data intensiveness of such research simulations.

 
 
HPC Showcasing
 
  Molecular Simulation of a Functional Polyimide Containing Electron-Donor and -Acceptor Groups Using Gaussian 03 Software Package  - by Liu Yiliang, Prof. Kang En-Tang, Dept. of Chemical and Biomolecular Engineering
Gaussian 03 software package has been used to model the excited state of a functional polyimide which is a potential material for organic material based memory device. This can help us understand the charge transfer mechanism when the material is put under an electric field. The calculation was carried out on the Linux clusters with a GPFS files system at HPC, Computer Centre. As the calculation is I/O intensive and can create as large as 800GB scratch files, the large capacity GPFS files system ensures the calculation jobs finish properly and promptly. Please read the article for details.
 
  Accelerating Protein Phylogenetic Analysis by PHYLIP on NUS Grid - by Hu Yongli, Dept. of Biochemistry, Yong Loo Lin School of Medicine 

Life science related bioinformatics numerical analyses are usually characterised as data intensive and time-consuming. In this article we share how we investigated and grid parallelised the open source package PHYLIP for inferring phylogenies. With the grid parallelised PHYLIP, one of the analyses that could take one year to run on a standalone server was able to be completed in one week. This enhances the process of protein phylogentic tree construction greatly and contributes positively to biological research. Read more details in this article.
 
Technical Updates
 
  NUS HPCBIO Portal  - by Grace Foo, Principal HPC Specialist, Computer Centre 
The HPCBIO Portal aims to provide NUS Life Sciences researchers convenient access to over 30 applications in Bioinformatics and Molecular Model. The web based and menu driven Portal allows users to submit and manage jobs from it and to manage their HPC home and working directories. One advantage is that users do not need to specify which servers/queues to run their jobs as the Portal does this for them. To find out more about the Portal, please read on….
  A Highly Scalable, Parallel File System for Biological Sciences - by Yeo Eng Hee, Principal HPC Specialist, Computer Centre

There is a potential tsunami of data generation from researchers, especially those from the biological sciences. The upcoming parallel file system being implemented by the Computer Centre aims to meet the demands from these and other researchers, by providing a total of 120TB of highly scalable, parallel file system in the coming weeks. Read on for more details.
 
 
  Accelerate Life Science Applications Using Grid Computing  - by Wang Junhong, Lead HPC Specialist, Computer Centre 
Though grid computing is not new, many users have the impression that grid computing is like a "supercomputer" or very powerful computer that they can use to run extremely computing intensive simulations or analyses, as well as very memory intensive ones. This is not true for any large computational task, but true for certain large computational tasks if they can be partitioned to multiple pieces of sub-tasks and processed individually. Read this article to find out how four open source bioinformatics application packages were parallelised and enabled onto NUS Grid, and the benefits and performance offered to researchers. We welcome researchers doing the similar bioinformatics analyses to explore the feasibility of parallelising and enabling the application onto NUS Grid. Read this article to find more.
  The Leading Molecular Electronic Structure Calculation Software Packages - by Zhang Xinhuai, Principal HPC Specialist, Computer Centre

Nowadays, molecular systems of tens to hundreds of atoms are routinely studied, thanks to increased computational resources, advanced computing technology and the availability of powerful computational software. Computation Chemistry software based on the electronic structure theory plays a very important roles in life science and material science, enabling scientists to perform beyond the capabilities of laboratories and do investigations on the nature and origin of the electronic, optical, and structural properties of a system with high accuracy without the need for any experimental input other than the atomic number and mass of the constituent atoms.  There are many software available in this area, and some of them have evolved over the years. This article shares a summary of a few popular software and gives a brief overview of their features and their specialties.
 

Previous Issues      Feedback/Comments