1. Background, and research objective(s)
(a) Research Background Merged from Statistical Physics and Protein Physics:
Paradigm Shift for Uncovering Novel Biological Phenomena in Life Sciences
The protein, consisting of 20 naturally existing amino acids' sequence, is in the unfolded state with the high structural diversity at the high temperature whereas at the low temperature it is in the folded state with both the stable energy and the unique native structure which performs its biological function. A protein exhibits markedly different structural, thermodynamic, kinetic, and biological characters across the folding-unfolding transition temperature. So far we know only about seven thousands representative structures of proteins in terms of the structural classification scheme of proteins. Given a new sequence of amino acids or a new protein, it is of great importance to predict/recognize the native structure of a target protein and clarify the protein folding mechanism not only because it is our starting point to understand the biological function of a protein by the pure academic sense but also because it bears much more important biological implication for preventing the disease and regulating the life phenomenon. The most fundamental issue in the protein folding problem is to know the protein energy function from which one can start to understand the protein folding mechanism and characterize the thermodynamic, kinetic, and mutagenesis behaviors of proteins quantitatively from the first principle.
One of the critical issue in the worldwide frontier research of protein science is to understand the relationship among amino acids' sequence, native structure, and biological function of a protein theoretically and experimentally in the aspect of the protein folding kinetics. Moreover, we wish to be able to regulate the biological function of a protein and subject its consequence toward the well-being of human life. This will be made possible if we can understand, design, and regulate the structural, thermodynamic, kinetic characters of a protein.
Although many research groups have performed the researches along with this paradigm, it is far short of our satisfaction in describing theoretically/computationally how the biological function of a protein and the corresponding life phenomena are controlled. To this end we need the breakthrough not only in our theoretical concept to envisage the protein function but also in our computational capability to simulate the structures of a protein. Here we propose that this will be precisely pursued by the interdisciplinary approach by merging the essential concepts and ideas among physics, chemistry, life science, and computer science to make the fundamental and profound progress for uncovering and regulating the biological function of a protein.
Currently the unified theoretical and computational framework of biophysics and protein sciences, which can not only recognize a native structure of a protein but also describe the thermodynamic, kinetic, and mutagenesis characters of a target protein, is not well established. This presents us the formidable difficulty in uncovering, designing, and regulating the fundamental biological function of a protein. In this project, we first start with designing the multiscale-heterogeneous protein energy function which encompasses the pairwise-interaction energies between all atoms and then coarse-grain these into those between all amino acids. This novel scheme for coarse-graining protein energy function will be performed by sampling structural ensembles of a protein by the molecular dynamics simulation, and eventually allow us to construct the protein energy function which specifically depends on both the amino acids' sequence and a native structure of a protein. Therefore, our way of designing the protein energy function catches the interaction energies pertaining to a protein both at the microscopic and the macroscopic length scale. Once the protein energy function is designed, we can take the advantage of the developments made in the (non)equilibrium statistical physics to calculate exactly the thermodynamic and kinetic properties of a protein. We will also trace out the folding pathways of a target protein by using the supercomputing cluster which has been constructed by ourselves and also by developing the phase ordering-supercomputing Monte Carlo algorithm. Our novel interdisciplinary approach for uncovering, designing, and regulating the biological/life phenomena of a protein will place us at the world top level in the frontier research of theoretical/computational protein biophysics, and we are determined to take the initiatives toward this direction.
(b) Research Objectives
Provoking the interdisciplinary merging of the concepts and ideas among physics, chemistry, life science, and supercomputing technology for a given target protein or a protein complex, we aim to uncover the fundamental scientific knowledge which governs the biological function of proteins and regulates their life phenomena by developing the creative and interdisciplinary concepts of the theoretical/computational protein biophysics.
● We will design the multiscale-heterogeneous protein energy function using the protein threading, the neural perceptron learning, and the molecular dynamics simulation. And then we construct the analytic free energy function, for a given target protein or a protein complex, of the type similar to that of magnetic spin system in the statistical physics.
● We will establish the theoretical/computational framework of statistical physics in order to describe exactly the thermodynamic, kinetic, mutagenesis characters of a protein as well as their stabilities. Also we develop the multi reside heat bath-supercomputing Monte Carlo algorithm and the master equation for tracing out the folding kinetics behavior of proteins.
● We will construct the free energy landscape of proteins which is the most fundamental quantity in view of not only the basic science and the diseases but also the life phenomena. We seek to describe the effect of the cell environment, such as temperature, pH value, ionic concentration, around a protein on its thermodynamic, kinetic, and mutagenesis characters.
● We aim to uncover the mechanism of protein aggregation and amyloid fibril formation which are responsible for the neuro-degenerative diseases, such as Altzheimer, Parkinsons, Hungtinton, and mad-cow diseases. We also pursue to uncover the folding-binding mechanism of protein-DNA, protein complex which is responsible for the growth of cancel cell and various diseases.
2. Contents and directions of research
3. Research Facilities
● We are the theoretical and computational research group, therefore the main research facility is the supercomputing facility. Our group has accumulated 12 years experience of constructing and operating the supercomputing PC-cluster since 1999 by ourselves in stead of purchasing directly the main frame supercomputer which costs about 100 times more expensive than the ones we made by ourselves. In fact we were the first group in Korea who constructed 64 nodes-supercomputing PC-cluster on 1999 and has led the Korean supercomputing community. Since then we have accumulated the leading edge technology of supercomputing PC-cluster. As a consequence we are now equipped with 296 nodes supercomputing PC-cluster (864 CPU-cores & 43,040 GPU/CUDA-cores) which amount to the total clock speed of 143 Tflops. The pictures below show the supercomputing PC-cluster in the full operation now at our Center for Proteome Biophysics, DGIST. This supercomputing facility greatly facilitates our worldwide competitive research on the theoretical/computational protein physics.
4. Organization plan for research group