Practical Estimation of TCR-pMHC Binding Free-Energy Based on the Dielectric Model and the Coarse-Grained Model

To evaluate free energy changes of bio-molecules in a water solution, ab initio molecular dynamics (MD) simulations such as Quantum Mechanical Molecular Mechanics (QM/MM) and MD are the most theoretically rigorous methods (Car and Parrinello 1985; Kuhne, Krack et al. 2007), although the calculation cost is far too large for large molecular systems that contain many electrons. Therefore, all-atom MD simulations based on classical mechanics (i.e., Newton’s equations) are used for the usual bio-molecular systems. As the conventional free energy perturbation (FEP) method based on all atom MD simulation is a strict method, to elucidate the molecular principles upon which the selectivity of a TCR is based, FEP simulations are used to analyse the binding free energy difference of a particular TCR (A6) for a wild-type peptide (Tax) and a mutant peptide (Tax P6A), both presented in HLA A2. The computed free energy difference is 2.9 kcal mol-1 and the agreement with the experimental value is good, although the calculation is very time-consuming and the simulation time is still insufficient for fully sampling the phase space. From this FEP calculation, better solvation of the mutant peptide when bound to the MHC molecule is important to the greater affinity of the TCR for the latter. This suggests that the exact and efficient evaluation of solvation is important for the affinity calculation (Michielin and Karplus 2002). Other FEP calculations of the wild-type and the variant human T cell lymphotropic virus type 1 Tax peptide presented by the MHC to the TCR have been performed using large scale massively parallel molecular dynamics simulations and the computed free energy difference using alchemical mutationbased thermodynamic integration, which agrees well with experimental data semiquantitatively (Wan, Coveney et al. 2005). However, the conventional FEP is still very timeconsuming when searching for so many unknown docking structures because all-atom MD for a large molecular system is a computationally hard task and MD simulations must be done not only in initial and final states but also in many intermediate states.


Introduction
To evaluate free energy changes of bio-molecules in a water solution, ab initio molecular dynamics (MD) simulations such as Quantum Mechanical Molecular Mechanics (QM/MM) and MD are the most theoretically rigorous methods (Car and Parrinello 1985;Kuhne, Krack et al. 2007), although the calculation cost is far too large for large molecular systems that contain many electrons. Therefore, all-atom MD simulations based on classical mechanics (i.e., Newton's equations) are used for the usual bio-molecular systems. As the conventional free energy perturbation (FEP) method based on all atom MD simulation is a strict method, to elucidate the molecular principles upon which the selectivity of a TCR is based, FEP simulations are used to analyse the binding free energy difference of a particular TCR (A6) for a wild-type peptide (Tax) and a mutant peptide (Tax P6A), both presented in HLA A2. The computed free energy difference is 2.9 kcal mol -1 and the agreement with the experimental value is good, although the calculation is very time-consuming and the simulation time is still insufficient for fully sampling the phase space. From this FEP calculation, better solvation of the mutant peptide when bound to the MHC molecule is important to the greater affinity of the TCR for the latter. This suggests that the exact and efficient evaluation of solvation is important for the affinity calculation (Michielin and Karplus 2002). Other FEP calculations of the wild-type and the variant human T cell lymphotropic virus type 1 Tax peptide presented by the MHC to the TCR have been performed using large scale massively parallel molecular dynamics simulations and the computed free energy difference using alchemical mutationbased thermodynamic integration, which agrees well with experimental data semiquantitatively (Wan, Coveney et al. 2005). However, the conventional FEP is still very timeconsuming when searching for so many unknown docking structures because all-atom MD for a large molecular system is a computationally hard task and MD simulations must be done not only in initial and final states but also in many intermediate states.
108 binding free energy. Of course, this ER method can be combined with the approximate models described below. Instead of MD simulations, Monte Carlo simulations are also used for the sampling of the configurations. This type of approach which only considers initial and final states is called an endpoint method.
Most of the calculation cost in all-atom MD involves the sampling of the solvent atom configurations because the number of solvent atoms -such as water and co-solvent ions -is much larger than that of the target bio-molecules, and long-range electrostatic potential is especially time consuming although efficient algorithms such as the Fast Multi Pole method and several Ewald methods are developed for all-atom MD. To decrease the calculation cost of the long-range electrostatic term, a continuum dielectric model -which can calculate the electrostatic free energy term of the system very efficiently -is widely used in many biomolecular systems and is described in the next section.
In the case of large molecules, the entropy term of solvation change becomes important (Asakura and Oosawa 1954), and the solvent accessible surface area (SA) based calculation method becomes insufficient because the excluded volume effect increases. Therefore, integration equation (IE) theories such as the Ornstein-Zernike equation and the closureswhich are developed in molecular liquid theory -promise to evaluate entropy change, including solvation and de-solvation processes (Kinoshita 2006;Kinoshita 2009;Yasuda, Yoshidome et al. 2010). The recent MD software package AMBER also contains such an IE algorithm, 3D-RISM, which is a reference site model employing Cartesian coordinates (Luchko, Gusarov et al. 2010). In particular, the simple morphological theory obtained from this IE approach is now applied to the elucidation of protein folding (Yasuda, Yoshidome et al. 2010) and F1-ATPase mechanisms and has proven to be useful (Yoshidome, Ito et al. 2011). The other approaches for decreasing the calculation cost of the solvent molecules are coarsegrained (CG) solvent models. The protein-dipole Langevin-dipole (PDLD) model, which can efficiently calculate the electrostatic interaction among permanent dipoles and induced dipoles of proteins and solvent atoms, is one of the coarse-grained solvent models. As the PDLD model is usually used in the outer area of the all-atom region, this is a hybrid approach of CG and all-atom models (Warshel and Levitt 1976;Xu, Wang et al. 1992). Hybrid approaches of all-atom, CG and continuum solvent models are evolving. A smoothly decoupled particle interface (SDPI) model has a switching region that gradually transitions from fully interacting particles to a continuum solvent. The resulting SDPI model allows for the use of an implicit solvent model based on a simple theory that needs only to reproduce the behaviour of bulk solvent rather than the more complex features of local interactions (Wagoner and Pande 2011). Of course, CG models for solute molecules -which are described in the third section -are promising for the understanding of protein folding (Liwo, He et al. 2011) and predictions of the ligand-receptor docking structure, etc.
The relationship among the theoretical models and approaches is summarised in Fig. 1.

The dielectric model and the MM-PBSA (GBSA) method
In this section, we describe briefly the principles behind the methods, the differences between PBSA and GBSA and explicit and implicit treatment.

Principles of the method
A molecule has an atomic polarisability due to its electrons and an orientational polarisability when the molecule is polar and has a permanent electric dipole moment. A high value of the relative dielectric constant (εr=78.4 at 298K) of water is mainly due to its orientational polarisation, where the electric dipole moment is 2.95 Debye. Moreover, the solution in our body contains several co-solvent ions such as Na + , Cl -, K + and so on for the usual physiological condition. Therefore, electrostatic interactions among bio-molecules are largely decreased by water and solvent ions in a very complicated manner when compared with the in vacuo case (Koehl 2006).
To obtain the electrostatic contribution to free energy change, the dielectric model is a good approximation and is widely used to calculate the electrostatic potential of molecular systems in many scientific and technological fields.
First, the use of a simple function of the effective relative dielectric constant is the easiest way to reduce the calculation time required in obtaining electrostatic potentials.
A simple distance-dependent function, 4.5 r, which is proposed by Pickersgill (Pickersgill 1988), can well explain site-directed mutagenesis experiments. Warwicker showed that the simple Debye-Hückel shielding function with a uniform effective relative dielectric constant of 50 was sufficient to explain experimental results when compared with a continuum model (Warwicker 1999). Mehler et al. challenged this problem and proposed a sigmoid function considering the local hydrophobicity and hydrophilicity of protein molecules whose results were also in good agreement with pKa shift measurements (Mehler and Guarnieri 1999). These methods are simple and very fast; however, they all require parameter readjustment for each new system to be studied. Unfortunately, a universal function applicable to all macromolecular systems does not yet exist.
Empirically obtained effective dielectric functions that depend on the inter-atomic distance, r, such as linear functions (εr=r or 4r) and the sigmoid function is simple, and low calculation-cost method is still used in recent drug design studies for the docking simulations of large molecular systems so as to save on the calculation-cost, although the calculation error is large (Takahashi, Sugiura et al. 2002).

PB approach
On the other hand, the typical dielectric model solves the Poisson equation and treats biomolecules and water as continuum media which have a specific dielectric constant, although the position-dependent local dielectric constant -which is calculated from the electronic polarisation of atoms and the orientational polarisation of local dipoles -is also possible for a finite difference equation (Nakamura, Sakamoto et al. 1988;Pitera, Falta et al. 2001).
Moreover, the Poisson-Boltzmann (PB) equation, which was first proposed by Gouy in 1910 and was complemented by Chapman in 1913, is widely-used for considering the contribution of solvent ions. The Gouy-Chapman theory, which solves a simple onedimensional nonlinear PB equation, is often used in a membrane-electrolyte system that has electrical double layers (Forsten, Kozack et al. 1994).
The PB equation is a differential equation and it describes electrostatic interactions between molecules in ionic solutions by using a mean-filed approximation where the correlations among the solvent ions are neglected. The equation in SI units can be written as: where is the divergence operator and () r   is the position-dependent dielectric, which is set to be constant in the solvent, the bio-molecule and the boundary regions in continuum dielectric models. ψ() r    is the gradient of the electrostatic potential, () r   represents the charge density of the solute (i.e., the fixed charges of the bio-molecule), i c  represents the concentration of the ion i at a distance of infinity from the solute, z i is the charge of the solvent ion, q is the charge of a proton, k B is the Boltzmann constant, T is the temperature and is a factor for the position-dependent accessibility of position r  to the ions in the solution. If the potential is small and the electrostatic energy is negligible compared to the thermal fluctuation, kBT, the equation can be linearised and solved more efficiently.
Here, κ is the Debye shielding parameter, defined as follows: This weak field limit approach is called the Debye-Hückel approximation (Fogolari, Brigo et al. 2002 To solve the PB equation, there are typically three numerical methods: a finite difference (FD) method is relatively time consuming, but simple and applicable to a complex system which has a position-dependent local dielectric constant. Therefore, the FD method is firstly applied to calculate the electrostatic potential in a protein-solvent system, and the pKa shift of the protein ionisable residues are well-explained (Gilson and Honig 1987) and the effect of the salt concentration on the pKa are also reproduced (Takahashi, Nakamura et al. 1992). The finite element method (FEM) and the boundary element method (BEM) are more powerful and the calculation cost is smaller than the FD method, although only a uniform dielectric constant must be set in each region (Lu, Zhou et al. 2008).

GB approach
One other powerful way to obtain the electrostatic potential based on the dielectric model is the Generalised Born (GB) model, which solves the linearised PB equation by approximating such bio-molecules as proteins and nucleic acids as a set of spheres whose internal dielectric constant differs from the external solvent (Koehl 2006 where is the dielectric constant in vacuo, is the dielectric constant of the solvent, qi is the electrostatic charge on the particle i, rij is the distance between particles i and j, and ai is a length defined as the effective Born radius (Still, Tempczyk et al. 1990).
The effective Born radius of an atom represents its degree of burial inside the solute and corresponds to the distance from the atom to the molecular surface. The exact evaluation of the effective Born radii is the central issue for the GB model (Onufriev, Bashford et al. 2004).
To consider the electrostatic shielding effect due to the solvent ions, a simplified function based on the Debye-Hückel approximation is added to the function Gs in the AMBER software package (Case, Cheatham et al. 2005), which is one of the most used packages in the world of bio-molecular simulations, as follows by (Srinivasan, Trevathan et al. 1999 They calculated the solvation free energies, Gs, with this GB model for proteins and nucleic acids, which agreed very well with those of the PB model. The salt-dependence of the electrostatic binding free energy based on the Debye-Hückel approximation is still under investigation (Harris, Bredenberg et al. 2011).

The GBSA (PBSA) approach
GBSA (PBSA) is simply a GB (PB) model with the hydrophobic solvent accessible surface area (SA) term. This is the most commonly used implicit solvent model combination and is widely used in MD simulations for large bio-molecules. This approach is known as MM/GBSA in the context of molecular mechanics. This formulation can well identify the native states of short peptides with a precise stereoscopic structure (Ho and Dill 2006), although the conformational ensembles produced by GBSA models in other studies differ significantly from those produced by an explicit solvent and do not identify the protein's native state (Zhou 2003). In particular, strong charge-charge interaction such as salt bridges are overstabilised due to insufficient electrostatic screening, and the alpha helix population became higher than the native one. These problems are common in PBSA. Variants of the GB model have also been developed to approximate the electrostatic environment of membranes, which have had some success in folding the transmembrane helices of integral membrane proteins (Im, Feig et al. 2003).
There are several kinds of software containing the GB algorithm. For example, the AMBER software package has three types of GBSA models as has as the PBSA model.
The MM-PBSA and GBSA approaches are the endpoint methods and usually only consider the initial unbound state and the final bound state. The binding free energy change, dGbind, is written as: The term dGgas refers to total free energy change and the term dHgas contains the van der Waals and electrostatic interaction energies as well as internal energy variation, such as bond, angle and torsional angle energies in vacuo (i.e., gas phase). The terms dHtr/ro denote the energy difference due to translational and rotational degrees of freedom, and becomes 3 RT in the classical limit (i.e., thermal energy is large enough). The term dS refers to the conformational entropy change (Tidor and Karplus 1994;Ben-Tal, Honig et al. 2000). The term dGsolv is the difference between the initial and final solvation free energies and is divided into the electrostatic contribution, dGelsolv, and the nonpolar contribution, dGnpsolv. The term dGnpsolv, which is the sum of a cavity term and a solute-solvent van der Waals term, is calculated from the SA as follows: dGnpsolv=γSA+b.

Practical Estimation of TCR-pMHC Binding Free-Energy Based on the Dielectric Model and the Coarse-Grained Model 113
The several types of the GBSA models are not only applied to many protein folding simulations (Zhou 2003), but also to nucleic acid conformational dynamics from massively parallel stochastic simulations, where the ubiquitous helical hairpin conformation is reproduced and folding pathway is investigated (Sorin, Rhee et al. 2003).

Review of recent work
As mentioned in the previous section 2.1, the dielectric models and the hybrid approaches are widely used in many scientific and technological fields, such as protein folding, molecular docking and drug design, etc. In particular, the binding free energy (BFE) calculation and the prediction of the binding affinity and binding structure between ligands and proteins is the most important aim (Gilson and Zhou 2007) because the major purpose of molecular docking Leis and Zacharias 2011) is to predict the experimentally-obtained BFE and the binding site of a receptor to a specific ligand molecule, and drug design is usually supported by suitable molecular docking methods.
For example, the linear interaction energy (Rastelli, Rio et al. 2010) method -which combines two different continuum solvent models -is applied to calculate protein-ligand BFEs for a set of inhibitors against the malarial aspartic protease plasmepsin II, and the explicit solvent LIE calculations and LIE-PB reproduce absolute experimental BFEs with an average unsigned error of 0.5 and 0.7 kcal mol -1 respectively (Carlsson, Ander et al. 2006). Moreover, the ligand-water interaction energies -which are calculated from both PB and GB models using snapshots from explicit solvent MD simulations of the ligand and proteinligand complex -are compared with the explicit solvent MD results. The obtained energy from the explicit water MD agrees well with those from the PB model, although the GB model overestimates the change in solvation energy, which overestimation is caused by consistent underestimation of the effective Born radii in the protein-ligand complex.
Xu and Wang applied the MM-PBSA method to FK506-binding proteins (Xu and Wang 2006) -which are important targets of pharmaceutical interests -and calculated the binding of a set of 12 non-immunosuppressive small-molecule inhibitors to FKBP12 through MD simulations, where each complex is subjected to 1-ns MD simulation conducted in an explicit solvent environment under constant temperature and pressure. The BFE of each complex is then calculated with the MM-PBSA method in the AMBER program and the MM-PBSA computation agrees very well with the experimentally determined BFEs, with a correlation coefficient (R 2 ) of 0.93 and a standard deviation as low as 0.30 kcal mol -1 . The vibrational entropy term given by the normal mode analysis is necessary for achieving this correlation. Moreover, an adjustment to one weight factor in the PBSA model is essential to correct the absolute values of the final binding free energies to a reasonable range, which suggests that the very good correlation is due to the similar properties of ligand molecules and that this artificial weight factor is not universal. A comparison of the MM-PBSA model with a Linear Response Approximation model suggests that the MM-PBSA method seems to be robust in binding affinity prediction for this class of compounds (Lamb, Tirado-Rives et al. 1999).
To systematically evaluate the performance of MM-PBSA and several versions of the MM-GBSA models, extensive calculations of BFEs are done for 59 ligands interacting with six different proteins with the AMBER 9.0 software . First, the effects of the length of the MD simulation are explored, ranging from 400 to 4800 ps, and the simulation length has an obvious impact on the predictions. Interestingly, longer MD simulation is not always necessary for achieving better predictions. Second, the effect of a solute dielectric constant (1, 2, or 4) on the BFEs of MM-PBSA is also checked and the predictions are quite sensitive to the solute dielectric constant. Therefore, this parameter should be carefully determined according to the characteristics of the protein/ligand binding interface. Third, conformational entropy often shows large fluctuations in MD trajectories, and a large number of snapshots are necessary to achieve stable predictions. Next, the comparison of the accuracy of the BFEs of three GB models: (1) GB-HCT, the pair wise model by Hawkins et al. (Hawkins, Cramer et al. 1996) parameterised by Tsui and Case (Tsui and Case 2000); (2) GB-OC1 and (Case, Cheatham et al.) GB-OC2, the parameters of which are modified by Onufriev et al. (Onufriev, Bashford et al. 2004) and the GB-OC1 model which gives better results compared to the other two GB models in ranking the binding affinities of the studied inhibitors. This may be explained by the better agreement of GB-OC1 with PBSA. The better performance of MM-PBSA when compared with MM-GBSA in calculating absolute -but not necessarily relative -BFEs is confirmed, which is not surprising because the GBSA is the approximation of PBSA, but it suggests the reliability of the dielectric continuum model itself. Considering its computational efficiency, MM-GBSA gives good relative BFEs and is much faster than MM-PBSA, and can serve as a powerful tool in drug design where the correct ranking of inhibitors is often emphasised and the obtaining of the absolute value of BFEs is not so important.
Interestingly, the successive study of MM-PBSA and MM-GBSA-OC1 using 98 proteinligand complexes to develop an excellent scoring function by Hou et al. shows that MM-GBSA (success rate 69.4%) outperformed MM-PBSA (45.5%) and many popular scoring functions in identifying the correct binding conformations, and the best prediction of the MM-GBSA model with an internal dielectric constant of 2.0 produced a Spearman correlation coefficient of 0.66, which is better than MM/PBSA (0.49) and almost all the scoring functions used in molecular docking ). However, the reason why the PBSA underperforms the GBSA is not clear. One possibility is the difference of the SA term and the other possibility is the insufficiency of the conformational sampling of proteins, as the authors are also emphasising the importance of MD calculation time. In any case, MM-GBSA performs well, for both binding pose predictions and binding free-energy estimations and it is efficient at re-scoring the top-hit poses produced by other less-accurate scoring functions.
As AMBER and other software packages -including the PB and GB models -are widely used and drug design is the important issue, many studies concerning ligand-protein docking based on the dielectric model have been done (Rastelli, Rio et al. 2010). The above calculation results of the GB and PB dielectric models are limited to relatively small ligand molecules and receptor proteins, and the size of the complex is not so large compared to socalled super-molecules, such as the immune complex and membrane proteins, etc. To calculate and analyse the BFE of a large, complex T-cell receptor (TCR) and immunogenic peptides (p) presented by class I major histocompatibility complexes (MHC), binding free energy decomposition (BFED) calculations based on the MM-GBSA approach including entropic terms were done on the 2C TCR/SIYR/H-2Kb system and provided a detailed description of the energetics of the interaction (Zoete and Michielin 2007), since this BFED method can detect the important individual side chains for the stability of a protein fold with computational alanine scanning of the insulin monomer (Zoete and Meuwly 2006). A correlation between the decomposition results and experimentally-determined activity differences for alanine mutants of the TCR-pMHC complex is 0.67 when the conformational entropy is neglected, and 0.72 when the entropy is considered. Similarly, a comparison of experimental activities with variations in the BFEs determined by computational alanine scanning yields correlations of 0.72 and 0.74 when the entropy is neglected or taken into account, respectively. In addition, a comparison of the two theoretical approaches for estimating the role of each side chain in the complex formation is given, and a new ad hoc approach for decomposing the vibrational entropy term into atomic contributions -the linear decomposition of vibrational entropy (LDVE) -is introduced. The latter allows the rapid calculation of the entropic contribution of interesting side chains to the binding. This approach is justified by the idea that the most important contributions to the vibrational entropy of a molecule originate from residues that contribute most to the vibrational amplitude of the normal modes. The results of the LDVE are very similar to those of the exact but highly computationally demanding method. The BFED approach is also applicable to the design of rational TCR by calculating each amino acid contribution in mutated TCR. As melanoma patients frequently show unusually positive clinical outcomes, it represents an interesting target for adoptive transfer with modified TCR. Sequence modifications of TCR which potentially increase the affinity for this epitope have been proposed and tested in vitro. T-cells expressing some of the proposed TCR mutants showed better T-cell functionality, with the improved killing of peptide-loaded T2 cells and better proliferative capacity compared to the wild type TCR expressing cells (Zoete, Irving et al. 2010).
As there are still not many applications for massive simulations with dielectric models to large bio-molecules like the TCR-pMHC complex, more extensive studies are necessary to evaluate the validity of the method and improve its accuracy and performance because the excluded volume effect due to water entropy change in binding will become larger in the larger systems.

The correlation between calculation-cost and accuracy
It is not easy to state the calculation cost and accuracy exactly because the method is only now developing and the accuracy depends on the system size.
Previous studies have shown a very good correlation between PB and GB results because the GB parameter is modified to achieve better agreement with that of PB (Gohlke, Kuhn et al. 2004;Onufriev, Bashford et al. 2004). Moreover, GB and PB methods also enable the rapid scoring of protein structures when they are combined with physics-based energy functions. The direct comparison of these two approaches on large protein data sets is done with a scoring function based on a GB and PB solvation model and short MD simulations. Against seven publicly available decoy sets, the results of the MM-PBSA approach are comparable to the GB-based scoring function (Lee, Yang et al. 2005).
We also compared the MM-PBSA and MM-GBSA methods. Table 1 shows the comparison of the binding electrostatic free energies of the PB and GB methods for two TCR-pMHC complexes (2gj6 and 3pwp), a complex of A6 and Tax peptide-HLA A2, and A6 with Hud-A2 respectively. Constant regions of TCR were removed (Gregoire, Lin et al. 1996), hydrogen was added and the complexes were neutralised and solvated with TIP3P. The numbers of atoms involved in the systems were 130,545 for 2GJ6 and 127,023 for 3PWP. Calculations were performed with Sander of AMBER 11 for 5 ns. The Gelsolv, which is always largely negative in each case, represents the electrostatic energy contribution due to solvents. The Gnp is the hydrophobic and van der Waals contributions were calculated from the solvent accessible surface area (SA). The difference of the PB and GB results of each case is 3-4%, although the total binding free energy, dGbind, differs by almost 20% because the binding energies in vacuo, dGgas, and the contribution of the solvent, dGsolv=dGelsolv+dGnpsolv, have a different sign and cancel each other. We must note that the ratio of the SA contributions between GBSA and PBSA is larger than the Eelsov, although the absolute contribution is 1/10 th of the Gelsolv.  Table 1. A comparison of the binding electrostatic free energies of the PBSA and GB methods for two TCR-pMHC complexes (PDB ID: 2gj6 and 2pwp). The Gelsolv, which is always largely negative in each case, represents the electrostatic energy contribution due to solvents. The Gnpsolv is the hydrophobic and van der Waals contributions are calculated from the solvent accessible surface area (SA). The total binding free energy, dGbind, is the sum of the binding energy in vacuo, dGgas, and the contribution of the solvent, dGsolv=dGelsolv+dESA. All energies in the table are given in kcal mol -1 .

The limits of all-atom simulations
Even though all-atom simulations provide the most detailed information about the system of interest, its calculation costs are quite high. A system containing a large protein molecule such as several 10 5 to 10 6 Dalton comes up to several 10 5 atoms when solvated in explicit water molecules, and expands to nm 3 in size; hence, the calculation time of less than sec www.intechopen.com

Practical Estimation of TCR-pMHC Binding Free-Energy
Based on the Dielectric Model and the Coarse-Grained Model 117 even using a recent multi-core PC. These figures are too short and still too small to reproduce such biologically interesting phenomena as protein folding, protein-assembly and enzymatic reaction, etc. Therefore, the increase of calculation efficiency is quite an urgent requirement. The calculation cost increases approximately in proportion to the square of the number of atoms, and the time for one step is approximately proportional to the order of the square-root of the mean mass of elements. The number of atoms constituting an amino acid (AA), when polymerized in a peptide, is 7 (Gly) to 24 (Trp), and the mass is between 57 (Gly) and 186 (Trp) -about 5 to 15 times of a C, N or O. When an AA is coarse-grained to 2 to 4 pseudo-atoms, the calculation cost decreased by 2 to 3 orders of magnitude, and the time for a step increases by 2 or 3 orders. In most CG models, the interaction between pseudo-atoms through bonds of less than 5 is described as follows: Extension potential between two beads  The whole energy of the system is described as the combination of these elemental potentials. For example, the Head-Gordon et al. model (Brown, Fawzi et al. 2003 where,, and i, j are summed for all the AAs contained in the peptide. The interaction between non-bonded pseudo-atoms is usually described as the Lennard-Jones potential. The methods for configuration sampling, usually MD (Shih, Arkhipov et al. 2006) and the Monte Carlo simulation (Levy, Karplus et al. 1980;Horejs, Mitra et al. 2011), are the same as those used in all-atom simulations. The equation of motions for MD is principally the same as that used in all-atom simulations, i.e.,

www.intechopen.com Molecular Dynamics -Studies of Synthetic and Biological Macromolecules
where F  ,  , and W  are external force, friction and thermal noise, respectively. Any modification is made according to the kind of ensemble adopted.

The difference of CG models between proteins and other molecules
As might be guessed by Fig.2A and B, it is easier to treat a homopolymer by the CG model than to treat a polypeptide or a protein. A homopolymer can be described with rather a few parameters and, under certain circumstances, several components can be coarse-grained as a pseudo-atom (4 styrenes in a dotted circle are treated as one bead). Rheological features such as phase-transition, diffusion coefficient, compressibility, ductility, elasticity and viscosity have been reproduced fairly well (Yaoita, Isaki et al. 2008;Harmandaris and Kremer 2009;Kalra and Joo 2009;Posel, Lísal et al. 2009). On the other hand, peptides and proteins consist of diversified 20 AAs and the particular functions of proteins such as specific binding and enzymatic functions are based on a unique configuration of those characteristic AAs. Therefore, to evaluate the interaction on the CG model is especially difficult due to the effect of averaging specific properties and the anisotropicity of components. Notwithstanding this state of affairs, some CG models have come to predict the docking and binding of proteins fairly well. In this section, representative protein CG models are reviewed and the application of the CG model to the evaluation of TCR-pMHC interaction is foreseen.

The one-bead model
Many one-bead models (Taketomi, Ueda et al. 1975;Brown, Fawzi et al. 2003;Jang, Hall et al. 2004) can be deemed as descendants of the Go-model. Go-like models, even though extremely simplified in their format, principally succeeded in reproducing several aspects of protein folding. This is presumably due to the finding that the protein-folding rate and mechanism are largely determined by a protein's topology rather than its inter-atomic interaction (Baker 2000). Those descendant models have equipped their own features, but still have a tendency towards a reference configuration. This might be due to the difficulty of incorporating the geometric and physicochemical aspects of all the AAs in only a few parameters. Recently, the finding that the underlying physicochemical principles of the interaction between the domains in protein folding are similar to those between the binding sites of protein assembly has been accepted (Haliloglu, Keskin et al. 2005;Levy, Cho et al. 2005;Turjanski, Gutkind et al. 2008;Baxter, Jennings et al. 2011). This fact will probably provide another aspect of the application of the CG model to issues of protein-binding.
Miyazawa and Jernigan (MJ) extracted inter-residue potentials from the crystallography of 1168 proteins (Miyazawa and Jernigan 1996). The principle adopted in this method is that the number of residue-residue contacts observed in a large number of protein crystals will represent the actual intrinsic inter-residue interactions. Namely, to regard the effect (contacts in the observed structure) in the same light as the cause (interaction energy) based on "the principle of structural consistency" or "the principle of minimal frustration".

Fig. 2. Coarse-grained models for protein
Homopolymer such as polystytene can be described with rather a few parameters, and in some cases, several units are mapped to one bead (A). Protein consists of heterogenious components, hence more detailed and complicated description (B). Main chain is represented by C and each side chain is mapped to one bead, which retains its original geometiric and physico-chemical features (Liwo, Pincus et al. 1993) (C). MARTINI force field maps more beads to a side chain (Marrink, Monticelli et al. 2008), enabled to simulate the release of inner water molecules through stress-sensitive channel enbedded in vesicle membrane (Louhivouri, Lisselada et al. 2008) (D). OPEP model all the atoms of main chain and maps one bead for side chain. As can be guessed, this models is suitable for dealing with the issues where backbone structure such as -helix and -sheet play essential roles , Laghaei et al. 2011, Nasica-Labouze et al. 2011.
Adopting this model for the parameters of LJ potentials, Kim and Hummer constructed a one-bead model combined with the Debye-Hückel type potential and performed configuresampling on replica exchange MC -applied to ubiquitin binding -and obtained good agreement with other experiments (Kim and Hummer 2008). Chakraborty's group applied an MJ matrix to estimate TCR-pMHC and explained the effect of the HLA class I haplotype on TCR repertoire-formation (Kosmrlj, Read et al. 2010). The above mentioned CG models are tabulated in Table 2.

UNRES
Scheraga's group described a CG model which consists of a C, side chain centroid (SC) and one dihedral angle (Liwo, Pincus et al. 1993). They searched the conformation space on this model with compactness of the protein as an indicator. The obtained structure was then decoded into an all-atom-backbone with the SC model and then searched further for the lowest-energy structure. Finally, an all-atom model was reconstructed from the obtained structure and searched for the lowest-energy structure on an electrostatically driven Monte Carlo (EDMC) simulation based on the ECEPP/2 potential. They succeeded in predicting ab initio the moderate size of proteins (53-235 residues) (Oldziej, Czaplewski et al. 2005). This hybrid method -the sampling of a configuration on a CG model and the estimation of binding energy on an atomistic model -presents quite a reasonable combination of efficiency and accuracy. Their recent accomplishment was a 1 msec simulation of more than 500 AA proteins through massive parallelisation (Scheraga, Maisuradze et al. 2010).

ATTRACT
Zacharias described a docking method of protein-protein or protein-ligand using a reduced protein model and docking algorithm, ATTRACT (Zacharias 2003). An AA is represented with 2 to 3 (Zacharias 2003) or 2 to 4 (Zacharias and Fiorucci 2010) pseudo-atoms and the interactions of specific pseudo-atom pairs, including their size and physicochemical characters, are interweaved into the parameters of the Lennard-Jones potential. ATTRACT assumes that both interacting molecules are rigid, smaller molecules is tried to dock from thousands of sites with 6 degrees of freedom, 3-translational and 3-rotational. Docking includes the minimisation of side chains described as rotamer, hence total minimisation is performed. They applied this CG model and ATTRACT to the Critical Assessment of Prediction of Interest (CAPRI) (Janin 2002) and showed two acceptable bindings out of 6 targets (May and Zacharias 2007) or else obtained better (4 out of 6 targets) prediction by improving the scoring function and docking method . The estimation of TCR and pMHC binding not only deals with the binding energy of a predetermined configuration, but also deals with the determination of the bindingconfiguration, because the TCR-pMHC complex has several binding modes (Wucherpfennig, Call et al. 2009). They showed that it is possible to uncover a binding site by using an electrostatic desolvation profile ) based on ODA method (Fernandez-Recio, Totrov et al. 2005).

The MARTINI force field
The MARTINI force field was originally devised for describing lipids or surfactants, such as dipalmitoylphosphatidylcholine (DPPC), dicapryloyl-PC (DCPC), dodecylphosphocholine 1 (Taketomi, Ueda et al. 1975); 2 (Jang, Hall et al. 2004);3 (Miyazawa and Jernigan 1996); 4 (Hummer and Kim 20 6 (Brown, Fawzi et al. 2003); 7 (Turjanski, Gutkind et al. 2008); 8 (Liwo, Pincus et al. 1993); 9 (Bahar and Jernigan 2005); 11 (Liwo, Oldziej et al. 2010); 12 (Zacharias 2003); 13 (Marrink, Monticelli et al. 2008 (Marrink, de Vries et al. 2004;Marrink, Risselada et al. 2007). The adoption of a very limited atom type and short range potentials provided very efficient computation, hence the micrometer length in scales and milliseconds in time, and succeeded in the simulation of the spontaneous aggregation of DPPC lipids into a bilayer and the formation of DPC in water. The hydrogen atom is neglected in this model. Heavy four atoms on average are represented as one pseudo-atom (four-to-one mapping) with an exception for ringlike molecules. Ringlike molecules are mapped with higher resolution (up to two-to-one mapping). Interaction sites are classified into 4 types: polar (P), nonpolar (N), apolar (C) and charged (Q). Within a main type, subtypes are distinguished either by a letter denoting the hydrogen-bonding capabilities (d = donor, a = acceptor, da= both, 0 = none) or by a number indicating the degree of polarity (from 1 = lower polarity to 5 = higher polarity). The interaction of each atom-type was parameterised at five levels: attractive (e = 5 kJ/mol), semi-attractive (e = 4.2 kJ/mol), intermediate (e = 3.4 kJ/mol), semi-repulsive (e = 2.6 kJ/mol) and repulsive ( e = 1.8 kJ/mol). Non-bonded interactions between the interaction sites i and j are described by the Lennard-Jones potential: with  ij representing the effective minimum distance of approach between two particles and  ij representing the strength of their interaction. This model was extended to deal with proteins (Marrink, Monticelli et al. 2008). The basic parameters are the same as used in the lipid model. Bonded interaction is described by the following set of potential energy functions acting between the bonded sites i, j, k, and l with an equilibrium distance d b , an angle  a , and a dihedral angle  i and  id : where V b , V a , V d and V id represent potential sites for bonding, stiffness, dihedral angle and improper dihedral angle, respectively. The total energy of the system is obtained by summing (17) to (20). The mapping of all AAs is mapped into 4 types of beads or a combination of them. In this mapping, Leu, Pro, Ile, Val, Cys and Met are classified as apolar (C-type), where as Thr, Ser, Asn and Gln are polar (P-type). Glu and Asp are charged (Q-type), and Arg and Lys are modelled by a combination of a Q and an uncharged particle (N-type). The bulky ring-based side chains are modelled by three (His, Phe, and Tyr) or four (Trp) beads. Gly and Ala residues are only represented by the backbone particle. The type of the backbone particle depends on the protein secondary structure; free in solution or in a coil or bend, the backbone has a strong polar character (P-type); as part of  helix or  strand, the interbackbone hydrogen bonds reduce the polar character significantly (N-type). Proline is less polar due to the lack of hydrogen-donor capabilities. More detailed geometrical representation is given in Fig.2 D, illustrating the binding distance, angle, dihedral angle, improper angle and bead configuration. This CG protein model contains directional specificity and heterogeneity in side chains to some extent, hence a feature of a secondary structure (-helix and -strand) and the gross physicochemical property, such as being charged, hydrophilic and hydrophobic. They succeeded in the partitioning of AAs in the DOPC bilayer, keeping the AA association (Leu-Leu, Lys-Glu) constant in water, the portioning and orientation of pentapeptides at the border of the water and cyclohexane. The tilt and orientation of hexapeptides in the DOPC bilayer is also reproduced after sub- sec to sec MD simulation on GROMACS software (van Der Spoel, Lindahl et al. 2005). They recently accomplished the simulation of the rapid release of content from a pressurised liposome through a particular mechano-sensitive protein channel, MscL, embedded in the liposomal membrane (Louhivuori, Risselada et al. 2010). The behaviour of this tiny functional organelle, which consists of 5 MscL molecules, 2108 DOPC lipids, 5,444 water beads with an additional 54,649 water beads forming a 4 m layer around the vesicle, was described in almost atomistic detail. In response to the increase of internal pressure, this vesicle released water molecules by opening the Mscl channel. MD was performed for 40 s, which corresponds to 160 s in an all-atom model. This model demonstrated that CG-MD provides for the computer-aided design of super-molecules and organelles of a practically usable size.

The optimised potential for efficient peptide-structure representation (OPEP) model
OPEP is, as shown in Fig.2 E, a CG protein model that uses a detailed representation of all backbone atoms (N, H, C, C and O) and reduces each side chain to one single bead with appropriate geometrical parameters and physicochemical properties (Derreumaux and Forcellino 2001). The OPEP energy function, which includes the implicit effects of an aqueous solution -expressed as (21) -is formulated as a sum of local potentials (E local ), nonbonded potential (E nonbonded ), and hydrogen-bonding potential (E H-bond ): Local potentials are expressed by: K b , K a , and K  represent force constants associated with changes in bond length, the bond angles of all particles and force constants related to changes in improper torsions of the side chains. The dihedral potentials associated with N-Cassociated are expressed as (23) and C-C expressed as as (24) which includes all the interaction works through more than 3-bonds, and all these functions are expressed as Van der Waals potentials, as shown in (11) Here, the Heavyside function H(x) = 1 if x >= 0 and 0 of x < 0, r ij is the distance between particles i and j, 00 0 () / 2 ij i j rr r  with 0 i r as the Van der Waals radius of particle i.
The hydrogen-bonding potential (E H-bond ) consists of two-body and three-body terms (Derreumaux, Maupetit et al. 2007).
This model was originally devised for predicting the structure and folding of proteins (Derreumaux 1999;Derreumaux and Forcellino 2001) and, by combining a Monte Carlo simulation, fairly succeeded in prefiguring basic supersecondary structures. This model, containing all the protein-backbone components, excels in issues where secondarystructure features play an essential role. They combined this potential with MD, which resulted in reproducing the aggregation of Alzheimer's A [16][17][18][19][20][21][22] , (Derreumaux and Mousseau 2007;Wei, Song et al. 2008). In adopting the sampling of Replica Exchange MD (REMD), they obtained an accurate structural description of Alzheimer's Amyloid-, hairpin and Trp-cage peptides ). A detailed atomic characterisation of oligomer-formation was obtained by combining OPEP, the atomistic model and REMD (Nasica-Labouze, Meli et al. 2011) Their reduced model on REMD enabled the calculation of several tens of sec in 40 replicas and the full assessment of convergence to the equilibrium ensemble, demonstrating the probability of determining the thermodynamic features of large proteins and assemblies (Laghaei, Mousseau et al. 2011).
As was mentioned above, the main CG models are tabulated in Table 2.

The trial for the TCR-pMHC and larger systems
At the starting point of the whole immunological synapse (IS) simulation, Wan, Flower and Coveny constructed a ternary complex of TCR-pMHC-CD4 between opposite membraneswhich consists of 329,265 atoms -and performed molecular dynamics for 10 ns on 128 processors of SGI Altix (Wan, Flower et al. 2008). It took 23 hours for one ns simulation. This run was not enough to calculate the binding free-energy by MM/PBSA due to the shortness of the simulation time and the lack of entropy evaluation. They intended to simulate a system consisting of four sets of the TCR-pMHC-CD4 complex, made up of about one million atoms. They pointed out the difficulty of the whole IS simulation on the all-atom model due to the too heavy load imposed upon the computer, and pointed out the feasability of adopting the hybrid atomistic/CG simulation for accomplishing the project (Diestler, Zhou et al. 2006).
At present, there have been only very limited trials of evaluated TCR-pMHC binding energy by the CG model. The evaluation of TCR-pMHC binding consists of at least three steps: 1) to determine the binding site, 2) to determine the binding configuration, and 3) to calculate the binding energy. Several works have provided not only the method to determine the binding configuration but also to detect the binding site from the surface nature of its own (Fernandez-Recio, Totrov et al. 2005;Burgoyne and Jackson 2006;Fiorucci and Zacharias 2010). The factors that concern the evaluation of TCR-pMHC binding are: 1) the evaluation of energy from a particular configuration, and 2) the sampling of independent configurations. In most CG models, the calculation of binding energy as the function of the configuration is based on their own parameters (Liwo, Pincus et al. 1993;Miyazawa and Jernigan 1996;Derreumaux 1999;Zacharias 2003;Buchete, Straub et al. 2004;Oldziej, Czaplewski et al. 2005;Zhou, Thorpe et al. 2007;Kim and Hummer 2008;Marrink, Monticelli et al. 2008). The sampling of independent configurations is most time-consuming but critically important process. If the sampling on the CG model reflects the distribution of the atomistic model with reasonable fidelity, it is quite a smart way to sample configurations on a CG-model , to reconstruct to the atom-scale the structure and then calculate the binding energy on these reconstructed atomistic structures using MM/PBSA. From this point of view, a general method to reconstruct the all-atom from the C atom position, RACOGS, was devised and the energy landscapes of both the CG-and the all-atom-model were shown to be quite similar, suggesting the validity of this principle (Heath, Kavraki et al. 2007).

Application of GPGPU in molecular dynamics
As mentioned above, all-atom simulation is very expensive, and hence is restricted scope in both time and scale. There have been attempts to breakthrough these circumstances, not only by improving the algorithm but also by devising novel hardware. Special purpose machines for MD have been developed (Susukita, Ebisuzaki et al. 2003;Shaw, Deneroff et al. 2008) and showed fairly good performance (Kikugawa, Apostolov et al. 2009). However, such purpose-specific machines are very expensive and their continuous development is difficult. The recent development of the general purpose graphic processor unit (GPGPU) has had much influence on high performance computing (Giupponi, Harvey et al. 2008). In 2011, three of the top 5 super-computers are constructed mainly on NVIDIA's GPGPU (http://www.top500.org/). Many applications are now being preparing to respond to this momentum, and representative molecular dynamics software such as Amber, CHARM, GROMACS and NAMDA are now being prepared to equip programs working on GPGPUs. Recent representative GPGPUs, such as Tesla C2075, have a performance of 1.03 T Flops on single precision. We calculated the binding energy of two TCR-pMHC complexes, 2GJ6 and 3PWP, on C2075 and compared the results calculated on a Xeon processor. After heating, density-equilibration and equilibration, product runs were performed for 10 runs, corresponding to 5 nsec in total.
The results are shown in Table 3. As can be seen from the

Conclusion
Physically meaningful models are rapidly advancing and are being applied to large macromolecular systems with the rapid evolution of parallel computation and hardware, such as multi-core processors and GPGPUs. Although the exact models become realistic for calculations of large bio-molecules, continuum dielectric models are still useful for the binding free energy calculation and bound complex structure prediction as well as the structure prediction tasks of bio-molecules such as proteins and nucleic acids, etc., because of the high cost performance and fairly good accuracy. In future, hybrid approaches will become promising, where QM model, the all-atom model, the CG model and continuum models are combined with a good conformational sampling technique such as the ER method, and we can choose the optimal hybrid approach according to purpose and the system size.
It has been clear that the calculation of TCR-pMHC binding energy with reasonable efficiency and accuracy is feasible. MMPBSA/GBSA seems quite promising. The sampling method affects both the efficiency and accuracy of the calculation. The combination of sampling on the CG model and energy-calculation on the atomistic model is very reasonable approaches. GPGPUs will be quite important facilities. A combination of those factors will provide for the valid simulation of biologically interesting phenomena for an adequately long time.