# Coarse Grained Go Models

## Introduction

In previous sections, especially the one describing the energy function, we have implicitly assumed a representation of the molecular system in which each atom is described as a single point that may be connected via bonds to other points (atoms). The force acting on an atom is the sum of that caused by these bonds as well as non-bonded interactions (electrostatics, van der Waals). However, this is only one way that a molecular system can be described, Quantum chemistry software packages, for example, calculate energy based on the wave functions (representing the electron densities) of the system's atoms. Depending on exactly how this is done, this can be many orders of magnitude more computationally expensive than calculating classical energy and forces. It is also possible to construct a potential energy function that is much cheaper to calculate, and that is what will be discussed in this section of the tutorial.

In particular, we will discuss a coarse-grained Go-model that is based off of the work of Klimov and Thirumalai, which we call the KT Go model. For more information about this model, please see:

• Ueda Y, Taketomi H, Go N. "Studies on protein folding, unfolding, and fluctuations by computer simulation. II. A. Three-dimensional lattice model of lysozyme." Biopolymers. 17(6), pp. 1531-1548 (1978).
• Klimov DK, Thirumalai D. "Linking rates of folding in lattice models of proteins with underlying thermodynamic characteristics." J. Chem. Phys. 109, p. 4119 (1998).
• Klimov DK, Thirumalai D. "Mechanisms and kinetics of β-hairpin formation." Proc. Nat. Acad. Sci. USA. 97(6), pp. 2544-2549 (1999).

This model is designed to study only proteins. Each amino acid it represented by two beads: one for its backbone placed at the alpha-carbon position, and one for the sidechain, placed at the sidechain center of mass. Since Glycine does not have much of a sidechain, it is represented by only one bead.

This model confers two major advantages over the all-atom model:

1. Since there are many fewer interaction centers, it is much less costly to compute the energy and forces.
2. Since high-frequency motions have been eliminated, it is possible to run with a longer time step, meaning that fewer energy evaluations are needed for a given amount of simulation time. KT Go models may be run conservatively at time steps of 5 fs, compared to 1-2 fs for all atom models.

The cost of these advantages, of course, is a less precise representation of the atomic system. Nonetheless, it has been shown in the literature that models such as this one can still reproduce dynamic and thermodynamic properties of molecular systems. Other coarse grained models exist at varying levels of precision. A full description is beyond the scope of this tutorial, however.

## The functional form of the coarse-grained Go model

Go models bake the implicit assumption that native structures are more stable than non-native ones. This immediately leads to the concept of native contacts. Two residues are considered to form a native contact if their sidechain centers of mass are within 4.5 angstroms of one other. Note that this value is configurable whgen building the model. The parameters of the model, especially the non-bonded ones, are designed to preserve these interactions while discouraging non-native contacts from forming.

### Bonded interactions

The bond, angle, dihedral, and improper dihedral terms of the KT Go model have the same functional form of those in the standard CHARMM force field. All distances are chosen such that the coarse grained representation of the crystal structure has the lowest energy for each parameter. Neighboring back-bone beads are connected, and each side-chain bead is connected to its own back bone. The bond force constants are, by default, all set to 50 kcal(mol) − 1\AA − 2, which is much weaker than would be used for bonds between backbone atoms in an all-atom force field. For instance, this force constant is about a factor of 6 weaker than that of an aliphatic carbon bond. This is because these pseudo-bonds are designed to represent coupling between different residues or different parts of the same residue, as opposed to strong chemical bonds.

A weak angle bending term (force constant 40 kcal(mol) − 1rad − 2) is chosen to approximate the stiffness of a peptide chain. Again, this is to allow for flexibility in the backbone-backbone, backbone-sidechain, and sidechain-sidechainb linkages.

Dihedral parameters are chosen depending on what secondary structural element the residue is a part of. Helix formation is encouraged in residues that are part of a helix in the crystal structure; otherwise, it is discouragel. Two parameters are given and used for each dihedral: one with a multiplicity of 1 and the other with a multiplicity of 3. The table below lists the default force constants used for the dihedrals (in kcal(mol) − 1$\AA^{-2}$):

 Secondary structure type Dihedral force constant (periodicity 1) Dihedral force constant (periodicity 3) α-helix 0.30 0.15 3-10 helix 0.30 0.15 Non-helix 0.55 0.275

Improper dihedral terms are used to maintain the chirality of the amino acids. Without them, there is nothing to prevent sidechain beads from rotating 180 degrees around the peptide chain, which would represent an unphysical chirality switch of an all-atom system. Although there are some L-amino acids, they are not common and this energy term mimics the all-atom energy barrier required for their formation.

### Non-bonded interactions

Non-bonded interactions in the KT Go model are substantially different from those in an all-atom model. The only non-bonded parameters used is a 12-6 Lennard-Jones (LJ) function (of the same form that is used to calculate the van der Waals interactions in the CHARMM force field). This term is used to represent all non-bonded interactions, and it includes the effects of electrostatics. Since a Go model makes the assumption that the starting structure is stable, and thus desirable, the Lennard Jones interactions are designed such that residues which are in contact with one another in the crystal structure have an attractive potential while all other pairs have a non-attractive potential. The only exception to this rule is backbone-backbone or backbone-sidechain pairs that participate in hydrogen bonds, which also have an attractive LJ potential with a well.depth of 0.25 kcal if the residues are within an α-helix (doubled to 0.5 kcal otherwise. The exact native sidechain-sidechain LJ well-depth is determined via the use of a contact potential. Contact potentials are derived experimentally based on how much different amino acids attract one another. CHARMMing supports several such potentials, one of the most used of which is the Miyazawa-Jernigan potential. For information, please see:

• Miyazawa S, Jernigan RL. "Residue-Residue Potentials with a Favorable Contact Pair Term and an Unfavorable High Packing Density Term, for Simulation and Threading." J. Mol. Biol. 256(3) pp. 623-644 (1996).

One important fact to realize is that the strength of the contact potential must be tuned to be in balance with that of the bonded forces present in the model. Therefore, the Lennard Jones parameters are scaled by a factor called nScale. This parameter varies from protein to protein, and thus must be recalculated for each system. This is usually done by running the system at different temperatures with carying choices of nScale. The ultimate nScale chosen for the model is that which best reproduces a physoiological melting temperature for the protein, which can often be found in the experimental literature. Often, temperature replica exchange is used to facilitate transitions between different temperatures, which reduces the amount of simulation steps necessary to obtain a converged melting temperature.

One other note is that the conbtact potential includes the effect of solvation on amino acid interactions. Therefore, KT Go models should never be run in the presence of explicit or implicit solvent, as this would duplicate an effect already found in the model in a highly unrealistic way (since none of the coarse grained beads are charged).

### Choosing nScale

In the very near future, CHARMMing will allow the user to select a target melting temperature and will automatically run calculations to try to determine the correct nScale. The determined nScale will be used for future calculations using that peptide.

The procedure for determining nScale is fairly straightforward. Firstly, an initial guess of the value must be made. Using 1.0 (no scaling up or down) is often a reasonable first guess. Then a brief (10-100 nanosecond) temperature replica exchange simulation is conducted using a range of temperatures above and below the hypothesized melting temperature. The fraction of time that each native contact is preserved is calculated at each temperature and from that a melting curve and melting temperature can be determined. If the melting temperature is too high, the value of nScale guess must be lowered (native contacts must be made weaker), and if it is too low then nScale should be raised (making native contacts stronger). This process is repeated until a physiological melting temperature is achieved.

## How to build the KT Go Model

Note: the following section describes the Python scripts that are used to build CHARMM topology and parameter files for KT Go models. It assumes minimal knowledge of Python.

CHARMMing uses a set of Python classes to build a KT Go model. These classes are part of the pychm library, which is developed in conjunction with CHARMMing. The main class is pychm.cg.ktgo.KTGo. A KT Go Python object can be easily created as follows:

```import pychm

pdbfile = pychm.io.pdb.PDBFile('your_pdb_file.pdb')[0]
cgm = pychm.cg.ktgo.KTGo(pdbfile)
```

There are a number of options that can be passed to the KTGo constructor. For example, the stride program is required to calculate the secondary structure of the PDB file, and a path to it can be passed in via the strideBion argument, e.g.:

```cgm = pychm.cg.ktgo.KTGo(pdbfile,strideBin="/path/to/stride/executable")
```

When the Python object is created, pychm determines the secondary structure and, most importantly, which pairs of residues form native contacts. It also stores the alpha carbon and side-chain center of mass positions, which are the initial coordinates of the coarse-grained beads. This is the basic information necessary to produce the topology, parameter, and PDB files for the model. However, other parameters (e.g. nScale, the contact set used, bond parameters, etc.) can be adjusted. This is done by setting model parameters in Python.

e.g. to change nScale from its default value of 1 to 0.7, you would use:

```cgm.nScale = 0.7
```

Once you've adjusted all the parameters to your liking (all of which are documented as pydoc strings so that Python's "help" functionality will list them), you can write out the foles to be read into CHARMM:

```cgm.write_pdb('mygomodel.pdb')
cgm.write_psf('mygomodel.psf')
cgm.write_rtf('mygomodel.rtf')
```

These files can be read directly into CHARMM to generate the PSF for the model; here is an example CHARMM script:

```* read in a KT GO model
*

open unit 10 read card name mygomodel.pdb
rewind unit 10

generate go setup first none last none ! makes the PSF

energy ! make sure everything is OK

write psf card name mygomodel.psf
* PSF for Go model
*

write coor card name mygomodel.crd
* CRD for Go model
*

stop
```

The model can then be used for future CHARMM simulations.

## What you can do with this model

Fundamentally, a KT Go model can be used to answer the same types of scientific questions that an all-atom model could, as long as those questions exist within the resolution of the model. Since atoms are heavily extracted, the KT Go model is not a good choice for studies that depend heavily on the exact position of particular atoms (e.g. the precise conformation of a binding pocket). However, these models have proven quite successful at reproducing thermodynamic properties such as transition rates.

### Simulation

Most simulations performed on KT Go models are dynamics. It is, of course, possible to do normal mode analysis, but these models are not always tuned to reproduce the correct frequency spectrum of the peptide. For details, see:

• Ghysels A, Miller BT, Pickard IV FC, and Brooks BR. "Comparing normal modes across different models and scales: Hessian reduction versus coarse-graining." J. Comput. Chem. 33(28) pp. 2250-2275 (2012).

Because these models are so cheap to run as compared to all-atom representations, it is generally possible to obtain hundreds of nanoseconds to several microseconds of dynamics. The use of an advanced sampling method like replica exchange can be used to rapidly explore conformational space.

### Analysis

Once an MD trajectory is obtained, either via single temperature dynamics or using a method like replica exchange, several types of analysis can be performed. As with an all-atom trajectory, structures can be clustered to identify various metastable states and folding transition pathways.