University of Kurdistan, Sanandaj logo

Department of Chemistry, Faculty of Science, University of Kurdistan

Protein set up for molecular dynamics simulations with AMBER

Molecular dynamics (MD) simulations are a widely used computational chemistry method to study biomolecules. The technique has been extensively applied to study enzymes (biomolecules responsible for performing and speeding up chemical reactions in living organisms). Three-dimensional structures of enzymes have been obtained and deposited in protein databanks with the PDB file format (https://www.rcsb.org/). The files are rough and cannot be used for simulations without being modified. It is necessary to set up these files to perform MD simulations. The setup process includes: i) assigning protonation states of protein residues with more than one possible protonation state (His, Glu, Asp, Arg, Lys, and Cys), ii) creating parameter files for unusual residues, iii) selecting the best alternative conformation for residues that have more than one conformation, and iv) performing a few tidy-up steps.

Here we discuss how to set up and run MD simulations for human glyoxalase II (GlxII), PDB ID 1QH5, as an example [1]. The procedure is customized for the AMBER program package but it may be used for other MM programs.

Start with a PDB file (e.g., 1QH5 PDB ID of GlxII; https://www.rcsb.org/structure/1QH5) and read all the initial lines in the file before the coordinates (lines start with ATOM or HETATM). Reading the article describing the PDB structure is better to find complete information. We need to know the number of protein chains and if all chains are included in the structure. Are all subunits included in the PDB file? Are there any Cys-Cys crosslinks? Are there any missing parts of the structure (check REMARKs 465 and 470)? Are there any unnormal molecules in the structure (check HET and HETNAM)? At what pH was the structure collected?

1.   Create a directory on your machine, download the PDB file, and rename it, e.g., to 0-1QH5.pdb.

mkdir               MD-1QH5

wget                 https://files.rcsb.org/download/1QH5.pdb

mv                    1QH5.pdb               00-1QH5

 

2.   Remove unnecessary lines to make the file smother and easy to use

We need only the lines which start with ATOM, HETATM, and TER. Execute the following command and export the output in a new file (e.g., 01-tided_up).

cat      00-1QH5 | egrep "^ATOM|^HETATM|^TER" >   01-tided_up

3.   Remove ions from the buffer (e.g., SO4)

sed     '/SO4/d'        01-tided_up                                   >   02-no_buffer_ions

4.   Treating residues or atoms with alternative locations

Some residues or atoms in some PDB structures may have alternative locations. These can be found by looking at the occupancy numbers stored in columns 57-60 in PDB files. These atoms have occupancy numbers less than 1.00. If so, Keep the one with the highest occupation number and remove the alternative one. If all alternatives of one atom have the same occupation numbers (0.500), keep the first or do MD simulations to find the most stable one, as we did in a recent work [2]. Residues with occupation numbers less than 1.00 can be checked by changepdb (a local program), command occ. The 1QH5 structure has no atom or residue with alternative locations.

5.   Treating missing residues and missing atoms

Some residues or atoms are missing from some PDB files. These can be found in REMARKs 465 and 470 in the PDB files. Normally we do not attempt to model missing residues, especially if they are located at the start or the end of chains. However, missing atoms can be located by Swiss-PdbViewer. To this, open the PDB file by the software, and you will encounter a dialogue box saying, "Some amino acid side chain atoms are missing. A reconstruction of the whole side chain will be attempted. Affected amino acids will appear in pink.". After that, save the file with the missing atoms located. The 1QH5 structure does not have any missing atoms or residues.

6.   Check for short contacts

Crystal water molecules with short contacts with each other or other residues have to be removed from the final PDB file. Short contacts can be checked by changepdb, command short con.

7.   Assign protonation states of titrable residues

Some amino acid residues can have more than one protonation state (Arg, Lys, His, Glu, Asp, and Cys). Protonation states of these residues are not identified in the PDB files since there are no hydrogen atoms in such files. First, submit the PDB file to the PROPKA server (https://www.ddl.unimi.it/vegaol/propka.htm). It calculates pKa values of protein residues. Note the pKa of the His residues, and check if any of the Asp and Glu have a pKa above 7 or if any Lys, Arg, and Tyr have pKa below 7.

The main philosophy of assigning proper protonation states of titrable residues is that charged groups can only be presented on the surface of proteins, and there should be no buried charges not involved in ionic pairs. However, buried charges close to metal centers or active sites can be presented in the protein (for example, Glu-374 is the nucleophilic residue of aphid myrosinase [3], which is buried in the active site with no ionic pair and has to be left charged). The bc command of changepdb can extract buried charges. Check all these graphically with RasMol or similar programs.

7.1     Protonation states of Arg and Lys residues

Our experience shows that most Arg and Lys residues are located on the surface, and if they are buried, they have a negative charge neighbor (Glu or Asp). Then we leave the symbol of these residues as are in the PDB structures (ARG or LYS).

7.2     Protonation states of Glu and Asp residues

The story is a little complicated for the Asp and Glu residues. These residues are usually found paired with Arg and Lys residues on surfaces and sometimes buried inside proteins. In these cases, we leave their symbols as are in the starting PDB files (GLU or ASP). Sometimes they are buried but coordinated to a metal ion (e.g., Glu-99 in human glyoxalase I, which is coordinated to a Zn ion [4]) and have to be charged (GLU and ASP symbols). In some proteins, a Glu or Asp residue is buried with no ion pair and is not a metal-ligand but has a general acid/base or nucleophilic role in the catalytic reaction. When an Asp or Glu residue has a general acid role in the catalytic reaction, they have to be protonated (their symbol has to be changed to ASH or GLH). For example, Glu-167 and Glu-374 are the general acid/base and nucleophilic residues of aphid myrosinase [3] and have to be shown by GLH and GLU symbols, respectively. Note that if you change the symbol of Asp and Glu residues to ASH and GLH, the tleap module of the program will add a proton to their OD2 and OE2 atoms, respectively. Sometimes, we may desire to add a proton to OD1 or OE1 atoms. For example, if we want to add a proton to the OE1 atom of a Glu residue, first change its symbol to GLH, then swap the coordinates of the OE1 and OE2 atoms in the PDB file.

7.3     Protonation states of His residues

His residues can have three different protonation states. They can be protonated on ND1 (HID), protonated on NE2 (HIE), or doubly protonated on both ND1 and NE2 atoms and charged (HIP). We need to know the hydrogen bond network around ND1 and NE2 atoms to assign protonation states of His residues. RasMol is a good candidate for this. If you want to show, for example, neighboring atoms of 26th residue with at least 3.5 Å by RasMol, execute the following, and then from the Display tab, select Ball & Stick.

restrict within(3.5, 26)

 

For instance, if the ND1 atom of a His residue is close to a hydrogen bond acceptor, e.g., a backbone O atom, then the His residue has to be protonated on that atom. On the other hand, if the ND1 atom is close to a hydrogen bond donor atom, e.g., the OG atom of a serine residue, it has to be protonated on NE2. Sometimes His residues are metal ligands. In these cases, the His residue has to be protonated on the atom that is not coordinated with the metal ion (e.g., in the 1QH5 structure His-54 is coordinated to a Zn ion by its NE2 atom [1], and its symbol has to be changed to HID). Those His residues which are on the surface and there to many negative charges in the protein can be assigned as HIP.

7.4     Protonation states of Cys residues

Most Cys residues must be unchanged (protonated) unless they are coordinated to a metal ion or make Cys-Cys crosslinks. In the former, we change the symbol of the corresponding residue to CYM, and in the latter, we change the symbol to CYX. The Cys-Cys crosslinks are mentioned in the lines starting with SSBOND in the PDB files. In these cases, we need to modify the tleap.in file (see section 1 on the MD page).

After assigning the protonation states, change the symbol of the desired residue in the final PDB file (01-tided_up) by hand. The his command of changepdb changes the symbols of his residues to desired ones.

8.   References

[1]      A.D. Cameron, M. Ridderström, B. Olin, B. Mannervik, Crystal structure of human glyoxalase II and its complex with a glutathione thiolester substrate analogue, Structure. 7 (1999) 1067–1078. https://doi.org/10.1016/S0969-2126(99)80174-9.

[2]      S. Jafari, U. Ryde, M. Irani, QM/MM Study of the Catalytic Reaction of Myrosinase; Importance of Assigning Proper Protonation States of Active-Site Residues, J. Chem. Theory Comput. 17 (2021) 1822–1841. https://doi.org/10.1021/acs.jctc.0c01121.

[3]      H. Husebye, S. Arzt, W.P. Burmeister, F. V. Härtel, A. Brandt, J.T. Rossiter, A.M. Bones, Crystal structure at 1.1Å resolution of an insect myrosinase from Brevicoryne brassicae shows its close relationship to β-glucosidases, Insect Biochem. Mol. Biol. 35 (2005) 1311–1320. https://doi.org/10.1016/j.ibmb.2005.07.004.

[4]      A.D. Cameron, M. Ridderström, B. Olin, M.J. Kavarana, D.J. Creighton, B. Mannervik, Reaction Mechanism of Glyoxalase I Explored by an X-ray Crystallographic Analysis of the Human Enzyme in Complex with a Transition State Analogue, Biochemistry. 38 (1999) 13480–13490. https://doi.org/10.1021/bi990696c.