University of Kurdistan, Sanandaj logo

Department of Chemistry, Faculty of Science, University of Kurdistan

Creating parameter files for unusual residues

Unusual residues are in lines starting with HETATM in PDB files. There are no parameters in AMBER libraries for these residues. There are two distinct categories for the hetero groups, disconnected and connected hetero groups. In the following, we calculate parameters for S-(N-hydroxy-N-bromophenylcarbamoyl)glutathione (the residue shown by the GBP symbol in the 1QH5 PDB structure) as an example of a separate hetero group (there are no covalent bonds between this group and protein residues and a negatively charged  His residue which is connected to a protein backbone. We also explain treating metal centers in MD simulations.

1.   Calculating parameters of GBP, a disconnected ligand

1.1     Copy the structure of GBP in a text file. This can be copied from the starting pdb file (1QH5.pdb) or another file in the setup steps (cf. the AMBER-setup page).

grep "GBP" 01-tided_up    >  gbp.pdb

 

1.2     Protonate the GBP structure

Open the gbp.pdb file created in the previous step with a program such as GaussView6 (https://gaussian.com/gaussview6/). Let the program add protons to GBP and then save the structure as a .gjf of .com file. Check that the added protons be correct, and assign charge and multiplicity of the protonated structure.

1.3     Optimize the molecule and calculate atomic point charges

Optimize the geometrical structure of GBP with QM methods, typically B3LYP/6-31G* or AM1 for a big molecule. Then calculate the electrostatic potential at the HF/6-31G* level. For the Amber-99SB force fields, this should be done at the HF/6-31G* level. However, other force filed may use different levels of theory. The following is the gbp.gjf file we created for the GBP structure. Execute the following command, which optimizes the structure and calculates the electrostatic potentials.

g16 < gbp.gjf > gbp.out

 

The gbp.gjf  file:

%chk=gbp.chk

%nprocshared=28

%mem=10GB

#P B3LYP/6-31g* Opt

 

Title Card Required

 

-1 1

 C                 27.14000000   47.03800000   60.87800000

 O                 27.40700000   47.65900000   61.94000000

 O                 26.97900000   47.63600000   59.78500000

 N                 28.14900000   44.84500000   60.18000000

 C                 26.99900000   45.48600000   60.86500000

 C                 26.85800000   44.81700000   62.25900000

 C                 23.76500000   40.78800000   63.65500000

 S                 24.90300000   40.41100000   62.26200000

 C                 25.64500000   45.21100000   63.15100000

 C                 24.35400000   44.38800000   63.02900000

 O                 23.38100000   44.87400000   62.43900000

 N                 24.35500000   43.17000000   63.57800000

 C                 23.21500000   42.22800000   63.50800000

 C                 22.11900000   42.46500000   64.61800000

 O                 21.07100000   41.81000000   64.58200000

 N                 22.47700000   43.28500000   65.61700000

 C                 21.67700000   43.49300000   66.84100000

 C                 21.34000000   44.94400000   67.17600000

 O                 21.72000000   45.83700000   66.38800000

 O                 20.69300000   45.18000000   68.21100000

 N                 24.50500000   40.11300000   59.52100000

 O                 25.82900000   40.19500000   59.53200000

 C                 23.95100000   40.27100000   60.73500000

 O                 22.74100000   40.21900000   60.93800000

 C                 23.85600000   39.90200000   58.22700000

 C                 22.62600000   40.49800000   57.91200000

 C                 24.50000000   39.08900000   57.28200000

 C                 22.03200000   40.30000000   56.66800000

 C                 23.90900000   38.89400000   56.03900000

 C                 22.68900000   39.48800000   55.74800000

 Br                21.89300000   39.20400000   54.06600000

 H                 22.29423980   43.15145279   67.64552504

 H                 20.73665901   43.01956161   66.64991126

 H                 22.48780168   44.18789027   65.18726463

 H                 27.98971022   44.84244516   59.19277144

 H                 28.24783678   43.90416574   60.50413297

 H                 26.08116122   45.32830039   60.33811973

 H                 27.71771851   45.14200767   62.80686321

 H                 26.71816501   43.77684798   62.05060146

 H                 22.92878532   40.12775384   63.55640980

 H                 24.35556602   40.77384285   64.54714985

 H                 25.37488587   46.19465187   62.82794064

 H                 25.97894082   45.05041059   64.15479010

 H                 22.73499311   42.38795075   62.56518022

 H                 26.16519315   40.06293506   58.64254341

 H                 22.13705342   41.11360064   58.63785612

 H                 25.43521583   38.62447931   57.51543496

 H                 21.09639772   40.75940663   56.42614481

 H                 24.39531990   38.28517902   55.30569927

 H                 28.98432404   45.35648272   60.38154198

 H                 25.17669583   42.87640593   64.06648590

 

--Link1--

%chk=gbp.chk

%nprocshared=28

%mem=10GB

#P HF/6-31G* SCF=Tight Geom=AllCheck Guess=Read

   Pop=MK IOp(6/33=2,6/41=10,6/42=17)

 

 

1.4     Extract the reactants' GAFF parameters using the antechamber module of AMBER

antechamber  -fi  gout  -fo  prepi   -c  resp  -i  gbp.out  -o  gbp.in  -rn  GBP  -at  gaff  -pf  y

 

Sometimes, you may need to prepare some missing parameters using AMBER's parmchk2 module. The following command should be executed if this is the case.

parmchk2   -i   gbp.in   -f  prepi   -o   gbp.dat

2.   Parameterizing a negatively charged His residue

Histidine has three major protonation states at biological pHs. Single-protonated at the delta- or epsilon- positions and double-protonated (protonated at both). The first two states are neutral, and the third is positively charged. You have to change the residue name of HIS to HIE or HID for epsilon- or delta-protonated histidine, respectively (HIS is interpreted as HIE). To get the double-protonated form, change the residue name to HIP. A fully deprotonated histidine (no protons on either NE2 or ND1) would need to be parametrized (mainly just the charge derivation). Note that this is a highly unusual form of histidine that would only be present in special cases. For example, in the Rieske site, a His group coordinated to a Fe ion can be deprotonated on both ND1 and NE2 atoms and then negatively charged [1]. The difference in the parametrizing of this case with the previous one (GBP) is that it is ready to be connected to other amino acids at its N- and C-termini and we need to strip off the atoms at the beginning and the end to make an "amino acid"-like residue. A detailed procedure has been described for parametrizing{2-[(1R,2R)-1-amino-2-hydroxypropyl]-4-(4-hydroxybenzylidene)-5-oxo-4,5-dihydro-1H-imidazol-1-yl}acetic acid (CRO) at the AMBER home page on the following address. Here, we customize that procedure for a negatively charged histidine residue.

https://ambermd.org/tutorials/basic/tutorial5/index.php

2.1     Start with a PDB file containing at least one HIS residue (e.g., 1RIE) and protonate it using pdb4amber as:

 

pdb4amber -i 1RIE.pdb -o 1RIE-protonated.pdb --reduce

If your PDB structure is already protonated, remove HE2 or HD1 proton from an HIE or HID residue, respectively.

2.2     Change the symbol of the HID  or HIE residue to HIN in your PDB file by hand.

2.3     Copy the modified histidine residue into a new PDB file.

Grep “HIN” comqum.pdb > HIN.pdb

 

2.4     Open the HIN.pdb file with GaussView and then add two protons (one on the C and the other on the N of the backbone to complete the valance of these atoms).

2.5     Rename the added H atoms to HX1 and HX2 in the modified HIN.pdb file (HIN-V-complete.pdb).

2.6     Calculate BCC charges and assign AMBER atom types to the structure.

antechamber -fi pdb -i HIN-V-complete.pdb -bk HIN -fo ac -o hin.ac -c bcc -at amber -nc -1

This creates the hin.ac file which is so similar to the PDB file format, but the last column is the atom types.

2.7     The antechamber module assigns an incorrect atom type to backbone nitrogen (NT). Change it to N by hand or using the following command.

sed -i 's/NT/ N/' hin.ac

 

2.8     Create the hin.in file using prepgen

prepgen -i hin.ac -o hin.in -m hin.mc -rn HIN

where the hin.mc file is as

HEAD_NAME N

TAIL_NAME C

MAIN_CHAIN CA

OMIT_NAME HX1

OMIT_NAME HX2

PRE_HEAD_TYPE C

POST_TAIL_TYPE N

CHARGE -1.0

2.9     Copy the parm10.dat file into your working directory.

cp $AMBERHOME/dat/leap/parm/parm10.dat .

 

2.10  Creat hin.dat file

parmchk2 -i hin.in -f prepi -o hin.dat -a Y -p parm10.dat

grep -v "ATTN" hin.dat > hin1.dat # Strip out ATTN lines

parmchk2 -i hin.in -f prepi -o hin2.dat

 At this point, we get two .dat files; one with the parameters extracted from the Amber parameter database (hin1.dat) and one with the parameters extracted by matching to gaff atom types (hin2.dat). Between these two files, we have all the parameters we need.

However, running tleap, you need to load hin2.dat first, followed by hin1.dat, to ensure that all of the gaff parameters are overwritten, where we would rather use the ff14SB parameters instead.

3.   Treating metal centers

If there are metal centers in your protein, special treatment must be done. We treat metal sites with a nonbonded model with restraints between metals and their ligands. This is necessary to keep the structure of the metal cluster intact [2]. Otherwise, the metal-ligand bonds will oscillate during the simulations and break up the cluster in an unrealistic manner.

3.1     Cut the metal centers from a protonated PDB file, e.g., pdbout from tleap

grep "HID    54" pdbout  >   site.pdb

grep "HIE    56" pdbout  >> site.pdb

grep "ASP    58" pdbout  >> site.pdb

grep "HID    59" pdbout  >> site.pdb

grep "HID   110" pdbout >> site.pdb

grep "ASP   134" pdbout >> site.pdb

grep "HID   173" pdbout >> site.pdb

grep " ZN    521" pdbout >> site.pdb

grep " ZN    522" pdbout >> site.pdb

grep "XOH   523" pdbout >> site.pdb

 

3.2     Open the site.pdb file by GaussView

3.3     Cut the amino acids up to their alpha carbons (CA)

3.4     Replace the CA atoms with hydrogens, and save the final structure in the PDB file format.

3.5     Optimize the metal cluster by a QM program, like Turbomole, ORCA or Gaussian. In the following, we explain our customized procedure for Turbomole using our local programs.

3.5.1     Convert the PDB file to a control file (Turbomole input file).

changepdb <<EOF

site.pdb

c

 

n

 

EOF

 

Sometimes changepdb does not read GaussView files. If so, first convert the PDB file to the xyz format and then to the PDB format again, using OpenBabel [3], ad follows.

module purge

ml  GCC/4.9.3-2.25  OpenMPI/1.10.2 OpenBabel/2.3.2-Python-2.7.11

babel -i pdb site.pdb -o xyz site.xyz

babel -i xyz site.xyz -o pdb  pdb

 

3.5.2     Defile Turbomole calculations

define <<EOF

 

 

 

bb all def2-SV(P)

*

eht

 

0 # enter the charge of the cluster

 

 

 

dft

on

func tpss

 

 

 

ri

on

m

2500

 

 

dsp

on

bj

 

 

 

*

EOF

 

3.5.3     Fix the alpha carbons in the control file, adding the "f" letter at the end of the lines containing their coordinates.

3.5.4     Optimize the cluster by executing the following command

jobex      -backup      -c       900      -ri

 

3.6     Calculate atomic partial charges and force constants

In the directory where you performed geometry optimization, execute the following which is customized to be used on AURORA.

mkdir RESP FREQ

cp alpha  auxbasis  basis  beta  control  energy  hessapprox mos coord RESP/

cp alpha  auxbasis  basis  beta  control  energy  hessapprox mos coord FREQ/

cd RESP/

sed -i '/$coord/i $esp_fit kollman' control

sbatch /lunarc/nobackup/projects/teobio/Mehdi/q-files/q-RESP

cd ../FREQ/

sed -i 's/$rij/#$rij/' control

sbatch /lunarc/nobackup/projects/teobio/Mehdi/q-files/q-FREQ

 

This submits two jobs; one calculates RESP charges in the RESP subdirectory, and the other calculates vibrational frequencies in the FREQ subdirectory.

3.6.1     Extraction of RESP charges and updating atomic charges of the cluster in the prmtop file.

Copy the RESP charges from the end of logd file in the RESP subdirectory into an excel sheet. Change the charge of alpha carbons so that the total charge of the protein does not change.

First, update the charges in a PDB file which can be created by changeparm.

changeparm <<EOF

prmtop

p

pdbch

m

prmcrd

q

EOF

Then update the files in the prmtop file using the correct charges in the pdbch file.

changeparm  <<EOF

prmtop

m

pdbch

w

prmtop

q

EOF

 

3.6.2     Extraction of force constants from the frequency calculations.

You can extract force constants between the metal and its ligating atoms using describe, a local program. Execute something like the following in the Freq subdirectory. It would be better to add extra pairs, those bonds whose force constants are of interest.

describe <<EOF

t

control

fc.out

298.150

1.0

 

 

1.0

 

68 9

68 16

68 48

68 56

68 70

69 28

69 37

69 56

69 65

69 70

 

 

 

EOF

 

 

The force constants can be extracted from fc.out file. Note that AMBER reads force constants in kcal/mol/Å2. Fortunately, force constants with different units can be extracted from the fc.out file. The calculated restraints have to be included in a file called rst. A sample rst file is as the following, in which restraints of a metal ion coordinated by four ligands are listed. The values shown in boldface have to be changed to those of your system, where iat(1)iat(2)r2 and r3, and rk2 and rk3 are the atom number of the metal in the pdbout file, the ligand atom number in the pdbout file, metal-ligand distance in Å, and metal-ligand force constant in kcal/mol/Å2, respectively. The metal−ligand distances can be extracted from the crystal structure (averaged over the subunits) or from a small optimized QM-cluster model (cf. sections ‎3.1 to ‎3.5).

&wt type='END' &end

 

# ZN-HIE-562

# Site 1

&rst

iat(1)= 18762, iat(2)= 9005

r1=0.0, r2=2.19, r3=2.19, r4=10

rk2=9.6, rk3=9.6

&end

 

# ZN-CYM-568

# Site 1

&rst

iat(1)= 18762, iat(2)= 9090

r1=0.0, r2=2.36, r3=2.36, r4=10

rk2=16.3, rk3=16.3

&end

 

# ZN-CYM-573

# Site 1

&rst

iat(1)= 18762, iat(2)= 9167

r1=0.0, r2=2.36, r3=2.36, r4=10

rk2=16.3, rk3=16.3

&end

 

# ZN-CYM-577

# Site 1

&rst

iat(1)= 18762, iat(2)= 9232

r1=0.0, r2=2.36, r3=2.36, r4=10

rk2=16.3, rk3=16.3

&end

 

    &rst iat=0 &end

 

4.   References

[1]      T.A. LINK, W.R. HAGEN, A.J. PIERIK, C. ASSMANN, G. von JAGOW, Determination of the redox properties of the Rieske [2Fe-2S] cluster of bovine heart bc1 complex by direct electrochemistry of a water-soluble fragment, Eur. J. Biochem. 208 (1992) 685–691.

[2]      L. Hu, U. Ryde, Comparison of methods to obtain force-field parameters for metal sites, J. Chem. Theory Comput. 7 (2011) 2452–2463. https://doi.org/10.1021/ct100725a.

[3]      N.M. O’Boyle, M. Banck, C.A. James, C. Morley, T. Vandermeersch, G.R. Hutchison, Open Babel: An open chemical toolbox, J. Cheminform. 3 (2011) 33. https://doi.org/10.1186/1758-2946-3-33.