
|
Department of Chemistry, Faculty of Science, University of Kurdistan |
|
Protein Setup for Molecular Dynamics Simulations
Molecular Dynamics (MD) simulations are widely used in computational chemistry to study the structural and dynamical behavior of biomolecules. The technique has been extensively applied to enzymes, biological catalysts responsible for performing and accelerating chemical reactions in living organisms. Three-dimensional structures of enzymes are available in the Protein Data Bank (PDB) (https://www.rcsb.org/) in .pdb format. However, raw PDB files are not directly suitable for simulations. They often contain incomplete residues, alternate conformations, and crystallographic artifacts that must be corrected before starting an MD simulation. This tutorial describes a standardized setup protocol for preparing protein structures for MD simulations, using Aphid Myrosinase (PDB ID: 1WCG) as a case study. The procedure is optimized for the AMBER software package but can be adapted for other molecular mechanics programs. For some setup steps, we use our PDBtoORCA toolkit. Be sure to download it from its GitHub repository (https://github.com/iranimehdi/pdbtoorca). 1. Preliminary Steps1.1 Gather Structural InformationBefore modification, inspect the PDB file to understand: · Number of chains and completeness of the structure · Presence of missing residues or atoms (see REMARK 465 and 470) · Metal centers and Cys–Cys disulfide bonds · Non-standard molecules (check HET and HETNAM lines) · Experimental pH and crystallographic details It is highly recommended to read the associated publication describing your PDB structure.
1.2 Setting Up the Working Directorymkdir MD-1WCG cd MD-1WCG wget https://files.rcsb.org/download/1WCG.pdb mv 1WCG.pdb 00-1WCG.pdb
Maintain sequential filenames (e.g., 00-, 01-, 02-) for each step to ensure reproducibility and clarity. 2. Structure Cleanup2.1 Remove Unnecessary LinesKeep only the coordinates and termination lines: cat 00-1WCG.pdb | egrep "^ATOM|^HETATM|^TER" > 01-tidy_up.pdb
2.2 Remove Buffer IonsExample: remove sulfate ions (SO₄˛⁻) sed '/SO4/d' 01-tidy_up.pdb > 02-no_buffer_ions.pdb 2.3 Remove Irrelevant MoleculesExample: remove glycerol molecules (GOL) sed '/GOL/d' 02-no_buffer_ions.pdb > 03-no_GOL.pdb
2.4 (Optional) Remove Crystal WatersUseful before docking or initial preparation. After docking, you can relocate the crystal water molecule, but remove those that are in short contact with the docked ligand. sed '/HOH/d' 03-no_GOL.pdb > 04-no_HOH.pdb
3. Select a Representative ChainIf the structure contains multiple homologous chains, you may select one (e.g., chain A) to reduce computational cost. This can be done manually or with the PDBtoORCA toolkit using the command: pdbtoorca <<EOF 04-no_HOH.pdb Chain A 05-chainA.pdb q EOF
4. Handling Missing Residues or AtomsCheck REMARK 465/470 in the PDB file. Usually, it is not necessary to attempt to model missing residues, especially if they are located at the start or end of chains. 5. Managing Alternative LocationsAtoms or residues with alternative conformations have occupancy numbers < 1.00. Keep the conformation with the highest occupancy and delete the others. Automate this with: pdbtoorca occ
6. Checking for Short ContactsShort contacts may arise from overlapping residues, water molecules, or alternate locations. Identify them using: pdbtoorca shortcon Remove or adjust problematic atoms accordingly. 7. Assigning Protonation StatesAssigning correct protonation states is essential for accurate electrostatics and catalytic modeling. Key principles: · Charged residues should generally be located on the protein surface to maintain solubility and realistic electrostatic distribution. Buried charges should be carefully examined. They are acceptable only if: o participating in metal coordination, o forming an ionic pair, or o acting as part of a catalytic residue (e.g., nucleophile). 7.1 Calculate pKa Values Using PROPKAUpload your PDB file to the PROPKA server to estimate residue pKa values and identify surface/buried charges. 7.2 Protonation States of Arginine (Arg) and Lysine (Lys)Typically charged (pKa > 7). · If pKa < 7, they may be buried. o Check for ionic pairs (Asp or Glu) using RasMol, e.g.: o restrict within(3.5, 26) · If no ionic partner is present, neutralize Lys → LYN. (No default parameter for neutral Arg in AMBER.) 7.3 Protonation States Glutamate (Glu) and Aspartate (Asp)Usually charged (pKa < 7). · If pKa > 7, they may be protonated (neutral): o Asp → ASH o Glu → GLH · Always retain the charged form when: o Coordinated to metal ions (e.g., Zn˛⁺) o Acting as a nucleophile in catalysis o Example: Glu-374 in Aphid Myrosinase (nucleophile [1]) → GLU o Example: Glu-167 (acid/base role) → GLH ⚠️ In AMBER, if you protonate GLU/ASP, the hydrogen is added to OE2/OD2 by default. To protonate OE1/OD1, swap their coordinates before tleap.
7.4 Protonation States of Histidine (His)· pKa > 7: Protonated (HIP): particularly if they are solvent-exposed or located in buried regions near negatively charged residues such as aspartate or glutamate · pKa < 7: Neutral (HID or HIE) · Examine hydrogen-bonding interactions: o ND1 near a backbone carbonyl → HID o NE2 near a backbone nitrogen → HID o NE2 near a backbone carbonyl → HIE o ND1 near a backbone nitrogen → HIE · If coordinated to metal (e.g., Zn˛⁺ via NE2), protonate the other nitrogen: o Example: His-54 in 1QIN [2]→ HID ⚠️ If there are too many negative charges in the protein, those His residues that are on the surface can be assigned as HIP. 8. Protonation States of Cysteine (Cys)· Default: CYS (protonated) · Metal coordination → CYM · Disulfide bonds → CYX (check SSBOND lines in PDB)
9. Finalizing the StructureAfter assigning all protonation states: · Modify residue names manually (e.g., ASP→ASH, GLU→GLH). · Or automate histidine adjustments: o pdbtoorca his This finalized structure now represents the biologically relevant state for parameterization and solvation in the AMBER software. 10. References[1] S. Jafari, U. Ryde, M. Irani, QM/MM study of the catalytic reaction of aphid myrosinase, Int. J. Biol. Macromol. 262 (2024) 130089. https://doi.org/https://doi.org/10.1016/j.ijbiomac.2024.130089. [2] S. Jafari, N. Kazemi, U. Ryde, M. Irani, Higher Flexibility of Glu-172 Explains the Unusual Stereospecificity of Glyoxalase I, Inorg. Chem. 57 (2018) 4944–4958. https://doi.org/10.1021/acs.inorgchem.7b03215.
|
|
Mehdi Irani Teaching duties Methods |