Ch121a SP2011 Problem Set 3: MD simulation of alanine tripeptide
Due: 04/25/2011 2:30pm in class
In this problem set, we will build all-alanine pentapeptide and tripeptide by ourselves in Maestro and simulate their conformational changes by LAMMPS molecular dynamics simulator. We will have the chance to re-discover the most prominent characteristic of an α-helix that Linus Pauling found by drawing a polypeptide on a strip of paper and folding it back and forth.
1) Login to one of servers (wolf11~wolf20.wag.caltech.edu, dell17~dell44.wag.caltech.edu) as you did in previous two lab sessions. Check if the server is busy by typing “top”. Create a clean directory by “mkdir yourdirectoryname”. Change working directory by “chdir yourdirectoryname”. Type “maestro” to start.
2) Build an alanine pentapeptide. In Maestro menu, go to Edit->Fragments. In the Build panel, select “Amino acids” in the Fragments select box. Select “Ala” and place it in the workspace. Grow four more alanines from the COOH-terminus of the first alanine. In Maestro's case, the C-term is capped by -NH-CH3 and the N-term is capped by CH3-CO-.
3) Change the backbone torsion angles to form an α-helix. Maestro Menu->Edit->Adjust->Dihedral. In the Adjust panel, pick four backbone atoms to define a dihedral angle. By selecting C(carbonyl C)-N-C(α-carbon)-N in the order of N-term to C-term, we define angle phi. Change phi angle to -60. By selecting N-C(α-carbon)-C(carbonyl C)-N, we define angle psi. Change psi angle to -45. You can delete the entry of the modified angle from the table panel to remove the marks in the workspace for better identifying the next angle to select. Repeat (phi, psi) selection and modification for the remaining alanines. Delete the capping group -NH-CH3 of the C-term and CH3-CO- of the N-term by double clicking the “X” button in the left toolbar and then selecting the unwanted atoms. Add hydrogens back to the N-term and C-term by double-clicking the “+H” button in the left toolbar and then click on the N and C whose valencies are not saturated by hydrogen.
4) What interaction would stabilize the α-helix conformation? What is the pattern of this interaction if you further grow the polypeptide in the same torsional angles for each added amino acid? Briefly describe it in your report. After you figure it out, you may want to hear from Linux Pauling on how he discovered α-helix (http://www.youtube.com/watch?v=yh9Cr5n21EE).
5) Save the α-helix alanine pentapeptide structure in mol2 format. Maestro Menu->Project->Export Structures.
6) Modify current α-helix pentapeptide structure to get tripeptide structure with standard torsional angles of α-helix. Save the alanine tripeptide structure to the working directory.
7) Get Mulliken charges for each structure. With only the target structure in workspace, go to Menu->Applications->Jaguar->Single Point Energy. In the Jaguar panel, select basis set 6-311G in “Molecule” subpanel, and check the Mulliken population box in the “Properties” subpanel. Click “Write...” at the bottom of the panel and write out yourjaguarjobname.in file. Open another Putty terminal, login to matrix by typing “ssh matrix”. Change to your working directory by “chdir yourdirectoryname”. Submit the jaguar job by “~wgliu/bin/qsubjaguar.py yourjaguarjobname.in”. Use “qstat -u yourusername” to check job status. Once it's finished, you will have a yourjaguarjobname.out file containing the results. If not mentioned specifically, the following steps can be done on the original server you logged in (wolf or dell). You don't need to stay on matrix.
8) Prepare bgf format structure file for MD simulation.
a. Convert your polypeptide structure file from mol2 format to bgf format by “/exec/cassandra/babel_linux -imol2 'yourstructurename.mol2' -obgf 'yourstructurename.bgf' ” (single quatation marks needed).
b. Assigning correct atom types of DREIDING force field by LINGRAF simulator developed at Caltech. Type “/exec/lingraf/lingv323/lingraf linux dreidii” to open LINGRAF graphical user interface.
Read a structure by Top menu->in-out->BioDesign->yourstructurename.bgf->OK->return. Assign DREIDING atom type by build->Organic->first selection under “Files”->edit->auto type->auto->return (after this your icon arrow will have a question mark, but ignore it)->return->return->return (back to Top menu).
c. Write the new bgf file with correct atom type and exit: in-out->write->BioDesign-> type yourstructurename_2.bgf under “Enter a name for the output file”->OK->OK->return->exit->OK.
d. Open yourstructurename_2.bgf file, you will find that the atomtype of for the N of the amine group is N_31, followed by “4 0” which means 4 bonds and 0 lone pair. This is wrong. Manually change the atomtype to N_3, and bonding info to “3 1” which means 3 bonds and 1 lone pair. Make sure the atomtype and the bonding info columns are aligned.
9) Assign Mullikan charges to the bgf file with correct atom type. Use “/project/Biogroup/Scripts/python/mullikenCharges.py yourjaguarjobname.out yourstructurename_2.bgf -o yourstructurename_3.bgf”. yourstructurename_3.bgf is final bgf file that we will use for MD simulation. Check the charges in the last column of bgf file and see if they make sense.
10) Create LAMMPS input files (we only use in.yoursuffix_singlepoint and data.yoursuffix ) using script of “/ul/tpascal/scripts/createLammpsInput.pl -b yourstructurename_3.bgf -f DREIDING -s yoursuffix”.
11) Fix the “in.yoursuffix_singlepoint” file to make it suitable for the non-periodic system.
a. Change “boundary p p p ” to “boundary f f f”.
b. Change “lj/charmm/coul/long/opt 9 10.00000” to “lj/charmm/coul/charmm 9 10.00000”.
c. Change “kspace_style pppm 0.0001” to “kspace_style none”.
12) Further modify the “in.yoursuffix_singlepoint” file for MD simulation.
a. Remove last line “run 0”.
b. Add the following lines (explanation for the lines are in the parenthesis which you don't need to type in):
timestep 1.0 (timestep 1.0 femtosecond)
thermo 100 (output thermodynamic info, like temperature, energy, pressure..., every 100 steps)
c. Assign initial velocities using Maxwell-Boltzmann distribution. We will set the initial temperature to 1000K.
velocity all create 1000.0 4928459 rot yes dist gaussian
(velocity all: one processor loops all atoms to generate velocities. create 1000.0 4928459: generate an ensemble of velocities with respect to specified temperature 1000.0K using randome number generator with specified seed 4928459; rot yes: angular momentum is zeroed; dist gaussian: ensemble of generated velocities satisfy guassian distribution).
- Heat the system from 1 to 1000 K
dump 1 all custom 1000 yoursuffix_heat.lammpstrj id type xu yu zu (1000 here means to save a snapshot every 1000 timesteps (1000*1 fs = 1 ps) to the trajectory file. You may want to change it to 100 for your trajectory analysis)
fix 1 all nve
fix 2 all temp/berendsen 1.0 1000.0 100.0
run 100000
unfix 2
unfix 1
undump 1
e. Run constant temperature MD at 1000K
dump 1 all custom 1000 yoursuffix_1000K.lammpstrj id type xu yu zu
fix 1 all nve
fix 2 all temp/berendsen 1000.0 1000.0 100.0
run 100000
unfix 2
unfix 1
undump 1
13) Enlarge the simulation box in the “data.yoursuffix” file
Change xlo xhi, ylo yhi, zlo zhi to
-100.0 100.0 xlo xhi
-100.0 100.0 ylo yhi
-100.0 100.0 zlo zhi
14) Run the MD simulation using:
/ul/ajaramil/lib/openmpi_matrix/bin/mpirun -np 2 /exec/lammps/bin/lmp_serial32 -in in.yoursuffix_singlepoint
15) View the trajectory from VMD
a. VMD (Visual Molecular Dynamics) is a molecular modeling program developed primarily for viewing and analyzing MD trajectories. Download the program from http://www.ks.uiuc.edu/Development/Download/download.cgi?PackageName=VMD. Install the program in your computer. Go to VMD main menu->File->New Molecule->Browse->Choose a trajectory file in current working directory->Load
b. Change Representation of the molecule and background color.
Main menu->Graphics->Representations->Drawing Method select box, try CPK, Licorice, VdW
and other methods. Use CPK or Licorice for now.
Main menu->Graphics->Colors..->Display(in Categories)->Background->8 white.
c. View the trajectory.
In the VMD Main window, play with all the buttons at the bottom of the window in the same way you play with your DVD player.
Conformational analysis for tripeptide 1000K simulation trajectory
a. Pick atoms in OpenGL Display, find the indexes of the atoms that define the two torsion angles of the middle alanine.
Press 1 on keyboard, click the atoms that define the dihedral angle in the order from N-term to C-term->Graphics->Labels->Check the index of each atom, write them down (say i, j, k, l)
b. Create a text file named get_phi.tcl, type in the following lines and save the file.
set outfile [open "phi.txt" w]
set phi [measure dihed {i j k l} frame all]
puts $outfile "$phi"
close $outfile
c. Menu->Extensions->Tk Console, type “source get_phi.tcl”. It will generate a “phi.txt” file in your current directory. Type
tr ' ' '\n' < phi.txt > phi.csv
You can open the phi.csv file using excel or openoffice in linux (the openoffice version is old in our wolf servers) and plot the trajectory of the dihedral angle as a function of time.
d. Plot histogram of the phi angle in Excel
Open phi.csv in Excel. Say I have 1001 frames by dumping 1 frame every 100 fs in a 100 ps simulation. Then column A1:A1001 contains my phi angle data. In entry B1, type '-180. Note that you type a single quote before the -180. In entry B2, type '-170. Select these two entries, move your mouse icon to the black dot in the bottom right corner of the selected box and drag down till you get a column of angles from -180 to 0 in B1:B19. Type '10 in B20, '20 in B21, use the same method get angles 10 to 180 in B20:B37. (Let me know if you have any alternative way to get the column from -180 to 180 fast...)
In entry C1, type =COUNTIFS($A$1:$A$1001, “>=”&B1, $A$1:$A$1001, “<”&B2), press Enter. Select entry C1, drag the black dot till C37. Then you have the frequencies of phi angles in each angle bin in column C1:C37.
Select B1:C37, go to Menu->Insert->Column (Charts)->2-D column. Update axis titles and delete the legend.
e. Repeat the above steps for the psi angle.
Conformational analysis for pentapeptide heating simulation trajectory
a. Delete your previously loaded trajectory (select the trajectory in VMD Main window, right click->Delete Molecule) and load the heating trajectory of your pentapeptide (yoursuffix_heat.lammpstrj). View the molecule in Licorice or CPK.
b. Watch the whole trajectory and get an idea what is happening as the temperature increases. Pay special attention to the characteristic interaction and similar interactions you identified when drawing the pentapeptide. How many drastic conformational changes happened in the 100 ps simulation? What are the interactions that stabilize the dominating conformations in the time periods between those drastic changes?
c. If it is hard to pick up the drastic conformational changes by eyes, let's do it in a quantitative way by RMSD trajectory analysis. RMSD is the root mean square deviation of the atom coordinates with respect to some reference frame (usually we exclude hydrogens in RMSD analysis if we only want to look at the overall conformational changes).
Go to Menu->Extensions->Analysis->RMSD Trajectory Tool. In the upper left region, change “protein” to “all”. Check the noh box. In the upper right panel, check the Trajectory Frame box and the Plot box. Click Ref: Top, set Trajectory Frame ref: to 1000 (this sets your last frame as your reference frame), click Align, then click RMSD. Report your plot. Change the Trajectory Frame ref: to 0 (this sets your first frame as the reference). Report this plot. What can you conclude from these two RMSD plots?
d. Identify the indexes of the frames where drastic conformational changes happen. Look for the temperatures associated with theses frames in log.lammps file. Report these temperatures.