Basic Solvation Input Script
Preparation
This tutorial will surround a protein in water. First, the user must
have a coordinate file containing enough waters to cover the entire
system. One with roughly 140,000 atoms can be downloaded from the
tutorial website
http://www.charmmtutorial.org
along with the other files used in this section.
The protein file will be read in before the water file. After the proteins structure and coordinates have been read in, the user must align the protein at the geometric origin with the "COOR ORIEnt" command.
At this point the waters that will used for solvation should be read in. Since we need to generate a new segment for the waters, we must read them in as a sequence and then execute the GENErate command. Once the file containing the waters is opened the following command should be issued:
| READ SEQUence TIPS 46656 |
The "SEQUence" command will read in the residues. Since this PDB file is purely waters, the RESId "TIPS" can be specified followed by the number of residues in the PDB (46656). Once the waters are read in the sequence can be generated:
| GENErate WATErs NOANgle NODIhedral |
The water coordinates should now be read and appended to the PSF using the append option to "READ COOR".
* Solvate Protein * bomlev -2 ! Read in Topology and Parameter files open unit 1 card read name top_all27_prot_na.rtf read RTF card unit 1 close unit 1 open unit 1 card read name par_all27_prot_na.prm read PARA card unit 1 close unit 1 ! solvation structure is rhdo ! Read protein structure from PSF file open read formatted unit 27 name 1yjp-full-min.psf read psf card unit 27 close unit 27 open read unit 10 card name 1yjp-full-min.crd read coor card unit 10 close unit 10 ! orient about origin so the alignment is correct with the water crystal coordinate orient |
Since the protein has been centered about the origin, the waters must also be centered about the origin with the "COOR ORIEnt" command. To make sure that the waters do not rotate, use the "NOROtation" command. Using...
| COORdinate ORIEnt NOROtation SELEct SEGId WATErs END |
will ensure that only the waters will be centered about the origin.
There may be overlapping atoms between the protein and the newly added waters, which will cause dynamics simulations to fail. The overlapping water molecules will have to be removed with the "DELEte" command. "DELEte ATOM SORT" informs CHARMM that the user will be deleting atoms and after that is done, the PSF will be re-sorted with the selected atoms removed.
A select command will specify which atoms are to be removed.
| ! Delete waters which overlap with protein DELEte ATOM SORT SELEct .BYREs. (SEGId WATErs .AND. TYPE OH2 .AND. - ((.NOT. (SEGId WATErs .OR. HYDRogen)) .AROUnd. 2.5)) END |
Since the residues are known and were read in with the "SEQU" command, it would be a good idea to select the atoms for deletion by their residues. A segid of "WATErs" was given to the water PDB so to select waters, the user can input "SEGId WATErs". Just to be sure all waters are selected, "TYPE OH2" will select all atoms with the atom name, "OH2". Now that the command has ensured that only waters will be chosen, it must be guaranteed that the waters will only be removed if they are too close to the protein. Select offers an "AROUnd" feature that will select atoms within a given proximity. Select should be followed by a number in angstroms and 2.5 is a reasonable value.
Not all 46,656 waters may be necessary in the solvation of a protein. Removing extraneous waters can result in faster data gathering and reduce clutter. CHARMM has a "CUT" command which can cut off all atoms in a radius specified in angstroms. The user will have to scan their PDB and decide a good cutoff based on the Cartesian coordinates of their protein. In order to use the feature, the user must first establish a point from where the radius will extend. Since this tutorial positioned both the waters and the protein at the origin, the point will be at the origin.
|
! Delete extraneous waters DELEte ATOM SORT SELEct .BYREs. - (.NOT.(POINt 0. 0. 0. CUT @SAFESPHERE) .AND. ( SEGId WATErs )) END
! where @SAFESPHERE is the size of the protein (25 Å) plus some buffer
|
If the user wants to minimize and/or apply molecular dynamics to the solvated PDB then a sphere will result in errors when using images; therefore, the user may want to solvate in a regular geometric structure. First, the user must determine the X,Y, and Z length of the box. This tutorial will remove anything outside the 45x45x45 Å box.
|
! Takes statistics on coordinates to figure out the minimum, maximum, and ! average values of the X, Y, and Z coordinates of the protein. These are ! stored in XMIN,XMAX,XAVE,YMIN,YMAX,YAVE,ZMIN,ZMAX,ZAVE. COORdinate STATistics SELEct protein END
! If you want to solvate 10 Å from the edge of the protein then
CALC XDIM = ( ABS ( ?XMAX - ?XMIN ) + 20 |
The input script should now look like:
* Solvate Protein * bomlev -2 ! Read in Topology and Parameter files open unit 1 card read name top_all27_prot_na.rtf read RTF card unit 1 close unit 1 open unit 1 card read name par_all27_prot_na.prm read PARA card unit 1 close unit 1 ! Read protein structure from PSF file open read formatted unit 27 name 1yjp-full-min.psf read psf card unit 27 close unit 27 open read unit 10 card name 1yjp-full-min.crd read coor card unit 10 close unit 10 ! orient about origin so the alignment is correct with the water crystal coordinate orient ! Takes statistics on coordinates to figure out the minimum, maximum, and ! average values of the X, Y, and Z coordinates of the protein. These are ! stored in XMIN,XMAX,XAVE,YMIN,YMAX,YAVE,ZMIN,ZMAX,ZAVE. COORdinate STATistics SELEct protein END ! Read in water sequence READ SEQUence TIPS 46656 ! Generate new segment for the water GENErate WATers NOANgle NODIhedral ! Read the water PDB coordinates and append them to the protein OPEN READ FORMATTED UNIT 1 water.crd READ COOR CARD UNIT 1 APPEND CLOSE UNIT 1 COORdinate ORIEnt NOROtations SELEct SEGId WATErs END ! Delete extraneous waters DELEte ATOM SORT SELEct .BYREs. - (.NOT.(POINt 0. 0. 0. CUT @SAFESPHERE) .AND. ( SEGId WATErs )) END ! xdim, ydim, and zdim are variables that store the dimensions of ! the protein along the x, y, and z axis respectively ! Because the user requested 10 distance from the edge, ! the distance is multipled by two and then added to the diameter of ! the structure. CALC XDIM = ( ABS ( ?XMAX - ?XMIN ) + 20 ) CALC YDIM = ( ABS ( ?YMAX - ?YMIN ) + 20 ) CALC ZDIM = ( ABS ( ?ZMAX - ?ZMIN ) + 20 )
Now that we have the dimensions of the system defined it is time to build the crystal and setup the images that will be employed in molecular dynamics simulations that use periodic boundary conditions.
To start things off, use the "CRYStal" command followed by the "DEFIne CUBE" so CHARMM knows we are simulating a protein in a water cube. Following this command, the Cartesian dimensions of the cubic box (the total length of each of the three sides) must be given. As shown above the structure can be used to determine the dimensions of the box. Using the values of XDIM, YDIM and ZDIM the lattice parameters (a, b and c) can be defined. Additionally, the angles of the shape need to be defined and since a cubic box is requested alpha, beta and gamma should be set to "90. 90. 90." These commands should be followed by:
| CRYStal BUILd NOPER 0 |
NOPEr is the number of space group operations. For a cube, "NOPEr" is zero since there are no operations needed for a cubic crystal. All together you should have something that resembles the following:
| CRYSTAL DEFINE CUBE @XDIM @XDIM @XDIM 90.0 90.0 90.0
CRYSTal BUILd NOPEr 0 IMAGe BYREs SELEct SEGId .NOT. SEGId PROTein END IMAGe BYSEg SELEct SEGId PROTein END |
The images should be defined with the "IMAGe BYREs" (by residue) command as shown above for atoms not in the protein. The SELEct command is used to choose atoms that are not proteins (.NOT. SEGId PROTEIN). This assumes that the protein SEGId name is PROTein. For the protein, the images must be based on the entire segment rather than the residues and therefore the "IMAGe BYSEg" (by segment id) command is used.
Next the image list must be built and atoms that are located outside of the box must be deleted.
! COPY MAIN COORDINATES TO COMPARISON SET COOR COPY COMP ! SET WRNLEV DOWN TO REDUCE CLUTTER... WRNLev 1 ! UPDATE THE IMAGE LISTS. IF ANY ATOMS MOVE DURING THIS PROCESS, IT MEANS ! THAT THEY ARE EXTRANEOUS TO THE CRYSTAL STRUCTURE THAT WE WANT TO BUILD ! AND SHOULD BE DELETED. THEREFORE, COOR DIFF IS USED TO DETECT MOVING ! ATOMS, WHICH HAD SELECTED FOR DELETION. UPDAte INBFrq 0 COOR DIFF DEFIne SEL1 SELEct .BYREs. (PROPerty X .NE. 0.0 .OR. - PROPerty Y .NE. 0.0 .OR. PROPerty Z .NE. 0.0) END ! COPY COMPARISON COORDINATES BACK TO MAIN COOR SWAP ! DELECT ATOMS THAT ARE OUTSIDE THE BOX DELEte ATOM SELEct SEL1 END ! WRITE OUT A CRYSTAL TRANSFORM FILE OPEN UNIT 1 CARD WRITe NAME 1yjp_cube.xtl CRYStal WRITe CARD UNIT 1 * SOLVATED STRUCTURE * SOLVATION: CUBE WITH A CRYSTAL DIMENSION OF @XDIM * |
It is now advisable to minimize a newly solvated structure before applying molecular dynamics, otherwise, the initial motion of the protein may be very jerky as the system tries to reach a stable conformation. Therefore, the recommended method is to first minimize the system to stabilize the energy and then gradually add energy back to it by heating during dynamics (will be discussed later).
Now the non-bond values must be set. Start off the command with "NBONd". In order to save time and memory, CHARMM allows the user to create a cutoff so the van der Waals energy (which is uses the common Lennard-Jones potential) is only calculated if two atoms are within a given distance of one another. When two atoms approach the cutoff distance, a switching function removes the contributing energies from the atoms without creating a discontinuity. CHARMM will keep track of the coordinates via "INBFRq"to check if they are within the cutoff value. The "e;CTOFnb"e; and "CTONnb" parameters define the range within which the switching function is used. The numbers associated with cutnb, ctofnb, and ctonnb are in angstroms and a description of these parameters follows.
Taking the above figure as an example, a pair of atoms that are separated by 8 (CTONnb) to 10 Å (CTOFnb) will initiate the switching function. CHARMM will keep track of atom pairs between 10 and 12 Å until the distance between the atoms in the pairs is greater than CUTNb (12 Å). A graph of the potential energy between two would look like:
As soon as pairwise distance between atoms is between 8 and 10 Å, CHARMM will activate the switching function bringing the contribution value to zero as shown by the graph. This is done using the "VSWItch" option to the NBONd command.
This tutorials values for CUTNb, CTOFnb, and CTONnb will be specified with the options:
| CUTNb 16.0 CTOFnb 12. CDIE EPS 1. CTONnb 8. |
The molecules near the edge of a box act as if they are interacting with images of other molecules that are also near the edge of the box in the system. The user controls this by defining the ends of the boundaries in the crystal command discussed above. The diagram below shows the original system (in red) with CHARMM-created images surrounding the original system. The circles represent the radius in which the original system atoms interact with the image atoms.
Setting the radius size can be done with the "CUTIm" option, which creates a cutoff for atoms interacting with another images atoms. This tutorial will use the value 16.0. CHARMM will then need to know how often to update the image list and this can be done with the "IMGFrq" command. Setting it to -1 will let CHARMM know to only update it when necessary.
If two atoms become too close, CHARMM has a warning system implemented to inform the user. To set the warning distance (in Å) use the command "WMIn" followed by a number. 1.5 is the default.
Generally, the most costly part of the calculation is the long-range electrostatic interactions between atoms. supports the Particle-Mesh Ewald (PME) summation method which is a particularly efficient way of calculating these. More information can be found by consulting ewald.doc in the CHARMM documentation. This can be set up by the command "EWALd PMEW". The grid points used by PME have to be set with the "FFTX" "FFTY" and "FFTZ" options. The values for these grid points should be near the value of the simulation box size. These must be integers and cannot have any other prime factors besides 2,3 and 5. Therefore, a number like 32 would work whereas 49 would cause an error. In general, powers of 2 are suggested. KAPPa is another value that has to be given for an Ewald summation and it is the real number that governs the width of the Gaussian distribution central to the summation. A general value is .34. The "SPLIne ORDEr 6" command can help speed up the process of the Ewald summation because it uses approximations to determine the polynomials in the function of cubic spline interpolation. Shake, which is described above, may also be used.
The input script should resemble the following:
* Solvate Protein * bomlev -2 ! Read in Topology and Parameter files open unit 1 card read name top_all27_prot_na.rtf read RTF card unit 1 close unit 1 open unit 1 card read name par_all27_prot_na.prm read PARA card unit 1 close unit 1 ! Read protein structure from PSF file open read formatted unit 27 name 1yjp-full-min.psf read psf card unit 27 close unit 27 open read unit 10 card name 1yjp-full-min.crd read coor card unit 10 close unit 10 ! orient about origin so the alignment is correct with the water crystal coordinate orient ! Takes statistics on coordinates to figure out the minimum, maximum, and ! average values of the X, Y, and Z coordinates of the protein. These are ! stored in XMIN,XMAX,XAVE,YMIN,YMAX,YAVE,ZMIN,ZMAX,ZAVE. COORdinate STATistics SELEct protein END ! Read in water sequence READ SEQUence TIPS 46656 ! Generate new segment for the water GENErate WATers NOANgle NODIhedral ! Read the water PDB coordinates and append them to the protein OPEN READ FORMATTED UNIT 1 water.crd READ COOR CARD UNIT 1 APPEND CLOSE UNIT 1 COORdinate ORIEnt NOROtations SELEct SEGId WATErs END ! Delete extraneous waters DELEte ATOM SORT SELEct .BYREs. - (.NOT.(POINt 0. 0. 0. CUT @SAFESPHERE) .AND. ( SEGId WATErs )) END ! xdim, ydim, and zdim are variables that store the dimensions of ! the protein along the x, y, and z axis respectively ! Because the user requested 10 distance from the edge, ! the distance is multipled by two and then added to the diameter of ! the structure. CALC XDIM = ( ABS ( ?XMAX - ?XMIN ) + 20 ) CALC YDIM = ( ABS ( ?YMAX - ?YMIN ) + 20 ) CALC ZDIM = ( ABS ( ?ZMAX - ?ZMIN ) + 20 ) CRYSTAL DEFINE CUBE @XDIM @XDIM @XDIM 90.0 90.0 90.0 CRYSTal BUILd NOPEr 0 IMAGe BYREs SELEct SEGId .NOT. SEGId PROTein END IMAGe BYSEg SELEct SEGId PROTein END ! COPY MAIN COORDINATES TO COMPARISON SET COOR COPY COMP ! SET WRNLEV DOWN TO REDUCE CLUTTER... WRNLev 1 ! UPDATE THE IMAGE LISTS. IF ANY ATOMS MOVE DURING THIS PROCESS, IT MEANS ! THAT THEY ARE EXTRANEOUS TO THE CRYSTAL STRUCTURE THAT WE WANT TO BUILD ! AND SHOULD BE DELETED. THEREFORE, COOR DIFF IS USED TO DETECT MOVING ! ATOMS, WHICH HAD SELECTED FOR DELETION. UPDAte INBFrq 0 COOR DIFF DEFIne SEL1 SELEct .BYREs. (PROPerty X .NE. 0.0 .OR. - PROPerty Y .NE. 0.0 .OR. PROPerty Z .NE. 0.0) END ! COPY COMPARISON COORDINATES BACK TO MAIN COOR SWAP ! DELECT ATOMS THAT ARE OUTSIDE THE BOX DELEte ATOM SELEct SEL1 END ! WRITE OUT A CRYSTAL TRANSFORM FILE OPEN UNIT 1 CARD WRITe NAME 1yjp_cube.xtl CRYStal WRITe CARD UNIT 1 * SOLVATED STRUCTURE * SOLVATION: CUBE WITH A CRYSTAL DIMENSION OF @XDIM * ! FIX HYDROGEN BONDS TO THEIR VALUES IN THE PARAMETER FILE SHAKe BONH PARAmeters ! NON-BOND SETUP (SAME VALUES AS WILL BE USED IN DYNAMICS) NBONd INBFrq -1 ELEC FSWItch VDW VSWItch CUTNb 16. CTOFnb 12. CTONnb 8. - CUTIm 16.0 MGFRQ -1 WMIN 1.0 EWALd PMEW FFTX 48 FFTY 48 FFTZ 48 KAPPa - .34 SPLIne ORDEr 6 ENERgy MINI SD NSTEp 100 MINI ABNR NSTEp 1000 NPRInt 100 TOLG 0.05 OPEN WRITE UNIT 1 CARD NAME 1yjp-full-solv-min.pdb WRITE COOR PDB UNIT 1 * new_1yjp-11213-min.pdb * OPEN WRITE UNIT 1 CARD NAME 1yjp-full-solv-min.crd WRITE COOR CARD UNIT 1 * new_1yjp-11213-min.crd * STOP |